跳转到主要内容

Building an Enterprise-Grade Claude Code Harness

April 12, 2026
3 min read
Tianli Zeng
ai
llm
claude
developer-tools
automation

Why We Need Harness Engineering

Claude Code is a powerful AI conversational tool out of the box, but using it raw creates a few problems:

  1. Context loss: Every new session starts from scratch — you have to re-explain the project background
  2. Repetitive operations: The same workflows get described manually each time, which is inefficient
  3. Inconsistent quality: Without a standardized process, output quality depends entirely on prompt quality
  4. Knowledge silos: Experience accumulated in one session can't be transferred to the next

Harness Engineering is the systematic answer to these problems — it wraps AI conversational capability into reusable, manageable, extensible engineering infrastructure.

Architecture

A Three-Layer System

Commands (43)  → Standardized operations triggered by the user
Skills (14)    → Context-aware intelligent activation
Hooks          → Lifecycle automation
Memory         → Cross-session persistent memory
MCP            → External system integration

Commands: Standardized Operations

Commands are the most direct interaction layer. Each command is a markdown file that defines a complete prompt, input/output specification, and constraints.

Organized by domain:

  • Document processing (8): draft, review, fix-refs, fix-heading, md2word, review-deep, edit-docx, fix-numbering
  • Project management (7): ship, deploy, audit, promote, groom, health, tidy
  • System tools (5): scan, recap, handoff, context, harness

Design principles:

  • Single responsibility: each command does one thing
  • Composable: groom = pull + audit + fix + review + ship
  • Idempotent: re-running produces no side effects

Skills: Intelligent Activation

Skills are a layer above commands — they carry trigger conditions and activate automatically when a matching scenario appears.

# Example: bid skill
name: Bid Writing
trigger: "When working on bid projects under ~/Work/bids/"
phases: [Parse RFP, Build chapter framework, Inventory references, Four-phase writing]

Key skills:

  • context: Project context injection (identifies project type, loads relevant config)
  • migrate: Python/Streamlit → Tauri migration (analyzes compute logic, generates Rust + React code)
  • harness: Configuration scaffold generation (builds CLAUDE.md + skills based on project nature)

Hooks: Lifecycle Automation

Hooks fire automatically on specific events:

  • startup: Detects HANDOFF.md and auto-loads context from the previous session
  • session-reflect: At session end, analyzes the work, generates a summary, sends a macOS notification
  • pre-commit: Quality checks before commits

Memory: Cross-Session Recall

A structured memory system divided into four types:

TypePurposeExample
userUser profile"Deep Go background, React beginner"
feedbackBehavior calibration"commit+push as one step"
projectProject state"Merge freeze starts March 5"
referenceExternal pointer"Pipeline bugs in Linear INGEST project"

Each memory entry is a standalone md file; MEMORY.md serves as the index. The system automatically detects stale memories and cleans them up.

MCP Integration

Connect to external systems via Model Context Protocol:

  • auggie: Semantic search of Git repositories (smarter than grep)
  • cclog: Search and analysis across 512+ historical sessions
  • gmail: Email read/write (used by the briefing system)

Engineering Practices

Skill Distribution

All skills are managed centrally in the cc-configs repository and distributed to every project via a sync mechanism:

# Check skill configuration status across all projects
python3 ~/Dev/devtools/scripts/tools/skill_sync.py status

# Sync to a specific project
python3 ~/Dev/devtools/scripts/tools/skill_sync.py sync hydro-toolkit

Session Management

The cclog MCP provides full session lifecycle management:

  • search_sessions: Search sessions by keyword
  • get_session_detail: Retrieve full session context
  • get_daily_digest: Generate a daily work summary

Dashboard Monitoring

dashboard.tianlizeng.cloud visualizes:

  • Skill configuration state across every repo
  • Recent tasks and progress
  • System health

What This Enables

This setup lets a single person:

  • Manage 29 GitHub repositories
  • Operate 24 production services
  • Produce high-quality technical documentation daily
  • Work in parallel across water conservancy, DevTools, and AI

The core value isn't "knowing how to use AI" — it's "turning AI into a production-grade toolchain."