Implementation Delta

The patterns in this guide were derived from theory: studying what platforms ship, extracting the common architecture, and naming the recurring structures. But building an agent harness reveals patterns that theory misses — patterns that only surface when you write the code, handle the edge cases, and watch things break at runtime.

This page documents ten patterns discovered by analyzing real open-source agent harness implementations (notably claw-code, a 30K+ LOC Rust reimplementation of a production agent harness). Each pattern is load-bearing in practice but absent from the existing taxonomy.


1. Prompt Compilation

The pattern: The system prompt is not a file. It is a runtime artifact assembled from a dependency graph of inputs — project context files, tool descriptions, permission state, plugin manifests, session history, hook declarations, and MCP server capabilities.

Why theory misses it: The Instruction Files pattern describes static markdown loaded at session start. In practice, the instruction file is one input to a build process that produces the final prompt.

What it looks like in code:

prompt = []
prompt += load_instruction_files(walk_up_tree(cwd))
prompt += render_tool_descriptions(active_tools)
prompt += render_permission_context(current_mode)
prompt += render_mcp_capabilities(connected_servers)
prompt += render_plugin_manifests(loaded_plugins)
prompt += render_session_summary(compacted_history)
prompt += render_hook_declarations(active_hooks)

Why it matters: Getting prompt compilation wrong — wrong ordering, missing context, exceeding the token budget — silently degrades every downstream behavior. The model doesn’t error; it just gets worse. This makes prompt compilation the most consequential code in the harness, yet it has no dedicated pattern.

Design guidance:


2. Streaming as an Architectural Constraint

The pattern: Streaming (SSE/WebSocket) is not a UX feature layered on top. It is a structural constraint that changes error handling, cancellation, rendering, tool execution, and backpressure throughout the entire stack.

Why theory misses it: The existing patterns assume request-response semantics. Real harnesses stream tokens as they arrive, which means every component — from the API client to the terminal renderer — must handle partial, in-flight state.

What it forces you to solve:

ProblemNon-streamingStreaming
Error handlingCheck response statusStream can break mid-token, mid-tool-call, or mid-markdown block
CancellationDon’t send the requestUser hits Ctrl+C during a file write — must abort cleanly, discard partial output, and leave the conversation in a valid state
Tool callsParse complete JSONTool call JSON arrives incrementally — must buffer, detect boundaries, and dispatch
RenderingRender complete markdownRender partial markdown that may have unclosed blocks, incomplete tables, or half-written code fences
BackpressureN/AModel produces tokens faster than the terminal renders — must buffer without unbounded memory growth

Design guidance:


3. Session Persistence and Resumption

The pattern: Users stop working and come back. The agent must serialize its full conversation state to disk and rehydrate it later, handling the reality that the world changed in between.

Why theory misses it: Context Management addresses within-session concerns (compaction, budgeting). It says nothing about the across-session lifecycle: save, quit, resume, and the stale-context problem that follows.

What goes wrong on resume:

Design guidance:


4. Tool Failure and Recovery

The pattern: Tools fail. bash returns exit code 1. File writes fail because the path doesn’t exist. MCP servers crash mid-call. The harness must treat tool failures as structured data the model reasons about, not exceptions that crash the loop.

Why theory misses it: The Lifecycle Hooks pattern covers pre/post-tool automation. The Tool Protocols pattern covers tool discovery and invocation. Neither addresses what happens when a tool call fails at runtime.

The failure taxonomy:

Failure typeExampleCorrect handling
Expected errorbash exits with code 1Return stderr as tool result. Model adapts.
Transient failureNetwork timeout on web_fetchRetry with backoff. Include attempt count in result.
Permanent failureMCP server process diedMark server as unavailable. Remove its tools from the active registry. Inform the model.
Partial resultStreaming tool output cut shortReturn what was received with a truncation marker.
Permission denialUser rejected the tool callReturn denial as a structured result. Model must not retry the same call.

Design guidance:


5. Credential Lifecycle Management

The pattern: The agent needs API keys, OAuth tokens, and service credentials. These expire, need rotation, and must be stored securely. The harness must manage the full credential lifecycle: acquire, store, refresh, and revoke.

Why theory misses it: Sandboxing & Permissions discusses the two-phase runtime (secrets available during setup, removed during execution). It doesn’t address how the harness itself authenticates to the services it depends on — the Anthropic API, MCP servers, OAuth providers.

What a real implementation requires:

Design guidance:


6. Project Context Discovery

The pattern: The agent doesn’t just load a file at a fixed path. It walks up the directory tree from the current working directory, discovers all relevant context files, and merges them by proximity (closer files win).

Why theory misses it: Instruction Files describes a four-layer precedence model (managed > user > project > local). In practice, the “project” layer is itself hierarchical — a monorepo may have a root CLAUDE.md and subdirectory-specific overrides, and the agent must discover and merge all of them.

The discovery algorithm:

context_files = []
dir = cwd
while dir != filesystem_root:
    for pattern in ["CLAUDE.md", ".claude/settings.json", ...]:
        if exists(dir / pattern):
            context_files.prepend(dir / pattern)
    dir = parent(dir)

Files closer to cwd override files further up. This handles:

Design guidance:


7. LSP as a Semantic Tool Layer

The pattern: Language Server Protocol gives the agent structured code understanding — go-to-definition, find-references, type checking, diagnostics — that is orders of magnitude more reliable than text search.

Why theory misses it: Tool Protocols documents MCP, A2A, and WebMCP. LSP is absent from the protocol stack despite being a mature, widely-deployed standard that every major language supports.

What LSP provides that text tools don’t:

CapabilityText tools (grep, glob)LSP
”What calls this function?”Regex search, high false-positive ratetextDocument/references — precise, semantic
”What type does this return?”Heuristic parsingtextDocument/hover — compiler-accurate
”Is this code valid?”Run the compiler (slow, noisy)textDocument/diagnostic — incremental, real-time
”Rename this symbol everywhere”Find-and-replace (breaks strings, comments)textDocument/rename — semantic, safe

Design guidance:


8. Editor and IDE Compatibility

The pattern: The same agent core must work as a CLI REPL, a VS Code extension, a JetBrains plugin, and a web interface. The harness must separate the core (conversation loop, tools, permissions) from the interface (terminal rendering, JSON-RPC, HTTP).

Why theory misses it: The existing patterns describe what the agent does, not how it presents itself. In practice, the interface layer is a major engineering surface with its own constraints.

The compatibility matrix:

InterfaceTransportInputOutputConstraints
CLI REPLstdin/stdoutLine editing (rustyline)Streaming markdown + syntax highlightingTerminal width, color support, signal handling
VS CodeJSON-RPC over stdioExtension API messagesWebview panels, editor decorationsExtension host lifecycle, webview security
JetBrainsHTTP/WebSocketPlugin API messagesTool windows, editor annotationsJVM process model, Kotlin/Java interop
WebHTTP/SSEREST APIJSON eventsCORS, authentication, no filesystem access

Design guidance:


9. Model Abstraction and Aliasing

The pattern: The harness maintains a model router that resolves human-friendly aliases (e.g., claude-opus-4-6) to actual API model IDs, supports multiple providers, and insulates the prompt logic from model identity changes.

Why theory misses it: Cost Management mentions model routing for cost optimization (cheap model for simple tasks). It doesn’t address the mechanical pattern of how the harness abstracts model identity.

What the abstraction must handle:

Design guidance:


10. The Plugin Trust Boundary

The pattern: Plugins extend the agent with new tools, commands, and hooks. But the trust model for plugins is fundamentally different from MCP servers, and this difference is rarely made explicit.

Why theory misses it: Tool Protocols documents MCP’s transport-level isolation (stdio/WebSocket). Sandboxing documents OS-level enforcement. Neither addresses the middle ground: locally installed plugins that run in-process with the harness.

The trust spectrum:

More isolated ◄──────────────────────────────────► Less isolated

Cloud sandbox    MCP server     Plugin (sandboxed)    Plugin (in-process)
  (Codex)      (stdio/WebSocket)   (WASM, Deno)       (dynamic linking)

Design guidance:


Summary: Theory vs. Implementation

Existing patternWhat theory saysWhat implementation adds
Instruction FilesStatic markdown at fixed pathsRuntime prompt compilation from a dependency graph
Tool ProtocolsMCP/A2A discovery and invocationTool failure taxonomy and structured error propagation
Context ManagementBudget bands and compaction triggersSession persistence, resumption, and stale-context handling
Lifecycle HooksPre/post-tool automationStreaming as a cross-cutting architectural constraint
SandboxingOS-level isolation and permission modelsPlugin trust boundaries and in-process vs. transport isolation
Settings ArchitectureFour-tier config scopeProject context discovery via directory tree walking
Cost ManagementModel routing for cost optimizationModel abstraction, aliasing, and provider-agnostic architecture
Not coveredLSP as a semantic tool layer
Not coveredEditor/IDE compatibility and interface abstraction
Not coveredCredential lifecycle management (OAuth, token refresh, storage)

These patterns are not speculative. They are extracted from running code that handles real conversations, real tool calls, and real failures. The gap between the patterns we document and the patterns we build is the gap between architecture and engineering — and closing it is what turns a framework diagram into a working system.