Seven teams. One architecture. No coordination.
Claude Code, OpenAI Codex, Gemini CLI, LangGraph, CrewAI, Google ADK, Amazon Bedrock — built by different companies, in different languages, under different constraints. They converged on the same design.
Not because they copied each other. Because the constraints are physics. Finite context windows. Tools that need a protocol. Safety that can’t depend on the model obeying. Tasks too complex for a single invocation. Any team that builds long enough arrives here.
The 8 Postulates
These are not suggestions. They are the load-bearing walls of every production agentic system. Violate them and you will rediscover why they exist.
| # | Postulate | What to do |
|---|---|---|
| 1 | Start with a persistent instruction file | Create a CLAUDE.md, AGENTS.md, or GEMINI.md before writing any agent config. Cover conventions, stack, testing, git, and security. Keep it under 200 lines. |
| 2 | Enforce safety outside the prompt | Put style preferences in the instruction file. Put linting in hooks. Put destructive command blocking in permissions. Never rely on the model remembering a safety rule. |
| 3 | Budget your context window | Reserve 10-15% for instructions, 30-40% for conversation, 20-30% for tool results. Compact at 70%. Clear at 80%. Separate cacheable content from compactable content. |
| 4 | Build tools on MCP | Use .mcp.json for tool connections. 97M+ downloads/month across every major platform. If you need agent-to-agent communication across systems, add A2A — but start with MCP. |
| 5 | Coordinate through shared state | Within a system, agents read from and write to shared state — not messages to each other. Between systems or organizations, use messaging protocols (A2A). Default to state; reach for messaging only when you must. |
| 6 | Decompose before you hit the cliff | Agent coherence degrades after extended sessions. The threshold moves with each model generation. Don’t find the limit — stay well under it. Break work into sub-tasks that complete in the safe zone. |
| 7 | Track cost per task from day one | Set token budgets per session. Route simple work to cheap models. Cache stable prompts. Monitor with alerts at 50%, 75%, and 90% of budget. Cost management is infrastructure, not optimization. |
| 8 | Add complexity in weekly increments | Week 1: instruction file. Week 2: hooks. Week 3: MCP tools. Week 4: skills. Month 2+: sub-agents. If your team has distributed systems experience, you can move faster — but still validate each layer before adding the next. |
The Architecture
graph TD
A["<b>Instruction Layer</b><br/>CLAUDE.md · AGENTS.md · GEMINI.md<br/><i>user → project → directory (most specific wins)</i>"] --> B
B["<b>Settings Layer</b><br/>settings.json · config.toml<br/><i>Permissions, hooks, env vars</i>"] --> C
C["<b>Tool Registry — MCP</b><br/>.mcp.json<br/><i>stdio (local) · http (remote)</i>"] --> D
subgraph loop ["Agent Execution Loop"]
D["Input"] --> E["Pre-Hooks"]
E -->|"BLOCK if gate fails"| F["Reasoning"]
F --> G["Tool Selection"]
G --> H["Tool Hooks"]
H -->|"BLOCK if denied"| I["Execution"]
I --> J["Post-Hooks"]
J --> K{"Continue?"}
K -->|Yes| F
K -->|No| L["Output"]
end
L --> M1
L --> M2
L --> M3
subgraph ext ["Extensions"]
M1["Skills<br/><i>reusable prompts</i>"]
M2["Subagents<br/><i>bounded contexts</i>"]
M3["Memory<br/><i>state, checkpoints</i>"]
end
Who This Is For
| Role | What you get |
|---|---|
| Agent developers | Patterns for instruction files, hooks, MCP tools, and context management. |
| Platform engineers | Multi-agent architecture, shared state, delegation, and cost controls. |
| Infrastructure teams | Observability, token accounting, safety enforcement, and production runbooks. |
| Engineering managers | Adoption roadmaps, cost models, and risk frameworks. |
Reading Order
| Section | Key questions answered |
|---|---|
| Foundation | What are the patterns? Why do they exist? |
| Runtime | How does an agent run? How do I manage context? |
| Coordination | How do agents work together? When do I need more than one? |
| Workflows | How do I encode repeatable processes? |
| Operations | How do I run agents reliably and affordably at scale? |
First agent? Start with Foundation → Runtime. Skip Coordination until one agent works reliably.
Scaling? Jump to Coordination and Operations. That’s where the failure modes live.