Skip to content

Custom subagents — protect context with specialized delegates

Q12 · Extensibility Do you use subagents (specialized sub-agents with tool restrictions)?

Max-score answer: “Full set: code-reviewer, code-explorer, code-architect — and I know upfront what to delegate.”

Why it matters: Subagents protect main-context budget and parallelize independent research. Without them, your context fills with raw tool output.

Context is the scarcest resource in any 2026 agentic coding session. Even with Anthropic’s 1M-token window on Sonnet 4.6 and Opus 4.7, the useful context budget for a single agent collapses long before you hit the wall — by the time you’ve read 30 files, run 10 test suites, and grepped a dozen call sites, half your tokens are raw tool output the model has to re-read on every turn. Long-context evaluations through 2025–2026 (Anthropic’s “Effective context engineering” post, plus the Chroma “context rot” series) all show the same shape: multi-hop reasoning accuracy starts decaying around 50k–80k tokens and gets noticeably worse past 200k, even when the raw window allegedly supports a million.

Subagents are the fix that scales. A subagent runs in its own context window with its own system prompt and its own tool allowlist; the orchestrator never sees the raw output of its grep, its Read, its bash — only the structured summary the subagent chooses to return. Delegate one round of “find every place we call Stripe.checkout.sessions.create” to a code-explorer on Haiku 4.5, and your main agent’s context grows by ~1k tokens (the summary) instead of ~40k (the actual reads). Run three explorers in parallel and you cut wall-clock time ~3×. That’s the whole game: protect the budget that holds your plan, file map and active edit; outsource everything else.

The cost math reinforces it. Subagents on Haiku 4.5 ($1/$5 per million tokens) cost roughly one-fifth of Sonnet 4.6 and one-fifteenth of Opus 4.7. Four parallel Haiku explorers to map a feature is still cheaper than asking Sonnet to read the same files in your main thread — and you keep your top-tier model fresh for planning. Skip subagents and you pay 3–5× more per task and hit context degradation earlier.

You get full marks on Q12 only when all four of these are true:

  • A full set of role-based subagents installed. Minimum code-reviewer, code-explorer (or Claude Code’s built-in Explore), and code-architect, plus 2–4 task-specific (e.g. migration-planner, test-author, doc-writer, sql-reviewer). They live in .claude/agents/ (project) and ~/.claude/agents/ (user), each as a single .md file with YAML frontmatter.
  • Tools whitelisted per subagent. A code-reviewer gets Read, Grep, Glob — no Bash, no Write, no Edit. A code-architect gets read-only plus WebSearch. A test-author gets Read, Write, Edit, Bash (scoped to test commands). Least privilege is enforced in the frontmatter, not in your head.
  • Models routed per subagent. Explorers, reviewers and bulk-scan agents on claude-haiku-4-5-20251001. Architects and planners on claude-opus-4-7. Test-authors and refactor-bots typically on claude-sonnet-4-6. The orchestrator stays on Sonnet and decides who to delegate to.
  • You can name, upfront, what gets delegated. Before you ship a task, you’ve already decided which steps the main agent does itself and which it fans out. “Map the call sites” → explorer. “Review this diff” → reviewer. “Design the migration plan” → architect. If you’re inventing the delegation mid-task, you don’t have a strategy — you have a toy.

Anything less — “I have one custom agent I never use”, “I rely on the built-in general-purpose agent for everything”, or “I let the model decide whether to subagent” — is mid-tier on Q12.

Claude Code built-ins (Explore, general-purpose)

Section titled “Claude Code built-ins (Explore, general-purpose)”

Claude Code ships with two subagent-like primitives out of the box, and you should know both before you build your own.

  • Explore. A built-in subagent specialized for codebase reconnaissance — file discovery, call-site mapping, “where is feature X implemented” questions. Explore is invoked automatically when the orchestrator decides it needs a read-only sweep, and it returns a structured summary instead of raw file contents. This is the agent that single-handedly protects your context from the “I just read 40 files to find one function” anti-pattern.
  • General-purpose. The catch-all Task agent. Useful when you want fan-out without authoring a custom agent yet, but it has no role specialization and no tool restrictions — which means it can do anything and therefore tends to do too much. Treat it as a temporary scaffold while you write the proper specialized agent.

Both are fine as defaults, but neither replaces a curated set. The whole point of Q12 is that you have decided what your common delegations are and codified them.

Custom subagents in .claude/agents/ (markdown frontmatter spec, tools whitelist)

Section titled “Custom subagents in .claude/agents/ (markdown frontmatter spec, tools whitelist)”

Custom subagents are markdown files with YAML frontmatter. Two locations:

  • .claude/agents/<name>.md — project-scoped, checked into the repo, available only in this project.
  • ~/.claude/agents/<name>.md — user-scoped, available in every project.

Project-scoped wins on collision. The agent file is read by Claude Code at session start; you can also create or edit them via the /agents slash command.

The frontmatter spec, with everything that actually matters:

---
name: code-reviewer
description: Reviews diffs against project style and architecture rules. Read-only.
tools: Read, Grep, Glob, WebSearch
model: claude-haiku-4-5-20251001
memory: user
---
# code-reviewer
You are a senior reviewer for this TypeScript/Next.js codebase. When invoked
on a diff or a set of changed files:
1. **Map the change.** Use Grep/Glob to find every call site touched by the
diff. Read enough surrounding code to understand the contract, not the
whole file.
2. **Score against the rules.** Apply, in order:
- The project's CLAUDE.md style rules (single quotes, import ordering,
interface for object shapes).
- The relevant skill (if any) under .claude/skills/.
- Standard quality: dead code, missing error paths, leaky abstractions,
test coverage on changed branches.
3. **Return a structured summary.** Top section: must-fix items with file:line.
Middle: nits and style. Bottom: praise-worthy choices (so the orchestrator
knows what to keep). Never paste the whole file back — return diff hunks
or line ranges only.
Hard rules:
- You have no Write/Edit/Bash. Do not propose to run anything.
- If you find a structural problem requiring an architectural decision, stop
and surface it. Do not redesign — that's the code-architect's job.

Fields that matter most:

  • tools — comma-separated allowlist. If missing, the subagent inherits all tools from the orchestrator — the default that gets people in trouble. Always set it. Common patterns: Read, Grep, Glob for read-only; add Write, Edit, MultiEdit for editors; add Bash(npm test:*) for narrowly-scoped runners.
  • model — pin the model per agent. Reviewers, explorers and bulk-scanners on claude-haiku-4-5-20251001. Architects and planners on claude-opus-4-7. Most others on claude-sonnet-4-6.
  • description — the first sentence is what the orchestrator reads when deciding to delegate. Lead with the verb and the scope (“Reviews diffs against project style rules”), not the role name.
  • memory — opts the subagent into a persistent knowledge directory across conversations. Useful for code-explorer (builds a feature map over time) and code-architect (accumulates design decisions). Don’t enable for stateless reviewers.

Cursor shipped first-class subagents in 2.4 and consolidated them through 3.0 (2026). Mental model is identical to Claude Code: a parent delegates to a child agent with its own context, prompt and tool restrictions. Three Cursor-specific notes:

  • Tool inheritance is opt-in, not automatic. Custom rules and “efficiency prompts” defined for the main agent do not propagate to subagents unless you explicitly include them. Subagent ignores your style guide because you never gave it the style guide.
  • Agent(<agent_type>) syntax restricts which agents can spawn which. Useful for nested orchestration: a top-level architect spawns explorers but not editors; an editor spawns a test-author but not another architect. Finite delegation tree instead of a recursive free-for-all.
  • Cloud Agents have a reduced surface. The task/subagent tool in local IDE sessions is not exposed to Cloud Agents as of early 2026, and on request-based plans some flows require Max mode or Composer 1.5. If you run Cursor mostly in Cloud, your subagent strategy still leans on local sessions or Claude Code.

OpenAI’s Codex CLI has no documented subagent architecture as of May 2026. There is no .codex/agents/ equivalent, no subagent delegation tool in the CLI’s tool surface, no “Task” or “Explore” primitive. Codex agents run as single-context agents with whatever tool surface you configured. If your primary tool is Codex, you do not have a path to max score on Q12 inside Codex itself — you either keep Claude Code or Cursor in the loop for delegated work, or you live with single-context limits.

A working set on a TypeScript/Next.js + Cloudflare repo:

.claude/agents/
code-reviewer.md # Haiku · Read/Grep/Glob · diffs vs CLAUDE.md rules
code-explorer.md # Haiku · Read/Grep/Glob · call sites & feature maps
code-architect.md # Opus · Read/Grep/Glob/WebSearch · plan-mode partner
migration-planner.md # Opus · Read/Grep/Glob/WebSearch · multi-PR migrations
test-author.md # Sonnet · Read/Write/Edit/Bash(npm test:*) · writes tests
sql-reviewer.md # Haiku · Read/Grep · D1 schema + queries review
~/.claude/agents/
pr-writer.md # Haiku · Read/Bash(gh,git) · drafts PR titles & bodies
doc-writer.md # Sonnet · Read/Write/Edit · MDX docs in project voice

Seven specialized agents covers ~90% of the delegations a real coding day will produce.

  1. Pick the next delegation you’re tired of doing yourself. Open your last week of Claude Code transcripts and find the task you ran more than three times. “Map every call site of X.” “Write tests for this function.” That’s your first subagent.

  2. Run /agents and create the file. The /agents slash command scaffolds the markdown file with frontmatter and an empty system prompt. Pick project scope (.claude/agents/) if it’s repo-specific, user scope (~/.claude/agents/) if you want it everywhere.

  3. Write the description first, prompt second. The description is what the orchestrator sees when deciding to delegate. Lead with a verb and a scope (“Reviews diffs”, “Maps call sites”) — not a noun (“A reviewer agent”).

  4. Whitelist tools explicitly. Never leave tools: blank. Code-reviewer: Read, Grep, Glob. Code-architect: add WebSearch. Editor: add Write, Edit, MultiEdit. Bash-running agent: scope it (Bash(npm test:*), not bare Bash).

  5. Pin the model. Read-only / bulk-scan → claude-haiku-4-5-20251001. Plan-mode partner / architecture → claude-opus-4-7. Editors and test-authors → claude-sonnet-4-6. If you don’t pin, the subagent inherits the orchestrator’s model and you lose the cost win.

  6. Tell the agent what not to do. Every prompt should end with a “Hard rules” section: when to stop, what to escalate, what not to redesign. A code-reviewer that starts redesigning the architecture is worse than no subagent.

  7. Test it on three real cases. Invoke the subagent on three actual backlog tasks. If the summary is >500 tokens, tighten compression. If it’s too short, widen it. Iterate on the prompt, not the orchestrator.

  8. Commit the file and add the delegation rule to CLAUDE.md. Project-scoped agents into the repo so teammates get the same behavior; user-scoped into your dotfiles. Then add a one-line bullet to CLAUDE.md (“When asked to review a diff, delegate to the code-reviewer subagent”). Without this hint, the orchestrator may forget the agent exists.

  • Over-delegating. Subagents are a sharp tool. Delegating “fix this typo” adds two network round-trips and a context handoff for a 5-token edit. Rule of thumb: delegate when the task would otherwise consume >5k tokens of main context, or when it’s embarrassingly parallel.
  • No model override. Default behavior is for the subagent to inherit the orchestrator’s model — usually Sonnet. A code-reviewer on Sonnet costs ~5× a code-reviewer on Haiku for ~zero quality gain. Pin the model in every agent.
  • Leaky context — returning raw output. The point of a subagent is that its reads, greps and tool calls do not enter the orchestrator’s context. If your prompt says “return the file contents”, you’ve defeated the purpose. Force a summary.
  • No tool restrictions. A subagent that inherits all tools is just the main agent in a different window. Always whitelist. A code-reviewer with Write access will eventually try to “helpfully” edit the file it’s reviewing.
  • One mega-agent instead of a set. A single “do-everything” custom agent isn’t a subagent strategy; it’s a second main agent. The win comes from role-specialized prompts, tools and models. If your agents all have the same tools: line, you’re not specializing.
  • Forgetting Cursor’s prompt-inheritance rule. Cursor subagents don’t inherit your project rules unless you explicitly include them. The fix is a single @.cursor/rules/style.md include in the subagent prompt.
  • Trying to push subagents through Codex. Codex has no subagent primitive. Don’t fake it with chained CLI calls; use Claude Code or Cursor for delegated work.
  • Letting Explore be your “subagent strategy”. Built-in agents are great defaults, but max score requires that you’ve decided what to delegate. A curated set in .claude/agents/ is the artifact that proves it.
  • Your project has at least three role-based subagents (code-reviewer, code-explorer, code-architect) in .claude/agents/ or ~/.claude/agents/.
  • At least two more task-specific subagents exist (e.g. migration-planner, test-author, doc-writer, pr-writer).
  • Every subagent has an explicit tools: whitelist in its frontmatter — none rely on inherited tools.
  • At least one subagent is pinned to claude-haiku-4-5-20251001 and at least one is pinned to claude-opus-4-7 — you can read the cost difference in your spend dashboard.
  • Your CLAUDE.md has at least one line per agent explaining when to delegate to it.
  • You can name, without looking, the three or four tasks you delegate by default — and the agent each one goes to.
  • Subagent summaries returned to the orchestrator are <500 tokens on average, not raw file contents.
  • You’ve run the same task once without and once with subagent delegation in the last month, and you noticed the context-usage difference.