Orchestrate multi-agent systems with coordinator-subagent patterns
Multi-agent systems get their reliability from structure: one coordinator acts as the single hub that decomposes the request, delegates to specialized subagents, and combines their results. Getting the coordinator's decomposition, dynamic subagent selection, and context passing right is what separates a system that produces complete, cited reports from one where every subagent succeeds yet the final answer is silently incomplete. In production, most multi-agent failures trace back to the coordinator, not the subagents.
Hub-and-spoke: the coordinator is the only hub
In a coordinator-subagent system, the coordinator is the single hub through which every message flows. Subagents are the spokes: they receive a task from the coordinator, do their specialized work in isolation, and return a result to the coordinator. Subagents never call each other directly. If the synthesis agent needs a fact the search agent could provide, it returns control to the coordinator, which then invokes the search agent.
This centralization is a deliberate design choice, not an accident. Routing all traffic through one place gives you observability (a single trace of who was asked what and what came back), consistent error handling (one place decides how to react to a subagent failure), and controlled information flow (the coordinator decides exactly what context each subagent sees). Peer-to-peer chatter between subagents would scatter that logic and make the system nearly impossible to debug.
In the Claude Agent SDK, the coordinator spawns subagents with the Task tool, so the coordinator's allowedTools must include "Task". Each subagent is described by an AgentDefinition (its own system prompt and restricted tool set). Because the coordinator holds the only complete picture of the run, it is also the natural owner of retries, partial-result handling, and the decision to stop.
Subagents start with an empty, isolated context
The most tested fact in this task statement: a subagent does not inherit the coordinator's conversation history. It starts fresh. Whatever the subagent needs to know, prior findings, the specific question, the source URLs, the output format, must be written explicitly into the prompt the coordinator sends when it spawns the subagent.
This isolation is a feature. It keeps each subagent's context window small and focused, which improves reliability and lets you run many subagents in parallel without their contexts colliding. But it means "the coordinator already discussed this" is never a reason a subagent will know something. If the synthesis agent needs the web search results, the coordinator must pass those results into the synthesis prompt; the synthesis agent cannot reach back into the coordinator's memory.
A practical consequence: design subagents to return structured, self-contained results (claims plus source URLs plus dates) so the coordinator can hand them to the next subagent without loss. Verbose free text that only makes sense inside the producing agent's context is expensive to pass and easy to misread downstream.
What the coordinator actually does: decompose, delegate, select, aggregate
The coordinator has four jobs. First, task decomposition: break the user's request into subtasks that map to specialized subagents. Second, delegation: send each subtask, with all needed context, to the right subagent. Third, dynamic selection: decide which subagents to invoke based on the query, rather than always running every stage. Fourth, aggregation: collect the results and combine them into a coherent answer.
Dynamic selection is worth emphasizing. A well-designed coordinator analyzes the query first and invokes only the subagents that query needs. A simple factual lookup might need only the search agent; a full research report needs search, analysis, synthesis, and report generation. Hard-coding every request through the entire pipeline wastes tokens and latency and can even degrade quality by forcing empty or irrelevant stages.
The coordinator prompt should specify goals and quality criteria, what a good result looks like, rather than a rigid step-by-step script. That lets the coordinator adapt its delegation to the specific query instead of mechanically following one fixed path.
Partition scope to eliminate duplicate work
When you fan out to multiple subagents (say, several search agents), give each a distinct slice of the problem. Assign non-overlapping subtopics or distinct source types: one agent handles academic papers, another handles news, another handles the company's own docs. Without explicit partitioning, parallel agents tend to converge on the same obvious queries and return largely the same results, so you pay for N agents and get roughly one agent's worth of coverage.
Good partitioning does two things at once: it minimizes duplication and it maximizes breadth. Each agent's isolated context stays focused on its slice, and the union of slices covers the topic. The coordinator is responsible for choosing the partition, because it is the only component that sees the whole scope.
Iterative refinement: check coverage, re-delegate, repeat
A single pass of decompose, delegate, and synthesize is often not enough for open-ended research. The stronger pattern is a loop: after the synthesis subagent produces a draft, the coordinator evaluates that output for gaps against the original goal. Where coverage is thin or missing, the coordinator formulates targeted follow-up queries, re-delegates them to the search and analysis subagents, and then re-invokes synthesis with the new material. It repeats until coverage is sufficient.
This loop is what turns a brittle one-shot pipeline into a system that can recover from its own blind spots. It is also the direct remedy for narrow decomposition (see the next concept): even if the first decomposition missed a dimension, a coverage check against the topic can surface the gap and trigger another round.
Concretely, the coordinator's evaluation step asks "does this synthesis address every part of the user's request?" and produces a list of uncovered subtopics. Those become the next round of subagent tasks. The stopping condition is coverage sufficiency, not a fixed iteration count.
The narrow-decomposition failure mode
The classic multi-agent bug: every subagent succeeds, yet the final report is incomplete. The web search agent found good articles, the analysis agent summarized them correctly, the synthesis agent wrote coherent prose, but whole branches of the topic are missing. When this happens, the subagents are not at fault, they did exactly what they were told. The root cause is upstream, in how the coordinator decomposed the task.
The tell is in the coordinator's decomposition log. If a request to research "creative industries" was split into "digital art," "graphic design," and "photography," the coordinator silently narrowed the topic to visual arts and never assigned music, writing, or film. No downstream agent can recover coverage that was never delegated.
Guard against this in two ways: write coordinator prompts that decompose at the right altitude (enumerate the major dimensions of a broad topic before slicing it), and add the iterative refinement loop so a coverage check can catch and fill gaps the first decomposition missed.
Anti-patterns to avoid
Why it fails: It scatters error handling and logging across agents, destroys the single observable trace of the run, and makes information flow uncontrollable and hard to debug.
instead Route every subagent-to-subagent exchange through the coordinator. If the synthesis agent needs a fact, it returns to the coordinator, which invokes the search agent and passes the result back.
Why it fails: Simple queries pay the token and latency cost of stages they do not need, and forcing empty or irrelevant stages can degrade output quality.
instead Have the coordinator analyze the query first and dynamically select only the subagents that request actually needs.
Why it fails: Subagents have isolated context and do not inherit the coordinator's history; the missing context produces off-target or hallucinated results.
instead Write all required context (prior findings, source URLs, output format, the precise question) explicitly into the subagent's prompt when spawning it.
Why it fails: If the initial decomposition misses a dimension of the topic, no downstream agent can recover the missing coverage, and the report ships with silent gaps.
instead Decompose at the right altitude and add an iterative refinement loop where the coordinator evaluates synthesis for gaps and re-delegates until coverage is sufficient.
Worked example: Diagnosing a research report that only covers visual arts
You run the multi-agent research system (Scenario 3) on the topic "impact of AI on creative industries." Every subagent reports success: the web search agent returns relevant articles, the document analysis agent summarizes papers correctly, and the synthesis agent produces coherent, well-written output. Yet the final report covers only visual arts. Music, writing, and film production are missing entirely.
Step 1: Do not blame the subagents. They each completed their assigned task correctly. In a hub-and-spoke system, when the pieces succeed but the whole is incomplete, suspect the coordinator's decomposition, the one component that decides what gets assigned.
Step 2: Read the coordinator's decomposition log. You find it split the topic into three subtasks:
subtask_1: "AI in digital art creation"
subtask_2: "AI in graphic design"
subtask_3: "AI in photography"
All three are visual arts. The coordinator narrowed "creative industries" to a single sector and never delegated music, writing, or film. This is the narrow-decomposition failure mode: no downstream agent can cover a dimension that was never assigned.
Step 3: Fix the decomposition altitude. Instruct the coordinator to first enumerate the major dimensions of the topic, then partition across them so each search subagent owns a distinct sector:
subtasks = [
"AI in music production",
"AI in writing and journalism",
"AI in film and video",
"AI in visual arts and design"
]
Step 4: Add an iterative refinement loop so a bad first cut self-corrects. After synthesis, the coordinator evaluates coverage against the original request and re-delegates any gaps:
draft = invoke(synthesis, findings)
gaps = coordinator.evaluate_coverage(draft, goal="all creative industries")
while gaps:
new = invoke(web_search, targeted_queries(gaps))
findings += invoke(analysis, new)
draft = invoke(synthesis, findings)
gaps = coordinator.evaluate_coverage(draft, goal="all creative industries")
The loop stops on coverage sufficiency, not a fixed count. With both fixes, breadth is guaranteed by the decomposition and protected by the gap check. Note that all of this control flow lives in the coordinator; the subagents stay simple and specialized.
Exam tips
- ✓In hub-and-spoke, subagents never talk to each other directly. All communication flows through the coordinator, which delivers observability, consistent error handling, and controlled information flow.
- ✓Subagents do NOT inherit the coordinator's conversation history. Any context they need (prior findings, sources, format) must be written explicitly into their prompt.
- ✓A good coordinator dynamically selects which subagents to invoke based on the query. Always routing every request through the full pipeline is an anti-pattern.
- ✓When every subagent succeeds but a broad report is incomplete, the root cause is narrow coordinator decomposition. Look at the decomposition log, not the subagents.
- ✓Iterative refinement loop: the coordinator evaluates synthesis output for gaps, re-delegates targeted queries to the search and analysis subagents, and re-invokes synthesis until coverage is sufficient.
- ✓Partition scope across parallel subagents (distinct subtopics or source types) to minimize duplicated work and maximize breadth.
Official exam objectives for 1.2
- Hub-and-spoke architecture where a coordinator agent manages all inter-subagent communication, error handling, and information routing
- How subagents operate with isolated context — they do not inherit the coordinator's conversation history automatically
- The role of the coordinator in task decomposition, delegation, result aggregation, and deciding which subagents to invoke based on query complexity
- Risks of overly narrow task decomposition by the coordinator, leading to incomplete coverage of broad research topics
- Designing coordinator agents that analyze query requirements and dynamically select which subagents to invoke rather than always routing through the full pipeline
- Partitioning research scope across subagents to minimize duplication (e.g., assigning distinct subtopics or source types to each agent)
- Implementing iterative refinement loops where the coordinator evaluates synthesis output for gaps, re-delegates to search and analysis subagents with targeted queries, and re-invokes synthesis until coverage is sufficient
- Routing all subagent communication through the coordinator for observability, consistent error handling, and controlled information flow
Flashcards from this lesson
In a hub-and-spoke multi-agent architecture, who manages inter-subagent communication?
The coordinator. Subagents never communicate directly; every task and result routes through the coordinator for observability, consistent error handling, and controlled information flow.
Do subagents inherit the coordinator's conversation history?
No. Subagents run in isolated context. Any information they need must be included explicitly in the prompt the coordinator sends when it spawns them.
A broad research report is missing whole subtopics, yet every subagent succeeded. What is the root cause?
Overly narrow task decomposition by the coordinator. It never assigned the missing dimensions, so no subagent could cover them. Check the decomposition log.
How should a coordinator decide which subagents to invoke?
Dynamically, based on the query's requirements and complexity. Simple queries may need only one subagent; do not force every request through the full pipeline.
What is an iterative refinement loop in a coordinator-subagent system?
The coordinator evaluates the synthesis output for coverage gaps, re-delegates targeted queries to the search and analysis subagents, and re-invokes synthesis, repeating until coverage is sufficient.
How do you minimize duplicated work when fanning out to parallel subagents?
Partition the scope: assign each subagent a distinct subtopic or source type so their coverage does not overlap, maximizing breadth while eliminating redundancy.
In the Agent SDK, what must a coordinator's allowedTools include to spawn subagents?
"Task". The Task tool is the mechanism for spawning subagents, and each subagent's role and tools are defined by an AgentDefinition.