Back to Agent Pipeline
Level 2 — Detail

Agent Communication

How the 8 agents collaborate: dispatch pattern, batch strategy, intermediate files, context injection, and normalization.

On this page
  • Dispatch Pattern — Main agent dispatches sub-agents
  • Batch Strategy — 20-30 files per batch
  • Intermediate Files — Communication channel
  • Context Injection — README, manifest, dir tree
  • Normalization — merge-batch-graphs.py
01
Dispatch Pattern
Main agent dispatches sub-agents with prompt templates from agents/

The main agent (controlled by skill.md) acts as a dispatcher. It reads the phase definition, determines which agent is needed, and launches it as a sub-agentAn isolated Claude instance, started via the Task tool. It receives its own prompt and works in a limited context — only the files and data it needs. via Claude Code's Task tool.

# Dispatch flow (pseudocode) def dispatch_agent(agent_name, batch_data): # 1. Load prompt template template = read(f"agents/{agent_name}.md") # 2. Inject context prompt = template.replace( "{{FILES}}", batch_data.files, "{{README}}", project.readme, "{{IMPORT_MAP}}", project.import_map, "{{LANG_ADDENDA}}", get_addenda(batch_data) ) # 3. Start sub-agent (Task tool) result = task( prompt=prompt, description=f"{agent_name} batch {batch_data.id}" ) # 4. Parse result as JSON return parse_json(result) # Dispatch for Phase 2: file-analyzer x5 batches = split(inventory, size=25) results = parallel( dispatch_agent("file-analyzer", b) for b in batches )

Step 1: The prompt template is loaded from agents/file-analyzer.md. It contains instructions for which nodes and edges the agent should extract.

Step 2: Placeholders in the template are replaced with real data: the batch's files, README, import map, and language-specific hints.

Step 3: Claude Code starts a new Claude instance (Task tool) with the prepared prompt. The sub-agent works in isolation.

Step 4: The result (JSON with nodes + edges) is parsed and written to the intermediate file.

The dispatch pattern isolates each agent from the overall context. A file-analyzer only sees its 25 files, not the entire codebase. This saves context tokens and enables parallelization.

02
Batch Strategy
20-30 files per batch, up to 5 in parallel, import map injection

The codebase is split into batches because a single Claude context cannot analyze hundreds of files simultaneously. The batch strategy optimizes the trade-off between context quality and parallelization:

Inventory
150 files
▼ split(size=25)
Batch 1
25 files
Batch 2
25 files
Batch 3
25 files
Batch 4
25 files
Batch 5
25 files
Batch 6
25 files
▼ parallel(max=5)
Merge
6 batch JSONs → 1 graph
# Batch configuration in skill.md batch_config: target_size: 25 # files per batch min_size: 10 # minimum (after retry split) max_parallel: 5 # simultaneous sub-agents sort_by: "directory" # related files in same batch import_map_injection: enabled: true scope: "cross_batch" # imports of ALL files, not just batch purpose: "Enables cross-batch edges"

Batch sorting: Files are sorted by directory so related modules (e.g., all files in src/auth/) end up in the same batch. This improves edge detection quality.

Import map injection: The key element. The import map contains ALL import relationships across the entire codebase. Each batch receives this map so files in different batches can still produce correct edges to each other.

Without import map injection, cross-batch relationships would be lost. File A in Batch 1 imports File B in Batch 3 — without the map, Batch 1 wouldn't know B exists. The map solves this elegantly.

03
Intermediate Files
.claude-learning/intermediate/ as the communication channel between agents

Agents don't communicate directly — they communicate through the file system. Each agent writes its results as JSON to the intermediate/ folder. The next agent reads these files as input.

# Intermediate folder structure .claude-learning/ intermediate/ manifest.json # P1: project-scanner import-map.json # P1: project-scanner dir-tree.txt # P1: project-scanner batch-1.json # P2: file-analyzer batch-2.json # P2: file-analyzer batch-3.json # P2: file-analyzer assembled-graph.json # P3: assemble-reviewer architecture.json # P4: architecture-analyzer tour.json # P5: tour-builder review-log.json # P6: review-validator # Data flow: # P1 output → P2 input (manifest, import-map) # P2 output → P3 input (batch-*.json) # P3 output → P4 input (assembled-graph.json) # P4 output → P5 input (architecture.json) # P5 output → P6 input (tour.json)

Why files instead of context? Claude sub-agents have isolated contexts. They cannot share variables. The only way to transfer data between agents is the file system.

Advantage: Intermediate files are inspectable. When something goes wrong, you can open the JSON files and see what each agent produced. This makes debugging dramatically easier.

Cleanup: After P7 (Save), intermediate files are optionally deleted. The final Knowledge Graph contains all relevant data.

This pattern is inspired by classic Unix pipelines: each process reads stdin, processes, writes stdout. Here it's JSON files instead of streams, but the principle is identical — loose coupling through defined interfaces.

04
Context Injection
README, manifest, dir tree are injected into agent prompts

Each agent receives shared context alongside its specific data, injected into its prompt template. Three files form the "shared context":

# agents/file-analyzer.md — Template excerpt --- name: file-analyzer role: Extract nodes and edges from source files output: JSON with nodes[] and edges[] --- ## Injected Context ### Project README {{README}} ### Directory Tree {{DIR_TREE}} ### Import Map (cross-batch) {{IMPORT_MAP}} ### Language Addenda {{LANG_ADDENDA}} # e.g. languages/python.md

README: Gives the agent project context. What does the project do? Which technologies are used? Without README, the agent produces more generic descriptions.

Dir Tree: Shows the entire folder structure. Helps the agent recognize dependencies between directories.

Import Map: The map of all import relationships. Allows the agent to produce correct cross-batch edges.

Language Addenda: Language-specific instructions from the languages/ folder. For Python files, decorator patterns and __init__.py conventions are explained.

Context injection ensures that each agent — despite being isolated — has enough knowledge about the overall project. The art lies in the balance: too much context wastes tokens, too little context produces lower-quality results.

05
Normalization
merge-batch-graphs.py: Node ID normalization, deduplication, edge cleanup

After parallel batch analysis, results must be merged. This is non-trivial: different batches may produce the same node under slightly different IDs. The merge processExecuted by the assemble-reviewer. Can optionally be supported by the Python script merge-batch-graphs.py, which produces more deterministic results than purely LLM-based merging. consists of three steps:

# merge-batch-graphs.py — Core logic def normalize_node_id(node): # Step 1: Deterministic ID generation type_prefix = node["type"] path = node["filePath"].replace("/", "-").replace(".", "-") name = node["name"] return f"{type_prefix}:{path}::{name}" def deduplicate_nodes(all_nodes): # Step 2: Merge duplicates seen = {} for node in all_nodes: nid = normalize_node_id(node) if nid in seen: # Longer description wins if len(node["description"]) > len(seen[nid]["description"]): seen[nid]["description"] = node["description"] else: seen[nid] = node return list(seen.values()) def clean_edges(edges, valid_node_ids): # Step 3: Remove dangling edges return [ e for e in edges if e["source"] in valid_node_ids and e["target"] in valid_node_ids ]

Step 1 — ID normalization: Node IDs are deterministically generated from type, file path, and name. This ensures identical nodes across batches receive the same ID, regardless of how the LLM named them.

Step 2 — Deduplication: Nodes with identical IDs are merged. On conflicts, the longer (presumably more informative) description wins.

Step 3 — Edge cleanup: Edges referencing non-existent nodes (e.g., because a batch failed) are removed. This ensures referential integrity.

Normalization is the fragile point of the pipeline. When two batches name the same node differently (e.g., "LoginForm" vs. "loginForm"), duplicates emerge. The Python script resolves this through case-insensitive comparisons and path-based heuristics — but edge cases are inevitable.

Dive deeper into sub-agent architecture
More Developer L2 Pages