Level 2 — Detail

Agent Communication

How the 8 agents collaborate: dispatch pattern, batch strategy, intermediate files, context injection, and normalization.

On this page

Dispatch Pattern — Main agent dispatches sub-agents
Batch Strategy — 20-30 files per batch
Intermediate Files — Communication channel
Context Injection — README, manifest, dir tree
Normalization — merge-batch-graphs.py

Dispatch Pattern

Main agent dispatches sub-agents with prompt templates from agents/

The main agent (controlled by skill.md) acts as a dispatcher. It reads the phase definition, determines which agent is needed, and launches it as a sub-agentAn isolated Claude instance, started via the Task tool. It receives its own prompt and works in a limited context — only the files and data it needs. via Claude Code's Task tool.

          # Dispatch flow (pseudocode)

def dispatch_agent(agent_name, batch_data):
    # 1. Load prompt template
    template = read(f"agents/{agent_name}.md")

    # 2. Inject context
    prompt = template.replace(
        "{{FILES}}", batch_data.files,
        "{{README}}", project.readme,
        "{{IMPORT_MAP}}", project.import_map,
        "{{LANG_ADDENDA}}", get_addenda(batch_data)
    )

    # 3. Start sub-agent (Task tool)
    result = task(
        prompt=prompt,
        description=f"{agent_name} batch {batch_data.id}"
    )

    # 4. Parse result as JSON
    return parse_json(result)

# Dispatch for Phase 2: file-analyzer x5
batches = split(inventory, size=25)
results = parallel(
    dispatch_agent("file-analyzer", b) for b in batches
)
        

Step 1: The prompt template is loaded from agents/file-analyzer.md. It contains instructions for which nodes and edges the agent should extract.

Step 2: Placeholders in the template are replaced with real data: the batch's files, README, import map, and language-specific hints.

Step 3: Claude Code starts a new Claude instance (Task tool) with the prepared prompt. The sub-agent works in isolation.

Step 4: The result (JSON with nodes + edges) is parsed and written to the intermediate file.

The dispatch pattern isolates each agent from the overall context. A file-analyzer only sees its 25 files, not the entire codebase. This saves context tokens and enables parallelization.

Batch Strategy

20-30 files per batch, up to 5 in parallel, import map injection

The codebase is split into batches because a single Claude context cannot analyze hundreds of files simultaneously. The batch strategy optimizes the trade-off between context quality and parallelization:

Inventory

150 files

▼ split(size=25)

Batch 1

25 files

Batch 2

25 files

Batch 3

25 files

Batch 4

25 files

Batch 5

25 files

Batch 6

25 files

▼ parallel(max=5)

Merge

6 batch JSONs → 1 graph

          # Batch configuration in skill.md
batch_config:
  target_size: 25        # files per batch
  min_size: 10           # minimum (after retry split)
  max_parallel: 5       # simultaneous sub-agents
  sort_by: "directory"   # related files in same batch

  import_map_injection:
    enabled: true
    scope: "cross_batch"  # imports of ALL files, not just batch
    purpose: "Enables cross-batch edges"
        

Batch sorting: Files are sorted by directory so related modules (e.g., all files in src/auth/) end up in the same batch. This improves edge detection quality.

Import map injection: The key element. The import map contains ALL import relationships across the entire codebase. Each batch receives this map so files in different batches can still produce correct edges to each other.

Without import map injection, cross-batch relationships would be lost. File A in Batch 1 imports File B in Batch 3 — without the map, Batch 1 wouldn't know B exists. The map solves this elegantly.

Intermediate Files

.claude-learning/intermediate/ as the communication channel between agents

Agents don't communicate directly — they communicate through the file system. Each agent writes its results as JSON to the intermediate/ folder. The next agent reads these files as input.

          # Intermediate folder structure
.claude-learning/
  intermediate/
    manifest.json          # P1: project-scanner
    import-map.json        # P1: project-scanner
    dir-tree.txt           # P1: project-scanner
    batch-1.json           # P2: file-analyzer
    batch-2.json           # P2: file-analyzer
    batch-3.json           # P2: file-analyzer
    assembled-graph.json   # P3: assemble-reviewer
    architecture.json      # P4: architecture-analyzer
    tour.json              # P5: tour-builder
    review-log.json        # P6: review-validator

# Data flow:
# P1 output → P2 input (manifest, import-map)
# P2 output → P3 input (batch-*.json)
# P3 output → P4 input (assembled-graph.json)
# P4 output → P5 input (architecture.json)
# P5 output → P6 input (tour.json)
        

Why files instead of context? Claude sub-agents have isolated contexts. They cannot share variables. The only way to transfer data between agents is the file system.

Advantage: Intermediate files are inspectable. When something goes wrong, you can open the JSON files and see what each agent produced. This makes debugging dramatically easier.

Cleanup: After P7 (Save), intermediate files are optionally deleted. The final Knowledge Graph contains all relevant data.

This pattern is inspired by classic Unix pipelines: each process reads stdin, processes, writes stdout. Here it's JSON files instead of streams, but the principle is identical — loose coupling through defined interfaces.

Context Injection

README, manifest, dir tree are injected into agent prompts

Each agent receives shared context alongside its specific data, injected into its prompt template. Three files form the "shared context":

          # agents/file-analyzer.md — Template excerpt
---
name: file-analyzer
role: Extract nodes and edges from source files
output: JSON with nodes[] and edges[]
---

## Injected Context

### Project README
{{README}}

### Directory Tree
{{DIR_TREE}}

### Import Map (cross-batch)
{{IMPORT_MAP}}

### Language Addenda
{{LANG_ADDENDA}}   # e.g. languages/python.md
        

README: Gives the agent project context. What does the project do? Which technologies are used? Without README, the agent produces more generic descriptions.

Dir Tree: Shows the entire folder structure. Helps the agent recognize dependencies between directories.

Import Map: The map of all import relationships. Allows the agent to produce correct cross-batch edges.

Language Addenda: Language-specific instructions from the languages/ folder. For Python files, decorator patterns and __init__.py conventions are explained.

Context injection ensures that each agent — despite being isolated — has enough knowledge about the overall project. The art lies in the balance: too much context wastes tokens, too little context produces lower-quality results.

Normalization

merge-batch-graphs.py: Node ID normalization, deduplication, edge cleanup

After parallel batch analysis, results must be merged. This is non-trivial: different batches may produce the same node under slightly different IDs. The merge processExecuted by the assemble-reviewer. Can optionally be supported by the Python script merge-batch-graphs.py, which produces more deterministic results than purely LLM-based merging. consists of three steps:

          # merge-batch-graphs.py — Core logic

def normalize_node_id(node):
    # Step 1: Deterministic ID generation
    type_prefix = node["type"]
    path = node["filePath"].replace("/", "-").replace(".", "-")
    name = node["name"]
    return f"{type_prefix}:{path}::{name}"

def deduplicate_nodes(all_nodes):
    # Step 2: Merge duplicates
    seen = {}
    for node in all_nodes:
        nid = normalize_node_id(node)
        if nid in seen:
            # Longer description wins
            if len(node["description"]) > len(seen[nid]["description"]):
                seen[nid]["description"] = node["description"]
        else:
            seen[nid] = node
    return list(seen.values())

def clean_edges(edges, valid_node_ids):
    # Step 3: Remove dangling edges
    return [
        e for e in edges
        if e["source"] in valid_node_ids
        and e["target"] in valid_node_ids
    ]
        

Step 1 — ID normalization: Node IDs are deterministically generated from type, file path, and name. This ensures identical nodes across batches receive the same ID, regardless of how the LLM named them.

Step 2 — Deduplication: Nodes with identical IDs are merged. On conflicts, the longer (presumably more informative) description wins.

Step 3 — Edge cleanup: Edges referencing non-existent nodes (e.g., because a batch failed) are removed. This ensures referential integrity.

Normalization is the fragile point of the pipeline. When two batches name the same node differently (e.g., "LoginForm" vs. "loginForm"), duplicates emerge. The Python script resolves this through case-insensitive comparisons and path-based heuristics — but edge cases are inevitable.

Dive deeper into sub-agent architecture