Back to Overview
Level 1 — Deep Dive

The Agent Pipeline

8 specialized agents work together to transform a codebase into a complete Knowledge Graph.

On this page
  • project-scanner — Inventory and import map
  • file-analyzer — Extract nodes + edges (5x parallel)
  • architecture-analyzer — Layer detection
  • tour-builder — Guided learning paths
  • graph-reviewer — Graph validation
01
project-scanner
Discovers files, languages, frameworks and creates the import map

The project-scannerThe first agent in the understand pipeline. Called once, it provides the foundation for all subsequent agents. is the first agent to run. It scans the entire project directory, detects programming languages from file extensions, identifies frameworks via configuration files (package.json, Cargo.toml, etc.), and creates a complete file inventory.

// project-scanner output (excerpt) { "files": [ { "path": "src/index.ts", "loc": 142, "language": "TypeScript" } ], "languages": { "TypeScript": { "files": 47 } }, "frameworks": { "Express": "4.18.2" } }

What does the scanner deliver? A complete inventory of all code files with paths, line counts, detected languages, and external imports. Plus a summary of detected languages and frameworks.

This data serves two purposes: (1) The orchestrator loads matching language and framework addenda, (2) files are split into batches of 20-30 for the file-analyzer.

02
file-analyzer
Analyzes batches of 20-30 files, produces nodes + edges (5x parallel)

The file-analyzerThe only agent that runs in multiple parallel instances. Up to 5 instances process different file batches simultaneously. Each instance receives only its batch + language addenda. is the most work-intensive agent. It runs up to five times in parallel, with each instance analyzing a batch of 20-30 files. For each file, it extracts nodes (functions, classes, interfaces, exports) and edges (imports, calls, inheritance).

Parallel Processing
INV
Inventory
150 files
B1
Batch 1
B2
Batch 2
B3
Batch 3
B4
Batch 4
B5
Batch 5
MRG
Merge
// Node type definitions (from graph-reviewer.md) const STRUCTURAL_NODE_TYPES = [ "file", "function", "class", "interface", "module", "component", "type", "config", "test", "route", "middleware", "schema", "constant" ]; const DOMAIN_NODE_TYPES = ["concept", "process", "entity"];
03
architecture-analyzer
Identifies 3-10 logical layers and assigns each file

The architecture-analyzerAnalyzes the assembled graph and identifies architectural layers. Each node is assigned to exactly one layer. The layer structure is stored as a separate field in the final graph. receives the assembled graph and identifies 3 to 10 logical architecture layers. Typical layers include: Core/Domain, Infrastructure, API/Routes, Frontend/UI, Database, Configuration, Tests.

Each file node is assigned to exactly one layer. Assignment is based on path patterns (e.g., src/api/ = API layer), import graphs (files only imported by others are often Core), and framework conventions (e.g., React components = UI layer).

// Layer assignment output (excerpt) { "layers": [ { "name": "Core", "nodes": ["src/models/user.ts"] }, { "name": "API", "nodes": ["src/routes/users.ts"] } ] }
04
tour-builder
Creates a guided learning path through the codebase

The tour-builderCreates an ordered sequence of nodes providing a logical entry point. Typically starts at the main entry point (main, index) and follows the most important dependency chains. analyzes the graph and creates an ordered learning path — the guided_tour. Instead of showing all nodes unsorted, it guides the learner step by step through the codebase, starting at the entry point and ascending in complexity.

// guided_tour output (excerpt) { "guided_tour": { "steps": [ { "order": 1, "node_id": "src/index.ts", "why": "Everything starts here — Express server setup" }, { "order": 2, "node_id": "src/config.ts", "why": "Essential for understanding all defaults" } ] } }

The learning path typically contains 8-15 stops covering about 80% of core functionality. Each stop has an explanation of why it matters (why field), which makes the difference from a plain file listing.

05
graph-reviewer
Validates the assembled Knowledge Graph

The graph-reviewerThe last agent in the analysis pipeline. It checks the entire graph for schema conformity, referential integrity, and completeness. Errors are either auto-repaired or reported as warnings. is the quality gatekeeper. It validates the final graph against the schema in three dimensions: schema validation (allowed node and edge types), referential integrity (every edge references existing nodes), and completeness (every file from inventory has at least one node).

// graph-reviewer validation rules const EDGE_TYPES = [ "imports", "exports", "calls", "contains", "implements", "extends", "uses", "tests", // ... 29 types total (26 structural + 3 domain) ]; function validateGraph(graph) { const errors = []; // 1. Schema: Only allowed types // 2. Referential integrity // 3. Completeness return errors; }

Three validation dimensions:

1. Schema Validation: Every node must have one of 16 allowed types, every edge one of 29 allowed types.

2. Referential Integrity: Every edge references a source and target. Both must exist as nodes.

3. Completeness: Every file from inventory must have at least one node in the graph.

Deep Dive: Agent Communication
More Developer L1 Pages