8 specialized agents work together to transform a codebase into a complete Knowledge Graph.
The project-scannerThe first agent in the understand pipeline. Called once, it provides the foundation for all subsequent agents. is the first agent to run. It scans the entire project directory, detects programming languages from file extensions, identifies frameworks via configuration files (package.json, Cargo.toml, etc.), and creates a complete file inventory.
// project-scanner output (excerpt)
{
"files": [
{ "path": "src/index.ts", "loc": 142, "language": "TypeScript" }
],
"languages": { "TypeScript": { "files": 47 } },
"frameworks": { "Express": "4.18.2" }
}
What does the scanner deliver? A complete inventory of all code files with paths, line counts, detected languages, and external imports. Plus a summary of detected languages and frameworks.
This data serves two purposes: (1) The orchestrator loads matching language and framework addenda, (2) files are split into batches of 20-30 for the file-analyzer.
The file-analyzerThe only agent that runs in multiple parallel instances. Up to 5 instances process different file batches simultaneously. Each instance receives only its batch + language addenda. is the most work-intensive agent. It runs up to five times in parallel, with each instance analyzing a batch of 20-30 files. For each file, it extracts nodes (functions, classes, interfaces, exports) and edges (imports, calls, inheritance).
// Node type definitions (from graph-reviewer.md)
const STRUCTURAL_NODE_TYPES = [
"file", "function", "class", "interface",
"module", "component", "type", "config",
"test", "route", "middleware", "schema", "constant"
];
const DOMAIN_NODE_TYPES = ["concept", "process", "entity"];The architecture-analyzerAnalyzes the assembled graph and identifies architectural layers. Each node is assigned to exactly one layer. The layer structure is stored as a separate field in the final graph. receives the assembled graph and identifies 3 to 10 logical architecture layers. Typical layers include: Core/Domain, Infrastructure, API/Routes, Frontend/UI, Database, Configuration, Tests.
Each file node is assigned to exactly one layer. Assignment is based on path patterns (e.g., src/api/ = API layer), import graphs (files only imported by others are often Core), and framework conventions (e.g., React components = UI layer).
// Layer assignment output (excerpt)
{
"layers": [
{ "name": "Core", "nodes": ["src/models/user.ts"] },
{ "name": "API", "nodes": ["src/routes/users.ts"] }
]
}The tour-builderCreates an ordered sequence of nodes providing a logical entry point. Typically starts at the main entry point (main, index) and follows the most important dependency chains. analyzes the graph and creates an ordered learning path — the guided_tour. Instead of showing all nodes unsorted, it guides the learner step by step through the codebase, starting at the entry point and ascending in complexity.
// guided_tour output (excerpt)
{
"guided_tour": {
"steps": [
{ "order": 1, "node_id": "src/index.ts",
"why": "Everything starts here — Express server setup" },
{ "order": 2, "node_id": "src/config.ts",
"why": "Essential for understanding all defaults" }
]
}
}The learning path typically contains 8-15 stops covering about 80% of core functionality. Each stop has an explanation of why it matters (why field), which makes the difference from a plain file listing.
The graph-reviewerThe last agent in the analysis pipeline. It checks the entire graph for schema conformity, referential integrity, and completeness. Errors are either auto-repaired or reported as warnings. is the quality gatekeeper. It validates the final graph against the schema in three dimensions: schema validation (allowed node and edge types), referential integrity (every edge references existing nodes), and completeness (every file from inventory has at least one node).
// graph-reviewer validation rules
const EDGE_TYPES = [
"imports", "exports", "calls", "contains",
"implements", "extends", "uses", "tests",
// ... 29 types total (26 structural + 3 domain)
];
function validateGraph(graph) {
const errors = [];
// 1. Schema: Only allowed types
// 2. Referential integrity
// 3. Completeness
return errors;
}
Three validation dimensions:
1. Schema Validation: Every node must have one of 16 allowed types, every edge one of 29 allowed types.
2. Referential Integrity: Every edge references a source and target. Both must exist as nodes.
3. Completeness: Every file from inventory must have at least one node in the graph.