Back to Overview
Level 1 — Deep Dive

Knowledge Graph Pipeline

Phase 0 through Phase 7: From pre-flight checks to the finished Knowledge Graph with 16 node types and 29 edge types.

On this page
  • Phase 0: Pre-flight — Git status, config, incremental check
  • Phase 1-2: Scan & Analyze — Discover files, analyze batches
  • Phase 3: Assemble Review — Merge the graph
  • Phase 4-5: Architecture & Tour — Layers and learning path
  • Phase 6: Review — Validation
  • Phase 7: Save — Persist and cleanup
01
Phase 0: Pre-flight
Git status, configuration, incremental check

Before analysis begins, Phase 0 verifies prerequisites. The incremental checkChecks whether a knowledge-graph.json already exists and whether files have changed since the last run (via Git diff). In incremental mode, only changed files are re-analyzed. is particularly important: Does a Knowledge Graph already exist? If so, which files have changed since the last run?

# Phase 0: Pre-flight checks pre_flight: steps: - check_git_status # Uncommitted changes? - load_config # .claude-learning/config.json - check_existing_graph # knowledge-graph.json present? - compute_diff # git diff --name-only since last run - decide_mode # full vs. incremental

What happens? Phase 0 checks if a prior graph exists. If so, only changed files (via Git diff) are re-analyzed, saving significant time on large projects.

02
Phase 1-2: Scan & Analyze
Discover files, split into batches, analyze in parallel

Phase 1 (SCAN) dispatches the project-scanner agent to inventory all code files. Phase 2 (ANALYZE) splits files into batches of 20-30 and dispatches up to five file-analyzer instances in parallel.

Phase 1-2 Flow
P0
Pre-
flight
P1
Scan
P2
Analyze
5x parallel
P3
Assemble
P4
Arch
P5
Tour
P6
Review
P7
Save
03
Phase 3: Assemble Review
Merge batch graphs with merge-batch-graphs.py

Phase 3 merges all batch sub-graphs into a single graph. The assemble-reviewerTakes all batch results (batch-1.json through batch-N.json) and merges them into assembled-graph.json. Deduplicates nodes and resolves cross-batch edges. solves three problems: node deduplication, cross-batch edge resolution, and consistency checks.

# merge-batch-graphs.py — core logic def merge_graphs(batch_files): merged = {"nodes": [], "edges": []} seen_ids = set() for batch in batch_files: for node in batch["nodes"]: if node["id"] not in seen_ids: merged["nodes"].append(node) seen_ids.add(node["id"]) return merged
04
Phase 4-5: Architecture & Tour
Layer assignment and learning path creation

Phase 4 dispatches the architecture-analyzer to identify 3-10 logical layers. Phase 5 dispatches the tour-builder to create an ordered learning path starting from the main entry point.

// After Phase 5: Graph structure { "nodes": [/* 16 types, 50-500 nodes */], "edges": [/* 29 types, 100-2000 edges */], "layers": [{ "name": "Core" }, { "name": "API" }], "guided_tour": { "steps": [{ "order": 1, "node_id": "src/index.ts" }] } }
05
Phase 6: Review
Inline validation or LLM review

Phase 6 runs final quality validation. The graph-reviewer checks schema conformity, referential integrity, and completeness.

Validation Checks
Check 1
Schema Validation
Only allowed node types (16) and edge types (29)?
Check 2
Referential Integrity
All source/target IDs in edges reference existing nodes?
Check 3
Completeness
Every file from inventory has at least one node in graph?
Check 4
Layer Coverage
Every node assigned to exactly one layer?
Check 5
Tour Validation
All tour steps reference existing nodes?
06
Phase 7: Save
Persist knowledge-graph.json, clean up intermediates

The final phase saves the validated graph as knowledge-graph.json in .claude-learning/. Intermediate files (batch JSONs, assembled-graph.json) are cleaned up. An optional HTML dashboard is auto-generated and launched.

# Phase 7: Save + Cleanup save: steps: - write_knowledge_graph # .claude-learning/knowledge-graph.json - generate_dashboard # .claude-learning/dashboard.html - cleanup_intermediate # batch-*.json removed - auto_launch_dashboard # open in browser
Deep Dive: Graph Schema
More Developer L1 Pages