← Back to Orchestrator

Phase Control in Detail

Bootstrap, analysis, curriculum derivation, and polish algorithm — the core phases with complete pseudocode, edge cases, and decision logic

01

Phase 0 — Bootstrap Logic

Before the orchestrator starts any analysis, it needs to know where the source code lives. Phase 0 detects the source and ensures all mandatory questions are answered before a single token is spent on analysis.

Source detection — three paths:

🌐
GitHub URL
📥
git clone
📂
Local Path
📁
Direct Access
💬
“this project”
📍
Use CWD

HARD BLOCK — Mandatory Questions

After source detection, Phase 0 presents a block of mandatory questions. The orchestrator must not proceed to Phase 1 until all are answered. Missing answers do not generate defaults — they trigger re-prompting. This prevents the skill from building an entire course on false assumptions.

Language(s)? — de, en, or both (determines filename suffixes and content language)
Integration mode? — standalone or embedded (determines header/footer generation)
Audiences? — Which of the three audiences should be served

Step 1: The orchestrator checks the user input against three patterns: GitHub URL (starts with https://github.com/...), local file path (absolute or relative), or a phrase like “this project”.

Step 2: Depending on the pattern, it clones, resolves the path, or uses the current working directory. On failure (private repo, non-existent path, empty directory), Phase 0 aborts with a clear error message.

Step 3: Then come the mandatory questions. The three questions (language, integration mode, audiences) are asked in a loop. No question may be skipped. There are no default values. The loop repeats until a valid answer is provided.

Edge case: If a user provides a GitHub URL pointing to a private repo and no token is available, the clone fails. The error message includes a hint about the missing token — no silent failure.

02

Phase 1 — Analysis Strategy

Phase 1 reads the source code and builds a theme tree with complexity ratings. The approach is top-down: README and entry points first, then progressively deeper into the structure.

Analysis order:

1
Read README + Docs
Provides project context: What does the project do? Which technologies? What is the purpose? This information frames all subsequent decisions.
2
Identify Entry Points
main.py, index.ts, App.vue, Dockerfile — the files that start the system. From here, the dependency chain is traced.
3
Map Actors and Data Flows
Who or what interacts with the system? APIs, users, databases, external services. Each actor becomes a potential theme.
4
Detect Patterns and Architecture
MVC, event-driven, microservices, monolith? The architecture pattern determines how themes are grouped and nested.
5
Rate Complexity per Theme
Each theme gets an initial complexity estimate (0–3). This determines whether the theme is a candidate for deeper levels.

Theme tree structure:

The theme tree is the central artifact of Phase 1. Each theme receives:

Complexity 0: Trivial — mentioned in one sentence on the L0 overview. No dedicated module.
Complexity 1: Needs one paragraph. Candidate for an L1 module, but no deeper.
Complexity 2: Needs multiple sections. L1 + L2 candidate.
Complexity 3: Needs its own page with diagrams, code examples, and edge cases. L1 + L2 + L3 candidate.

The rationale field documents why this complexity was chosen. This is critical for traceability — when the depth map later shows that a theme got no L3, you can navigate back to the reasoning.

Edge case: Monorepos

For monorepos with multiple packages, Phase 1 analyzes each package as a separate subtree. Themes are then grouped at the top level (e.g., “Frontend”, “Backend”, “Shared”). This prevents a monorepo from producing a flat, unstructured tree.

03

Phase 2 — Curriculum per Audience

Phase 2 takes the theme tree from Phase 1 and creates separate curricula for each audience. Each audience receives only the topics and depths that are relevant to them.

Maximum depth per audience:

📊
Executives
max_level = L1. No L2, no L3. Executives need overview and decision bases, not implementation details.
L0 + L1
👤
Users
max_level = L2. Get L0, L1, and selected L2 pages. No L3 — too technical.
L0 + L1 + L2
🔧
Developers
max_level = L3. Full access. All levels including deep-dives with complete code.
L0 + L1 + L2 + L3

Curriculum derivation — algorithm:

The algorithm works as follows:

1. Each audience has a maximum depth. Executives: L1. Users: L2. Developers: L3. Planning never exceeds this.

2. For each theme, the Helpfulness Score (HS) is calculated — specific to this audience. The same theme has different scores per audience.

3. The HS is checked against thresholds. Each level has a threshold per audience. Only if the score meets the threshold and the maximum depth allows it, the level is planned.

4. Each topic gets a stop reason: Why was it not planned deeper? Three possible reasons: audience maximum depth reached, score below threshold, or the topic simply lacks substance for more.

Core rule: Never plan pages that won't be built. If executives max out at L1, no L2 is planned for them — even if the HS would theoretically be high enough.

Example: Topic “Authentication” — three curricula

Audience HS Planned Levels Stop Reason
📊 Executives 9 L0, L1 audience_max_level (max=L1)
👤 Users 7 L0, L1, L2 audience_max_level (max=L2)
🔧 Developers 10 L0, L1, L2, L3 complexity exhausted
04

Phase 5 — Polish Algorithm

Phase 5 is the final pass. No content changes happen here — only verification, repair, and consistency enforcement. The polish algorithm is a systematic checklist implemented as interactive toggles.

Checks in detail:

Verify cross-level links: Every link from L1→L2 and L2→L3 must point to an existing file. Orphaned links (target not generated because HS was too low) are removed, not left dangling.
Set audience switch on L0: The L0 overview must contain links to all generated audience variants. If only developers were generated, no switch appears. If all three exist, all three are linked.
Check relative paths: L0→L1: l1/file.html. L1→L2: ../l2/file.html. L2→L3: ../l3/file.html. A wrong prefix breaks navigation.
Validate breadcrumbs: Each page must have the correct breadcrumb path. L2 pages: L0 › L1 › L2. L3 pages: L0 › L1 › L2 › L3. Breadcrumb links must point to the correct audience variant.
Ensure language versions: If both languages were selected, every DE file must have an EN counterpart and vice versa. The language switch in the nav must point to the correct partner file.
Check sibling navigation: All L2 pages of the same audience must share the same sibling nav at the bottom. The current page is highlighted, all others are linked. Missing siblings are removed from the nav.
Verify level dots: The dots in the hero must correctly display the current level. L2 page: two done, one active, one empty. Faulty dots create incorrect visual hierarchy.

The polish algorithm makes five passes:

1. Find and handle dead links: Every internal link is checked against the list of generated files. Deep-dive links to non-generated L3 pages are removed. Other dead links produce an error.

2. Inject audience switch: If multiple audiences were generated, the L0 overview gets links to all variants. If only one audience exists, no switch is shown.

3. Fix path prefixes: Links between levels must reflect the correct directory structure. A link from an L1 page to an L2 page must start with ../l2/, not l2/.

4. Synchronize breadcrumbs: The breadcrumb chain is computed from the file's position in the level tree and compared with the actual HTML.

5. Check language pairs: Every _de file needs an _en counterpart (if both languages were selected). Missing partners are reported as errors.

✏️ Knowledge Check

Phase 2 creates a curriculum for 📊 Executives. Should it include L2/L3 candidate topics?

No — max level is L1, don't plan what won't be built
Yes — the HS might be high enough, so it should plan them
Only L2, no L3 — one level deeper is still fine
🧪 Deep-Dive: Routing Logic in Detail →