← Back to Orchestrator

Phase Control in Detail

Bootstrap, analysis, curriculum derivation, and polish algorithm — the core phases with complete pseudocode, edge cases, and decision logic

On this page

01Phase 0 — Bootstrap Logic 02Phase 1 — Analysis Strategy 03Phase 2 — Curriculum per Audience 04Phase 5 — Polish Algorithm

01

Phase 0 — Bootstrap Logic

Before the orchestrator starts any analysis, it needs to know where the source code lives. Phase 0 detects the source and ensures all mandatory questions are answered before a single token is spent on analysis.

Source detection — three paths:

🌐

GitHub URL

→

📥

git clone

📂

Local Path

→

📁

Direct Access

💬

“this project”

→

📍

Use CWD

HARD BLOCK — Mandatory Questions

After source detection, Phase 0 presents a block of mandatory questions. The orchestrator must not proceed to Phase 1 until all are answered. Missing answers do not generate defaults — they trigger re-prompting. This prevents the skill from building an entire course on false assumptions.

Language(s)? — de, en, or both (determines filename suffixes and content language)

Integration mode? — standalone or embedded (determines header/footer generation)

Audiences? — Which of the three audiences should be served

// Phase 0: Source Detection
function detect_source(user_input):
    // Pattern 1: GitHub URL
    if matches(user_input, /^https?:\/\/github\.com\/[\w-]+\/[\w.-]+/):
        repo_url = extract_repo_url(user_input)
        if is_private(repo_url) and !has_token():
            // Edge case: private repos need auth
            raise "Repository is private. Please provide a token."
        local_path = git_clone(repo_url, temp_dir())
        return { type: "github", path: local_path, url: repo_url }

    // Pattern 2: Local path
    if is_absolute_path(user_input) or is_relative_path(user_input):
        resolved = resolve_path(user_input)
        if !exists(resolved):
            raise "Path does not exist: " + resolved
        if !has_readable_files(resolved):
            raise "No readable files found in: " + resolved
        return { type: "local", path: resolved, url: null }

    // Pattern 3: "this project" / "dieses Projekt"
    if matches(user_input, /dieses?\s*projekt|this\s*project|current/i):
        cwd = get_working_directory()
        if !has_readable_files(cwd):
            raise "CWD contains no analyzable files."
        return { type: "cwd", path: cwd, url: null }

    // Fallback: no recognized pattern
    raise "Source not recognized. Please provide a GitHub URL, path, or 'this project'."

// HARD BLOCK: Mandatory Questions
function mandatory_questions(source):
    answers = {}
    for question in [LANGUAGE, INTEGRATION_MODE, AUDIENCES]:
        while !is_valid(answers[question]):
            answers[question] = ask_user(question)
            if !is_valid(answers[question]):
                notify("Please answer the question.")
                // NO default. NO skip. Loop.
    return answers
            

Step 1: The orchestrator checks the user input against three patterns: GitHub URL (starts with https://github.com/...), local file path (absolute or relative), or a phrase like “this project”.

Step 2: Depending on the pattern, it clones, resolves the path, or uses the current working directory. On failure (private repo, non-existent path, empty directory), Phase 0 aborts with a clear error message.

Step 3: Then come the mandatory questions. The three questions (language, integration mode, audiences) are asked in a loop. No question may be skipped. There are no default values. The loop repeats until a valid answer is provided.

Edge case: If a user provides a GitHub URL pointing to a private repo and no token is available, the clone fails. The error message includes a hint about the missing token — no silent failure.

02

Phase 1 — Analysis Strategy

Phase 1 reads the source code and builds a theme tree with complexity ratings. The approach is top-down: README and entry points first, then progressively deeper into the structure.

Analysis order:

1

Read README + Docs

Provides project context: What does the project do? Which technologies? What is the purpose? This information frames all subsequent decisions.

2

Identify Entry Points

main.py, index.ts, App.vue, Dockerfile — the files that start the system. From here, the dependency chain is traced.

3

Map Actors and Data Flows

Who or what interacts with the system? APIs, users, databases, external services. Each actor becomes a potential theme.

4

Detect Patterns and Architecture

MVC, event-driven, microservices, monolith? The architecture pattern determines how themes are grouped and nested.

5

Rate Complexity per Theme

Each theme gets an initial complexity estimate (0–3). This determines whether the theme is a candidate for deeper levels.

Theme tree structure:

// Theme Tree Output (Phase 1)
theme_tree = {
  project: "MyProject",
  detected_patterns: ["REST API", "React SPA", "PostgreSQL"],
  themes: [
    {
      id: "auth",
      name: "Authentication & Authorization",
      complexity: 3,  // needs own page with diagrams
      sub_themes: ["OAuth2 Flow", "JWT Handling", "Role-Based Access"],
      l1_candidate: true,
      l2_candidate: true,
      l3_candidate: true,
      rationale: "Multiple flows, edge cases, security aspects"
    },
    {
      id: "api-design",
      name: "API Design & Endpoints",
      complexity: 2,  // needs multiple sections
      sub_themes: ["REST Conventions", "Error Responses", "Pagination"],
      l1_candidate: true,
      l2_candidate: true,
      l3_candidate: false,
      rationale: "Patterns explainable, but not as deep as auth"
    },
    {
      id: "setup",
      name: "Project Setup",
      complexity: 1,  // one paragraph suffices
      sub_themes: ["Installation", "Env Variables"],
      l1_candidate: true,
      l2_candidate: false,
      l3_candidate: false,
      rationale: "Simple instructions, no deep explanation needed"
    },
    {
      id: "license",
      name: "License",
      complexity: 0,  // one sentence
      sub_themes: [],
      l1_candidate: false,
      l2_candidate: false,
      l3_candidate: false,
      rationale: "Trivial, mentioned on L0"
    }
  ]
}
            

The theme tree is the central artifact of Phase 1. Each theme receives:

• Complexity 0: Trivial — mentioned in one sentence on the L0 overview. No dedicated module.
• Complexity 1: Needs one paragraph. Candidate for an L1 module, but no deeper.
• Complexity 2: Needs multiple sections. L1 + L2 candidate.
• Complexity 3: Needs its own page with diagrams, code examples, and edge cases. L1 + L2 + L3 candidate.

The rationale field documents why this complexity was chosen. This is critical for traceability — when the depth map later shows that a theme got no L3, you can navigate back to the reasoning.

Edge case: Monorepos

For monorepos with multiple packages, Phase 1 analyzes each package as a separate subtree. Themes are then grouped at the top level (e.g., “Frontend”, “Backend”, “Shared”). This prevents a monorepo from producing a flat, unstructured tree.

03

Phase 2 — Curriculum per Audience

Phase 2 takes the theme tree from Phase 1 and creates separate curricula for each audience. Each audience receives only the topics and depths that are relevant to them.

Maximum depth per audience:

📊

Executives

max_level = L1. No L2, no L3. Executives need overview and decision bases, not implementation details.

L0 + L1

👤

Users

max_level = L2. Get L0, L1, and selected L2 pages. No L3 — too technical.

L0 + L1 + L2

🔧

Developers

max_level = L3. Full access. All levels including deep-dives with complete code.

L0 + L1 + L2 + L3

Curriculum derivation — algorithm:

// Phase 2: Curriculum per Audience
function derive_curriculum(theme_tree, audience):
    max_level = get_max_level(audience)
    // Executives: 1, Users: 2, Developers: 3

    curriculum = { audience: audience, topics: [] }

    for theme in theme_tree.themes:
        // Calculate Helpfulness Score for this audience
        hs = calculate_hs(theme, audience)

        // Determine planned levels
        planned_levels = []

        // L0 mention: if HS >= 1
        if hs >= 1:
            planned_levels.push("L0_mention")

        // L1 own module: if HS >= threshold_l1
        if hs >= THRESHOLD_L1[audience] and max_level >= 1:
            planned_levels.push("L1")

        // L2 own page: if HS >= threshold_l2
        if hs >= THRESHOLD_L2[audience] and max_level >= 2:
            planned_levels.push("L2")

        // L3 deep-dive: if HS >= threshold_l3
        if hs >= THRESHOLD_L3[audience] and max_level >= 3:
            planned_levels.push("L3")

        // CRITICAL: Do NOT plan beyond max_level!
        // Executives NEVER get L2/L3, regardless of HS.

        if planned_levels.length > 0:
            curriculum.topics.push({
                theme: theme,
                hs: hs,
                levels: planned_levels,
                stop_reason: determine_stop(hs, max_level, theme)
            })

    return curriculum

// Stop reasons (for transparency in the depth map)
function determine_stop(hs, max_level, theme):
    if max_level reached:
        return "audience_max_level"  // e.g., executives can't go deeper than L1
    if hs < next_threshold:
        return "hs_below_threshold"  // score too low for next depth
    if theme.complexity < next_level:
        return "complexity_insufficient"  // topic lacks substance for more
            

The algorithm works as follows:

1. Each audience has a maximum depth. Executives: L1. Users: L2. Developers: L3. Planning never exceeds this.

2. For each theme, the Helpfulness Score (HS) is calculated — specific to this audience. The same theme has different scores per audience.

3. The HS is checked against thresholds. Each level has a threshold per audience. Only if the score meets the threshold and the maximum depth allows it, the level is planned.

4. Each topic gets a stop reason: Why was it not planned deeper? Three possible reasons: audience maximum depth reached, score below threshold, or the topic simply lacks substance for more.

Core rule: Never plan pages that won't be built. If executives max out at L1, no L2 is planned for them — even if the HS would theoretically be high enough.

Example: Topic “Authentication” — three curricula

Audience	HS	Planned Levels	Stop Reason
📊 Executives	9	L0, L1	audience_max_level (max=L1)
👤 Users	7	L0, L1, L2	audience_max_level (max=L2)
🔧 Developers	10	L0, L1, L2, L3	complexity exhausted

04

Phase 5 — Polish Algorithm

Phase 5 is the final pass. No content changes happen here — only verification, repair, and consistency enforcement. The polish algorithm is a systematic checklist implemented as interactive toggles.

Checks in detail:

✓ Verify cross-level links: Every link from L1→L2 and L2→L3 must point to an existing file. Orphaned links (target not generated because HS was too low) are removed, not left dangling.

✓ Set audience switch on L0: The L0 overview must contain links to all generated audience variants. If only developers were generated, no switch appears. If all three exist, all three are linked.

✓ Check relative paths: L0→L1: l1/file.html. L1→L2: ../l2/file.html. L2→L3: ../l3/file.html. A wrong prefix breaks navigation.

✓ Validate breadcrumbs: Each page must have the correct breadcrumb path. L2 pages: L0 › L1 › L2. L3 pages: L0 › L1 › L2 › L3. Breadcrumb links must point to the correct audience variant.

✓ Ensure language versions: If both languages were selected, every DE file must have an EN counterpart and vice versa. The language switch in the nav must point to the correct partner file.

✓ Check sibling navigation: All L2 pages of the same audience must share the same sibling nav at the bottom. The current page is highlighted, all others are linked. Missing siblings are removed from the nav.

✓ Verify level dots: The dots in the hero must correctly display the current level. L2 page: two done, one active, one empty. Faulty dots create incorrect visual hierarchy.

// Phase 5: Polish
function polish(generated_files):
    errors = []

    // 1. Cross-level links
    for file in generated_files:
        for link in extract_internal_links(file):
            target = resolve_relative(file.path, link.href)
            if !exists_in(generated_files, target):
                if link.is_deep_dive:
                    remove_link(file, link)  // L3 not generated? Remove link
                    log("Removed dead deep-dive link: " + link.href)
                else:
                    errors.push({ file: file, link: link, type: "dead_link" })

    // 2. Audience switch on L0
    l0_files = filter(generated_files, level == 0)
    for l0 in l0_files:
        audiences_present = get_audiences(generated_files)
        if audiences_present.length > 1:
            inject_audience_switch(l0, audiences_present)

    // 3. Relative paths
    for file in generated_files:
        expected_prefix = get_prefix(file.level, link.target_level)
        // L0→L1: "l1/", L1→L2: "../l2/", L2→L3: "../l3/"
        for link in extract_internal_links(file):
            if !link.href.startsWith(expected_prefix):
                fix_prefix(file, link, expected_prefix)
                log("Fixed path prefix: " + link.href)

    // 4. Breadcrumbs
    for file in generated_files:
        expected_crumbs = build_breadcrumb_chain(file)
        actual_crumbs = parse_breadcrumbs(file)
        if expected_crumbs != actual_crumbs:
            replace_breadcrumbs(file, expected_crumbs)

    // 5. Language pairs
    for file in generated_files:
        partner = get_language_partner(file)  // _de ↔ _en
        if partner and !exists_in(generated_files, partner):
            errors.push({ file: file, type: "missing_lang_partner" })

    return { fixed: count_fixes, errors: errors }
            

The polish algorithm makes five passes:

1. Find and handle dead links: Every internal link is checked against the list of generated files. Deep-dive links to non-generated L3 pages are removed. Other dead links produce an error.

2. Inject audience switch: If multiple audiences were generated, the L0 overview gets links to all variants. If only one audience exists, no switch is shown.

3. Fix path prefixes: Links between levels must reflect the correct directory structure. A link from an L1 page to an L2 page must start with ../l2/, not l2/.

4. Synchronize breadcrumbs: The breadcrumb chain is computed from the file's position in the level tree and compared with the actual HTML.

5. Check language pairs: Every _de file needs an _en counterpart (if both languages were selected). Missing partners are reported as errors.

✏️ Knowledge Check

Phase 2 creates a curriculum for 📊 Executives. Should it include L2/L3 candidate topics?

No — max level is L1, don't plan what won't be built

Yes — the HS might be high enough, so it should plan them

Only L2, no L3 — one level deeper is still fine

🧪 Deep-Dive: Routing Logic in Detail →

🔧 Developer — All L2 Pages

01 Phase Detail 02 Score Calculation 03 Subagent Architecture 04 Colorspace & Typography 05 Hero & Module Patterns