AI Content Generation - Chris-Cullins/wiki_bot GitHub Wiki

AI Content Generation

Overview

The AI Content Generation area is responsible for transforming repository structure and source code into comprehensive wiki documentation. It orchestrates the entire documentation generation process by interfacing with Claude AI (via multiple provider options), processing responses, and ensuring output quality through validation and templating.

Core Responsibilities:

Generate wiki pages (Home, Architecture, Area-specific docs)
Manage AI query execution across different LLM providers
Process and normalize AI responses into structured Markdown
Apply templates and ensure content quality
Handle incremental documentation updates

Key Components

WikiGenerator (`src/wiki-generator.ts`)

The central orchestrator for all documentation generation. Manages the complete lifecycle of AI-powered wiki creation, from prompting to response processing to template application.

Key capabilities:

Multi-page generation (home, architecture, areas)
Response collection from streaming/non-streaming APIs
Content validation and normalization
Architectural area extraction and file identification
Integration with template rendering system

PromptLoader (`src/prompt-loader.ts`)

Simple utility for loading and hydrating prompt templates from markdown files.

Responsibilities:

Load prompt files from src/prompts/
Variable interpolation using {{variableName}} syntax
Clean separation of prompt content from generation logic

TemplateRenderer (`src/template-renderer.ts`)

Post-processing engine that applies customizable templates to AI-generated content.

Features:

Multi-directory template search (custom + default)
Variant-based template selection
Template caching for performance
Variable interpolation in templates

QueryFactory (`src/query-factory.ts`)

Abstraction layer for LLM provider selection and query execution.

Supported Providers:

agent-sdk: Anthropic's official Claude Agent SDK
claude-cli: Claude CLI tool with system prompt injection
codex-cli: Codex CLI with JSON response parsing
mock: Test mode with synthetic responses

MockAgentSDK (`src/mock-agent-sdk.ts`)

Testing utility that simulates AI responses without API calls.

Capabilities:

Context-aware mock responses based on prompt content
JSON/Markdown response type handling
Supports all generation workflow types

How It Works

Documentation Generation Flow

Initialization
```
const generator = new WikiGenerator(queryFn, config, logger);
```
- Query function selected based on llmProvider config
- Debug logging configured
- Template renderer initialized

Page Generation

User Request → Load Prompt → Inject Variables → Execute Query → 
Collect Response → Strip Wrappers → Normalize Content → 
Apply Template → Validate Output

Response Processing Pipeline
- collectResponseText(): Streams events and extracts text from various message types
- stripFenceWrappers(): Removes markdown code fences that LLMs often add
- stripLeadingCommentary(): Removes meta-commentary before actual content
- ensureHeading(): Guarantees proper H1 heading with title
- hasMeaningfulBody(): Validates content substance
Quality Gates
- Meta-description detection: Filters out AI responses that describe what they're doing rather than actual content
- Meaningful body check: Ensures pages have substantive content beyond just headings
- Fallback to existing: Preserves prior valid content when regeneration fails

Architectural Area Discovery

Architecture Overview → Extract Areas (JSON) → For Each Area:
  → Identify Relevant Files (JSON) → Read File Contents → 
  Generate Area Documentation

The system uses a two-pass approach:

Extract area names from architecture overview (JSON response)
For each area, identify relevant files from full file list (JSON response)

Provider Abstraction

The QueryFunction type provides a unified interface:

type QueryFunction = (params: { prompt: string; options?: any }) => Query;

Each provider implementation:

agent-sdk: Direct async iteration over SDK messages
claude-cli: Spawns process, captures stdout, wraps in iterator
codex-cli: Spawns process, parses JSON lines, extracts agent_message types
mock: Generates contextual responses based on prompt patterns

Important Functions/Classes

WikiGenerator Core Methods

`generateHomePage(repoStructure, existingDoc?)`

Creates the wiki home page with project overview, features, and getting started guide.

Process:

Selects generate-home-page or update-home-page prompt
Injects repository structure and root path
Ensures proper heading and meaningful content
Applies home template

`generateArchitecturalOverview(repoStructure, existingDoc?)`

Produces structured architecture documentation with sections, diagram, and areas.

Special handling:

ensureArchitectureOutline(): Enforces standardized section structure
Extracts/normalizes Mermaid diagrams
Adds TODO placeholders for missing sections
Guarantees required sections: Summary, Pattern, Directories, Areas, Interactions, Data Flow, Diagram

`extractArchitecturalAreas(architecturalOverview)`

Parses JSON array of area names from architecture content.

Returns: string[] of area names or empty array on parse failure

`identifyRelevantFiles(area, allFiles, repoStructure)`

Determines which files belong to a specific architectural area.

Validation:

Filters non-existent paths
Deduplicates results
Logs warnings for invalid suggestions

`generateAreaDocumentation(area, relevantFiles, existingDoc?)`

Creates detailed documentation for a single architectural area.

Features:

Reads all relevant file contents
Formats as --- filepath ---\ncontent blocks
Applies depth instruction
Validates against meta-description patterns
Supports variant-specific templates via slugify(area)

Response Processing

`collectResponseText(query)`

Unified response collector supporting multiple message formats:

Stream events (content_block_delta, content_block_start)
SDK assistant messages with content blocks
Mock assistant messages (simple string content)

Priority: stream > SDK message > mock

`stripFenceWrappers(content)`

Removes markdown code fences while preserving language hints:

```markdown
# Content

→ Returns: `# Content`

#### `isMetaDescription(content)`
Detects when AI describes what it's doing rather than providing actual documentation.

**Trigger patterns:**
- "I've created/provided/assembled this documentation..."
- "This documentation includes/covers/contains..."

### Template System

#### `TemplateRenderer.render(templateName, context, options?)`
Applies templates with variant support:

**Search order:**
1. `{variantSubdir}/{variant}.md` (if both specified)
2. `{templateName}-{variant}.md` (if variant)
3. `{variant}.md` (if variant)
4. `{templateName}.md` (base)

**Context variables** replaced via `{{key}}` syntax

## Developer Notes

### Critical Gotchas

1. **Escaping Bug in Template Regex**
   ```typescript
   // CURRENT (WRONG):
   return value.replace(/[.*+?^${}()|[\]\\]/g, '\\{{fileContentText}}');
   
   // SHOULD BE:
   return value.replace(/[.*+?^${}()|[\]\\]/g, '\\{{content}}');

The literal {{fileContentText}} replacement breaks regex escaping.

Provider Command Errors
- CLI providers (claude-cli, codex-cli) require external binaries
- ENOENT errors indicate missing installation
- Usage limit detection only works for stdout messages
Response Type Detection
- Order matters: check mock → stream → SDK messages
- Missing type guards cause silent failures
- isTextBlock() and isTextDelta() validate content structure
Incremental Updates
- Controlled by config.incrementalDocs flag
- Switches between generate-* and update-* prompts
- Falls back to existing content when regeneration produces empty/meta results

Best Practices

When adding new page types:

Create prompt in src/prompts/{action}-{pagetype}.md
Add generation method to WikiGenerator
Implement validation (meaningful body check)
Create template in src/templates/{pagetype}.md
Add fallback behavior for empty responses

When supporting new providers:

Add provider type to Config.llmProvider
Implement in createQueryFunction()
Return async iterator matching Query type
Handle both streaming and complete responses
Add error handling for command/API failures

Quality validation flow:

const raw = await collectResponseText(query);
const stripped = stripFenceWrappers(raw);
const withHeading = ensureHeading(stripped, title);

if (isMetaDescription(withHeading) || !hasMeaningfulBody(withHeading, title)) {
  return existingDoc || fallback;
}

return await templates.render(templateType, { content: withHeading });

Configuration Interaction

config.debug: Enables detailed logging
config.promptLoggingEnabled: Writes prompts/responses to disk
config.documentationDepth: Affects getDepthInstruction() output
config.templateDir: Custom template directory (falls back to defaults)
config.testMode: Activates mock provider

Usage Examples

Basic Generation

import { WikiGenerator } from './wiki-generator.js';
import { createQueryFunction } from './query-factory.js';

const queryFn = createQueryFunction(config, repoPath);
const generator = new WikiGenerator(queryFn, config);

// Generate home page
const homePage = await generator.generateHomePage(repoStructure);

// Generate architecture
const architecture = await generator.generateArchitecturalOverview(repoStructure);

// Extract and document areas
const areas = await generator.extractArchitecturalAreas(architecture);
for (const area of areas) {
  const files = await generator.identifyRelevantFiles(area, allFiles, repoStructure);
  const doc = await generator.generateAreaDocumentation(area, files);
}

With Incremental Updates

const config = {
  incrementalDocs: true,
  // ... other config
};

const generator = new WikiGenerator(queryFn, config);

// Updates existing content or generates new
const updatedHome = await generator.generateHomePage(
  repoStructure,
  existingHomePage // Pass existing content
);

Using Different Providers

// Agent SDK (default)
const sdkQuery = createQueryFunction({ llmProvider: 'agent-sdk' }, repoPath);

// Claude CLI
const cliQuery = createQueryFunction({ llmProvider: 'claude-cli' }, repoPath);

// Test mode
const mockQuery = createQueryFunction({ testMode: true }, repoPath);

Custom Templates

const config = {
  templateDir: '/path/to/custom/templates',
  // ...
};

// Will search: custom dir → default dir
// Supports variants: area-config-management.md, config-management.md, area.md