Git Wiki Publishing - Chris-Cullins/wiki_bot GitHub Wiki
Git/Wiki Publishing
Overview
The Git/Wiki Publishing area handles all aspects of generating documentation content from repository analysis and persisting it to GitHub Wiki repositories. This system orchestrates the complete documentation pipeline: crawling the repository structure, generating wiki pages via LLM prompts, managing git operations for the wiki repository, and ensuring changes are committed and pushed correctly.
Key Responsibilities:
- Managing git repository cloning, updates, and state tracking
- Writing generated documentation to GitHub wiki repositories
- Orchestrating the full documentation generation workflow
- Handling incremental updates and selective regeneration
- Providing configurable repository modes (fresh, incremental, reuse-or-clone)
Key Components
GitRepositoryManager (src/github/git-repository-manager.ts)
Core git operations handler that encapsulates all repository state management. Supports three operational modes:
- fresh: Always removes and re-clones the repository
- incremental: Updates existing repository or clones if missing
- reuse-or-clone: Uses existing repository as-is, only clones if absent
Key Features:
- Authenticated URL construction with token injection
- Repository status tracking (branch, commits ahead/behind, uncommitted changes)
- Safe update operations with uncommitted change detection
- Credential sanitization in error messages
GitHubWikiWriter (src/github/github-wiki-writer.ts)
High-level interface for wiki documentation persistence. Manages the full lifecycle of wiki page generation and publication.
Core Capabilities:
- Page name normalization and file mapping
- Sidebar generation with intelligent page ordering (Home → Architecture → alphabetized areas)
- Content change detection to avoid unnecessary commits
- Cleanup of existing markdown files in fresh mode
- Lazy repository preparation
WikiGenerator (src/wiki-generator.ts)
Orchestrates LLM-powered documentation generation across all wiki pages. Handles prompt construction, response parsing, and content normalization.
Generated Page Types:
- Home Page: Repository overview with structure summary
- Architecture Overview: High-level architectural patterns and area identification
- Area Documentation: Deep-dive into specific architectural areas with file analysis
Content Processing Pipeline:
- Strips markdown fence wrappers from LLM responses
- Ensures proper heading hierarchy
- Guards against meta-descriptive content (responses about the documentation process itself)
- Template rendering for customizable output formats
- Depth-aware content generation (summary/standard/deep)
Main Application Flow (src/index.ts)
Entry point that coordinates the complete documentation workflow:
- Configuration & Setup: Load config, parse CLI args, initialize logger
- Repository Crawling: Scan repository structure and enumerate files
- Target Resolution: Match CLI
--target-filearguments to actual paths - Documentation Generation:
- Home page generation
- Architectural overview
- Area extraction from overview
- Per-area file identification and documentation
- Wiki Persistence: Commit and push all generated pages
Selective Regeneration Mode:
When --target-file is specified, only areas touching those files are regenerated. Existing pages are reused for untouched areas.
Configuration System (src/config.ts)
Environment-driven configuration supporting multiple LLM providers (agent-sdk, claude-cli, codex-cli) and repository modes. Key configuration options:
WIKI_REPO_MODE: Controls repository management strategyINCREMENTAL_DOCS: Enables reuse of existing wiki contentWIKI_FRESH_CLEAN: Removes existing markdown files in fresh modeDOC_DEPTH: Controls documentation verbosity (summary/standard/deep)PROMPT_LOG_ENABLED: Persists prompt/response transcripts for debugging
Template System (src/template-renderer.ts)
Flexible template loader supporting variant-specific overrides:
- Searches custom template directory first, then built-in defaults
- Supports variant subdirectories (e.g.,
templates/areas/{area-slug}.md) - Simple
{{placeholder}}interpolation - Gracefully falls back to raw content if templates are missing
How It Works
Documentation Generation Workflow
1. Repository Analysis
└─> RepoCrawler scans file tree
2. Home Page Generation
└─> Prompt includes full repository structure
└─> LLM generates overview markdown
3. Architecture Overview
└─> LLM analyzes structure for architectural patterns
└─> Identifies major areas (e.g., "CLI", "Git/Wiki Publishing")
4. Area Extraction
└─> Parse architecture overview for area names
└─> Returns JSON array of area strings
5. Per-Area Documentation
├─> Identify relevant files for each area
├─> Read file contents
├─> Generate documentation from source code
└─> Apply depth-specific instructions
6. Wiki Publication
├─> Write pages to local wiki checkout
├─> Generate sidebar with ordered page links
├─> Commit changes
└─> Push to remote wiki repository
Repository State Management
GitRepositoryManager tracks repository state through the RepositoryStatus interface:
interface RepositoryStatus {
exists: boolean; // Local .git directory present
clean: boolean; // No uncommitted changes
branch: string; // Current branch name
ahead: number; // Commits ahead of remote
behind: number; // Commits behind remote
uncommittedChanges: string[]; // Porcelain status lines
}
Update Safety: The update() method throws if uncommitted changes are detected, preventing accidental data loss when resetting to remote state.
Content Change Detection
GitHubWikiWriter.hasContentChanged() normalizes content before comparison:
- Ensure trailing newline
- Normalize line endings (CRLF → LF)
- Compare normalized strings
This prevents spurious commits from whitespace-only changes.
Meta-Description Filtering
WikiGenerator.isMetaDescription() detects when the LLM responds with commentary about the documentation process rather than actual documentation content. Triggers include patterns like:
- "I've created this documentation..."
- "This wiki covers..."
- "The documentation includes..."
When detected, the system falls back to existing documentation or returns a placeholder.
Important Functions/Classes
GitRepositoryManager.prepare()
Orchestrates repository setup based on configured mode:
- fresh: Removes existing repo → clones fresh
- incremental: Updates if exists, otherwise clones
- reuse-or-clone: Only clones if missing
Location: src/github/git-repository-manager.ts:52
GitHubWikiWriter.writeDocumentation(pages: Map<string, string>)
Main entry point for wiki persistence. Handles the complete write workflow:
- Ensure repository is prepared
- Check for uncommitted changes (warns but continues)
- Write all pages with change detection
- Generate sidebar
- Commit and push if changes detected
Location: src/github/github-wiki-writer.ts:54
WikiGenerator.generateAreaDocumentation(area, relevantFiles, existingDoc?)
Generates documentation for a specific architectural area:
- Reads all relevant file contents
- Constructs prompt with file contents and depth instructions
- Collects LLM response
- Strips fence wrappers and ensures heading
- Checks for meta-descriptions
- Applies templates
- Falls back to existing doc if generation produces empty content
Location: src/wiki-generator.ts:449
WikiGenerator.ensureArchitectureOutline(content)
Normalizes architecture page structure to ensure consistent sections:
- Summary
- Architectural Pattern
- Key Directories
- Architectural Areas
- Component Interactions
- Data Flow
- Diagram (Mermaid)
Extracts any stray Mermaid diagrams and places them in the Diagram section.
Location: src/wiki-generator.ts:148
resolveTargetFiles(inputs, filePaths, repoPath)
Matches CLI --target-file arguments to actual repository file paths. Handles:
- Relative paths (e.g.,
./src/index.ts) - Absolute paths
- Paths relative to current working directory
- Cross-platform path normalization (forward slashes)
Location: src/index.ts:78
Developer Notes
Git Authentication
The system injects tokens into repository URLs for HTTPS authentication:
// Input: https://github.com/user/repo.wiki.git
// Output: https://x-access-token:[email protected]/user/repo.wiki.git
Security: Git commands with URLs are sanitized in error messages (<redacted-url>).
Shallow Clones
Enable shallow: true for faster initial clones when full git history isn't needed. The system adds --depth 1 to the clone command.
Trade-off: Shallow clones cannot push to different branches or merge. Only use with simple linear workflows.
Incremental Documentation
When INCREMENTAL_DOCS=true, the system attempts to load existing wiki pages before generation. The LLM receives existing content in the prompt and can produce minimal updates.
Fallback Logic: If generation produces empty/meta-descriptive content, the original page is preserved.
Template Override System
Custom templates are searched before built-in defaults. For area-specific templates:
{custom}/areas/{area-slug}.md{custom}/area-{area-slug}.md{custom}/{area-slug}.md{custom}/area.md{default}/area.md
The first match wins. If no template exists, raw content is used.
Selective Regeneration
CLI flag --target-file src/foo.ts enables targeted updates:
- Only areas containing
src/foo.tsare regenerated - Other areas reuse existing documentation
- Automatically enables
incrementalDocsmode - Home and Architecture pages are reused if they exist
Warning Behavior: Unmatched target files trigger a warning but don't halt execution.
LLM Response Parsing
collectResponseText() handles three message types:
- Mock messages (test mode):
{ type: 'assistant', content: string } - SDK messages:
{ type: 'assistant', message: { content: [...] } } - Stream events:
{ type: 'stream_event', event: { ... } }
Stream events are prioritized, falling back to SDK messages, then mock content.
Page Name Sanitization
sanitizePageName() transforms area names into wiki-friendly page names:
- Trims whitespace
- Converts path separators to spaces
- Removes special characters (keeps alphanumeric, spaces, hyphens, underscores)
- Collapses multiple spaces
- Preserves "Home" capitalization
Example: "CLI / Configuration" → "CLI Configuration"
Usage Examples
Generating Fresh Documentation
# Full fresh generation with cleanup
WIKI_REPO_MODE=fresh \
WIKI_FRESH_CLEAN=true \
GITHUB_WIKI_URL=https://github.com/user/repo.wiki.git \
GITHUB_TOKEN=ghp_xxx \
npm start
Incremental Updates
# Update only changed areas
WIKI_REPO_MODE=incremental \
INCREMENTAL_DOCS=true \
GITHUB_WIKI_URL=https://github.com/user/repo.wiki.git \
GITHUB_TOKEN=ghp_xxx \
npm start
Selective Regeneration
# Regenerate only areas touching specific files
npm start -- --target-file src/index.ts --target-file src/config.ts
Deep Documentation Depth
# Generate exhaustive documentation
DOC_DEPTH=deep \
GITHUB_WIKI_URL=https://github.com/user/repo.wiki.git \
GITHUB_TOKEN=ghp_xxx \
npm start
Using Custom Templates
# Override default templates
TEMPLATE_DIR=./custom-templates \
GITHUB_WIKI_URL=https://github.com/user/repo.wiki.git \
GITHUB_TOKEN=ghp_xxx \
npm start
Create ./custom-templates/area.md:
# {{title}}
**Area:** {{area}}
---
{{content}}
---
*Generated with custom template*