Architecture Guidance - rovo79/wordcraft GitHub Wiki

To architect a Novel Generator as an LLM-powered VS Code extension using specialized agents and a central coordinator (MCP), here's a high-level design:

1. Core Architecture

Component	Role
MCP (Master Control Program)	Central coordinator for agents, task routing, and state management.
Agents	Specialized LLM-driven modules for specific novel-writing tasks.
VS Code Extension	UI/UX layer for user input/output & integration with the editor.
LLM Backend	APIs (e.g., OpenAI, Anthropic) or local models (via Ollama/Llamafile).

2. Component Breakdown

A. VS Code Extension Layer

Input/Output:
- Webview Panel: For users to input concepts/settings (e.g., genre, tone, brief outline).
- Custom Editor: To display the generated novel in Markdown/rich text.
- Sidebar UI: Track progress, toggle agents, or adjust LLM parameters.
Workspace Integration:
- Save generated chapters/drafts as .md or .txt files in the workspace.
- Use VS Code's file watcher to trigger agents on changes.

B. MCP (Master Control Program)

Orchestration:
- Break the novel into tasks (e.g., outline → chapter 1 → character arcs).
- Assign tasks to agents via a message bus (e.g., JSON-RPC or WebSocket).
State Management:
- Track novel structure (outline, characters, settings) in a JSON file.
- Manage context windows for LLMs (e.g., summarize prior chapters to avoid token limits).
RAG (Retrieval-Augmented Generation):
- Use a vector DB (e.g., ChromaDB) to store novel details (plot points, characters) for retrieval during generation.

C. Specialized Agents

Story Expansion Agent:
- Expands a brief concept into a detailed outline (acts, chapters).
- Uses few-shot prompting with genre-specific examples.
Character Agent:
- Generates character profiles (motivations, arcs, dialogue styles).
- Ensures consistency (e.g., "Character X never uses metaphors").
Chapter Writer Agent:
- Generates prose for individual chapters using outline + RAG context.
- Style Transfer: Adjust tone (e.g., "rewrite this in Cormac McCarthy's voice").
Plot Consistency Agent:
- Fact-checker for plot holes, timeline issues, or contradictions.
- Runs validation via LLM queries (e.g., "Does [event] align with [character]’s backstory?").
Editing Agent:
- Polishes prose (grammar, pacing, show-don’t-tell).
- Flags sections for human review.

D. LLM Backend

Cloud APIs (OpenAI, Anthropic, Mistral):
- Use function calling for structured outputs (e.g., "Return chapter 1 as JSON with keys 'plot', 'characters'").
Local Models (via Ollama, LM Studio):
- Run smaller fine-tuned models (e.g., Mistral-7B) for specific tasks.
Hybrid Approach:
- Use GPT-4 for creative tasks, smaller models for consistency checks.

3. Communication Flow

User inputs a concept via the VS Code webview.
MCP splits the task into subtasks (e.g., outline → characters → chapters).
Agents process tasks asynchronously:
- Story Expansion Agent → Outline
- Character Agent → Profiles
- Chapter Writer Agent → First draft
- Plot Consistency Agent → Validation
MCP aggregates results and updates the novel state.
Final output is rendered in the VS Code custom editor.

4. Technical Setup

VS Code Extension:
- Built with TypeScript/JavaScript using the VS Code Extension API.
- Use webview-ui-toolkit for reactive UI components.
Agents:
- Python-based for ML workflows (e.g., LangChain, LlamaIndex).
- Expose agents as REST/WebSocket services (or run in a background process).
Context Management:
- Store chapter summaries in a vector DB for retrieval during generation.
- Use LLM-generated embeddings to query relevant plot points/characters.

5. Example Workflow

User Input: "A cyberpunk heist where rebels steal an AI from a megacorp."
MCP triggers Story Expansion Agent to generate an outline.
Character Agent creates a team of rebels (hacker, ex-soldier, rogue AI).
Chapter Writer Agent drafts Chapter 1, using the outline and character profiles.
Plot Consistency Agent flags a contradiction: "The hacker’s backstory doesn’t explain her grudge against the megacorp."
MCP reroutes the issue to the Character Agent for revision.
Final draft is displayed in VS Code, editable by the user.

6. Challenges & Solutions

Token Limits: Use RAG to retrieve only relevant context (e.g., prior chapters).
Cost: Cache frequent responses and use smaller models for non-critical tasks.
Consistency: Store character/plot metadata in a structured format (JSON Schema).
User Control: Allow manual overrides (e.g., "Regenerate Chapter 3 with more action").

7. Optional Enhancements

Version Control: Integrate with Git to track novel iterations.
Collaboration: Allow multiple users to edit/comment via Live Share.
Multi-Agent Debate: Have agents "debate" plot choices (e.g., "Should Character X die?").

This architecture balances automation with user control, leveraging VS Code's ecosystem for a seamless writing experience. Would you like me to dive deeper into any component?