Timmy AI Assistant - ericfitz/tmi GitHub Wiki
Timmy AI Assistant
Timmy is TMI's planned conversational AI assistant for threat model analysis. Timmy operates within the scope of a single threat model and reasons over its data -- assets, threats, diagrams, documents, repositories, and notes -- to help you understand, analyze, and improve your threat models.
Status: Timmy is under active development on the
dev/1.4.0branch. The backend is functional: chat API endpoints, LLM integration via LangChainGo, vector embedding pipeline, content providers, and streaming responses are all implemented. The frontend chat UI is not yet implemented. See Implementation Status below for details.
Development demo video (YouTube)
Purpose
Timmy is inspired by Google's NotebookLM: a "grounded" chat that reasons over specific sources rather than answering from general knowledge alone. You control which sub-entities are included in the conversation via the timmy_enabled flag on each sub-entity, allowing you to focus the discussion on relevant material.
Problems Timmy Solves
Threat models are dense and hard to reason about holistically
A mature threat model contains dozens of assets, threats, data flows, and supporting documents. Humans struggle to hold all of that in mind simultaneously. Timmy can synthesize across the full model and surface connections, gaps, or inconsistencies that a person might miss.
Security review is bottlenecked on expert availability
Not every team has a senior security reviewer on hand. Timmy acts as an always-available collaborator -- it cannot replace a human reviewer, but it can help teams self-serve on initial analysis, ask better questions, and arrive at a review better prepared.
Threat modeling artifacts are underutilized after creation
Teams build threat models and then rarely revisit them conversationally. Timmy makes the model queryable: "What are the highest-risk data flows?", "Which assets lack mitigations?", "Summarize the threats related to authentication."
Onboarding to an existing threat model is slow
A new team member or reviewer joining a threat model must read through everything. Timmy can provide guided summaries and answer targeted questions, dramatically reducing ramp-up time.
How Users Will Interact with Timmy
You navigate to a threat model's chat page, see your sources (sub-entities) in a sidebar, toggle which ones to include, and have a conversation. You can ask Timmy to:
- Analyze threats -- identify highest-risk areas, evaluate threat severity, and assess coverage
- Identify gaps -- find assets without threats, threats without mitigations, and incomplete data flows
- Explain data flows -- summarize how data moves through the system based on DFD diagrams
- Suggest mitigations -- recommend security controls based on identified threats
- Summarize content -- provide overviews of the threat model or specific sub-entities
- Answer questions -- respond to targeted queries about any aspect of the threat model
Previous chat sessions will be preserved and can be resumed.
Implementation Status
Completed (dev/1.4.0)
-
timmy_enabledfield on all threat model sub-entity types: diagrams, assets, threats, documents, notes, and repositories. Defaults totrue. Also present on team notes and project notes -
Database models for chat sessions, messages, embeddings, and usage tracking (
TimmySession,TimmyMessage,TimmyEmbedding,TimmyUsage) -
Database schema definitions for the four Timmy tables with proper indexes and foreign key constraints
-
Server configuration (
TimmyConfig) with settings for LLM provider/model, dual embedding providers (text + code), retrieval parameters, rate limits, memory budgets, and chunking. See Configuration-Reference#timmy-ai-assistant for all variables. Timmy is disabled by default -
Chat API endpoints -- REST endpoints for creating sessions (with SSE progress), sending messages (with SSE token streaming), listing sessions, and listing message history
-
LLM integration via LangChainGo -- provider-agnostic chat completion and embedding with OpenTelemetry instrumentation
-
Dual-index vector embedding pipeline -- text index for assets, threats, diagrams, documents, and notes; code index for repositories (optional, requires separate code embedding model). In-memory brute-force cosine similarity search with LRU eviction and memory budgeting
-
Content provider abstraction (
ContentProviderinterface andContentProviderRegistry) for extracting plain text from source entities for embedding, including:- Direct text extraction from database-resident entities (assets, threats, notes, repositories)
- JSON semantic extraction from DFD diagrams
- HTTP/HTML content extraction with SSRF protection
- PDF content extraction
- Content pipeline with pluggable sources and extractors (Google Drive, general HTTP)
-
Two-tier context building -- Tier 1 (entity overview) and Tier 2 (vector search results) assembled into LLM prompts
-
Rate limiting -- per-user message rate limiting (sliding window) and system-wide LLM concurrency control
-
SSRF validator for safely fetching external document URLs during content extraction
-
Import/export support in the frontend for the
timmy_enabledfield -
Dual-index RAG (#241) -- text and code vector indexes with separate embedding models, external embedding ingestion API, optional query decomposition and cross-encoder reranking
-
Embedding automation API --
embedding-automationbuilt-in group,/automation/embeddings/endpoints for external tools to push pre-computed embeddings -
Optional query decomposition -- LLM-driven splitting of user queries into index-specific sub-queries (off by default)
-
Optional cross-encoder reranking -- API-based reranker rescores merged results from both indexes for higher-precision context
In Progress (dev/1.4.0)
- Additional content providers (#249) -- Confluence, OneDrive/SharePoint, and Google Workspace delegated access
Not Yet Implemented
- Frontend chat UI -- Angular components for the chat page, source sidebar, and session management
Architecture Decisions
Key decisions from the backend design discussion:
- LLM integration: Provider-agnostic via LangChainGo, allowing operators to choose their LLM provider.
- Vector store: In-memory brute-force cosine similarity search with database-serialized embeddings (rows-per-embedding). No separate vector database required. Dual indexes (text + code) with composite keys per threat model.
- Conversation storage: Normal relational tables in the existing threat model database.
- Memory management: Explicit budget with LRU eviction and session admission control under memory pressure. Single shared memory pool across both index types.
- Scope: Two vector indexes per threat model (text + code), loaded on demand, evicted independently after inactivity.
- Entity-to-index mapping: Strict -- repositories go to the code index, all other entity types go to the text index.
- Query pipeline: Optional two-stage enhancement -- LLM-driven query decomposition splits user questions into index-specific sub-queries (off by default), and cross-encoder reranking rescores merged results for higher precision (requires separate reranker model). Both degrade gracefully when not configured.
Query Pipeline Architecture
flowchart TD
A[User Message] --> B{Query Decomposer\nconfigured?}
B -->|yes| C[LLM Decompose:\ntext_query + code_query]
B -->|no| D[Use original query\nfor both indexes]
C --> E[Embed text_query\nwith text model]
C --> F[Embed code_query\nwith code model]
D --> E
D --> F
E --> G[Search Text Index\ntop-K results]
F --> H{Code Index\nconfigured?}
H -->|yes| I[Search Code Index\ntop-K results]
H -->|no| J[Skip]
G --> K[Merge Candidates]
I --> K
J --> K
K --> L{Reranker\nconfigured?}
L -->|yes| M[Cross-Encoder Rerank\nwith original query]
L -->|no| N[Use merged results\nas-is]
M --> O[Apply Rerank top-K\ncutoff]
O --> P[Format Tier 2 Context]
N --> P
P --> Q[Assemble Full Prompt:\nBase + Tier 1 + Tier 2]
Q --> R[LLM Synthesis\nstreaming response]
Related Pages
- Architecture-and-Design -- System architecture and design decisions
- REST-API-Reference -- API endpoint reference
Related Issues
- Server backend: ericfitz/tmi#214
- Client UX: ericfitz/tmi-ux#293