ZED AI Assistant ‐ Project Documentation - gcsharathbabu/gcsharathbabu-portfolio GitHub Wiki
This document contains the complete project details for the ZED AI Developer Assistant, from product requirements to the final project plan.
- Author: AI Expert
- Status: Version 1.0 - Draft
- Target Release: V1 [MVP] - Q2 2025
ZED is an AI-powered developer assistant designed to radically improve engineering team productivity by accelerating onboarding and providing instant, context-aware technical knowledge. In today's fast-paced development environments, new engineers can spend weeks getting up to speed, and even senior engineers waste valuable time searching for scattered information. ZED solves this by acting as a centralized, conversational knowledge expert, integrated directly into the developer's primary workspace: Slack.
- Slow Onboarding: New hires struggle to understand the "larger picture" of the product and its architecture. They spend an excessive amount of time on "hand-holding" for environment setup, finding documentation, and understanding service workflows.
- Knowledge Silos: Critical information is scattered across Confluence, GitHub, Google Docs, and tribal knowledge within individual team members' heads. Finding the correct, up-to-date answer to a specific question is inefficient and frustrating.
- Repetitive Questions: Senior engineers are frequently interrupted to answer the same questions about setup, dependencies, and feature logic, detracting from their focus on high-priority development tasks.
- Context-Switching: Developers lose significant time and focus switching between their code editor, communication tools, and various documentation platforms to find the information they need.
- Goal 1: Reduce the time-to-first-commit for new developers by 50%.
- Goal 2: Decrease the volume of repetitive, non-critical questions in public engineering channels by 70%.
- Goal 3: Establish a single, trusted source of truth for all technical and product documentation.
- Non-Goal for V1: ZED will not write production-ready feature code. It will focus on providing information, boilerplate, and test code. It is an assistant, not an autonomous agent.
- Priya (New SDE-1): Fresh out of college, technically skilled but new to the company's large codebase. She needs help understanding service interactions, setting up her local environment, and finding the right person to ask for help.
- David (Senior SDE-3): A team lead who is a primary source of knowledge. He is often interrupted with questions that could be answered by documentation. He needs to offload this work to focus on complex architectural decisions and mentoring.
- Epic 1: Seamless Onboarding & Setup: As Priya, I want to ask ZED "How do I set up the local environment for the authentication service?" so that I can get a step-by-step playbook without having to ask a teammate.
- Epic 2: Conversational Q&A in Slack: As Priya, I want to ask ZED questions in natural language directly in a Slack channel and get an immediate, accurate answer with source citations.
- Epic 3: Automated Knowledge Ingestion: As an admin, I want to configure ZED to automatically sync documentation from our GitHub markdown files and Confluence space every 24 hours so that its knowledge base is always up-to-date.
- Onboarding Time: Average time from a new developer's start date to their first merged pull request.
- Adoption Rate: Percentage of engineering team members who have interacted with ZED in a given week.
- Answer Satisfaction: Ratio of "thumbs up" to "thumbs down" reactions on ZED's answers in Slack.
- Question Deflection: Reduction in the number of questions asked in the #dev-help channel that could have been answered by ZED.
- Author: AI Expert
- Status: Version 1.0 - Draft
This document provides the High-Level Design for the ZED AI Developer Assistant. The architecture is based on a Retrieval-Augmented Generation (RAG) model, which separates the knowledge base from the language model's reasoning capabilities. This design ensures that answers are factually grounded in our internal documentation and that the knowledge base can be updated efficiently without retraining the core AI model.
The system is logically divided into two main workflows: an Offline Ingestion Pipeline for processing knowledge and an Online Inference Pipeline for answering user queries.
flowchart TD
%% Clients
slack([Slack])
vscode([VS Code Future])
%% Data Sources
github([GitHub])
confluence([Confluence])
gdrive([Google Drive])
%% Ingestion Pipeline
ingestion_service([Ingestion Service<br>AWS Lambda])
s3[[S3 Raw Doc Store]]
vector_db1[(Vector DB)]
%% Inference Pipeline
api_server([API Server<br>FastAPI on ECS/Fargate])
llm(["Large Language Model<br>GPT-4o / Claude 3"])
vector_db2[(Vector DB)]
%% Groupings (boxes for docs only)
subgraph CLIENTS [ ]
direction LR
slack -->|User asks question| api_server
vscode -->|User asks question| api_server
end
subgraph DATA_SOURCES [ ]
direction LR
github
confluence
gdrive
end
subgraph INGESTION ["Offline Ingestion Pipeline<br>(Scheduled/Webhook)"]
github --> ingestion_service
confluence --> ingestion_service
gdrive --> ingestion_service
ingestion_service -- Stores Raw Document --> s3
ingestion_service -- Chunks, Embeds & Upserts --> vector_db1
end
subgraph INFERENCE ["Online Inference Pipeline<br>(Real-time)"]
api_server -- Embeds query --> api_server
api_server -- Searches for context --> vector_db2
vector_db2 -- Returns relevant chunks --> api_server
api_server -- Constructs prompt --> llm
llm -- Generates answer --> api_server
api_server -- Formats & sends response --> slack
end
%% Storage paths
vector_db1 -.-> vector_db2
%% For clarity, add spacing via dummy nodes
s3 -.-> vector_db2
- Clients: Slack (V1) and a future VS Code extension.
- Data Sources: GitHub, Confluence, and Google Drive.
-
ZED Backend System (Hosted on AWS):
- Ingestion Pipeline (Offline): An AWS Lambda function orchestrates fetching, chunking, and embedding documents into a Vector Database.
- Data Stores: An S3 bucket for raw documents and a Vector Database (e.g., AWS OpenSearch) for storing text chunks and their embeddings.
- Real-time Inference Pipeline (Online): A FastAPI server on AWS ECS handles user queries, retrieves context from the Vector DB, and uses an LLM (e.g., GPT-4o) to generate answers.
- Author: AI Expert
- Status: Version 1.0 - Draft
Vector Database Schema:
{
"id": "string",
"source_id": "string",
"source_type": "string",
"document_title": "string",
"content": "string",
"vector": "array<float>",
"metadata": {
"created_at": "timestamp",
"url": "string"
}
}
openapi: 3.0.0
info:
title: ZED AI Assistant API
version: 1.0.0
paths:
/v1/slack/events:
post:
summary: Handles incoming events from Slack
/v1/query:
post:
summary: Submits a query to the RAG pipeline (for internal testing)
-
Text Chunking Strategy: Uses a
RecursiveCharacterTextSplitter
with a chunk size of ~1000 characters and an overlap of ~200 characters. - Prompt Engineering Template: The prompt explicitly instructs the LLM to answer only based on the provided context and to state "I could not find an answer..." if the information isn't present.
- Slack Integration Logic: Uses the Slack Bolt library and an asynchronous response flow to avoid Slack's 3-second timeout.
Epic | Task / User Story | Priority | Estimated Effort |
---|---|---|---|
Infra & Setup | Setup AWS Core Infrastructure | P0 - Blocker | M |
Infra & Setup | Setup Project & CI/CD | P0 - Blocker | M |
Data Ingestion | Orchestrate Ingestion Pipeline | P1 - High | L |
Conversational Q&A | Implement RAG Query Logic | P0 - Blocker | M |
Slack Integration | Setup Slack Bolt App | P1 - High | M |
Security | Implement Secrets Management | P0 - Blocker | M |
Observability | Create Monitoring Dashboard | P1 - High | M |
Deployment | Finalize Production CI/CD Pipeline | P1 - High | L |