ZED AI Assistant ‐ Project Documentation - gcsharathbabu/gcsharathbabu-portfolio GitHub Wiki

This document contains the complete project details for the ZED AI Developer Assistant, from product requirements to the final project plan.

1. Product Requirements Document (PRD)

  • Author: AI Expert
  • Status: Version 1.0 - Draft
  • Target Release: V1 [MVP] - Q2 2025

Introduction & Purpose

ZED is an AI-powered developer assistant designed to radically improve engineering team productivity by accelerating onboarding and providing instant, context-aware technical knowledge. In today's fast-paced development environments, new engineers can spend weeks getting up to speed, and even senior engineers waste valuable time searching for scattered information. ZED solves this by acting as a centralized, conversational knowledge expert, integrated directly into the developer's primary workspace: Slack.

Problem Statement

  • Slow Onboarding: New hires struggle to understand the "larger picture" of the product and its architecture. They spend an excessive amount of time on "hand-holding" for environment setup, finding documentation, and understanding service workflows.
  • Knowledge Silos: Critical information is scattered across Confluence, GitHub, Google Docs, and tribal knowledge within individual team members' heads. Finding the correct, up-to-date answer to a specific question is inefficient and frustrating.
  • Repetitive Questions: Senior engineers are frequently interrupted to answer the same questions about setup, dependencies, and feature logic, detracting from their focus on high-priority development tasks.
  • Context-Switching: Developers lose significant time and focus switching between their code editor, communication tools, and various documentation platforms to find the information they need.

Goals & Objectives

  • Goal 1: Reduce the time-to-first-commit for new developers by 50%.
  • Goal 2: Decrease the volume of repetitive, non-critical questions in public engineering channels by 70%.
  • Goal 3: Establish a single, trusted source of truth for all technical and product documentation.
  • Non-Goal for V1: ZED will not write production-ready feature code. It will focus on providing information, boilerplate, and test code. It is an assistant, not an autonomous agent.

User Personas

  • Priya (New SDE-1): Fresh out of college, technically skilled but new to the company's large codebase. She needs help understanding service interactions, setting up her local environment, and finding the right person to ask for help.
  • David (Senior SDE-3): A team lead who is a primary source of knowledge. He is often interrupted with questions that could be answered by documentation. He needs to offload this work to focus on complex architectural decisions and mentoring.

Features & User Stories (V1)

  • Epic 1: Seamless Onboarding & Setup: As Priya, I want to ask ZED "How do I set up the local environment for the authentication service?" so that I can get a step-by-step playbook without having to ask a teammate.
  • Epic 2: Conversational Q&A in Slack: As Priya, I want to ask ZED questions in natural language directly in a Slack channel and get an immediate, accurate answer with source citations.
  • Epic 3: Automated Knowledge Ingestion: As an admin, I want to configure ZED to automatically sync documentation from our GitHub markdown files and Confluence space every 24 hours so that its knowledge base is always up-to-date.

Success Metrics

  • Onboarding Time: Average time from a new developer's start date to their first merged pull request.
  • Adoption Rate: Percentage of engineering team members who have interacted with ZED in a given week.
  • Answer Satisfaction: Ratio of "thumbs up" to "thumbs down" reactions on ZED's answers in Slack.
  • Question Deflection: Reduction in the number of questions asked in the #dev-help channel that could have been answered by ZED.

2. High-Level Design (HLD)

  • Author: AI Expert
  • Status: Version 1.0 - Draft

Overview

This document provides the High-Level Design for the ZED AI Developer Assistant. The architecture is based on a Retrieval-Augmented Generation (RAG) model, which separates the knowledge base from the language model's reasoning capabilities. This design ensures that answers are factually grounded in our internal documentation and that the knowledge base can be updated efficiently without retraining the core AI model.

The system is logically divided into two main workflows: an Offline Ingestion Pipeline for processing knowledge and an Online Inference Pipeline for answering user queries.

Architecture Diagram

flowchart TD
    %% Clients
    slack([Slack])
    vscode([VS Code Future])

    %% Data Sources
    github([GitHub])
    confluence([Confluence])
    gdrive([Google Drive])

    %% Ingestion Pipeline
    ingestion_service([Ingestion Service<br>AWS Lambda])
    s3[[S3 Raw Doc Store]]
    vector_db1[(Vector DB)]

    %% Inference Pipeline
    api_server([API Server<br>FastAPI on ECS/Fargate])
    llm(["Large Language Model<br>GPT-4o / Claude 3"])
    vector_db2[(Vector DB)]

    %% Groupings (boxes for docs only)
    subgraph CLIENTS [ ]
      direction LR
      slack -->|User asks question| api_server
      vscode -->|User asks question| api_server
    end

    subgraph DATA_SOURCES [ ]
      direction LR
      github
      confluence
      gdrive
    end

    subgraph INGESTION ["Offline Ingestion Pipeline<br>(Scheduled/Webhook)"]
      github --> ingestion_service
      confluence --> ingestion_service
      gdrive --> ingestion_service
      ingestion_service -- Stores Raw Document --> s3
      ingestion_service -- Chunks, Embeds & Upserts --> vector_db1
    end

    subgraph INFERENCE ["Online Inference Pipeline<br>(Real-time)"]
      api_server -- Embeds query --> api_server
      api_server -- Searches for context --> vector_db2
      vector_db2 -- Returns relevant chunks --> api_server
      api_server -- Constructs prompt --> llm
      llm -- Generates answer --> api_server
      api_server -- Formats & sends response --> slack
    end

    %% Storage paths
    vector_db1 -.-> vector_db2

    %% For clarity, add spacing via dummy nodes
    s3 -.-> vector_db2

Loading

Component Breakdown

  • Clients: Slack (V1) and a future VS Code extension.
  • Data Sources: GitHub, Confluence, and Google Drive.
  • ZED Backend System (Hosted on AWS):
    • Ingestion Pipeline (Offline): An AWS Lambda function orchestrates fetching, chunking, and embedding documents into a Vector Database.
    • Data Stores: An S3 bucket for raw documents and a Vector Database (e.g., AWS OpenSearch) for storing text chunks and their embeddings.
    • Real-time Inference Pipeline (Online): A FastAPI server on AWS ECS handles user queries, retrieves context from the Vector DB, and uses an LLM (e.g., GPT-4o) to generate answers.

3. Low-Level Design (LLD)

  • Author: AI Expert
  • Status: Version 1.0 - Draft

Data Models / Schemas

Vector Database Schema:

{
  "id": "string",
  "source_id": "string",
  "source_type": "string",
  "document_title": "string",
  "content": "string",
  "vector": "array<float>",
  "metadata": {
    "created_at": "timestamp",
    "url": "string"
  }
}

API Specification (OpenAPI 3.0 Snippet)

openapi: 3.0.0
info:
  title: ZED AI Assistant API
  version: 1.0.0
paths:
  /v1/slack/events:
    post:
      summary: Handles incoming events from Slack
  /v1/query:
    post:
      summary: Submits a query to the RAG pipeline (for internal testing)

Key Logic & Algorithms

  • Text Chunking Strategy: Uses a RecursiveCharacterTextSplitter with a chunk size of ~1000 characters and an overlap of ~200 characters.
  • Prompt Engineering Template: The prompt explicitly instructs the LLM to answer only based on the provided context and to state "I could not find an answer..." if the information isn't present.
  • Slack Integration Logic: Uses the Slack Bolt library and an asynchronous response flow to avoid Slack's 3-second timeout.

4. V1 Project Plan & Task Breakdown

Epic Task / User Story Priority Estimated Effort
Infra & Setup Setup AWS Core Infrastructure P0 - Blocker M
Infra & Setup Setup Project & CI/CD P0 - Blocker M
Data Ingestion Orchestrate Ingestion Pipeline P1 - High L
Conversational Q&A Implement RAG Query Logic P0 - Blocker M
Slack Integration Setup Slack Bolt App P1 - High M
Security Implement Secrets Management P0 - Blocker M
Observability Create Monitoring Dashboard P1 - High M
Deployment Finalize Production CI/CD Pipeline P1 - High L
⚠️ **GitHub.com Fallback** ⚠️