GSoC 2024 Contributions ‐ Aviral Kaintura - The-OpenROAD-Project/ORAssistant GitHub Wiki

The project aims to develop a conversational chat assistant for OpenROAD, with a focus on data engineering and evaluation. Key contributions include:

  1. Data Engineering: Curating and processing OpenROAD-related data, including GitHub issues, and structuring it into JSONL format for easier use by the assistant.

  2. Automated Evaluation Systems:

    • Basic Abbreviation Evaluation: Ensures the assistant can correctly identify and explain abbreviations used within the OpenROAD community.
    • LLM Judge-Based Evaluation: Uses large language models (LLMs) like GPT-4o and Gemini 1.5 Flash to assess the assistant’s responses by comparing them with ground truth answers.
  3. Exploratory Data Analysis (EDA): Conducted EDA on GitHub OpenROAD issues to classify issues by categories (Build, Query, Installation, Runtime) and tools, using data from GitHub’s GraphQL API.

  4. Future Work:

    • Incorporate GitHub Discussions data into the knowledge base.
    • Use the expanded dataset to improve the assistant’s RAG architecture.
    • Continuously refine and improve the assistant based on evaluation results.

The project is progressing well and aims to enhance OpenROAD’s user support capabilities.