GSoC 2024 Contributions ‐ Aviral Kaintura - The-OpenROAD-Project/ORAssistant GitHub Wiki
The project aims to develop a conversational chat assistant for OpenROAD, with a focus on data engineering and evaluation. Key contributions include:
-
Data Engineering: Curating and processing OpenROAD-related data, including GitHub issues, and structuring it into JSONL format for easier use by the assistant.
-
Automated Evaluation Systems:
- Basic Abbreviation Evaluation: Ensures the assistant can correctly identify and explain abbreviations used within the OpenROAD community.
- LLM Judge-Based Evaluation: Uses large language models (LLMs) like GPT-4o and Gemini 1.5 Flash to assess the assistant’s responses by comparing them with ground truth answers.
-
Exploratory Data Analysis (EDA): Conducted EDA on GitHub OpenROAD issues to classify issues by categories (Build, Query, Installation, Runtime) and tools, using data from GitHub’s GraphQL API.
-
Future Work:
- Incorporate GitHub Discussions data into the knowledge base.
- Use the expanded dataset to improve the assistant’s RAG architecture.
- Continuously refine and improve the assistant based on evaluation results.
The project is progressing well and aims to enhance OpenROAD’s user support capabilities.