Include a project as context - amosproj/amos2025ss04-ai-driven-testing GitHub Wiki

Overview

Augmented generation is the setting in which trained model gets an update to its knowledge-base. In our case, we want to use this to include projects or existing files for the models to eventually create test tor these. There are multiple ways to access an entire project.

RAG

I chose Retrieval Augmented Generation (RAG) for the following reasons:

RAG allows us to access the specific files we want to create tests for, instead of the whole project as context. This enables us to stay within all necessary character limits for queries.
The database can be modified easily, which is necessary for creating tests for continually changing code in development.
The sources used are provided by the retriever, which enables developers to better understand the generated response.

RAG works like this:

Fill database:

The project is uploaded into a (vector-)database with the use of embedding.
The data is separated into chunks and then sent to an embedding model.
This generates a vector location for each chunk and stores it in a vector database.
If the database is not deleted, it is possible to add features like update the data when changes are made to the original files. It is also possible to not compare whether certain files already exist in a database and only add the "new" ones. These features are not jet present in the proof of concept.

Prompt:

Each user prompt is also sent to a RAG retriever.
This RAG retriever takes the prompt and sends it to the same embedding model, which performs a similarity search and returns the most relevant chunks that are likely to have the information necessary for the user's prompt. It is very important, that the same embedding model is used, otherwise the wrong chunks may be sent back as context.
The information from these chunks can now be put into the context window of the LLM

RAG explanation

Alternatives:

CAG:

All documents are formatted to fit inside the models context window.
The LLM takes this input and processes it. The internal state (Key Value Cache (KV-cache)) of the model after processing is captured and stored.
A user can submit a query in addition to the KV-cache. No extra reprocessing needs to occur for the KV-cache.

Fine tuning:

Retrain model on specific data.
This has high computational cost and dose not allow for dynamic changes.

Medium - This site goes into datail and explains the different approaches very nicely. [IBM] (https://www.youtube.com/watch?v=HdafI0t3sEY) [IBM2] (https://www.youtube.com/watch?v=00Q0G84kq3M) [pixegami] (https://www.youtube.com/watch?v=2TJxpyO3ei4) [Agentforce] (https://www.salesforce.com/eu/agentforce/what-is-rag/?d=701ed000000wZq4AAE&nc=701ed000000x22PAAQ&utm_content=701ed000000wZq4AAE&utm_source=google&utm_medium=paid_search&utm_campaign=21860714809&utm_adgroup=177401895771&utm_term=retrieval%20augmented%20generation&utm_matchtype=e&gclsrc=aw.ds&gad_source=1&gad_campaignid=21860714809&gclid=Cj0KCQjwotDBBhCQARIsAG5pinNYfZVoSXZBwGfch_mXzVyOICKdru7orCWDIKZNvk0aS1SUSW9aFMQaAqFEEALw_wcB) [Janssen] (https://dev.to/stephanj/the-power-of-full-project-context-using-llms-463c)