Home - amosproj/amos2025ss04-ai-driven-testing GitHub Wiki

Welcome to the amos2025ss04-ai-driven-testing wiki!

Our project

How to use

User and Developer Guide For a quick start and for the correct setup. You may also want to check out specific sections for certain applications:

How to start the Web-interface Frontend Tutorial on how to start it
Ollama Setup Information Possible ways to use Ollama to run LLMs
Possible Ways to run Ollama and Settings

How to work on this project

Contributing
How to be a Release Manager How to do the weekly release. This is relevant or any future team tat takes part in the AMOS course.

Structure

Architecture Overview of the system architecture and component interactions.
Backend API Design The structure receiving prompts, interacting with LLMs and returning the results to the client.
CI Pipeline CI/CD Pipeline Overview over the continuous integration pipeline for the project.
Execution Flow Design The sequential process in our project.
Model Configuration Design How we handle models.
Module System Design How we handle the extensions.
Prompt Design How we send Prompts in compact and effective formats.
Robot Framework Information about potential integration with the Robot Framework.

Models

A list of all language models evaluated for use in the project. We only chose the versions of models that are available under an open source license.

DeepCoder
DeepSeek‐Coder V1
Google Gemma 3
LLMs incompatibility with our project Overview of models that were evaluated but deemed unsuitable.
Mistral AI
OpenHermes 2.5
Phi4‐Mini
Phi‑4 Reasoning
Qwen 2.5 Coder
Qwen3
Smollm2
StarCoder
StarCoder2
TinyLlama

Our research and experiments

Docker Performance Performance of different docker configurations when running LLMs
Evaluating Large Language Model Responses to Spelling Errors
LLM Code Understanding Evaluation
Comparison of 1b, 3b, 7b and 14b models
Running AI LLM Projects in CI

LLM components and assessment tools

AI‐Model-Benchmark Standard Benchmarks used to evaluate LLMs
Benefits of chaining LLMs
Code Complexity Description of the code complexity metrics that will be used for evaluation, including MCC and CCC.
Code Coverage
Include a project as context This explains how to include repositories into the LLM promt as context using a RAG and the alternatives.
Iterative Refinement (Multi‐Pass Generation) Passing the response as the new prompt input.

Training an LLM by yourself

The maintenance of this Wiki officially stops at 16.07.2025, as this is the Demo-day of our project and no further contribution is expected from the team.