Running AI LLM Projects in CI - amosproj/amos2025ss04-ai-driven-testing GitHub Wiki

🎯 Goal

Running AI models, especially LLMs, inside CI pipelines is feasible and increasingly common.
Projects like Ollama and transformers support containerized execution, making them well-suited for CI integration.
Self-hosted runners with GPU support allow for local inference workloads within CI jobs.
A strong and reusable approach is to Dockerize tge AI project, so it can be run across different GitHub Actions pipelines as a prebuilt image. However, we do need appropriate modules for the interaction with the CI-Pipeline

GitHub Actions supports both cloud-hosted and self-hosted runners.
Self-hosted runners can be provisioned on local machines, servers, or even containers with access to GPUs (e.g., via nvidia-docker).
These runners can execute AI-related workflows such as:
- Model inference and evaluation
- Prompt testing
- Auto-generated tests
- Code summarization and documentation

Ollama in CI: Examples exist where Ollama is run in GitHub Actions via Docker, using a local LLM to run inference tasks.
ExecutionAgent: A prototype project where an LLM autonomously sets up and runs tests in a CI-like loop.
awesome-local-llms: A GitHub repo listing local-first LLMs, many of which are CI-compatible using Docker.

Dockerizing the app makes it:

Benefit	Description
Portable	Can be used in GitHub Actions (or other CI-Tools), locally, in other repos or CI pipelines
Consistent	Same behavior across environments
Reusable	Prebuilt image can be shared across projects
Scalable	Works with self-hosted GPU runners

📚 Sources