Model Configuration Design - amosproj/amos2025ss04-ai-driven-testing GitHub Wiki
Model Configuration Design
This document describes how Large Language Models (LLMs) are defined, configured, and managed within the backend system, primarily focusing on the allowed_models.json
file and the logic that loads and uses it (potentially in model_manager.py
or integrated within main.py
/api.py
).
allowed_models.json
1. -
Purpose: This file serves as a simple, external configuration for specifying which LLM identifiers are recognized and permitted for use by the backend. It acts as an allow-list.
-
Location:
backend/allowed_models.json
-
Format: A JSON array of strings. Each string is a unique identifier for an LLM.
Example
allowed_models.json
:[ "gpt-3.5-turbo", "gpt-4", "gpt-4-turbo", "claude-3-opus-20240229", "claude-3-sonnet-20240229", "openhermes-2.5-mistral-7b", "mistral-7b-instruct", "ollama-custom-model-id" ]
-
Management:
- To add support for a new model (assuming the LLM Manager can handle its provider), simply add its identifier string to this JSON array.
- To disallow a previously supported model, remove its identifier from the array.
- The application needs to be restarted (if it loads this file at startup) for changes to take effect.
2. Model Loading and Usage
-
Loading Logic (in
model_manager.py
ormain.py
/api.py
):- At application startup, the backend reads
allowed_models.json
. - The list of model identifiers is loaded into a Python list or set in memory (e.g., a global variable like
ALLOWED_MODELS
).
- At application startup, the backend reads
-
Request Validation:
- When a request is made to the
/generate-test
endpoint, themodel_name
provided in the request payload is validated against this in-memory list ofALLOWED_MODELS
. - If the requested
model_name
is not found inALLOWED_MODELS
, the API returns an error (e.g.,HTTPException(400, detail="Model not allowed or not found.")
).
- When a request is made to the
-
Client-Side Information:
- The
GET /models
API endpoint exposes this list ofALLOWED_MODELS
to clients. This allows user interfaces or other client applications to dynamically populate model selection dropdowns or inform users about supported models.
- The
3. Relationship with LLM Manager
- While
allowed_models.json
lists which models can be requested, the LLM Manager is responsible for how to instantiate a client for a model if it's deemed allowed. - The
LLM Manager
uses themodel_name
(which has already been validated againstallowed_models.json
) to infer the provider and configure the appropriate SDK client.
4. Benefits of this Approach
- Configuration over Code: Model support can be updated by editing a JSON file without changing Python code (for models whose providers are already supported by
llm_manager.py
). - Clarity: Provides a clear, single source of truth for which models are intended to be used.
- Security (Basic): Prevents arbitrary model names from being passed to the LLM client instantiation logic, offering a basic level of control.