Model Configuration Design - amosproj/amos2025ss04-ai-driven-testing GitHub Wiki
Model Configuration Design
This document describes how Large Language Models (LLMs) are defined, configured, and managed within the backend system, primarily focusing on the allowed_models.json file and the logic that loads and uses it (potentially in model_manager.py or integrated within main.py/api.py).
1. allowed_models.json
-
Purpose: This file serves as a simple, external configuration for specifying which LLM identifiers are recognized and permitted for use by the backend. It acts as an allow-list.
-
Location:
backend/allowed_models.json -
Format: A JSON array of strings. Each string is a unique identifier for an LLM.
Example
allowed_models.json:[ "gpt-3.5-turbo", "gpt-4", "gpt-4-turbo", "claude-3-opus-20240229", "claude-3-sonnet-20240229", "openhermes-2.5-mistral-7b", "mistral-7b-instruct", "ollama-custom-model-id" ] -
Management:
- To add support for a new model (assuming the LLM Manager can handle its provider), simply add its identifier string to this JSON array.
- To disallow a previously supported model, remove its identifier from the array.
- The application needs to be restarted (if it loads this file at startup) for changes to take effect.
2. Model Loading and Usage
-
Loading Logic (in
model_manager.pyormain.py/api.py):- At application startup, the backend reads
allowed_models.json. - The list of model identifiers is loaded into a Python list or set in memory (e.g., a global variable like
ALLOWED_MODELS).
- At application startup, the backend reads
-
Request Validation:
- When a request is made to the
/generate-testendpoint, themodel_nameprovided in the request payload is validated against this in-memory list ofALLOWED_MODELS. - If the requested
model_nameis not found inALLOWED_MODELS, the API returns an error (e.g.,HTTPException(400, detail="Model not allowed or not found.")).
- When a request is made to the
-
Client-Side Information:
- The
GET /modelsAPI endpoint exposes this list ofALLOWED_MODELSto clients. This allows user interfaces or other client applications to dynamically populate model selection dropdowns or inform users about supported models.
- The
3. Relationship with LLM Manager
- While
allowed_models.jsonlists which models can be requested, the LLM Manager is responsible for how to instantiate a client for a model if it's deemed allowed. - The
LLM Manageruses themodel_name(which has already been validated againstallowed_models.json) to infer the provider and configure the appropriate SDK client.
4. Benefits of this Approach
- Configuration over Code: Model support can be updated by editing a JSON file without changing Python code (for models whose providers are already supported by
llm_manager.py). - Clarity: Provides a clear, single source of truth for which models are intended to be used.
- Security (Basic): Prevents arbitrary model names from being passed to the LLM client instantiation logic, offering a basic level of control.