Model Configuration Design - amosproj/amos2025ss04-ai-driven-testing GitHub Wiki

Model Configuration Design

This document describes how Large Language Models (LLMs) are defined, configured, and managed within the backend system, primarily focusing on the allowed_models.json file and the logic that loads and uses it (potentially in model_manager.py or integrated within main.py/api.py).

1. `allowed_models.json`

Purpose: This file serves as a simple, external configuration for specifying which LLM identifiers are recognized and permitted for use by the backend. It acts as an allow-list.
Location: backend/allowed_models.json

Format: A JSON array of strings. Each string is a unique identifier for an LLM.

Example allowed_models.json:

[
  "gpt-3.5-turbo",
  "gpt-4",
  "gpt-4-turbo",
  "claude-3-opus-20240229",
  "claude-3-sonnet-20240229",
  "openhermes-2.5-mistral-7b",
  "mistral-7b-instruct",
  "ollama-custom-model-id"
]

Management:
- To add support for a new model (assuming the LLM Manager can handle its provider), simply add its identifier string to this JSON array.
- To disallow a previously supported model, remove its identifier from the array.
- The application needs to be restarted (if it loads this file at startup) for changes to take effect.

2. Model Loading and Usage

Loading Logic (in model_manager.py or main.py/api.py):
- At application startup, the backend reads allowed_models.json.
- The list of model identifiers is loaded into a Python list or set in memory (e.g., a global variable like ALLOWED_MODELS).
Request Validation:
- When a request is made to the /generate-test endpoint, the model_name provided in the request payload is validated against this in-memory list of ALLOWED_MODELS.
- If the requested model_name is not found in ALLOWED_MODELS, the API returns an error (e.g., HTTPException(400, detail="Model not allowed or not found.")).
Client-Side Information:
- The GET /models API endpoint exposes this list of ALLOWED_MODELS to clients. This allows user interfaces or other client applications to dynamically populate model selection dropdowns or inform users about supported models.

3. Relationship with LLM Manager

While allowed_models.json lists which models can be requested, the LLM Manager is responsible for how to instantiate a client for a model if it's deemed allowed.
The LLM Manager uses the model_name (which has already been validated against allowed_models.json) to infer the provider and configure the appropriate SDK client.

4. Benefits of this Approach

Configuration over Code: Model support can be updated by editing a JSON file without changing Python code (for models whose providers are already supported by llm_manager.py).
Clarity: Provides a clear, single source of truth for which models are intended to be used.
Security (Basic): Prevents arbitrary model names from being passed to the LLM client instantiation logic, offering a basic level of control.