Model Configuration Design - amosproj/amos2025ss04-ai-driven-testing GitHub Wiki

Model Configuration Design

This document describes how Large Language Models (LLMs) are defined, configured, and managed within the backend system, primarily focusing on the allowed_models.json file and the logic that loads and uses it (potentially in model_manager.py or integrated within main.py/api.py).

1. allowed_models.json

  • Purpose: This file serves as a simple, external configuration for specifying which LLM identifiers are recognized and permitted for use by the backend. It acts as an allow-list.

  • Location: backend/allowed_models.json

  • Format: A JSON array of strings. Each string is a unique identifier for an LLM.

    Example allowed_models.json:

    [
      "gpt-3.5-turbo",
      "gpt-4",
      "gpt-4-turbo",
      "claude-3-opus-20240229",
      "claude-3-sonnet-20240229",
      "openhermes-2.5-mistral-7b",
      "mistral-7b-instruct",
      "ollama-custom-model-id"
    ]
    
  • Management:

    • To add support for a new model (assuming the LLM Manager can handle its provider), simply add its identifier string to this JSON array.
    • To disallow a previously supported model, remove its identifier from the array.
    • The application needs to be restarted (if it loads this file at startup) for changes to take effect.

2. Model Loading and Usage

  • Loading Logic (in model_manager.py or main.py/api.py):

    • At application startup, the backend reads allowed_models.json.
    • The list of model identifiers is loaded into a Python list or set in memory (e.g., a global variable like ALLOWED_MODELS).
  • Request Validation:

    • When a request is made to the /generate-test endpoint, the model_name provided in the request payload is validated against this in-memory list of ALLOWED_MODELS.
    • If the requested model_name is not found in ALLOWED_MODELS, the API returns an error (e.g., HTTPException(400, detail="Model not allowed or not found.")).
  • Client-Side Information:

    • The GET /models API endpoint exposes this list of ALLOWED_MODELS to clients. This allows user interfaces or other client applications to dynamically populate model selection dropdowns or inform users about supported models.

3. Relationship with LLM Manager

  • While allowed_models.json lists which models can be requested, the LLM Manager is responsible for how to instantiate a client for a model if it's deemed allowed.
  • The LLM Manager uses the model_name (which has already been validated against allowed_models.json) to infer the provider and configure the appropriate SDK client.

4. Benefits of this Approach

  • Configuration over Code: Model support can be updated by editing a JSON file without changing Python code (for models whose providers are already supported by llm_manager.py).
  • Clarity: Provides a clear, single source of truth for which models are intended to be used.
  • Security (Basic): Prevents arbitrary model names from being passed to the LLM client instantiation logic, offering a basic level of control.