Possible Ways to run Ollama - amosproj/amos2025ss04-ai-driven-testing GitHub Wiki

Ollama Setup Information

Ollama allows you to run open-source large language models, such as Llama 3, Mistral, Gemma, and others, locally on your own machine. This guide provides information on how to set up and use Ollama with the AI-Driven Testing project.

Our backend can interact with models served by Ollama through its OpenAI-compatible API endpoint.

1. Installing Ollama

  1. Download Ollama: Visit the official Ollama website at ollama.com (or ollama.ai) and download the installer for your operating system (macOS, Windows, Linux).
  2. Run the Installer: Follow the installation instructions provided.
    • On Linux, this often involves a curl command.
    • On Windows and macOS, it's typically a standard application installer.
  3. Verify Installation (Optional): After installation, you can open a terminal or command prompt and type ollama --version to check if it's installed correctly.

2. Running the Ollama Server

By default, the Ollama application/service should start automatically after installation or when you launch the Ollama application.

  • macOS: Ollama typically runs as a menu bar application.
  • Windows: Ollama runs as a background service.
  • Linux: Ollama runs as a systemd service (ollama.service). You can check its status with systemctl status ollama.

If it's not running, you might need to start it manually (e.g., sudo systemctl start ollama on Linux, or by launching the Ollama app on macOS/Windows).

The Ollama server, by default, listens on http://localhost:11434.

3. Pulling LLM Models

Before you can use a model with Ollama, you need to download (or "pull") it.

  1. Open your terminal or command prompt.

  2. Use the ollama pull command followed by the model name. You can find a list of available models on the Ollama Library.

    Examples:

    ollama pull mistral
    ollama pull llama3
    ollama pull codellama
    ollama pull gemma:2b # Pull a specific version like gemma 2B
    

    The download process may take some time depending on the model size and your internet connection.

  3. List Downloaded Models: To see which models you have downloaded locally, use:

    ollama list
    

4. Configuring the AI-Driven Testing Backend for Ollama

Our backend interacts with Ollama using its OpenAI-compatible API.

  1. OLLAMA_BASE_URL Environment Variable:

    • The backend uses an environment variable OLLAMA_BASE_URL to know where your Ollama server is running.
    • This should be set in your .env file in the backend/ directory.
    • Default Value (if Ollama is running on the same machine as the backend):
      OLLAMA_BASE_URL="http://localhost:11434"
      
    • If running backend in Docker and Ollama on Host (Docker Desktop):
      OLLAMA_BASE_URL="http://host.docker.internal:11434"
      
    • If Ollama is running in a separate Docker container or on a different machine, adjust the URL accordingly.
  2. Model Identifiers in allowed_models.json:

    • To use an Ollama-hosted model with our backend, its identifier must be present in the backend/allowed_models.json file.
    • The identifier you add to allowed_models.json should match the model name you used with ollama pull (and that appears in ollama list). For example, if you pulled mistral, you would add "mistral" to allowed_models.json.
    • The LLM Manager in our backend will typically route requests for models not explicitly identified as "openai" or "anthropic" (by prefix or configuration) to the Ollama endpoint, assuming they are listed in allowed_models.json.
  3. How the Backend Connects:

    • The backend's llm_manager.py uses the OpenAI Python SDK to communicate with Ollama. It configures the OpenAI client with the OLLAMA_BASE_URL and a placeholder API key (Ollama itself doesn't require an API key for local access).

5. Using Ollama Models with the Backend

Once Ollama is set up, models are pulled, and the backend is configured:

  • Ensure your Ollama server is running.

  • Ensure the AI-Driven Testing backend server is running.

  • You can then select an Ollama-hosted model (that's in allowed_models.json) when using the CLI or the API directly.

    Example (CLI):

    # Assuming "mistral" was pulled via ollama and added to allowed_models.json
    python cli.py generate --code-path ./my_code.py --model-name "mistral"
    

Troubleshooting

  • "Connection refused": Ensure the Ollama server is running and accessible at the OLLAMA_BASE_URL configured in your .env file. Check firewall settings if necessary.
  • Model not found:
    • Verify the model name in your request matches exactly what's listed by ollama list.
    • Ensure the model name is present in backend/allowed_models.json.
  • Performance: Running large models locally can be resource-intensive (CPU, RAM, GPU if configured). Performance will depend on your hardware.
  • Ollama Logs: Check Ollama's server logs for more detailed error information if you encounter issues. How to access these logs varies by OS (e.g., journalctl -u ollama on Linux).

For further information on Ollama itself, refer to the official Ollama documentation.