Prompt Design - amosproj/amos2025ss04-ai-driven-testing GitHub Wiki

Prompt Design

The effectiveness of the AI-Driven Testing tool heavily relies on the quality of the prompts sent to the Large Language Models (LLMs). This document outlines the design and structure of these prompts. The primary prompt content is often defined in backend/prompt.txt or constructed dynamically within the backend logic (e.g., in main.py or api.py).

(Review backend/prompt.txt and the prompt construction logic in your Python code to fill in the specific details accurately.)

1. Goals of the Prompt

  • Instruct the LLM Clearly: To generate unit tests for a given code snippet.
  • Specify Output Format: To ensure the LLM returns the test case in a usable format (e.g., a specific JSON structure, a Python code block).
  • Incorporate User Preferences: To allow users to specify target testing frameworks (e.g., pytest, unittest) and mocking frameworks.
  • Handle Dependencies: To provide the LLM with necessary context if the code under test relies on other code snippets.
  • Encourage High-Quality Tests: To guide the LLM towards generating meaningful, correct, and comprehensive tests.

2. Prompt Structure

Prompts sent to chat-based LLMs typically consist of a sequence of messages, often including a "system" message and one or more "user" messages.

2.1. System Message

  • Purpose: Sets the overall context, role, and high-level instructions for the LLM. It's like telling the LLM "You are an expert test generation assistant."
  • Example Content (Conceptual - check prompt.txt or code):
    You are an expert AI programming assistant specialized in generating high-quality unit tests for Python code.
    Your goal is to create comprehensive and correct test cases.
    The user will provide you with a Python code snippet and optionally some dependencies and framework preferences.
    You MUST return your response as a well-formed JSON object containing two keys:
    1. "test_case": A string containing the complete, runnable Python unit test code.
    2. "explanation": A brief explanation of the tests you generated.
    
    Ensure the generated test code is for Python.
    If a testing framework is specified (e.g., pytest, unittest), adhere to its conventions.
    If a mocking framework is specified (e.g., unittest.mock), use it appropriately for mocking dependencies.
    Focus on testing different paths, edge cases, and ensuring correctness.
    
    (This example assumes the prompt engineers the LLM to return JSON. If your system expects raw Python code directly, the prompt would be different.)

2.2. User Message

  • Purpose: Contains the specific task for the LLM, including the code snippet to be tested and any user-provided parameters.

  • Dynamic Construction: This part of the prompt is usually built dynamically in the backend code (e.g., main.py/api.py) based on the user's request.

  • Example Content (Conceptual - check code for exact formatting):

    Please generate unit tests for the following Python code snippet:
    
    ```python
    {{CODE_SNIPPET_FROM_USER}}
    

    {{#if DEPENDENCIES}} The code snippet depends on the following helper code/dependencies:

    {{#each DEPENDENCIES}}
    {{this}}
    {{/each}}
    

    {{/if}}

    {{#if TESTING_FRAMEWORK}} Please generate the tests using the '{{TESTING_FRAMEWORK}}' framework. {{/if}}

    {{#if MOCKING_FRAMEWORK}} Please use the '{{MOCKING_FRAMEWORK}}' framework for any necessary mocking. {{/if}}

    Remember to provide your output as a JSON object with "test_case" and "explanation" keys.

    *   `{{PLACEHOLDERS}}` indicate where actual user data (code snippet, dependencies list, framework choices) is injected into the template.
    
    

3. Key Considerations in Prompt Engineering

  • Clarity and Specificity: The more precise the instructions, the better the LLM's output.
  • Role Playing: Assigning a role (e.g., "expert AI programming assistant") can improve results.
  • Output Formatting: Explicitly requesting a specific output format (like JSON or a particular code structure) is crucial for programmatic parsing of the LLM's response.
  • Examples (Few-Shot Prompting - Optional): For some LLMs or complex tasks, including a few examples of good input/output pairs within the prompt can significantly improve performance. (It's not clear if your system uses this from the file list).
  • Iterative Refinement: Prompt engineering is often an iterative process. The content of prompt.txt or the prompt construction logic may have evolved based on experimentation and observing LLM outputs.

4. Location of Prompt Definitions

  • Static Parts: Often stored in a file like backend/prompt.txt (for the system message or a base user message template).
  • Dynamic Construction: Implemented in the Python backend code (e.g., in the /generate-test endpoint handler in main.py/api.py) to inject user-specific data.

Review these locations in your project to understand the exact prompts being used.