Mistral - DrAlzahraniProjects/csusb_fall2024_cse6550_team4 GitHub Wiki

Mistral Documentation

Last edited by csusb_fall2024_cse6550_team4


Table of Contents

  1. Installation
  2. Configuration
  3. Implementation
  4. Usage
  5. Troubleshooting

1. Installation

This section outlines the steps to set up Mistral and integrate it into the project environment using Docker. And we have to check the python version first

Screenshot (50)

Step 1: Build and Deploy the Mistral Environment

Navigate to the Project Repository: Clone the repository or navigate to the project folder:

git clone https://github.com/DrAlzahraniProjects/csusb_fall2024_cse6550_team4.git  
cd csusb_fall2024_cse6550_team4  

Build the Docker Image: Mistral is included as part of the Docker image build process in the project's Dockerfile. Use the provided Dockerfile to build the project environment:

docker build -t team4_chatbot .  

The Dockerfile is configured to install Mistral and other dependencies necessary for the project.

Run the Docker Container: Start the container with the following command:

docker run -d -p 19530:19530 -p 5004:5004 -p 6004:6004 --name chatbot team4_chatbot  

Verify Mistral Deployment: Inside the running Docker container, ensure that Mistral is installed and working correctly. You can check this by executing a Python command to import Mistral:

docker exec -it chatbot bash  
python -c "from open_mistral import ChatMistralAI; print('Mistral is installed!')"  

Required python packages

you can use this prompt in the command line if your system has python installed

pip install langchain langchain-mistralai python-dotenv pymilvus

Screenshot (51)

This should print "Mistral is installed!" confirming that Mistral is properly installed within the Docker container.

1

Figure shows the mistral package that the docker file will run to install.

2

Figure shows how the statement imports the ChatMistralAI class from the langchain_mistralai package.

Step 1: Create a new Mamba Environment

RUN mamba create -n team2_env python=3.10 -y
SHELL ["mamba", "run", "-n", "team2_env", "/bin/bash", "-c"]

Step 2: Add Command for Mistral AI Installation In your Dockerfile, add:

RUN /bin/bash -c "source ~/.bashrc && mamba install mistral -c conda-forge"

Step 3: Update requirements.txt for Hugging Face Transformers Ensure your requirements.txt file includes:

transformers>=4.0

Note: If using GPUs, ensure Docker has GPU support configured for performance optimization.

2. Configuration

Configuring Mistral involves setting up environment variables and preparing the connection to your vector store. Here’s how you can do that:

Screenshot (64)

FAISS Vector Store Setup - This figure illustrates the creation of a FAISS vector store from loaded documents. The vector store is essential for fast document retrieval based on semantic similarity.

Environment Variables: Create a .env file in your project directory to store sensitive information such as your Mistral API key.

Screenshot (53)

API Key for Mistral

Mistral requires an API key for interaction. Ensure you have your key and configure it as follows:

Create or Update .env File: Add the following configuration details to your .env file in the project root directory:

MISTRAL_API_KEY=your_api_key_here  

Note: Replace your_api_key_here with your actual Mistral API key.

Screenshot (63) Figure shows the API key configuration in the .env file.

4

Figure shows loading of the API key.

Screenshot (54)

Connecting to Mistral

from open_mistral import ChatMistralAI

def initialize_mistral(api_key: str = MISTRAL_API_KEY):
    """
    Initialize the Mistral model for generating responses.

    Args:
        api_key (str): The API key for Mistral.
![Screenshot (55)](https://github.com/user-attachments/assets/de3fa084-0ed1-4b64-8e76-32c3f72efa62)

    Returns:
        model: The Mistral model initialized for use.
    """
    # Initialize Mistral model with the provided API key
    model = ChatMistralAI(model='open-mistral-7b', api_key=api_key, temperature=0.2)
    print("Mistral model initialized.")
    return model

Key Steps: API Key: The API key is loaded from the environment variable MISTRAL_API_KEY set in the .env file.

Model Initialization: The ChatMistralAI class is used to initialize the Mistral model with the provided API key. The model used is 'open-mistral-7b', and a temperature setting of 0.2 is used to control the randomness of the generated responses.

Connection Confirmation: A log message Mistral model initialized. is printed to confirm that the model has been successfully initialized and is ready to generate responses.

3. Implementation

1. Mistral Model Integration

In this project, Mistral is used to generate answers for user queries. The connection to the Mistral API is made through the initialize_mistral function, which loads the model for use in generating responses.

Here is the relevant code snippet for initializing the Mistral model:

from open_mistral import ChatMistralAI

def initialize_mistral(api_key: str = MISTRAL_API_KEY):
    """
    Initialize the Mistral model for generating responses.

![Screenshot (62)](https://github.com/user-attachments/assets/89d06788-4477-4b35-9f2d-dcf12b435a40)


Hybrid Retriever Setup - This figure demonstrates the setup of a hybrid retriever that combines document retrieval from the FAISS vector store and user input for enhanced information retrieval.

    Args:
        api_key (str): The API key for Mistral.

    Returns:
        model: The Mistral model initialized for use.
    """
    model = ChatMistralAI(model='open-mistral-7b', api_key=api_key, temperature=0.2)
    print("Mistral model initialized.")
    return model
  • In this implementation:

create_stuff_documents_chain: This function sets up a chain that combines document retrieval with answer generation. System Prompt: The AI is guided to use the context provided in the retrieved documents to formulate answers. This approach is crucial for providing relevant and contextually accurate responses

Screenshot (61)

Mistral Model Chain Setup - This figure illustrates the setup of the model chain that integrates document retrieval with the Mistral AI to generate accurate answers based on user queries.

Functionality: API Key Configuration: The function takes the API key as an argument, which is required to interact with the Mistral model. This key is stored in the .env file. Model Initialization: The ChatMistralAI class is used to initialize the Mistral model, which will then be used to generate responses for incoming queries.

2. Querying with Mistral

Once the Mistral model is initialized, it is used to generate responses based on user input. Here is an example of how the model is used to process a query:

def generate_response(query, model):
    """
    Generate a response using the Mistral model based on the input query.

    Args:
        query (str): The input query from the user.
        model: The initialized Mistral model.

    Returns:
        str: The generated response from the model.
    """
    response = model.chat(query)
    return response['message']

3. Integrating with Retrieval-Augmented Generation (RAG)

In the context of this project, Mistral is part of a Retrieval-Augmented Generation (RAG) pipeline. This means that, before generating a response, relevant documents are retrieved from Milvus (vector store). The documents are then passed to Mistral for generating context-aware responses. Below is the integration code:

Fallback Context Implementation

Screenshot (68)

def query_rag(query):
    """
    Entry point for the RAG model to generate an answer to a given query.

    Args:
        query (str): The query string for which an answer is to be generated.

    Returns:
        str: The formatted answer with a unique source link (if available).
    """
    # Initialize the Mistral model
    model = initialize_mistral(api_key=MISTRAL_API_KEY)
    
    # Perform retrieval and answer generation
    response = generate_response(query, model)
    
    # Process the response and append source links if necessary
    return response

Key Steps: Retrieve Relevant Documents: The function first uses Milvus to retrieve relevant documents based on the user’s query. Generate Response: After retrieving the documents, the generate_response() function passes the query and documents to the Mistral model for response generation. Source Links: If relevant documents are used, the response is appended with source links to the documents for user reference.

5

6

Figures show how query_rag() function uses mistral to take in queries and generate responses.

Fallback Context Implementation

Screenshot (56)

4. Usage

Running the Application with Mistral Integration

To run the application locally with Mistral integrated, follow these steps:

maxresdefault

Tabnine Chat Overview

Tabnine Chat is an AI-powered coding assistant integrated into IDEs like Visual Studio Code. It enhances productivity by offering features such as:

  • Explain Code: Understand selected code snippets.
  • Generate Tests: Automate unit test creation.
  • Write Docstrings: Create detailed code documentation.
  • Fix Code: Debug and optimize code.
  • Onboard Projects: Assist in setting up new coding environments.

The image shows Tabnine's interface, with features on the left panel and a coding workspace on the right (run.py). Tabnine simplifies code reviews, documentation, debugging, and onboarding processes for developers. Clone the Repository:

git clone https://github.com/DrAlzahraniProjects/csusb_fall2024_cse6550_team4.git  
cd csusb_fall2024_cse6550_team4  

Build the Docker Image: Build the Docker image as you normally would for the project:

docker build -t team4_chatbot .  

Run the Docker Container: Start the container:

docker run -d -p 19530:19530 -p 5004:5004 -p 6004:6004 --name chatbot team4_chatbot  

Access the Streamlit Interface: Open your browser and navigate to http://localhost:5004 to interact with the chatbot interface.

Interacting with the Chatbot

Once the application is running, type your query into the input box. Mistral will be used to generate responses based on the query and retrieved documents from Milvus.

Example Query:

"What is software engineering?"

Bot Response:

"Software engineering is the application of engineering principles to software development..."

Example of Query and Response Generation Here's an example of how the chatbot processes a query and generates a response:

User Query: "What is modular training?"

Bot Response: "Modular training is a form of learning that breaks content into smaller, manageable units or modules."

Source Links: If relevant documents were used to generate the answer, source links (e.g., links to PDFs) would be appended.

The run_query() function is responsible for orchestrating the entire process of handling a user's query, retrieving the relevant documents, and generating a response using Mistral. This is the function called when a user interacts with the chatbot interface.

Screenshot (66)

prompt: The query provided by the user.

Return Value: This function returns a tuple containing: The generated response (response). The retrieved documents (retrieved_docs), which are used as context for generating the answer.

Retrieve Documents: The first step in RAG is to fetch the most relevant documents from the knowledge base, which is stored in Milvus (a vector database). The retrieve_documents() function is responsible for performing the document retrieval:

Integrating Mistral into RAG Workflow

Mistral is used in the Retrieval-Augmented Generation (RAG) workflow, where relevant documents are retrieved based on the user query, and Mistral is used to generate responses using those documents.

Milvus Initialization: If the vector store hasn't been initialized, it proceeds with the setup. This involves connecting to Milvus and loading the embeddings of the knowledge base into memory, so that the chatbot can use it for retrieving relevant documents during query processing.

Initialization Function:

Screenshot (67)

Document Retrieval: The first step is retrieving the most relevant documents from a vector store (such as Milvus) based on the query.

Generate Response: Once relevant context is retrieved, the generate_response() function generates a response using Mistral. The response can be enhanced by providing context from these documents.

Troubleshooting

Common Issues:

While working with Mistral, you may encounter several common issues. Here’s how to address them:

  • Missing API Key: If you forget to add your API key to the .env file, you will encounter an error when trying to run your application.

ValueError: "MISTRAL_API_KEY not found in .env"

Ensure that your .env file is correctly set up and contains the necessary API key for authentication.

  • Document Retrieval Fails: If the FAISS index is not created properly or the documents are not loaded as expected, check the following:

    • Verify that the document_path is correct and points to a valid directory containing your documents.

image

Figure 7: Highlighted line shows the document path.

  • Make sure that all required packages, including FAISS, are listed in your requirements.txt file, as mentioned earlier.

Package Installation Screenshot

Figure 8: This image shows the requirements.txt file including FAISS.

  • API Key Authentication Fails: If your API key is not working, double-check its validity by logging into the Mistral platform. Ensure that the key you’re using matches the one provided in your Mistral account settings.

API Key Authentication Fail Screenshot

Figure 9: API Key Authentication Fail - This figure shows an authentication failure due to an invalid API key, stressing the need for accurate credentials.

Debugging:

When working with the Mistral framework, you may encounter various issues that can impede the functionality of your application. Implementing effective debugging strategies can help identify and resolve these problems efficiently. Here are some recommended approaches:

Logging Output: One of the most effective ways to debug your application is by incorporating logging or print statements within the chat_completion function. This allows you to capture real-time interactions between different components, specifically during the retrieval and generation processes. By logging key variables and outputs at various stages of execution, you can gain insight into the flow of data and pinpoint where issues may arise. For instance, logging the retrieved documents and the generated responses can help you identify if the issue lies in the document retrieval step or in the model’s response generation. Additionally, consider using Python’s built-in logging module instead of print statements, as it offers more flexibility and control over logging levels and outputs.

Token Limits: The max_tokens parameter plays a crucial role in determining the length of the generated responses. If this parameter is set too low, it can result in incomplete or truncated answers, which may lead to user dissatisfaction. Therefore, it is essential to adjust this parameter based on the expected length of the responses for different queries. For example, if your application frequently deals with complex questions requiring detailed answers, consider increasing the max_tokens limit to accommodate longer outputs. Monitor the responses closely; if you notice frequent truncation, increasing this limit is advisable. Additionally, keep in mind that excessively high token limits may lead to longer processing times, so finding a balance based on your application’s needs is key.

Environment Verification: It is also beneficial to verify that your Python environment is set up correctly and that all required dependencies are installed without conflicts. Use tools like pip list to check installed packages and their versions. Incompatible library versions can lead to unexpected behaviors, so ensuring that your environment is clean and properly configured is critical for smooth operation.

By implementing these debugging strategies, you can effectively troubleshoot issues within the Mistral framework, leading to more reliable and robust application performance.

Token Limit Adjustment Screenshot

Figure 10: Token Limit Adjustment - This figure highlights how to adjust the max_tokens parameter to prevent truncated responses, ensuring that the AI generates complete answers.

  • Environment Verification: Confirm that your Python environment is set up correctly and that all dependencies are installed as expected. Sometimes, conflicts in packages can lead to unexpected behavior.