Vertex RAG Engine - gunpal5/Google_GenerativeAI GitHub Wiki

Introduction

This page explains how to use the Vertex AI RAG Engine within the Google_GenerativeAI SDK. This engine allows you to create knowledge bases (corpora), import data into them, and then use Google's Generative AI models (like Gemini Flash 2.0) to provide grounded responses based on that data. This process, known as Retrieval Augmented Generation (RAG), enhances the model's responses by providing it with relevant context from your own data sources.

Details

Here's a step-by-step guide on how to utilize the Vertex AI RAG Engine:

1. Initialize Vertex AI:

When initializing the Vertex AI client, make sure to provide the appropriate authentication using one of the following:

GoogleOAuthAuthenticator
GoogleServiceAccountAuthenticator
ADCAuthentication

Your credentials should have sufficient IAM permissions or roles, such as:

Vertex AI RAG Data Service Agent
Vertex AI User
Secret Manager Secret Accessor
AI Platform Developer

var vertexAi = new VertexAI(projectId, region,
    authenticator:
    new GoogleServiceAccountAuthenticator("path/to/your/service/account.json")
    // or another authenticator that suits your credentials
);

2. Create a RAG Manager:

Create an instance of the RagManager to handle corpus operations.

var ragManager = vertexAi.CreateRagManager();

3. Create a Corpus:

A corpus serves as your knowledge base. Create a new corpus using the CreateCorpusAsync method. You can optionally specify a vector database (Pinecone, Weaviate, etc.) using overload methods. If no specific vector database is provided, a default one will be used.

var corpus = await ragManager.CreateCorpusAsync("My New Corpus", "My description");
//or Use any other overloaded methods to create corpus with different Vector Database such as
// CreateCorpusAsync(string displayName, string? description, RagVectorDbConfigPinecone pineconeConfig, string apiKeyResourceName, string?  embeddingModelName = null, CancellationToken cancellationToken = default)

4. Import Data into the Corpus:

Import data from a specified source into your corpus. Replace GcsSource with the appropriate source (Jira, Slack, SharePoint, etc.) and configure it accordingly.

var fileSource = new GcsSource() 
{ 
    // Configure your GcsSource here 
};

await ragManager.ImportFilesAsync(corpus.Name, fileSource);

5. Create a Gemini Generative Model with RAG Configuration:

Create a Gemini generative model configured to use the created corpus for RAG. The corpusIdForRag parameter links the model to your knowledge base.

var model = vertexAi.CreateGenerativeModel(
    VertexAIModels.Gemini.Gemini2Flash, 
    corpusIdForRag: corpus.Name
);

6. Generate Content:

Generate content by querying the model. The model will retrieve relevant information from the corpus to provide a grounded response.

var result = await model.GenerateContentAsync("query related to the corpus");

Important Considerations

Data Source Configuration: Ensure that the data source (e.g., GcsSource) is correctly configured to access your data.
Vector Database Choice: If you have specific performance or scalability requirements, consider using a supported external vector database.
Corpus Maintenance: Regularly update your corpus with new information to keep the model's responses accurate and relevant.
Query Formulation: Craft clear and specific queries to get the most relevant responses from the model.
Corpus Size and Latency: A larger corpus will increase latency.
Cost: Importing large datasets and performing frequent queries can incur significant costs. Monitor your usage and optimize accordingly.

For more detailed information about the RAG Engine, including supported databases and data sources, please refer to the official documentation: https://cloud.google.com/vertex-ai/generative-ai/docs/rag-overview