Corpora Client - gunpal5/Google_GenerativeAI GitHub Wiki

Introduction

The CorporaClient provides methods for interacting with the Gemini API's Corpora endpoint. This allows you to create, manage, and query corpora, which are collections of documents used for semantic search. Corpora are essential for building applications that can understand and retrieve information based on meaning rather than just keywords.

Details

The CorporaClient offers the following functionalities:

Creating a Corpus

The CreateCorpusAsync method creates a new corpus.

using GenerativeAI.Clients;
using GenerativeAI.Types;

// ... other code ...

var corporaClient = new CorporaClient(platform, httpClient, logger); // Initialize CorporaClient

var corpus = new Corpus
{
    DisplayName = "My Sample Corpus", // Display name for the corpus
    // ... other corpus properties ...
};

var createdCorpus = await corporaClient.CreateCorpusAsync(corpus);

if (createdCorpus != null)
{
    Console.WriteLine($"Corpus created: {createdCorpus.Name}");
}
else
{
    Console.WriteLine("Failed to create corpus.");
}

Querying a Corpus

The QueryCorpusAsync method performs semantic search over a corpus.

using GenerativeAI.Clients;
using GenerativeAI.Types;

// ... other code ...

var corporaClient = new CorporaClient(platform, httpClient, logger); // Initialize CorporaClient

var corpusName = "corpora/my-corpus-id"; // Replace with the actual corpus name

var queryCorpusRequest = new QueryCorpusRequest
{
    Query = "What are the key concepts in this document?",
    // ... other query parameters ...
};

var queryCorpusResponse = await corporaClient.QueryCorpusAsync(corpusName, queryCorpusRequest);

if (queryCorpusResponse != null && queryCorpusResponse.RelevantChunks != null)
{
    foreach (var chunk in queryCorpusResponse.RelevantChunks)
    {
        Console.WriteLine($"Relevant Chunk: {chunk.ChunkData.Text}");
    }
}
else
{
    Console.WriteLine("No relevant chunks found.");
}

Listing Corpora

The ListCorporaAsync method retrieves a list of corpora.

using GenerativeAI.Clients;
using GenerativeAI.Types;

// ... other code ...

var corporaClient = new CorporaClient(platform, httpClient, logger); // Initialize CorporaClient

var listCorporaResponse = await corporaClient.ListCorporaAsync(); // You can provide pageSize and pageToken

if (listCorporaResponse != null && listCorporaResponse.Corpora != null)
{
    foreach (var corpus in listCorporaResponse.Corpora)
    {
        Console.WriteLine($"Corpus Name: {corpus.Name}");
    }
}
else
{
    Console.WriteLine("No corpora found.");
}

Getting a Corpus

The GetCorpusAsync method retrieves a specific corpus by name.

using GenerativeAI.Clients;
using GenerativeAI.Types;

// ... other code ...

var corporaClient = new CorporaClient(platform, httpClient, logger); // Initialize CorporaClient

var corpusName = "corpora/my-corpus-id"; // Replace with the actual corpus name

var corpus = await corporaClient.GetCorpusAsync(corpusName);

if (corpus != null)
{
    Console.WriteLine($"Corpus Display Name: {corpus.DisplayName}");
}
else
{
    Console.WriteLine("Corpus not found.");
}

Updating a Corpus

The UpdateCorpusAsync method updates an existing corpus.

using GenerativeAI.Clients;
using GenerativeAI.Types;

// ... other code ...

var corporaClient = new CorporaClient(platform, httpClient, logger); // Initialize CorporaClient

var corpusName = "corpora/my-corpus-id"; // Replace with the actual corpus name

var updatedCorpus = new Corpus
{
    Name = corpusName, // Important: Include the name in the updated corpus object.
    DisplayName = "My Updated Corpus Name",
    // ... other updated properties ...
};

string updateMask = "displayName"; // Specify the fields to update

var resultCorpus = await corporaClient.UpdateCorpusAsync(corpusName, updatedCorpus, updateMask);

if (resultCorpus != null)
{
    Console.WriteLine($"Corpus updated: {resultCorpus.DisplayName}");
}
else
{
    Console.WriteLine("Failed to update corpus.");
}

Deleting a Corpus

The DeleteCorpusAsync method deletes a corpus.

using GenerativeAI.Clients;

// ... other code ...

var corporaClient = new CorporaClient(platform, httpClient, logger); // Initialize CorporaClient

var corpusName = "corpora/my-corpus-id"; // Replace with the actual corpus name

await corporaClient.DeleteCorpusAsync(corpusName); // You can optionally set force to true

Console.WriteLine($"Corpus deleted: {corpusName}");

Important Considerations

  • Ensure proper authorization is configured before using the CorporaClient. See the Authentication page.
  • Replace placeholder corpus names and IDs with actual values.
  • Handle potential exceptions during API calls.
  • Be mindful of rate limits when making frequent requests. See the official documentation for details.
  • The updateMask parameter in UpdateCorpusAsync is crucial. It specifies which fields of the Corpus object should be updated. Only the fields listed in the updateMask will be modified.

API Reference