Document Client - gunpal5/Google_GenerativeAI GitHub Wiki
Introduction
The DocumentsClient
provides methods for interacting with the Gemini API's Documents endpoint. This allows you to create, manage, and query documents within a corpus. Documents are the individual units of content within a corpus, containing the text that will be used for semantic search.
Details
The DocumentsClient
offers the following functionalities:
Creating a Document
The CreateDocumentAsync
method creates a new document within a specified corpus.
using GenerativeAI.Clients;
using GenerativeAI.Types;
// ... other code ...
var documentsClient = new DocumentsClient(platform, httpClient, logger); // Initialize DocumentsClient
var parentCorpus = "corpora/my-corpus-id"; // Replace with the parent corpus name
var document = new Document
{
DisplayName = "My Sample Document",
CustomMetadata = new List<CustomMetadata>{new CustomMetadata(){Key="my key", StringValue = "This is a test document" }}
// ... other document properties ...
};
var createdDocument = await documentsClient.CreateDocumentAsync(parentCorpus, document);
if (createdDocument != null)
{
Console.WriteLine($"Document created: {createdDocument.Name}");
}
else
{
Console.WriteLine("Failed to create document.");
}
Querying a Document
The QueryDocumentAsync
method performs semantic search within a specific document.
using GenerativeAI.Clients;
using GenerativeAI.Types;
// ... other code ...
var documentsClient = new DocumentsClient(platform, httpClient, logger); // Initialize DocumentsClient
var documentName = "corpora/my-corpus-id/documents/my-document-id"; // Replace with the document name
var queryDocumentRequest = new QueryDocumentRequest
{
Query = "What is mentioned about topic X in this document?",
// ... other query parameters ...
};
var queryDocumentResponse = await documentsClient.QueryDocumentAsync(documentName, queryDocumentRequest);
if (queryDocumentResponse != null && queryDocumentResponse.RelevantChunks != null)
{
foreach (var chunk in queryDocumentResponse.RelevantChunks)
{
Console.WriteLine($"Relevant Chunk: {chunk.ChunkData.Text}");
}
}
else
{
Console.WriteLine("No relevant chunks found.");
}
Listing Documents
The ListDocumentsAsync
method retrieves a list of documents within a corpus.
using GenerativeAI.Clients;
using GenerativeAI.Types;
// ... other code ...
var documentsClient = new DocumentsClient(platform, httpClient, logger); // Initialize DocumentsClient
var parentCorpus = "corpora/my-corpus-id"; // Replace with the parent corpus name
var listDocumentsResponse = await documentsClient.ListDocumentsAsync(parentCorpus); // You can provide pageSize and pageToken
if (listDocumentsResponse != null && listDocumentsResponse.Documents != null)
{
foreach (var document in listDocumentsResponse.Documents)
{
Console.WriteLine($"Document Name: {document.Name}");
}
}
else
{
Console.WriteLine("No documents found.");
}
Getting a Document
The GetDocumentAsync
method retrieves a specific document by name.
using GenerativeAI.Clients;
using GenerativeAI.Types;
// ... other code ...
var documentsClient = new DocumentsClient(platform, httpClient, logger); // Initialize DocumentsClient
var documentName = "corpora/my-corpus-id/documents/my-document-id"; // Replace with the document name
var document = await documentsClient.GetDocumentAsync(documentName);
if (document != null)
{
Console.WriteLine($"Document Display Name: {document.DisplayName}");
Console.WriteLine($"Document Content: {document.Content?.Text}");
}
else
{
Console.WriteLine("Document not found.");
}
Updating a Document
The UpdateDocumentAsync
method updates an existing document.
using GenerativeAI.Clients;
using GenerativeAI.Types;
// ... other code ...
var documentsClient = new DocumentsClient(platform, httpClient, logger); // Initialize DocumentsClient
var documentName = "corpora/my-corpus-id/documents/my-document-id"; // Replace with the document name
var updatedDocument = new Document
{
Name = documentName, // Important: Include the name in the updated document object.
DisplayName = "My Updated Document Name",
CustomMetadata = new List<CustomMetadata>{new CustomMetadata(){Key="my key", StringValue = "This is a test document updated" }} #
// ... other updated properties ...
};
string updateMask = "displayName,content"; // Specify the fields to update
var resultDocument = await documentsClient.UpdateDocumentAsync(documentName, updatedDocument, updateMask);
if (resultDocument != null)
{
Console.WriteLine($"Document updated: {resultDocument.DisplayName}");
}
else
{
Console.WriteLine("Failed to update document.");
}
Deleting a Document
The DeleteDocumentAsync
method deletes a document.
using GenerativeAI.Clients;
// ... other code ...
var documentsClient = new DocumentsClient(platform, httpClient, logger); // Initialize DocumentsClient
var documentName = "corpora/my-corpus-id/documents/my-document-id"; // Replace with the document name
await documentsClient.DeleteDocumentAsync(documentName); // You can optionally set force to true
Console.WriteLine($"Document deleted: {documentName}");
Important Considerations
- Ensure proper authorization is configured before using the
DocumentsClient
. See the Authentication page. - Replace placeholder document names, IDs, and corpus names with actual values.
- Handle potential exceptions during API calls.
- Be mindful of rate limits when making frequent requests. See the official documentation for details.
- The
updateMask
parameter inUpdateDocumentAsync
is crucial. It specifies which fields of theDocument
object should be updated. Only the fields listed in theupdateMask
will be modified.