components llm_rag_data_import_acs - Azure/azureml-assets GitHub Wiki

LLM - Import Data from ACS

llm_rag_data_import_acs

Overview

Collects documents from Azure Cognitive Search Index, extracts their contents, saves them to a uri folder, and creates an MLIndex yaml file to represent the search index.

Documents collected can then be used in other components without having to query the ACS index again, allowing for a consistent dataset for chunking, data generation, etc.

Version: 0.0.71

Tags

Preview

View in Studio: https://ml.azure.com/registries/azureml/components/llm_rag_data_import_acs/version/0.0.71

Inputs

Name Description Type Default Optional Enum
num_docs Number of documents to import from ACS instance integer 50
acs_config Values for connecting to ACS instance. Required keys: 'endpoint', 'endpoint_key_name', 'index_name', 'content_key', 'title_key'. 'content_key' defaults to 'content' and 'title_key' defaults to 'title' string
use_existing Use an existing ACS which is already embedded - directly output MLIndex config string False ['True', 'False']

Outputs

Name Description Type
output_data Uri folder containing the documents' content saved as md files uri_folder
ml_index Uri folder containing an MLIndex yaml representing the ACS Index. Can be used with azureml-rag package uri_folder

Environment

azureml:llm-rag-embeddings@latest

⚠️ **GitHub.com Fallback** ⚠️