components llm_rag_data_import_acs - Azure/azureml-assets GitHub Wiki
Collects documents from Azure Cognitive Search Index, extracts their contents, saves them to a uri folder, and creates an MLIndex yaml file to represent the search index.
Documents collected can then be used in other components without having to query the ACS index again, allowing for a consistent dataset for chunking, data generation, etc.
Version: 0.0.68
Preview
View in Studio: https://ml.azure.com/registries/azureml/components/llm_rag_data_import_acs/version/0.0.68
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
num_docs | Number of documents to import from ACS instance | integer | 50 | ||
acs_config | Values for connecting to ACS instance. Required keys: 'endpoint', 'endpoint_key_name', 'index_name', 'content_key', 'title_key'. 'content_key' defaults to 'content' and 'title_key' defaults to 'title' | string | |||
use_existing | Use an existing ACS which is already embedded - directly output MLIndex config | string | False | ['True', 'False'] |
Name | Description | Type |
---|---|---|
output_data | Uri folder containing the documents' content saved as md files | uri_folder |
ml_index | Uri folder containing an MLIndex yaml representing the ACS Index. Can be used with azureml-rag package | uri_folder |
azureml:llm-rag-embeddings@latest