components llm_ingest_db_to_acs - Azure/azureml-assets GitHub Wiki
Single job pipeline to chunk data from AzureML sql data store, and create ACS embeddings index
Version: 0.0.97
Preview
View in Studio: https://ml.azure.com/registries/azureml/components/llm_ingest_db_to_acs/version/0.0.97
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
db_datastore | database datastore uri in the format of 'azureml://datastores/{datastore_name}' | string | |||
embeddings_model | The model used to generate embeddings. 'azure_open_ai://endpoint/{endpoint_name}/deployment/{deployment_name}/model/{model_name}' | string | |||
chat_aoai_deployment_name | The name of the chat AOAI deployment | string | True | ||
embedding_aoai_deployment_name | The name of the embedding AOAI deployment | string | |||
embeddings_dataset_name | The name of the acs index | string | |||
max_tables | integer | True | |||
max_columns | integer | True | |||
max_rows | integer | True | |||
max_sampling_rows | integer | True | |||
max_text_length | integer | True | |||
max_knowledge_pieces | integer | True | |||
selected_tables | string | True | |||
column_settings | string | True | |||
llm_config | The name of the llm config | string | True | ||
runtime | The name of the runtime | string | False | ||
serverless_instance_count | integer | 1 | True | ||
serverless_instance_type | string | Standard_DS3_v2 | True | ||
embedding_connection | Azure OpenAI workspace connection ARM ID for embeddings | string | True | ||
llm_connection | Azure OpenAI workspace connection ARM ID for LLM | string | True | ||
acs_connection | Azure Cognitive Search workspace connection ARM ID | string | True | ||
acs_config | JSON describing the acs index to create or update for embeddings | string | |||
sample_data | Sample data to be used for data ingestion. format: 'azureml:samples-test:1' | uri_folder | True |
path: "azureml:samples-test:1" data ingest setting
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
sample_acs_config | JSON describing the acs index to create or update for samples | string | |||
include_builtin_examples | boolean | True | True | ||
tools | The name of the tools for dbcopilot. Supported tools: "tsql", "python". Format: ["tsql", "python"] | string | True | ||
knowledge_pieces | The list of knowledge pieces to be used for grounding. | string | True | ||
include_views | Whether to turn on views. | boolean | True | ||
instruct_template | The instruct template for the LLM. | string | True | ||
managed_identity_enabled | Whether to connect using managed identity. | boolean | False | True |
Name | Description | Type |
---|---|---|
grounding_index | uri_folder | |
db_context | uri_folder |