LLM - Generate Embeddings Parallel

llm_rag_generate_embeddings_parallel

Overview

Generates embeddings vectors for data chunks read from chunks_source.

chunks_source is expected to contain csv files containing two columns:

"Chunk" - Chunk of text to be embedded
"Metadata" - JSON object containing metadata for the chunk

If previous_embeddings is supplied, input chunks are compared to existing chunks in the Embeddings Container and only changed/new chunks are embedded, existing chunks being reused.

Version: 0.0.84

Inputs

Name	Description	Type	Default	Optional	Enum
chunks_source	Folder containing chunks to be embedded.	uri_folder

If adding to previously generated Embeddings

Name	Description	Type	Default	Optional	Enum
embeddings_container	Folder containing previously generated embeddings. Should be parent folder of the 'embeddings' output path used for for this component. Will compare input data to existing embeddings and only embed changed/new data, reusing existing chunks.	uri_folder		True

Embeddings settings

Name	Description	Type	Default	Optional	Enum
embeddings_model	The model to use to embed data. E.g. 'hugging_face://model/sentence-transformers/all-mpnet-base-v2' or 'azure_open_ai://deployment/{deployment_name}/model/{model_name}'	string	hugging_face://model/sentence-transformers/all-mpnet-base-v2
deployment_validation	Uri file containing information on if the Azure OpenAI deployments, if used, have been validated	uri_file		True

Outputs

Name	Description	Type
embeddings	Where to save data with embeddings. This should be a subfolder of previous embeddings if supplied, typically named using '${name}'. e.g. /my/prev/embeddings/${name}	uri_folder
processed_file_names	Text file containing the names of the files that were processed	uri_file

components llm_rag_generate_embeddings_parallel - Azure/azureml-assets GitHub Wiki

LLM - Generate Embeddings Parallel

llm_rag_generate_embeddings_parallel

Overview

Tags

Inputs

Outputs

⚠️ GitHub.com Fallback ⚠️

components llm_rag_generate_embeddings_parallel - Azure/azureml-assets GitHub Wiki

LLM - Generate Embeddings Parallel

llm_rag_generate_embeddings_parallel

Overview

Tags

Inputs

Outputs

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️