components oss_distillation_pipeline - Azure/azureml-assets GitHub Wiki

OSS Distillation Pipeline

oss_distillation_pipeline

Overview

Component to generate data from teacher model enpoint and finetune student model on generated dataset

Version: 0.0.10

View in Studio: https://ml.azure.com/registries/azureml/components/oss_distillation_pipeline/version/0.0.10

Inputs

Compute parameters

Name Description Type Default Optional Enum
instance_type_pipeline_validation Instance type to be used for validation component. The parameter compute_pipeline_validation must be set to 'serverless' for instance_type to be used. string True
instance_type_data_generation Instance type to be used for finetune component in case of virtual cluster compute, eg. Singularity.ND40_v2. The parameter compute_finetune must be set to 'serverless' for instance_type to be used string Standard_D4as_v4 True
instance_type_data_import Instance type to be used for data_import component in case of virtual cluster compute, eg. Singularity.D8_v3. The parameter compute_data_import must be set to 'serverless' for instance_type to be used string Singularity.ND96amrs_A100_v4 True
instance_type_finetune Instance type to be used for finetune component in case of virtual cluster compute, eg. Singularity.ND40_v2. The parameter compute_finetune must be set to 'serverless' for instance_type to be used string Singularity.ND96amrs_A100_v4 True
compute_pipeline_validation compute to be used for validation component string serverless True
compute_data_generation compute to be used for model_import eg. provide 'FT-Cluster' if your compute is named 'FT-Cluster'. Special characters like \ and ' are invalid in the parameter value. If compute cluster name is provided, instance_type field will be ignored and the respective cluster will be used string serverless True
compute_data_import compute to be used for model_import eg. provide 'FT-Cluster' if your compute is named 'FT-Cluster'. Special characters like \ and ' are invalid in the parameter value. If compute cluster name is provided, instance_type field will be ignored and the respective cluster will be used string serverless True
compute_finetune compute to be used for finetune eg. provide 'FT-Cluster' if your compute is named 'FT-Cluster'. Special characters like \ and ' are invalid in the parameter value. If compute cluster name is provided, instance_type field will be ignored and the respective cluster will be used string serverless True

Data Generator Component

Name Description Type Default Optional Enum
train_file_path Path to the registered training data asset. The supported data formats are jsonl, json, csv, tsv and parquet. uri_file
validation_file_path Path to the registered validation data asset. The supported data formats are jsonl, json, csv, tsv and parquet. uri_file True
teacher_model_endpoint_name Teacher model endpoint name string True
teacher_model_endpoint_url Teacher model endpoint URL string True
teacher_model_endpoint_key Teacher model endpoint key string True
teacher_model_max_new_tokens Teacher model max_new_tokens inference parameter integer 128
teacher_model_temperature Teacher model temperature inference parameter number 0.2
teacher_model_top_p Teacher model top_p inference parameter number 0.1
teacher_model_frequency_penalty Teacher model frequency penalty inference parameter number 0.0
teacher_model_presence_penalty Teacher model presence penalty inference parameter number 0.0
teacher_model_stop Teacher model stop inference parameter string True
request_batch_size No of data records to hit teacher model endpoint in one go integer 10
min_endpoint_success_ratio The minimum value of (successful_requests / total_requests) required for classifying inference as successful. If (successful_requests / total_requests) < min_endpoint_success_ratio, the experiment will be marked as failed. By default it is 0.7 (0 means all requests are allowed to fail while 1 means no request should fail.) number 0.7
enable_chain_of_thought Enable Chain of thought for data generation string false True ['true', 'false']
enable_chain_of_density Enable Chain of density for text summarization string false True ['true', 'false']
max_len_summary Maximum Length Summary for text summarization integer 80 True
data_generation_task_type Data generation task type. Supported values are: 1. NLI: Generate Natural Language Inference data 2. CONVERSATION: Generate conversational data (multi/single turn) 3. NLU_QA: Generate Natural Language Understanding data for Question Answering data 4. MATH: Generate Math data for numerical responses 5. SUMMARIZATION: Generate Key Summary for an Article string ['NLI', 'CONVERSATION', 'NLU_QA', 'MATH', 'SUMMARIZATION']

Batch Score Component

Name Description Type Default Optional Enum
authentication_type Authentication type for endpoint. Either azureml_workspace_connection or managed_identity. string azureml_workspace_connection False ['azureml_workspace_connection', 'managed_identity']
additional_headers JSON serialized string expressing additional headers to be added to each request. string True
debug_mode Enable debug mode to print all the debug logs in the score step. boolean False False
ensure_ascii If set to true, the output is guaranteed to have all incoming non-ASCII characters escaped. If set to false, these characters will be output as-is. More detailed information can be found at https://docs.python.org/3/library/json.html boolean False False
max_retry_time_interval The maximum time (in seconds) spent retrying a payload. If unspecified, payloads are retried for unlimited time. integer True
initial_worker_count The initial number of workers to use for scoring. integer 5 False
max_worker_count Overrides initial_worker_count if necessary. integer 200 False
instance_count Number of nodes in a compute cluster we will run the batch score step on. integer 1
max_concurrency_per_instance Number of processes that will be run concurrently on any given node. This number should not be larger than 1/2 of the number of cores in an individual node in the specified cluster. integer 1
mini_batch_size The mini batch size for parallel run. string 100KB True

Finetuning Component

Name Description Type Default Optional Enum
number_of_gpu_to_use_finetuning number of gpus to be used per node for finetuning, should be equal to number of gpu per node in the compute SKU used for finetune integer 1 True

Continual-Finetuning model path

Name Description Type Default Optional Enum
mlflow_model_path MLflow model asset path. Special characters like \ and ' are invalid in the parameter value. mlflow_model True
pytorch_model_path Pytorch model asset path. Special characters like \ and ' are invalid in the parameter value. custom_model True

Training parameters

Name Description Type Default Optional Enum
num_train_epochs training epochs integer 1 True
per_device_train_batch_size Train batch size integer 1 True
learning_rate Start learning rate. number 0.0003 True

Validation parameters

Name Description Type Default Optional Enum
system_properties Validation parameters propagated from pipeline. string True

Model parameters

Name Description Type Default Optional Enum
model_asset_id Asset id of model string False

Model registration

Name Description Type Default Optional Enum
registered_model_name Name of the registered model string True
validation_info Validation status. uri_file True

Outputs

Name Description Type
output_model Output dir to save the finetuned lora weights uri_folder
⚠️ **GitHub.com Fallback** ⚠️