OSS Distillation Pipeline

oss_distillation_pipeline

Overview

Component to generate data from teacher model enpoint and finetune student model on generated dataset

Version: 0.0.10

View in Studio: https://ml.azure.com/registries/azureml/components/oss_distillation_pipeline/version/0.0.10

Inputs

Compute parameters

Name	Description	Type	Default	Optional
instance_type_pipeline_validation	Instance type to be used for validation component. The parameter compute_pipeline_validation must be set to 'serverless' for instance_type to be used.	string		True
instance_type_data_generation	Instance type to be used for finetune component in case of virtual cluster compute, eg. Singularity.ND40_v2. The parameter compute_finetune must be set to 'serverless' for instance_type to be used	string	Standard_D4as_v4	True
instance_type_data_import	Instance type to be used for data_import component in case of virtual cluster compute, eg. Singularity.D8_v3. The parameter compute_data_import must be set to 'serverless' for instance_type to be used	string	Singularity.ND96amrs_A100_v4	True
instance_type_finetune	Instance type to be used for finetune component in case of virtual cluster compute, eg. Singularity.ND40_v2. The parameter compute_finetune must be set to 'serverless' for instance_type to be used	string	Singularity.ND96amrs_A100_v4	True
compute_pipeline_validation	compute to be used for validation component	string	serverless	True
compute_data_generation	compute to be used for model_import eg. provide 'FT-Cluster' if your compute is named 'FT-Cluster'. Special characters like \ and ' are invalid in the parameter value. If compute cluster name is provided, instance_type field will be ignored and the respective cluster will be used	string	serverless	True
compute_data_import	compute to be used for model_import eg. provide 'FT-Cluster' if your compute is named 'FT-Cluster'. Special characters like \ and ' are invalid in the parameter value. If compute cluster name is provided, instance_type field will be ignored and the respective cluster will be used	string	serverless	True
compute_finetune	compute to be used for finetune eg. provide 'FT-Cluster' if your compute is named 'FT-Cluster'. Special characters like \ and ' are invalid in the parameter value. If compute cluster name is provided, instance_type field will be ignored and the respective cluster will be used	string	serverless	True

Data Generator Component

Name	Description	Type	Default	Optional	Enum
train_file_path	Path to the registered training data asset. The supported data formats are `jsonl`, `json`, `csv`, `tsv` and `parquet`.	uri_file
validation_file_path	Path to the registered validation data asset. The supported data formats are `jsonl`, `json`, `csv`, `tsv` and `parquet`.	uri_file		True
teacher_model_endpoint_name	Teacher model endpoint name	string		True
teacher_model_endpoint_url	Teacher model endpoint URL	string		True
teacher_model_endpoint_key	Teacher model endpoint key	string		True
teacher_model_max_new_tokens	Teacher model max_new_tokens inference parameter	integer	128
teacher_model_temperature	Teacher model temperature inference parameter	number	0.2
teacher_model_top_p	Teacher model top_p inference parameter	number	0.1
teacher_model_frequency_penalty	Teacher model frequency penalty inference parameter	number	0.0
teacher_model_presence_penalty	Teacher model presence penalty inference parameter	number	0.0
teacher_model_stop	Teacher model stop inference parameter	string		True
request_batch_size	No of data records to hit teacher model endpoint in one go	integer	10
min_endpoint_success_ratio	The minimum value of (successful_requests / total_requests) required for classifying inference as successful. If (successful_requests / total_requests) < min_endpoint_success_ratio, the experiment will be marked as failed. By default it is 0.7 (0 means all requests are allowed to fail while 1 means no request should fail.)	number	0.7
enable_chain_of_thought	Enable Chain of thought for data generation	string	false	True	['true', 'false']
enable_chain_of_density	Enable Chain of density for text summarization	string	false	True	['true', 'false']
max_len_summary	Maximum Length Summary for text summarization	integer	80	True
data_generation_task_type	Data generation task type. Supported values are: 1. NLI: Generate Natural Language Inference data 2. CONVERSATION: Generate conversational data (multi/single turn) 3. NLU_QA: Generate Natural Language Understanding data for Question Answering data 4. MATH: Generate Math data for numerical responses 5. SUMMARIZATION: Generate Key Summary for an Article	string			['NLI', 'CONVERSATION', 'NLU_QA', 'MATH', 'SUMMARIZATION']

Batch Score Component

Name	Description	Type	Default	Optional	Enum
authentication_type	Authentication type for endpoint. Either `azureml_workspace_connection` or `managed_identity`.	string	azureml_workspace_connection	False	['azureml_workspace_connection', 'managed_identity']
additional_headers	JSON serialized string expressing additional headers to be added to each request.	string		True
debug_mode	Enable debug mode to print all the debug logs in the score step.	boolean	False	False
ensure_ascii	If set to true, the output is guaranteed to have all incoming non-ASCII characters escaped. If set to false, these characters will be output as-is. More detailed information can be found at https://docs.python.org/3/library/json.html	boolean	False	False
max_retry_time_interval	The maximum time (in seconds) spent retrying a payload. If unspecified, payloads are retried for unlimited time.	integer		True
initial_worker_count	The initial number of workers to use for scoring.	integer	5	False
max_worker_count	Overrides `initial_worker_count` if necessary.	integer	200	False
instance_count	Number of nodes in a compute cluster we will run the batch score step on.	integer	1
max_concurrency_per_instance	Number of processes that will be run concurrently on any given node. This number should not be larger than 1/2 of the number of cores in an individual node in the specified cluster.	integer	1
mini_batch_size	The mini batch size for parallel run.	string	100KB	True

Finetuning Component

Name	Description	Type	Default	Optional	Enum
number_of_gpu_to_use_finetuning	number of gpus to be used per node for finetuning, should be equal to number of gpu per node in the compute SKU used for finetune	integer	1	True

Continual-Finetuning model path

Name	Description	Type	Default	Optional	Enum
mlflow_model_path	MLflow model asset path. Special characters like \ and ' are invalid in the parameter value.	mlflow_model		True
pytorch_model_path	Pytorch model asset path. Special characters like \ and ' are invalid in the parameter value.	custom_model		True

Training parameters

Name	Description	Type	Default	Optional
num_train_epochs	training epochs	integer	1	True
per_device_train_batch_size	Train batch size	integer	1	True
learning_rate	Start learning rate.	number	0.0003	True

Validation parameters

Name	Description	Type	Default	Optional	Enum
system_properties	Validation parameters propagated from pipeline.	string		True

Student Model parameters

Name	Description	Type	Default	Optional	Enum
model_asset_id	Asset id of the student model	string		False

Model registration

Name	Description	Type	Default	Optional	Enum
registered_model_name	Name of the registered model	string		True
validation_info	Validation status.	uri_file		True

Outputs

Name	Description	Type
output_model	Output dir to save the finetuned lora weights	uri_folder

components oss_distillation_pipeline - Azure/azureml-assets GitHub Wiki

OSS Distillation Pipeline

oss_distillation_pipeline

Overview

Inputs

Outputs

⚠️ GitHub.com Fallback ⚠️

components oss_distillation_pipeline - Azure/azureml-assets GitHub Wiki

OSS Distillation Pipeline

oss_distillation_pipeline

Overview

Inputs

Outputs

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️