components finetune_common_validation - Azure/azureml-assets GitHub Wiki

Common Validation Component

finetune_common_validation

Overview

Component to validate the finetune job against Validation Service

Version: 0.0.8

View in Studio: https://ml.azure.com/registries/azureml/components/finetune_common_validation/version/0.0.8

Inputs

component input: mlflow model path

Name Description Type Default Optional Enum
mlflow_model_path MLflow model asset path. Special characters like \ and ' are invalid in the parameter value. mlflow_model True

Data validation component input: training mltable

Name Description Type Default Optional Enum
train_mltable_path Path to the mltable of the training dataset. mltable False

optional component input: validation mltable

Name Description Type Default Optional Enum
validation_mltable_path Path to the mltable of the validation dataset. mltable True

component input: test mltable

Name Description Type Default Optional Enum
test_mltable_path Path to the mltable of the test dataset. mltable True
user_column_names Comma separated list of column names to be used for training. string True

Compute validation

Name Description Type Default Optional Enum
compute_preprocess Compute to be used for preprocess eg. provide 'FT-Cluster' if your compute is named 'FT-Cluster'. Special characters like \ and ' are invalid in the parameter value. If compute cluster name is provided, instance_type field will be ignored and the respective cluster will be used. string True
instance_type_preprocess Instance type to be used for preprocess component in case of serverless compute, eg. standard_d12_v2. The parameter compute_preprocess must be set to 'serverless' for instance_type to be used string True
compute_model_import Compute to be used for model_import eg. provide 'FT-Cluster' if your compute is named 'FT-Cluster' string True
instance_type_model_import Instance type to be used for model_import component in case of serverless compute, eg. standard_d12_v2. The parameter compute_model_import must be set to 'serverless' for instance_type to be used string True
compute_finetune Compute to be used for finetuning eg. provide 'FT-Cluster' if your compute is named 'FT-Cluster'. Special characters like \ and ' are invalid in the parameter value. If compute cluster name is provided, instance_type field will be ignored and the respective cluster will be used string True
instance_type_finetune Instance type to be used for finetune component in case of serverless compute, eg. standard_nc24rs_v3. The parameter compute_finetune must be set to 'serverless' for instance_type to be used string True
instance_count Number of nodes to be used for finetuning (used for distributed training) integer 1 True
process_count_per_instance Number of gpus to be used per node for finetuning, should be equal to number of gpu per node in the compute SKU used for finetune integer 1 True
compute_model_evaluation Compute to be used for model evaluation eg. provide 'FT-Cluster' if your compute is named 'FT-Cluster' string True
instance_type_model_evaluation Instance type to be used for model_evaluation components in case of serverless compute, eg. standard_nc24rs_v3. The parameter compute_model_evaluation must be set to 'serverless' for instance_type to be used string True
task_name Which task the model is solving. string ['tabular-classification', 'tabular-classification-multilabel', 'tabular-regression', 'text-classification', 'text-classification-multilabel', 'text-named-entity-recognition', 'text-summarization', 'question-answering', 'text-translation', 'text-generation', 'fill-mask', 'image-classification', 'image-classification-multilabel', 'image-object-detection', 'image-instance-segmentation', 'video-multi-object-tracking']

ME validation

Name Description Type Default Optional Enum
test_batch_size Test batch size. integer 1 True
label_column_name Label column name in provided test dataset, for example "label". string label True
device string auto False ['auto', 'cpu', 'gpu']
evaluation_config Additional parameters for Computing Metrics. uri_file True
evaluation_config_params Additional parameters as JSON serialized string. string True

Task Speciffic params validation

Name Description Type Default Optional Enum
task_specific_extra_params All extra params. The values should be key values pairs separated by semi-colon. For example "param1=value1;param2=value2" string True

Outputs

Name Description Type
validation_info Validation status. uri_file

Environment

azureml://registries/azureml/environments/acpt-pytorch-2.2-cuda12.1/labels/latest

⚠️ **GitHub.com Fallback** ⚠️