components mmdetection_image_objectdetection_instancesegmentation_pipeline - Azure/azureml-assets GitHub Wiki
Pipeline component for image object detection and instance segmentation using MMDetection models.
Version: 0.0.23
View in Studio: https://ml.azure.com/registries/azureml/components/mmdetection_image_objectdetection_instancesegmentation_pipeline/version/0.0.23
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
compute_model_import | Compute to be used for model_import eg. provide 'FT-Cluster' if your compute is named 'FT-Cluster' | string | False | ||
compute_finetune | Compute to be used for finetune eg. provide 'FT-Cluster' if your compute is named 'FT-Cluster' | string | False | ||
instance_count | Number of nodes to be used for finetuning (used for distributed training) | integer | 1 | True | |
process_count_per_instance | Number of gpus to be used per node for finetuning, should be equal to number of gpu per node in the compute SKU used for finetune | integer | 1 | True | |
compute_model_evaluation | Compute to be used for model evaluation eg. provide 'FT-Cluster' if your compute is named 'FT-Cluster' | string | True |
Model Selector Component Model family
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
model_family | Which framework the model belongs to. | string | MmDetectionImage | True | ['MmDetectionImage'] |
model_name | Please select models from AzureML Model Assets for all supported models. For MMDetection, provide the model's config name here, same as its specified in MMDetection Model Zoo. To find the correct model name, go to https://github.com/open-mmlab/mmdetection/tree/v3.1.0/configs click on the model type and you will find the model name in the metafile.yml file which is present at configs/<MODEL_TYPE>/metafile.yml location. It is the user responsibility to comply with the model's license terms. | string | True | ||
pytorch_model | Pytorch Model registered in AzureML Asset. | custom_model | True | ||
mlflow_model | Mlflow Model registered in AzureML Asset. | mlflow_model | True | ||
download_from_source | Download model directly from MMDetection instead of system registry | boolean | False | True |
Finetuning Component component input: training mltable
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
training_data | Path to the mltable of the training dataset. | mltable | False |
optional component input: validation mltable
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
validation_data | Path to the mltable of the validation dataset. | mltable | True | ||
image_min_size | Minimum image size after augmentation that is input to the network. If left empty, it would either be overwritten by image_scale in model config or would be chosen based on the task type and model selected. The image will be rescaled as large as possible within the range [image_min_size, image_max_size]. The image size will be constraint so that the max edge is no longer than image_max_size and short edge is no longer than image_min_size. | integer | True | ||
image_max_size | Maximum image size after augmentation that is input to the network. If left empty, it would either be overwritten by image_scale in model config or would be chosen based on the task type and model selected. The image will be rescaled as large as possible within the range [image_min_size, image_max_size]. The image size will be constraint so that the max edge is no longer than image_max_size and short edge is no longer than image_min_size. | integer | True | ||
task_name | Which task the model is solving. | string | ['image-object-detection', 'image-instance-segmentation'] |
primary metric
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
metric_for_best_model | Specify the metric to use to compare two different models. If left empty, will be chosen automatically based on the task type and model selected. | string | True | ['mean_average_precision', 'precision', 'recall'] |
Augmentation parameters
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
apply_augmentations | If set to true, will enable data augmentations for training. | boolean | True | True | |
number_of_workers | Number of subprocesses to use for data loading (PyTorch only). 0 means that the data will be loaded in the main process. | integer | 8 | True |
Deepspeed Parameters
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
apply_deepspeed | If set to true, will enable deepspeed for training. Please note deepspeed is not yet supported for MMDetection, will be enabled in future. | boolean | False | True |
optional component input: deepspeed config
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
deepspeed_config | Deepspeed config to be used for finetuning. Please note deepspeed is not yet supported for MMDetection, will be enabled in future. | uri_file | True | ||
apply_ort | If set to true, will use the ONNXRunTime training. Please note ONNXRunTime is not yet supported for MMDetection, will be enabled in future. | boolean | False | True |
Training parameters
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
number_of_epochs | Number of training epochs. If left empty, will be chosen automatically based on the task type and model selected. | integer | True | ||
max_steps | If set to a positive number, the total number of training steps to perform. Overrides 'number_of_epochs'. In case of using a finite iterable dataset the training may stop before reaching the set number of steps when all data is exhausted. If left empty, will be chosen automatically based on the task type and model selected. | integer | True | ||
training_batch_size | Train batch size. If left empty, will be chosen automatically based on the task type and model selected. | integer | True | ||
validation_batch_size | Validation batch size. If left empty, will be chosen automatically based on the task type and model selected. | integer | True | ||
auto_find_batch_size | Flag to enable auto finding of batch size. If the provided 'per_device_train_batch_size' goes into Out Of Memory (OOM) enabling auto_find_batch_size will find the correct batch size by iteratively reducing 'per_device_train_batch_size' by a factor of 2 till the OOM is fixed. | boolean | False | True |
learning rate and learning rate scheduler
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
learning_rate | Start learning rate. Defaults to linear scheduler. If left empty, will be chosen automatically based on the task type and model selected. | number | True | ||
learning_rate_scheduler | The scheduler type to use. If left empty, will be chosen automatically based on the task type and model selected. | string | True | ['warmup_linear', 'warmup_cosine', 'warmup_cosine_with_restarts', 'warmup_polynomial', 'constant', 'warmup_constant'] | |
warmup_steps | Number of steps used for a linear warmup from 0 to learning_rate. If left empty, will be chosen automatically based on the task type and model selected. | integer | True |
optimizer
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
optimizer | optimizer to be used while training. If left empty, will be chosen automatically based on the task type and model selected. | string | True | ['adamw_hf', 'adamw', 'sgd', 'adafactor', 'adagrad'] | |
weight_decay | The weight decay to apply (if not zero) to all layers except all bias and LayerNorm weights in Adam, AdamW & SGD optimizer. If left empty, will be chosen automatically based on the task type and model selected. | number | True | ||
extra_optim_args | Optional additional arguments that are supplied to SGD Optimizer. The arguments should be semi-colon separated key value pairs and should be enclosed in double quotes. For example, "momentum=0.5; nesterov=True" for sgd. Please make sure to use a valid parameter names for the chosen optimizer. For exact parameter names, please refer https://pytorch.org/docs/1.13/generated/torch.optim.SGD.html#torch.optim.SGD for SGD. Parameters supplied in extra_optim_args will take precedence over the parameter supplied via other arguments such as weight_decay. If weight_decay is provided via "weight_decay" parameter and via extra_optim_args both, values specified in extra_optim_args will be used. | string | True |
gradient accumulation
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
gradient_accumulation_step | Number of update steps to accumulate the gradients for, before performing a backward/update pass. If left empty, will be chosen automatically based on the task type and model selected. | integer | True |
mixed precision training
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
precision | Apply mixed precision training. This can reduce memory footprint by performing operations in half-precision. | string | 32 | True | ['32', '16'] |
metric thresholds
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
iou_threshold | IOU threshold used during inference in non-maximum suppression post processing. | number | True | ||
box_score_threshold | During inference, only return proposals with a score greater than box_score_threshold . The score is the multiplication of the objectness score and classification probability. |
number | True |
random seed
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
random_seed | Random seed that will be set at the beginning of training. | integer | 42 | True |
evaluation strategy parameters
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
evaluation_strategy | The evaluation strategy to adopt during training. Please note that the save_strategy and evaluation_strategy should match. | string | epoch | True | ['epoch', 'steps'] |
evaluation_steps | Number of update steps between two evals if evaluation_strategy='steps'. Please note that the saving steps should be a multiple of the evaluation steps. | integer | 500 | True |
logging strategy parameters
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
logging_strategy | The logging strategy to adopt during training. | string | epoch | True | ['epoch', 'steps'] |
logging_steps | Number of update steps between two logs if logging_strategy='steps'. | integer | 500 | True |
Save strategy
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
save_strategy | The checkpoint save strategy to adopt during training. Please note that the save_strategy and evaluation_strategy should match. | string | epoch | True | ['epoch', 'steps'] |
save_steps | Number of updates steps before two checkpoint saves if save_strategy="steps". Please note that the saving steps should be a multiple of the evaluation steps. | integer | 500 | True |
model checkpointing limit
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
save_total_limit | If a value is passed, will limit the total amount of checkpoints. Deletes the older checkpoints in output_dir. If the value is -1 saves all checkpoints". | integer | 5 | True |
Early Stopping Parameters
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
early_stopping | Enable early stopping. | boolean | False | True | |
early_stopping_patience | Stop training when the specified metric worsens for early_stopping_patience evaluation calls. | integer | 1 | True |
Grad Norm
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
max_grad_norm | Maximum gradient norm (for gradient clipping). If left empty, will be chosen automatically based on the task type and model selected. | number | True |
resume from the input model
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
resume_from_checkpoint | Loads optimizer, Scheduler and Trainer state for finetuning if true. | boolean | False | True | |
save_as_mlflow_model | Save as mlflow model with pyfunc as flavour. | boolean | True | True |
Model prediction Component component input: test mltable
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
test_data | Path to the mltable of the test dataset. | mltable | False | ||
test_batch_size | Test batch size. | integer | 4 | True | |
label_column_name | Label column name to be ignored by model for prediction purposes, for example "label". | string | label | True | |
input_column_names | Input column names provided to model for prediction, for example column1. Add comma delimited values in case of multiple input columns, for example column1,column2. | string | image,image_meta_info,text_prompt | True | |
evaluation_config | Additional parameters for Computing Metrics. | uri_file | True | ||
evaluation_config_params | Additional parameters as JSON serialized string. | string | True |
########################### Finetuning Component ########################### #
Name | Description | Type |
---|---|---|
mlflow_model_folder | Output dir to save the finetune model as mlflow model. | mlflow_model |
pytorch_model_folder | Output dir to save the finetune model as torch model. | custom_model |
Compute metrics Component
Name | Description | Type |
---|---|---|
evaluation_result | Test Data Evaluation Results | uri_folder |