components image_classification_pipeline - Azure/azureml-assets GitHub Wiki

Image Classification Pipeline

image_classification_pipeline

Overview

Pipeline component for image classification.

Version: 0.0.21

View in Studio: https://ml.azure.com/registries/azureml/components/image_classification_pipeline/version/0.0.21

Inputs

------------------- Computes -------------------

Name Description Type Default Optional Enum
compute_model_import Compute to be used for framework_selector eg. provide 'cpu-cluster' if your compute is named 'cpu-cluster'. string False
compute_finetune Compute to be used for running the selected framework eg. provide 'gpu-cluster' if your compute is named 'gpu-cluster'. string False
instance_count Number of nodes to be used for finetuning (used for distributed training) integer 1 True
process_count_per_instance Number of gpus to be used per node for finetuning, should be equal to number of gpu per node in the compute SKU used for finetune. integer 1 True

------------------- Model Framework Selector -------------------

Name Description Type Default Optional Enum
model_name Name of the model. Based on this model name, a framework will be selected (Hugging Face, MM Detection). string True
download_from_source Download model directly from HuggingFace instead of system registry boolean False True

------------------- Data Inputs ------------------

Name Description Type Default Optional Enum
training_data Path to MLTable for training data. mltable False
validation_data Path to MLTable for validation data. mltable True

------------------- Classification Type ------------------

Name Description Type Default Optional Enum
task_type Whether a single image can have multiple labels. string ['image-classification', 'image-classification-multilabel']
primary_metric Primary metric for the task string True ['accuracy', 'iou']
ams_gradient Enable ams_gradient when optimizer is adam or adamw. boolean True
beta1 Value of beta1 when optimizer is adam or adamw. Must be a float in the range [0, 1]. number True
beta2 Value of beta2 when optimizer is adam or adamw. Must be a float in the range [0, 1]. number True
checkpoint_frequency Frequency to store model checkpoints. Must be a positive integer. integer True
checkpoint_run_id The run ID of the experiment that has a pretrained checkpoint for incremental training. string True
early_stopping Enable early stopping logic during training. boolean True
early_stopping_patience Minimum number of epochs or validation evaluations with no primary metric improvement before the run is stopped. Must be a positive integer. integer True
early_stopping_delay Minimum number of epochs or validation evaluations to wait before primary metric improvement is tracked for early stopping. Must be a positive integer. integer True
evaluation_frequency Frequency to evaluate validation dataset to get metric scores. Must be a positive integer. integer True
gradient_accumulation_step Number of forward passes without updating the model weights while accumulating the gradients of those steps, and then using the accumulated gradients to compute the weight updates. Must be a positive integer. integer True
layers_to_freeze How many layers to freeze for your model. For instance, passing 2 as value for seresnext means freezing layer0 and layer1 referring to the below supported model layer info. Must be a positive integer. integer True
learning_rate Initial learning rate. number True
learning_rate_scheduler Type of learning rate scheduler. Must be warmup_cosine or step. string True ['warmup_cosine', 'step']
momentum Value of momentum when optimizer is sgd. Must be a float in the range [0, 1]. number True
nesterov Enable nesterov when optimizer is sgd. boolean True
number_of_epochs Number of training epochs integer True
number_of_workers Number of subprocesses to use for data loading (PyTorch only). 0 means that the data will be loaded in the main process. integer True
optimizer Type of optimizer string True ['sgd', 'adam', 'adamw']
random_seed Random seed that will be set at the beginning of training. integer True
step_lr_gamma Value of gamma when learning rate scheduler is step. Please check for https://learn.microsoft.com/azure/machine-learning/reference-automl-images-hyperparameters more information. number True
step_lr_step_size Value of step size when learning rate scheduler is step. Please check for https://learn.microsoft.com/azure/machine-learning/reference-automl-images-hyperparameters more information. integer True
training_batch_size Training batch size. integer True
training_crop_size Image crop size that's input to your neural network for training dataset. Notes - seresnext doesn't take an arbitrary size. ViT-variants should have the same validation_crop_size and training_crop_size. integer True
validation_batch_size Validation batch size. integer True
validation_crop_size Image crop size that's input to your neural network for validation dataset. Note - seresnext doesn't take an arbitrary size. ViT-variants should have the same validation_crop_size and training_crop_size. integer True
validation_resize_size Image size to which to resize before cropping for validation dataset. Note - seresnext doesn't take an arbitrary size. integer True
warmup_cosine_lr_cycles Value of cosine cycle when learning rate scheduler is warmup_cosine. Please check for https://learn.microsoft.com/azure/machine-learning/reference-automl-images-hyperparameters more information. number True
warmup_cosine_lr_warmup_epochs Value of warmup epochs when learning rate scheduler is warmup_cosine. Please check for https://learn.microsoft.com/azure/machine-learning/reference-automl-images-hyperparameters more information. integer True
weight_decay Value of weight decay used by the optimizer. number True
weighted_loss Value of weighted loss. integer True

Outputs

Name Description Type
pytorch_model_folder The trained pytorch model. custom_model
mlflow_model_folder The trained MLFlow model. mlflow_model
⚠️ **GitHub.com Fallback** ⚠️