components openai_completions_finetune - Azure/azureml-assets GitHub Wiki

OpenAI Completions Finetune Pipeline

openai_completions_finetune

Overview

Finetune your own OAI model. Visit https://learn.microsoft.com/en-us/azure/cognitive-services/openai/ for more info.

Version: 0.2.2

View in Studio: https://ml.azure.com/registries/azureml/components/openai_completions_finetune/version/0.2.2

Inputs

Name Description Type Default Optional Enum
model OAI model engine string davinci False ['ada', 'babbage', 'curie', 'davinci', 'text-davinci-fine-tune-002']
registered_model_name User-defined registered model name string False
train_dataset Input dataset (file or folder). If a folder dataset is passed, includes all nested files. uri_folder False
validation_dataset Input dataset (file or folder). If a folder dataset is passed, includes all nested files. uri_folder True
lora_weights LoRA weights for continual finetuning. This is optional. uri_folder True
n_epochs Number of epochs for the training integer 4 True
batch_size The batch size to use for training. When set to -1, batch_size is calculated as 0.2% of examples in training set and the max is 256. integer -1 True
learning_rate_multiplier The learning rate multiplier to use for training. Must be between 0.0 and 5.0. number 0.1 True
prompt_loss_weight The prompt loss weight to use for training number 0.1 True
compute_classification_metrics If set, we calculate classification-specific metrics such as accuracy and F-1 score using the validation set at the end of every epoch. In order to compute classification metrics, you must provide a validation_file. Additionally, you must specify classification_n_classes for multiclass classification or classification_positive_class for binary classification. boolean True
classification_n_classes The number of classes in a classification task. This parameter is required for multiclass classification. integer True
classification_positive_class The positive class in binary classification. This parameter is needed to generate precision, recall, and F1 metrics when doing binary classification. string True
classification_betas If this is provided, we calculate F-beta scores at the specified beta values. The F-beta score is a generalization of F-1 score. This is only used for binary classification. With a beta of 1 (i.e. the F-1 score), precision and recall are given the same weight. A larger beta score puts more weight on recall and less on precision. A smaller beta score puts more weight on precision and less on recall. The value specified should be a comma separated list of doubles. string True
quota_enforcement_resource_id Owner subscription id. string True

Outputs

Name Description Type
output_model Dataset with the output model weights (LoRA weights) uri_folder
⚠️ **GitHub.com Fallback** ⚠️