components openai_completions_finetune - Azure/azureml-assets GitHub Wiki

OpenAI Completions Finetune Pipeline

openai_completions_finetune

Overview

Finetune your own OAI model. Visit https://learn.microsoft.com/en-us/azure/cognitive-services/openai/ for more info.

Version: 0.2.2

View in Studio: https://ml.azure.com/registries/azureml/components/openai_completions_finetune/version/0.2.2

Inputs

Name	Description	Type	Default	Optional	Enum
model	OAI model engine	string	davinci	False	['ada', 'babbage', 'curie', 'davinci', 'text-davinci-fine-tune-002']
registered_model_name	User-defined registered model name	string		False
train_dataset	Input dataset (file or folder). If a folder dataset is passed, includes all nested files.	uri_folder		False
validation_dataset	Input dataset (file or folder). If a folder dataset is passed, includes all nested files.	uri_folder		True
lora_weights	LoRA weights for continual finetuning. This is optional.	uri_folder		True
n_epochs	Number of epochs for the training	integer	4	True
batch_size	The batch size to use for training. When set to -1, batch_size is calculated as 0.2% of examples in training set and the max is 256.	integer	-1	True
learning_rate_multiplier	The learning rate multiplier to use for training. Must be between 0.0 and 5.0.	number	0.1	True
prompt_loss_weight	The prompt loss weight to use for training	number	0.1	True
compute_classification_metrics	If set, we calculate classification-specific metrics such as accuracy and F-1 score using the validation set at the end of every epoch. In order to compute classification metrics, you must provide a validation_file. Additionally, you must specify classification_n_classes for multiclass classification or classification_positive_class for binary classification.	boolean		True
classification_n_classes	The number of classes in a classification task. This parameter is required for multiclass classification.	integer		True
classification_positive_class	The positive class in binary classification. This parameter is needed to generate precision, recall, and F1 metrics when doing binary classification.	string		True
classification_betas	If this is provided, we calculate F-beta scores at the specified beta values. The F-beta score is a generalization of F-1 score. This is only used for binary classification. With a beta of 1 (i.e. the F-1 score), precision and recall are given the same weight. A larger beta score puts more weight on recall and less on precision. A smaller beta score puts more weight on precision and less on recall. The value specified should be a comma separated list of doubles.	string		True
quota_enforcement_resource_id	Owner subscription id.	string		True

Outputs

Name	Description	Type
output_model	Dataset with the output model weights (LoRA weights)	uri_folder

⚠️ GitHub.com Fallback ⚠️