Skip to content

components multimodal_classification_finetune

github-actions[bot] edited this page Nov 15, 2023 · 4 revisions

Multimodal Classification using MMEFT

multimodal_classification_finetune

Overview

Component to finetune multimodal models for classification using MMEFT

Version: 0.0.1

View in Studio: https://ml.azure.com/registries/azureml/components/multimodal_classification_finetune/version/0.0.1

Inputs

Name Description Type Default Optional Enum
problem_type Specify whether its single-label or multi-label multimodal classification task. string multimodal-classification-singlelabel False ['multimodal-classification-singlelabel', 'multimodal-classification-multilabel']

Training parameters

Name Description Type Default Optional Enum
number_of_epochs training epochs integer 1 True
max_steps If set to a positive number, the total number of training steps to perform. Overrides 'epochs'. In case of using a finite iterable dataset the training may stop before reaching the set number of steps when all data is exhausted. integer -1 True
training_batch_size Train batch size integer 8 True
validation_batch_size Validation batch size integer 64 True
auto_find_batch_size Flag to enable auto finding of batch size. If the provided 'training_batch_size' goes into Out Of Memory (OOM) enabling auto_find_batch_size will find the correct batch size by iteratively reducing 'training_batch_size' by a factor of 2 till the OOM is fixed string false True ['true', 'false']
optimizer Optimizer to be used while training string adamw_hf True ['adamw_hf', 'adamw_torch', 'adafactor']
learning_rate Start learning rate. Defaults to linear scheduler. number 0.001 True
warmup_steps Number of steps used for a linear warmup from 0 to learning_rate integer 0 True
weight_decay The weight decay to apply (if not zero) to all layers except all bias and LayerNorm weights in AdamW optimizer number 0.0 True
adam_beta1 The beta1 hyperparameter for the AdamW optimizer number 0.9 True
adam_beta2 The beta2 hyperparameter for the AdamW optimizer number 0.999 True
adam_epsilon The epsilon hyperparameter for the AdamW optimizer number 1e-08 True
gradient_accumulation_steps Number of updates steps to accumulate the gradients for, before performing a backward/update pass integer 64 True
learning_rate_scheduler The scheduler type to use string linear True ['linear', 'cosine', 'cosine_with_restarts', 'polynomial', 'constant', 'constant_with_warmup']
precision Apply mixed precision training. This can reduce memory footprint by performing operations in half-precision. string 32 True ['32', '16']
random_seed Random seed that will be set at the beginning of training integer 42 True
enable_full_determinism Ensure reproducible behavior during distributed training string false True ['true', 'false']
dataloader_num_workers Number of subprocesses to use for data loading. 0 means that the data will be loaded in the main process. integer 0 True
ignore_mismatched_sizes Whether or not to raise an error if some of the weights from the checkpoint do not have the same size as the weights of the model string true True ['true', 'false']
max_grad_norm Maximum gradient norm (for gradient clipping) number 1.0 True
evaluation_strategy The evaluation strategy to adopt during training string epoch True ['epoch', 'steps']
evaluation_steps_interval The evaluation steps in fraction of an epoch steps to adopt during training. Overwrites evaluation_steps if not 0. number 0.0 True
evaluation_steps Number of update steps between two evals if evaluation_strategy='steps' integer 500 True
logging_strategy The logging strategy to adopt during training. string epoch True ['epoch', 'steps']
logging_steps Number of update steps between two logs if logging_strategy='steps' integer 500 True
primary_metric Specify the metric to use to compare two different models string loss True ['loss', 'f1_macro', 'mcc', 'accuracy', 'precision_macro', 'recall_macro']
resume_from_checkpoint Loads Optimizer, Scheduler and Trainer state for finetuning if true string false True ['true', 'false']
save_total_limit If a value is passed, will limit the total amount of checkpoints. Deletes the older checkpoints in output_dir. If the value is -1 saves all checkpoints" integer -1 True

Early Stopping Parameters

Name Description Type Default Optional Enum
apply_early_stopping Enable early stopping string false True ['true', 'false']
early_stopping_patience Stop training when the specified metric worsens for early_stopping_patience evaluation calls integer 1 True
early_stopping_threshold Denotes how much the specified metric must improve to satisfy early stopping conditions number 0.0 True

Deepspeed Parameters

Name Description Type Default Optional Enum
apply_deepspeed If set to true, will enable deepspeed for training string false True ['true', 'false']
deepspeed_config Deepspeed config to be used for finetuning uri_file True

ORT Parameters

Name Description Type Default Optional Enum
apply_ort If set to true, will use the ONNXRunTime training string false True ['true', 'false']

MLFlow Parameters

Name Description Type Default Optional Enum
save_as_mlflow_model If set to true, will save as mlflow model with pyfunc as flavour string true True ['true', 'false']

Dataset parameterss

Name Description Type Default Optional Enum
preprocess_output output folder of preprocessor containing preprocessed metadata information uri_folder False
model_selector_output output folder of model selector containing model metadata like config, checkpoints, tokenizer config uri_folder False

Outputs

Name Description Type
pytorch_model_folder Output dir to save the finetune model and other metadata uri_folder
mlflow_model_folder Output dir to save the finetune model as mlflow model mlflow_model

Environment

azureml://registries/azureml/environments/acft-multimodal-gpu/versions/1

Clone this wiki locally