Training a Model - WHOIGit/ifcb_classifier GitHub Wiki

Model training for this project is initiated with the neuston_net.py TRAIN command.

A number of decisions are involved in the setup and training of a new image-classification NN model.

Neural-net architecture
Labeled dataset
Training limits
Data augmentation options
Training results/statistics

This page gives an overview of the options available to the user. Additional details like examples, and use-cases, commands and tools can be found on their respective wiki pages.

Model Architecture

Pytorch, the underlying NN library this project uses, comes bundled with a number of known CNN architectures for classification, listed below. These CNNs can be initialized with random weights or pre-trained weights (trained on the imagenet dataset).

See Model Parameters for details.

inception_v3
alexnet
squeezenet
vgg: vgg11 vgg13 vgg16 vgg19
resnet: resnet18 resnet34 resnet50 resnet101 resnet152
densenet: densenet121 densenet161 densenet169 densenet201
inception_v4 *recently added as a custom model (not built in to pytorch, no pre-trained weights option)

Labeled Datasets

This is the body of labeled images used during training (and training validation). A Dataset Directory is a directory who's sub-directories are the dataset's class labels. Images in these sub-directories are used as the training and validation data. The ratio of images to be used for training vs. validation can be specified at runtime. It is possible to combine or exclude specific sub-directory class labels at runtime using a Class Config CSV. It is also possible to amalgamate multiple Dataset Directories dynamically using a Dataset Config CSV.

See Dataset Parameters for further details.

Training Limits

Often we want to limit the number of training epochs so as to not waste processing effort needlessly. It's possible to set a minimum number of training epochs, a maximum number of training epochs, and define an early-stopping criteria which stops training when over-fitting is detected.

See Epoch Parameters for details.

Data augmentation

Data augmentation is a technique for improved learning in which input images are modified in order to provide additional variability during training.

Currently neuston_net only supports:

--flip - Reflection along the horizontal and/or vertical axis

See Data Augmentation for details.

Output Options

Besides the creation of a model, neuston_net TRAIN can outputs a variety of files that aid in the review of training performance, as well as auxiliary files. The Output Options allow a user to specify content and format of some of these files, as well as change the default directory results are saved to. These include:

a csv of model performance as it trains through epochs
a copy of the model saved as a portable onnx format file
the list of classification scores for the validation dataset for the best epoch
aggregate score values such as f1, recall, and precision
the list of classes, ordered according to a chosen training-performance metric
Confusion matrix of classes
list of images in the same order as the results, allowing for downstream comparison of mistakes

See Output Options for details.