Training a Model - WHOIGit/ifcb_classifier GitHub Wiki

Model training for this project is initiated with the neuston_net.py TRAIN command.

A number of decisions are involved in the setup and training of a new image-classification NN model.

  • Neural-net architecture
  • Labeled dataset
  • Training limits
  • Data augmentation options
  • Training results/statistics

This page gives an overview of the options available to the user. Additional details like examples, and use-cases, commands and tools can be found on their respective wiki pages.

Model Architecture

Pytorch, the underlying NN library this project uses, comes bundled with a number of known CNN architectures for classification, listed below. These CNNs can be initialized with random weights or pre-trained weights (trained on the imagenet dataset).

See Model Parameters for details.

  • inception_v3
  • alexnet
  • squeezenet
  • vgg: vgg11 vgg13 vgg16 vgg19
  • resnet: resnet18 resnet34 resnet50 resnet101 resnet152
  • densenet: densenet121 densenet161 densenet169 densenet201
  • inception_v4 *recently added as a custom model (not built in to pytorch, no pre-trained weights option)

Labeled Datasets

This is the body of labeled images used during training (and training validation). A Dataset Directory is a directory who's sub-directories are the dataset's class labels. Images in these sub-directories are used as the training and validation data. The ratio of images to be used for training vs. validation can be specified at runtime. It is possible to combine or exclude specific sub-directory class labels at runtime using a Class Config CSV. It is also possible to amalgamate multiple Dataset Directories dynamically using a Dataset Config CSV.

See Dataset Parameters for further details.

Training Limits

Often we want to limit the number of training epochs so as to not waste processing effort needlessly. It's possible to set a minimum number of training epochs, a maximum number of training epochs, and define an early-stopping criteria which stops training when over-fitting is detected.

See Epoch Parameters for details.

Data augmentation

Data augmentation is a technique for improved learning in which input images are modified in order to provide additional variability during training.

Currently neuston_net only supports:

  • --flip - Reflection along the horizontal and/or vertical axis

See Data Augmentation for details.

Output Options

Besides the creation of a model, neuston_net TRAIN can outputs a variety of files that aid in the review of training performance, as well as auxiliary files. The Output Options allow a user to specify content and format of some of these files, as well as change the default directory results are saved to. These include:

  • a csv of model performance as it trains through epochs
  • a copy of the model saved as a portable onnx format file
  • the list of classification scores for the validation dataset for the best epoch
  • aggregate score values such as f1, recall, and precision
  • the list of classes, ordered according to a chosen training-performance metric
  • Confusion matrix of classes
  • list of images in the same order as the results, allowing for downstream comparison of mistakes

See Output Options for details.