TRAIN data_augmentation - WHOIGit/ifcb_classifier GitHub Wiki

Data Augmentation

Data augmentation is a technique for improved learning in which input images are modified in order to provide more variability during training. This is particularly important in classes with low instance counts. It's important to note that transformations performed for augmentation are not applied during inference.

IFCB imagery is very consistent in terms of luminosity, clarity and object/ROI centering and as such we have not explored many data augmentation techniques that is typical in other image classification tasks.

Below is the usage text pertaining to Data Augmentation on Neuston Net TRAIN

Augmentation Options:
  Data Augmentation is a technique by which training results may improved by simulating novel input

  --flip {x,y,xy,x+V,y+V,xy+V}
                        Training images have 50% chance of being flipped along the designated axis:
                        (x) vertically, (y) horizontally, (xy) either/both. 
                        May optionally specify "+V" to include Validation dataset

Reflection (--flip)

This augmentation gives an image a 50:50 chance to be reflected or "flipped" along designated axis when it is loaded to memory during training. This option is appropriate for ifcb imagery as left-right/up-down symmetry is not significant for classification, even for long specimens. If chirality becomes significant for the identification of certain species, this option should not be used. --flip allows for 6 distinct options by choosing to include any number of three components:

  • x - allows for vertical reflection (around the x axis)
  • y - allows for horizontal reflection (around the y axis)
  • +V - augmentation additionally gets applied to the Validation dataset, not just Training dataset.

Opinion: we recommend using --flip xy for training on ifcb imagery