neuston_net TRAIN - WHOIGit/ifcb_classifier GitHub Wiki
python neuston_net.py TRAIN path/to/DATASET BASE_MODEL TRAIN_ID
usage: neuston_net.py TRAIN [-h] [--model-id MODEL_ID] [--img-norm MEAN STD] [--seed SEED] [--split T:V]
[--untrain] [--class-min MIN] [--emax MAX] [--emin MIN] [--estop STOP]
[--class-config CSV COL] [--flip {x,y,xy,x+V,y+V,xy+V}]
[--outdir OUTDIR] [--epochs-log EPOCHS_LOG] [--args-log ARGS_LOG]
[--results FNAME [SERIES ...]]
SRC MODEL TRAINING_ID
positional arguments:
SRC Directory with class-labeled subfolders. May also be a dataset-configuration csv.
MODEL Select a base model. Eg: "inception_v3"
TRAIN_ID Training ID. This value is the default value used by --outdir and --model-id.
optional arguments:
-h, --help show this help message and exit
Model Adjustments:
--untrain If set, initializes MODEL ~without~ pretrained neurons.
Default (unset) is to start with a model pretained on imagenet.
--img-norm MEAN STD Normalize images by MEAN and STD.
eg1: "0.667 0.161", eg2: "0.056,0.058,0.051 0.067,0.071,0.057"
Dataset Adjustments:
--seed SEED Set a specific seed for deterministic output & dataset-splitting reproducability.
--split T:V Ratio of images per-class to split randomly into Training and Validation datasets.
Randomness affected by SEED. Default is "80:20"
--class-config CSV COL Skip and combine classes as defined by column COL of a CSV configuration file.
--class-min MIN Exclude classes with fewer than MIN instances. Default is 2.
--class-max MAX Limit classes to a MAX number of instances.
If multiple datasets are specified with a dataset-configuration csv,
classes from lower-priority datasets are truncated first.
Epoch Parameters:
--emax MAX Maximum number of training epochs. Default is 60.
--emin MIN Minimum number of training epochs. Default is 10.
--estop STOP Number of epochs following a best-epoch after-which to stop training.
AKA Early Stopping. Set STOP=0 to disable. Default is 10.
Augmentation Options:
Data Augmentation is a technique by which training results may improved by simulating novel input
--flip {x,y,xy,x+V,y+V,xy+V}
Training images have 50% chance of being flipped along the designated axis:
(x) vertically, (y) horizontally, (xy) either/both.
May optionally specify "+V" to include Validation dataset
Output Options:
--outdir OUTDIR Default is "training-output/{TRAINING_ID}"
--model-id ID Default is "{date}__{TRAINING_ID}"
--epochs-log ELOG Specify a csv filename. Includes epoch, loss, validation loss, and f1 scores.
Default is "epochs.csv".
--args-log ALOG Specify a human-readable yml output filename containing all user-specified
and default training parameters. Default is "args.yml".
--results FNAME [SERIES ...]
FNAME: Specify a validation-results filename or pattern.
Valid patterns are: "{epoch}". Accepts .json .h5 and .mat file formats.
SERIES: Data to include in FNAME file.
The following are always included and need not be specified:
model_id, timestamp, class_labels, input_classes, output_classes.
Options are: image_basenames image_fullpaths
output_scores output_winscores
confusion_matrix
classes_by_{count|f1|recall|precision}
{f1|recall|precision}_{macro|weighted|perclass}
{counts|val_counts|train_counts}_perclass
--results may be specified multiple times in order to create different files.
If not invoked, the default options are:
FNAME = results.mat
SERIES = image_basenames output_scores counts_perclass
confusion_matrix f1_perclass f1_weighted f1_macro