Model discussion - GerardWalsh/golf-classifier GitHub Wiki

Model notes

Outcomes

Consider two roles a model like this could play: as a search/rescue aid or in an insurance mobile app . In the first case, the model could be used to aid a search effort for a particular car in the case it was stolen. Here we would expect better recall performance from the model as we would have a human operator supervising the model output and would rather deal with many images incorrectly classified, that loose the target due to precision being favored. In the second case, if an app was developed for insurance company and we would want to generate a report of the car to send to the insurance company, we would want the car model prediction to be precise - the outcome will vastly effect the insurance quote etc. At this moment, we will favor precision as it easier to evaluate on test images.

Simple model

A simple model was used as an initial trial, which overfitted the training set (accuracy of 1) and generalized poorly (accuracy of ~55%) on the validation/dev set. This model was deemed to be too simple and would not be able to learn the intricacies of the dataset.

Complex model - ResNet50

ResNet50 was chosen as it can represent complex functions due to it's depth and can effectively do so due skip connections. Training ResNet50 by using weights trained on Imagenet as initialization, we achieve 98% on the training set, 90% on the dev set and 89% on the testing set. We thus are overfitting and have a variance issue. Transfer learning ("freezing" the feature extractor backbone, ResNet50) was also investigated as a way to mitigate overfitting (due to having to learn less paramaters, mostly those in our FC layer) but was unsuccessful. A simpler model was thus investigated.

Complex model - VGG16

A simpler CNN architecture was investigated such that less parameters needed to be learnt. Transfer learning with VGG16 yielded 85% on the test/validation set and 84% on the test set - somewhat short of ResNet. Training VGG by using weights learnt on Imagenet as initialization yielded 93% on the training set and 91% on the validation/dev set - this looks promising as 90.5% was achieved on the test set (1.5% higher than ResNet). Development will continue on VGG16.

TODO

  1. Increase accuracy on training set to ~95%.
  2. Add more regularization to decrease variance/overfitting issue with VGG.