Classification training v2 - dnum-mi/basegun-ml GitHub Wiki

Analysis of the previous model

After analysing the previous classification model, 4 pain points have been identified:

  • Inference time too long: End-to-end load testing of the application in the production environment revealed that the inference time exceeds the previously defined specification.
  • Rotation of the input image affects predictions: Real-world testing showed that rotating the picture could, in some cases, alter the typology prediction.
  • Unbalanced performances: While the overall precision of the model is around 75%, there is significant imbalance among typologies, with some achieving 97% precision and others only 50%.
  • Weapon detection: Early user feedback highlighted the absence of a weapon detection module. Currently, the application assumes the presence of a weapon in the picture to determine the typology (classification).

    Training strategy

    To address these pain points, several strategies have been considered, prioritizing quick wins with positive impact. Let’s delve into the main approaches:

    Change model architecture

    The first step involves transitioning from an EfficientNet architecture to YOLOv8. YOLOv8 offers significantly faster inference times. To isolate the impact of the model change, we used the same dataset and data augmentation parameters. alt text

    The results showed that using YOLOv8 for classification reduces classification errors by fourfold and significantly improves inference time. This new model will serve as our baseline for further enhancements.

    Data augmentation

    To mitigate the dependency on image rotation, we introduced data augmentation techniques. Specifically, we applied the following geometric transformations:

  • Flip left/right (with a 0.2 probability, accounting for left-handed individuals)
  • Flip up/down (also with a 0.2 probability)
  • A 180° random rotation

    In order to assess the impact of these transformations, the test dataset has been modified using the same transformations. On this new dataset, the previous YOLOv8 model goes from 93.5% precision to 91% precision. Let's have a look to the impact after the training with the new data augmentation. alt text

    This new data augmentation allow the model to be more precise and less rotation dependant

    Typologies remodeling

    Some experimentation has been fulfilled in order to determined if new typologies will give better performance to the model:

  • Merge the semi auto long guns typologies (as it was in the dataset V0)
  • Split the autre_pistolet typologie because it was the less precise.

    The conclusion were the following ones:

  • the merging of the long guns semi-auto does not give a significative impact on performance
  • the split of autre_pistolet is a complex task because of the lack of image in this typology and the diversity. Moreover this typology is not commonly found on the field by the law enforcement forces so it is not a priority.

    As a result, we decided to retain the existing typologies

    Model size

    YOLOv8 offers various model sizes, from nano (our current choice) to XL. Larger models generally yield better performance but come with longer inference times. The YOLOv8 nano offers good performance and is quite fast (~30ms on an isoprod environment), therefore we decided to test the YOLOv8 small which is the next size as the inference time was below the threshold defined.

    Here is the comparison. alt text

    While the precision gain was not substantial, we opted to use the small model for now. However, we’ve kept the nano model as a backup in case new constraints arise.

  • ⚠️ **GitHub.com Fallback** ⚠️