Quantisation - ofithcheallaigh/masters_project GitHub Wiki
#Introcution The aim of quantiation is to reduce the model size.
Post-training quantisation does not require any changes to the designed network, which allows a previously trained network into a quantised network, such as 32-bit floating point or 16 bit floating point.
The image below shows the model size with no quantisation:
The Float16
quantisation results can be seen below:
While this is a reduction in size further work could be done to increase this optimisation.