Framework test and benchmark - aifoundry-org/erbium GitHub Wiki
Testing approach
All frameworks have simple basic operations in common. Amount of that operations allow one to provide a minimal example of Neural Network architecture to test quality and performance of these frameworks.
The following section is mostly computer-vision-based due to yours truly’s obvious bias. That being said, extending this list with some audio-based or sensor-processing networks is also of some interest.
Models
The basic example is Multi-layer Perceptron (MLP) which consist of:
- Linear (Fully-connected) layer
- Dropout operation
- ReLU activation function
Similar implementation as presented here
A slightly more advanced example is a CNN-based classifier network, such as ResNet-20. This architecture introduces additional operations commonly supported by most frameworks, including:
- Convolution
- Batch Normalization
- Residual connections
- Pooling layers
Implementation is presented here
Datasets
For MLP network MNIST dataset is sufficient.
For ResNet-20 network CIFAR-10 dataset is optimal
Metrics
For metrics several ones are proposed to use:
- Accuracy - For measure of prediction quality, because some frameworks introduce rather extreme optimizations that may affect the overall quality.
- Latency - The amount of time it takes for a neural network to produce a prediction for a single input sample.
- Throughput - The number of predictions produced by a neural network in a given amount of time
- Memory footprint - The amount of memory consumed by the model and a framework itself (since we have a limited resources.)