Framework test and benchmark - aifoundry-org/erbium GitHub Wiki

Testing approach

All frameworks have simple basic operations in common. Amount of that operations allow one to provide a minimal example of Neural Network architecture to test quality and performance of these frameworks.

The following section is mostly computer-vision-based due to yours truly’s obvious bias. That being said, extending this list with some audio-based or sensor-processing networks is also of some interest.

Models

The basic example is Multi-layer Perceptron (MLP) which consist of:

Linear (Fully-connected) layer
Dropout operation
ReLU activation function

Similar implementation as presented here

A slightly more advanced example is a CNN-based classifier network, such as ResNet-20. This architecture introduces additional operations commonly supported by most frameworks, including:

Convolution
Batch Normalization
Residual connections
Pooling layers

Implementation is presented here

Datasets

For MLP network MNIST dataset is sufficient.
For ResNet-20 network CIFAR-10 dataset is optimal

Metrics

For metrics several ones are proposed to use:

Accuracy - For measure of prediction quality, because some frameworks introduce rather extreme optimizations that may affect the overall quality.
Latency - The amount of time it takes for a neural network to produce a prediction for a single input sample.
Throughput - The number of predictions produced by a neural network in a given amount of time
Memory footprint - The amount of memory consumed by the model and a framework itself (since we have a limited resources.)