MLPerf Inference v0.5 - AshokBhat/ml GitHub Wiki
About
- Inference Benchmark v0.5 published in Nov 2019
Components
| Load |
Ref Model |
Params |
GOPS/Input |
Data Set |
Quality Target |
Metric |
| Heavy |
[ResNet-50]] v1.5 ](/AshokBhat/ml/wiki/25.6M- |
-7.8- |
-[[ImageNet) (224x224) |
99% OF [FP32]] (76.456%) ](/AshokBhat/ml/wiki/[[Top-1-Accuracy) |
|
|
| Light |
[MobileNet-v1]] 224 ](/AshokBhat/ml/wiki/4.2M- |
-1.138- |
-[[ImageNet) (224x224) |
98% OF FP32 (71.676%) |
Top-1 Accuracy |
|
| Load |
Ref Model |
Params |
GOPS/Input |
Data Set |
Quality Target |
Metric |
| Heavy |
[SSD]]-ResNet34 ](/AshokBhat/ml/wiki/36.3M- |
-433- |
-[[COCO) (1200x1200) |
99% OF [FP32]] ](/AshokBhat/ml/wiki/0.20-[[mAP) |
|
|
| Light |
[SSD]]-MobileNet-V1 ](/AshokBhat/ml/wiki/6.91M |
-2.47- |
-[[COCO) (300x300) |
99% OF [FP32]] ](/AshokBhat/ml/wiki/0.22-[[mAP) |
|
|
| Load |
Ref Model |
Params |
GOPS/Input |
Data Set |
Quality Target |
Metric |
| ? |
GNMT |
210M |
|
WMT16 EN-DE |
99% OF FP32 |
23.9 SacreBleu |
Scenarios and Metrics
| Scenario |
Query Generation |
Metric |
Samples/Query |
Examples |
| Single-Stream (SS) |
Sequential |
90th-Percentile Latency |
1 |
Typing Autocomplete, Real-Time AR |
| Multistream (MS) |
Arrival Interval With Dropping |
Number Of Streams, Subject To Latency Bound |
N |
Multicamera Driver Assistance, Large-Scale Automation |
| Server (S) |
Poisson Distribution |
Queries Per Second Subject To Latency Bound |
1 |
Translation Website |
| Offline (O) |
Batch |
Throughput |
At Least 24,576 |
Photo Categorization |
Closed vs Open systems
Closed division - For comparison of different systems
- Same models, data sets, and quality targets to ensure comparability across wildly different architectures.
Open division - For fostering innovation
- Innovation in ML systems, algorithms, optimization, and hardware/software co-design.
- Same ML task but can change the model architecture and the quality targets
Further information
See also