Intel ML Benchmarks - AshokBhat/ml GitHub Wiki

Benchmarks

BERT-Large SQuAD: 1.45x higher INT8 real-time inference throughput & 1.74x higher INT8 batch inference throughput on Ice Lake vs. prior generation Cascade Lake Platinum 8380:
New:1-node, 2x Intel Xeon Platinum 8380 processor on Coyote Pass with 512 GB (16 slots/ 32GB/ 3200) total DDR4 memory, ucode X261, HT on, Turbo on, Ubuntu 20.04 LTS, 5.4.0-65-generic, 1x Intel_SSDSC2KG96, Intel SSDPE2KX010T8, BERT - Large SQuAD, gcc-9.3.0, oneDNN 1.6.4, BS=1,128 INT8, TensorFlow 2.4.1 with Intel optimizations for 3rd Gen Intel Xeon Scalable processor, upstreamed to TensorFlow- 2.5 (container- intel/intel-optimized-tensorflow:tf-r2.5-icx-b631821f), Model zoo: https://github.com/IntelAI/models/tree/icx-launch-public/quickstart/, test by Intel on 3/12/2021.
Baseline: Platinum 8280: 1-node, 2x Intel Xeon Platinum 8280 processor on Wolf Pass with 384 GB (12 slots/ 32GB/ 2933) total DDR4 memory, ucode 0x5003003, HT on, Turbo on, Ubuntu 20.04 LTS, 5.4.0-48-generic, 1x Samsung_SSD_860, Intel SSDPE2KX040T8, BERT - Large SQuAD, gcc-9.3.0, oneDNN 1.6.4, BS=1,128 INT8, TensorFlow 2.4.1 with Intel optimizations for 3rd Gen Intel Xeon Scalable processor, upstreamed to TensorFlow- 2.5 (container- intel/intel-optimized-tensorflow:tf-r2.5-icx-b631821f), Model zoo: https://github.com/IntelAI/models/tree/icx-launch-public/quickstart/, test by Intel on 2/17/2021.