ResNet 50 Performance - AshokBhat/ml GitHub Wiki

Inference performance

Single Stream (Latency in ms)

Submitter System Latency Processor Num Accelerator Num Software
Habana Labs HL-102-Goya PCI-board 0.24 Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz 1 Goya 1 Synapse-V0.2.0
[Intel]] ](/AshokBhat/ml/wiki/Intel-Xeon-Platinum-9200-processors- -1.37- -Intel-Xeon-Platinum-9282-Processor- -2- -None- -None- -[[PyTorch) Caffe2
[NVIDIA]] ](/AshokBhat/ml/wiki/NVIDIA-Jetson-AGX-Xavier-(Xavier)- -2.04- -NVIDIA-Carmel-(ARMv8.2)- --1- -NVIDIA-Xavier- -1- -TensorRT-6.0,-Jetpack-4.3-DP,-[[CUDA) 10.0, cuDNN 7.6.3
Qualcomm SDM855 QRD 8.95 Qualcomm Kryo485 1 Qualcomm Hexagon 690 Processor: Hexagon Vector Extensions (HVX), Hexagon Tensor Accelerator (HTA) 1 Snapdragon Neural Processing Engine (SNPE) V1.30
Intel DELL ICL i3 1005G1 13.58 Intel Core i3-1005G1 Processor 1 Intel UHD Graphics 1 OpenVINO
dividiti Linaro HiKey960 (hikey960) 203.99 HiSilicon Kirin960 1 Arm Mali-G71 MP8 1 ArmNN v19.08 (OpenCL)
dividiti Huawei Mate 10 Pro (mate10pro) 354.13 HiSilicon Kirin970 1 Arm Mali-G72 MP12 1 ArmNN v19.08 (OpenCL)
dividiti Firefly-RK3399 (firefly) 391.02 Rockchip RK3399 1 None None ArmNN v19.08 (Neon)
dividiti Firefly-RK3399 (firefly) 447.9 Rockchip RK3399 1 Arm Mali-T860 MP4 1 ArmNN v19.08 (OpenCL)
dividiti Raspberry Pi 4 (rpi4) 448.31 Broadcom BCM2711B0 1 None None ArmNN v19.08 (Neon)
dividiti Linaro HiKey960 (hikey960) 494.9 HiSilicon Kirin960 1 None None ArmNN v19.08 (Neon)
dividiti Huawei Mate 10 Pro (mate10pro) 494.92 HiSilicon Kirin970 1 None None ArmNN v19.08 (Neon)
dividiti Linaro HiKey960 (hikey960) 518.07 HiSilicon Kirin960 1 None None TFLite v1.15.0-rc2
dividiti Firefly-RK3399 (firefly) 695.11 Rockchip RK3399 1 None None TFLite v1.15.0-rc2
dividiti Raspberry Pi 4 (rpi4) 1,916.65 Broadcom BCM2711B0 1 None None TFLite v1.15.0-rc2

Server

Server QPS Processor Sockets Accelerator Num Software
4850.62 Intel® Xeon® Platinum 9282 Processor 2 None None PyTorch Caffe2
16014.29 Intel Skylake 2 TPU v3 4 TensorFlow, TPU 1.15.dev
20742.83 Intel(R) Xeon(R) Gold 6154 2 Nvidia T4 4 NGC19.09 TensorRT
41546.64 Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz 2 NVIDIA Tesla T4 8 TensorRT 6.0, CUDA 10.1, cuDNN 7.6.3
60030.57 Intel(R) Xeon(R) 8268 2 NVIDIA TITAN RTX 4
103532.10 Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz 2 NVIDIA Tesla T4 20 TensorRT 6.0, CUDA 10.1, cuDNN 7.6.3

Offline

Submitter Offline Processor Num Accelerator Num Software
NVIDIA 2,158.93 NVIDIA Carmel (ARMv8.2) 1 NVIDIA Xavier 1 TensorRT 6.0, Jetpack 4.3-DP, CUDA 10.0, cuDNN 7.6.3
Alibaba Cloud 5,540.10 Intel Xeon Platinum 8163 1 Nvidia Tesla T4 1 TensorRT 6.0, CUDA 10.1, cuDNN 7.6.3
Intel 5,965.62 Intel® Xeon® Platinum 9282 Processor 2 None None PyTorch Caffe2
Dell EMC 22,438.00 Intel(R) Xeon(R) Gold 6154 2 Nvidia T4 4 NGC19.09 TensorRT
NVIDIA 66,250.40 Intel(R) Xeon(R) 8268 2 NVIDIA TITAN RTX 4 TensorRT 6.0, CUDA 10.1, cuDNN 7.6.3

See also