GeforceGTX1080Ti SSDLiteMobilenetV2 - wom-ai/inference_results_v0.5 GitHub Wiki

2020-06-02
2020-06-02

INT8 CHW4 Performance mode only (C++)

[2020-06-02 10:47:40,520 main.py:302 INFO] Using config files: measurements/GeforceGTX1080Ti/ssd-small/SingleStream/config.json
[2020-06-02 10:47:40,520 __init__.py:142 INFO] Parsing config file measurements/GeforceGTX1080Ti/ssd-small/SingleStream/config.json ...
[2020-06-02 10:47:40,520 main.py:306 INFO] Processing config "GeforceGTX1080Ti_ssd-small_SingleStream"
[2020-06-02 10:47:40,520 main.py:116 INFO] Running harness for ssd-small benchmark in SingleStream scenario...
BenchmarkHarness (
{'gpu_batch_size': 1, 'gpu_single_stream_expected_latency_ns': 1621000, 'input_dtype': 'int8', 'input_format': 'chw4', 'map_path': 'data_maps/coco/val_map.txt', 'precision': 'int8', 'tensor_path': '${PREPROCESSED_DATA_DIR}/coco/val2017/SSDMobileNet/int8_chw4', 'use_graphs': False, 'system_id': 'GeforceGTX1080Ti', 'scenario': 'SingleStream', 'benchmark': 'ssd-small', 'config_name': 'GeforceGTX1080Ti_ssd-small_SingleStream', 'test_mode': 'PerformanceOnly', 'warmup_duration': 20.0, 'log_dir': '/work/mlperf/inference_results_v0.5/closed/NVIDIA/build/logs/2020.06.02-10.47.40'}
BenchmarkHarness )
[2020-06-02 10:47:40,523 __init__.py:42 INFO] Running command: ./build/bin/harness_default --plugins="build/plugins/NMSOptPlugin/libnmsoptplugin.so" --logfile_outdir="/work/mlperf/inference_results_v0.5/closed/NVIDIA/build/logs/2020.06.02-10.47.40/GeforceGTX1080Ti/ssd-small/SingleStream" --logfile_prefix="mlperf_log_" --test_mode="PerformanceOnly" --warmup_duration=20.0 --use_graphs=false --gpu_batch_size=1 --map_path="data_maps/coco/val_map.txt" --tensor_path="${PREPROCESSED_DATA_DIR}/coco/val2017/SSDMobileNet/int8_chw4" --gpu_engines="./build/engines/GeforceGTX1080Ti/ssd-small/SingleStream/ssd-small-SingleStream-gpu-b1-int8.plan" --performance_sample_count=256 --max_dlas=0 --single_stream_expected_latency_ns=1621000 --mlperf_conf_path="measurements/GeforceGTX1080Ti/ssd-small/SingleStream/mlperf.conf" --user_conf_path="measurements/GeforceGTX1080Ti/ssd-small/SingleStream/user.conf" --scenario SingleStream --model ssd-small --response_postprocess coco
&&&& RUNNING Default_Harness # ./build/bin/harness_default
[I] mlperf.conf path: measurements/GeforceGTX1080Ti/ssd-small/SingleStream/mlperf.conf
[I] user.conf path: measurements/GeforceGTX1080Ti/ssd-small/SingleStream/user.conf
[W] [TRT] TensorRT was linked against cuBLAS 10.2.0 but loaded cuBLAS 10.1.0
[I] Device:0: ./build/engines/GeforceGTX1080Ti/ssd-small/SingleStream/ssd-small-SingleStream-gpu-b1-int8.plan has been successfully loaded.
[W] [TRT] TensorRT was linked against cuBLAS 10.2.0 but loaded cuBLAS 10.1.0
[W] [TRT] TensorRT was linked against cuBLAS 10.2.0 but loaded cuBLAS 10.1.0
[W] [TRT] TensorRT was linked against cuBLAS 10.2.0 but loaded cuBLAS 10.1.0
[W] [TRT] TensorRT was linked against cuBLAS 10.2.0 but loaded cuBLAS 10.1.0
[I] Creating batcher thread: 0 EnableBatcherThreadPerDevice: false
Starting warmup. Running for a minimum of 20 seconds.
Finished warmup. Ran for 20.0096s.
================================================
MLPerf Results Summary
================================================
SUT name : LWIS_Server
Scenario : Single Stream
Mode     : Performance
90th percentile latency (ns) : 1337869
Result is : VALID
  Min duration satisfied : Yes
  Min queries satisfied : Yes

================================================
Additional Stats
================================================
QPS w/ loadgen overhead         : 733.56
QPS w/o loadgen overhead        : 750.99

Min latency (ns)                : 1137298
Max latency (ns)                : 10948253
Mean latency (ns)               : 1331582
50.00 percentile latency (ns)   : 1305119
90.00 percentile latency (ns)   : 1337869
95.00 percentile latency (ns)   : 1461381
97.00 percentile latency (ns)   : 1697531
99.00 percentile latency (ns)   : 2182374
99.90 percentile latency (ns)   : 2852728

================================================
Test Parameters Used
================================================
samples_per_query : 1
target_qps : 616.903
target_latency (ns): 0
max_async_queries : 1
min_duration (ms): 60000
max_duration (ms): 0
min_query_count : 1024
max_query_count : 0
qsl_rng_seed : 3133965575612453542
sample_index_rng_seed : 665484352860916858
schedule_rng_seed : 3622009729038561421
accuracy_log_rng_seed : 0
accuracy_log_probability : 0
print_timestamps : false
performance_issue_unique : false
performance_issue_same : false
performance_issue_same_index : 0
performance_sample_count : 256

No warnings encountered during test.

No errors encountered during test.
Device Device:0 processed:
  44015 batches of size 1
  Memcpy Calls: 0
  PerSampleCudaMemcpy Calls: 0
  BatchedCudaMemcpy Calls: 44015
&&&& PASSED Default_Harness # ./build/bin/harness_default
[2020-06-02 10:49:01,682 main.py:153 INFO] Result: 90th percentile latency (ns) : 1337869 and Result is : VALID

======================= Perf harness results: =======================

GeforceGTX1080Ti-SingleStream:
    ssd-small: 90th percentile latency (ns) : 1337869 and Result is : VALID


======================= Accuracy results: =======================

GeforceGTX1080Ti-SingleStream:
    ssd-small: No accuracy results in PerformanceOnly mode.

[2020-06-02 10:49:02,236 main.py:302 INFO] Using config files: measurements/GeforceGTX1080Ti/ssd-small/SingleStream/config.json
[2020-06-02 10:49:02,236 __init__.py:142 INFO] Parsing config file measurements/GeforceGTX1080Ti/ssd-small/SingleStream/config.json ...
[2020-06-02 10:49:02,236 main.py:306 INFO] Processing config "GeforceGTX1080Ti_ssd-small_SingleStream"
[2020-06-02 10:49:02,236 main.py:116 INFO] Running harness for ssd-small benchmark in SingleStream scenario...
BenchmarkHarness (
{'gpu_batch_size': 1, 'gpu_single_stream_expected_latency_ns': 1621000, 'input_dtype': 'int8', 'input_format': 'chw4', 'map_path': 'data_maps/coco/val_map.txt', 'precision': 'int8', 'tensor_path': '${PREPROCESSED_DATA_DIR}/coco/val2017/SSDMobileNet/int8_chw4', 'use_graphs': False, 'system_id': 'GeforceGTX1080Ti', 'scenario': 'SingleStream', 'benchmark': 'ssd-small', 'config_name': 'GeforceGTX1080Ti_ssd-small_SingleStream', 'test_mode': 'AccuracyOnly', 'log_dir': '/work/mlperf/inference_results_v0.5/closed/NVIDIA/build/logs/2020.06.02-10.49.01'}
BenchmarkHarness )
[2020-06-02 10:49:02,238 __init__.py:42 INFO] Running command: ./build/bin/harness_default --plugins="build/plugins/NMSOptPlugin/libnmsoptplugin.so" --logfile_outdir="/work/mlperf/inference_results_v0.5/closed/NVIDIA/build/logs/2020.06.02-10.49.01/GeforceGTX1080Ti/ssd-small/SingleStream" --logfile_prefix="mlperf_log_" --test_mode="AccuracyOnly" --use_graphs=false --gpu_batch_size=1 --map_path="data_maps/coco/val_map.txt" --tensor_path="${PREPROCESSED_DATA_DIR}/coco/val2017/SSDMobileNet/int8_chw4" --gpu_engines="./build/engines/GeforceGTX1080Ti/ssd-small/SingleStream/ssd-small-SingleStream-gpu-b1-int8.plan" --performance_sample_count=256 --max_dlas=0 --single_stream_expected_latency_ns=1621000 --mlperf_conf_path="measurements/GeforceGTX1080Ti/ssd-small/SingleStream/mlperf.conf" --user_conf_path="measurements/GeforceGTX1080Ti/ssd-small/SingleStream/user.conf" --scenario SingleStream --model ssd-small --response_postprocess coco
&&&& RUNNING Default_Harness # ./build/bin/harness_default
[I] mlperf.conf path: measurements/GeforceGTX1080Ti/ssd-small/SingleStream/mlperf.conf
[I] user.conf path: measurements/GeforceGTX1080Ti/ssd-small/SingleStream/user.conf
[W] [TRT] TensorRT was linked against cuBLAS 10.2.0 but loaded cuBLAS 10.1.0
[I] Device:0: ./build/engines/GeforceGTX1080Ti/ssd-small/SingleStream/ssd-small-SingleStream-gpu-b1-int8.plan has been successfully loaded.
[W] [TRT] TensorRT was linked against cuBLAS 10.2.0 but loaded cuBLAS 10.1.0
[W] [TRT] TensorRT was linked against cuBLAS 10.2.0 but loaded cuBLAS 10.1.0
[W] [TRT] TensorRT was linked against cuBLAS 10.2.0 but loaded cuBLAS 10.1.0
[W] [TRT] TensorRT was linked against cuBLAS 10.2.0 but loaded cuBLAS 10.1.0
[I] Creating batcher thread: 0 EnableBatcherThreadPerDevice: false
Starting warmup. Running for a minimum of 5 seconds.
Finished warmup. Ran for 5.01101s.

No warnings encountered during test.

No errors encountered during test.
Device Device:0 processed:
  5000 batches of size 1
  Memcpy Calls: 0
  PerSampleCudaMemcpy Calls: 0
  BatchedCudaMemcpy Calls: 5000
&&&& PASSED Default_Harness # ./build/bin/harness_default
[2020-06-02 10:49:15,585 main.py:153 INFO] Result: Cannot find performance result. Maybe you are running in AccuracyOnly mode.
[2020-06-02 10:49:15,591 __init__.py:42 INFO] Running command: python3 build/inference/v0.5/classification_and_detection/tools/accuracy-coco.py --mlperf-accuracy-file /work/mlperf/inference_results_v0.5/closed/NVIDIA/build/logs/2020.06.02-10.49.01/GeforceGTX1080Ti/ssd-small/SingleStream/mlperf_log_accuracy.json             --coco-dir /work/mlperf/inference_results_v0.5/closed/NVIDIA/build/preprocessed_data/coco --output-file build/ssd-small-results.json
loading annotations into memory...
Done (t=0.40s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.11s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=12.97s).
Accumulating evaluation results...
DONE (t=2.18s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.237
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.352
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.263
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.018
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.161
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.561
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.214
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.266
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.267
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.023
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.184
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.631
mAP=23.725%

======================= Perf harness results: =======================

GeforceGTX1080Ti-SingleStream:
    ssd-small: Cannot find performance result. Maybe you are running in AccuracyOnly mode.


======================= Accuracy results: =======================

GeforceGTX1080Ti-SingleStream:
    ssd-small: Accuracy = 23.725, Threshold = 21.780. Accuracy test PASSED.
GeforceGTX1080Ti SSDLiteMobilenetV2 - wom-ai/inference_results_v0.5 GitHub Wiki

Contents

2020-06-02

INT8 CHW4 Performance mode only (C++)