Xavier - wom-ai/inference_results_v0.5 GitHub Wiki

!!Notice!!

Check the current power mode status
$ sudo nvpmodel -q
NV Fan Mode:quiet
NV Power Mode: MAXN
0

Contents

Overview

  • QPS(Query Per Second)

  • Power Mode

    • Modes
    0 1 2 3 4 5 6
    MAXN MODE_10W MODE_15W MODE_30W _ALL MODE_30W _6CORE MODE_30W _4CORE MODE_30W _2CORE
    • Command
    sudo nvpmodel -m 0
    
    MAXN (8Core) MODE_10W (2Core) MODE_15W (4Core) MODE_30W_6CORE MODE_30_2CORE
    QPS w/ loadgen overhead 662.56 211.59 368.99 473.68 473.93
    QPS w/o loadgen overhead 669.99 212.80 371.46 477.80 477.82
    Memory Maximal Frequency (MHz) 2133 1066 1333 1600 1600
  • Graphical Env (e.g. GNOME Desktop Window Env) does not affect the performance

    $ systemctl set-default multi-user.target // into TEXT mode
    $ systemctl set-default graphical.target // into GRAPHICAL Mode
    

2020-01-07

Performance Only (C++)

nvidia@nvidia:~/data/inference_results_v0.5/closed/NVIDIA (stereoboy)$ sh run_harness.sh 
---------------------
{'action': 'run_harness', 'benchmarks': 'ssd-small', 'configs': '', 'scenarios': 'SingleStream', 'no_gpu': False, 'gpu_only': False}
---------------------
[2020-01-07 21:43:39,689 main.py:294 INFO] Using config files: measurements/Xavier/ssd-small/SingleStream/config.json
[2020-01-07 21:43:39,690 __init__.py:144 INFO] Parsing config file measurements/Xavier/ssd-small/SingleStream/config.json ...
-------------------------
[{'benchmark': 'ssd-small', 'config_name': 'Xavier_ssd-small_SingleStream', 'scenario': 'SingleStream', 'ssd-small': {'gpu_batch_size': 1, 'gpu_single_stream_expected_latency_ns': 1621000, 'input_dtype': 'int8', 'input_format': 'chw4', 'map_path': 'data_maps/coco/val_map.txt', 'precision': 'int8', 'tensor_path': '${PREPROCESSED_DATA_DIR}/coco/val2017/SSDMobileNet/int8_chw4', 'use_graphs': False}, 'system_id': 'Xavier'}]
-------------------------
[2020-01-07 21:43:39,692 main.py:301 INFO] Processing config "Xavier_ssd-small_SingleStream"
[2020-01-07 21:43:39,692 main.py:111 INFO] Running harness for ssd-small benchmark in SingleStream scenario...
{'gpu_batch_size': 1, 'gpu_single_stream_expected_latency_ns': 1621000, 'input_dtype': 'int8', 'input_format': 'chw4', 'map_path': 'data_maps/coco/val_map.txt', 'precision': 'int8', 'tensor_path': '${PREPROCESSED_DATA_DIR}/coco/val2017/SSDMobileNet/int8_chw4', 'use_graphs': False, 'system_id': 'Xavier', 'scenario': 'SingleStream', 'benchmark': 'ssd-small', 'config_name': 'Xavier_ssd-small_SingleStream', 'test_mode': 'PerformanceOnly', 'log_dir': '/home/nvidia/data/inference_results_v0.5/closed/NVIDIA/build/logs/2020.01.07-21.43.39'}
[2020-01-07 21:43:39,701 __init__.py:42 INFO] Running command: ./build/bin/harness_default --plugins="build/plugins/NMSOptPlugin/libnmsoptplugin.so" --logfile_outdir="/home/nvidia/data/inference_results_v0.5/closed/NVIDIA/build/logs/2020.01.07-21.43.39/Xavier/ssd-small/SingleStream" --logfile_prefix="mlperf_log_" --test_mode="PerformanceOnly" --use_graphs=false --gpu_batch_size=1 --map_path="data_maps/coco/val_map.txt" --tensor_path="${PREPROCESSED_DATA_DIR}/coco/val2017/SSDMobileNet/int8_chw4" --gpu_engines="./build/engines/Xavier/ssd-small/SingleStream/ssd-small-SingleStream-gpu-b1-int8.plan" --performance_sample_count=256 --max_dlas=0 --single_stream_expected_latency_ns=1621000 --mlperf_conf_path="measurements/Xavier/ssd-small/SingleStream/mlperf.conf" --user_conf_path="measurements/Xavier/ssd-small/SingleStream/user.conf" --scenario SingleStream --model ssd-small --response_postprocess coco
&&&& RUNNING Default_Harness # ./build/bin/harness_default
[I] mlperf.conf path: measurements/Xavier/ssd-small/SingleStream/mlperf.conf
[I] user.conf path: measurements/Xavier/ssd-small/SingleStream/user.conf
[I] Device:0: ./build/engines/Xavier/ssd-small/SingleStream/ssd-small-SingleStream-gpu-b1-int8.plan has been successfully loaded.
[I] Creating batcher thread: 0 EnableBatcherThreadPerDevice: false
Starting warmup. Running for a minimum of 5 seconds.
Finished warmup. Ran for 5.01146s.
================================================
MLPerf Results Summary
================================================
SUT name : LWIS_Server
Scenario : Single Stream
Mode     : Performance
90th percentile latency (ns) : 1520229
Result is : VALID
  Min duration satisfied : Yes
  Min queries satisfied : Yes

================================================
Additional Stats
================================================
QPS w/ loadgen overhead         : 656.75
QPS w/o loadgen overhead        : 664.01

Min latency (ns)                : 1404448
Max latency (ns)                : 3408283
Mean latency (ns)               : 1506004
50.00 percentile latency (ns)   : 1478979
90.00 percentile latency (ns)   : 1520229
95.00 percentile latency (ns)   : 1789393
97.00 percentile latency (ns)   : 1825459
99.00 percentile latency (ns)   : 1877173
99.90 percentile latency (ns)   : 2119872

================================================
Test Parameters Used
================================================
samples_per_query : 1
target_qps : 616.903
target_latency (ns): 0
max_async_queries : 1
min_duration (ms): 60000
max_duration (ms): 0
min_query_count : 1024
max_query_count : 0
qsl_rng_seed : 3133965575612453542
sample_index_rng_seed : 665484352860916858
schedule_rng_seed : 3622009729038561421
accuracy_log_rng_seed : 0
accuracy_log_probability : 0
print_timestamps : false
performance_issue_unique : false
performance_issue_same : false
performance_issue_same_index : 0
performance_sample_count : 256

No warnings encountered during test.

No errors encountered during test.
Device Device:0 processed:
  39406 batches of size 1
  Memcpy Calls: 0
  PerSampleCudaMemcpy Calls: 0
  BatchedCudaMemcpy Calls: 39406
&&&& PASSED Default_Harness # ./build/bin/harness_default
[2020-01-07 21:44:48,460 main.py:142 INFO] Result: 90th percentile latency (ns) : 1520229 and Result is : VALID

======================= Perf harness results: =======================

Xavier-SingleStream:
    ssd-small: 90th percentile latency (ns) : 1520229 and Result is : VALID


======================= Accuracy results: =======================

Xavier-SingleStream:
    ssd-small: No accuracy results in PerformanceOnly mode.

Inference Only (Python)

nvidia@nvidia:~/data/inference_results_v0.5/closed/NVIDIA (stereoboy)$ sh run_infer_xavier.sh 
Unable to init server: Could not connect: Connection refused
Unable to init server: Could not connect: Connection refused

(infer.py:22645): Gdk-CRITICAL **: 21:45:03.706: gdk_cursor_new_for_display: assertion 'GDK_IS_DISPLAY (display)' failed

(infer.py:22645): Gdk-CRITICAL **: 21:45:03.708: gdk_cursor_new_for_display: assertion 'GDK_IS_DISPLAY (display)' failed
[2020-01-07 21:45:03,723 infer.py:144 INFO] Running accuracy test...
[2020-01-07 21:45:03,724 infer.py:58 INFO] Running SSDMobileNet functionality test for engine [ ./build/engines/Xavier/ssd-small/SingleStream/ssd-small-SingleStream-gpu-b1-int8.plan ] with batch size 1
[TensorRT] VERBOSE: Plugin Creator registration succeeded - GridAnchor_TRT
[TensorRT] VERBOSE: Plugin Creator registration succeeded - NMS_TRT
[TensorRT] VERBOSE: Plugin Creator registration succeeded - Reorg_TRT
[TensorRT] VERBOSE: Plugin Creator registration succeeded - Region_TRT
[TensorRT] VERBOSE: Plugin Creator registration succeeded - Clip_TRT
[TensorRT] VERBOSE: Plugin Creator registration succeeded - LReLU_TRT
[TensorRT] VERBOSE: Plugin Creator registration succeeded - PriorBox_TRT
[TensorRT] VERBOSE: Plugin Creator registration succeeded - Normalize_TRT
[TensorRT] VERBOSE: Plugin Creator registration succeeded - RPROI_TRT
[TensorRT] VERBOSE: Plugin Creator registration succeeded - BatchedNMS_TRT
[TensorRT] VERBOSE: Plugin Creator registration succeeded - FlattenConcat_TRT
[TensorRT] VERBOSE: Deserialize required 2202132 microseconds.
[2020-01-07 21:45:06,713 runner.py:38 INFO] Binding Input
[2020-01-07 21:45:06,714 runner.py:38 INFO] Binding Postprocessor
loading annotations into memory...
Done (t=0.89s)
creating index...
index created!
[2020-01-07 21:45:07,661 infer.py:85 INFO] Running validation on 100 images. Please wait...
[2020-01-07 21:45:07,698 infer.py:95 INFO] Batch 0 >> Inference time:  0.020011
[2020-01-07 21:45:07,705 infer.py:95 INFO] Batch 1 >> Inference time:  0.002683
[2020-01-07 21:45:07,709 infer.py:95 INFO] Batch 2 >> Inference time:  0.002053
[2020-01-07 21:45:07,714 infer.py:95 INFO] Batch 3 >> Inference time:  0.001992
[2020-01-07 21:45:07,718 infer.py:95 INFO] Batch 4 >> Inference time:  0.001901
[2020-01-07 21:45:07,723 infer.py:95 INFO] Batch 5 >> Inference time:  0.001845
[2020-01-07 21:45:07,726 infer.py:95 INFO] Batch 6 >> Inference time:  0.001838
[2020-01-07 21:45:07,730 infer.py:95 INFO] Batch 7 >> Inference time:  0.001794
[2020-01-07 21:45:07,734 infer.py:95 INFO] Batch 8 >> Inference time:  0.001740
[2020-01-07 21:45:07,738 infer.py:95 INFO] Batch 9 >> Inference time:  0.001643

...

[2020-01-07 21:45:08,030 infer.py:95 INFO] Batch 90 >> Inference time:  0.001641
[2020-01-07 21:45:08,034 infer.py:95 INFO] Batch 91 >> Inference time:  0.001617
[2020-01-07 21:45:08,037 infer.py:95 INFO] Batch 92 >> Inference time:  0.001610
[2020-01-07 21:45:08,040 infer.py:95 INFO] Batch 93 >> Inference time:  0.001613
[2020-01-07 21:45:08,044 infer.py:95 INFO] Batch 94 >> Inference time:  0.001621
[2020-01-07 21:45:08,047 infer.py:95 INFO] Batch 95 >> Inference time:  0.001633
[2020-01-07 21:45:08,051 infer.py:95 INFO] Batch 96 >> Inference time:  0.001622
[2020-01-07 21:45:08,054 infer.py:95 INFO] Batch 97 >> Inference time:  0.001613
[2020-01-07 21:45:08,057 infer.py:95 INFO] Batch 98 >> Inference time:  0.001590
[2020-01-07 21:45:08,060 infer.py:95 INFO] Batch 99 >> Inference time:  0.001588
Loading and preparing results...
DONE (t=0.01s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.49s).
Accumulating evaluation results...
DONE (t=0.67s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.295
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.421
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.327
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.024
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.209
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.627
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.252
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.312
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.313
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.027
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.212
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.656
[2020-01-07 21:45:09,266 infer.py:139 INFO] Get mAP score = 0.295327 Target = 0.223860