Performance of GCC 11 vs GCC 10 - laurynas-biveinis/unodb GitHub Wiki

commit, GCC 10 vs 11.

  • micro_benchmark_key_prefix: 3% slowdown (unpredictable_prepend_key_prefix<unodb::db>) to 2% speedup (unpredictable_cut_key_prefix<unodb::db>)

--benchmark_filter=unodb::db:

  • micro_benchmark_n4: 3% slowdown (minimal_n4_sequential_insert<unodb::db>/16) to 3% speedup (full_n4_sequential_insert<unodb::db>/4096)
  • micro_benchmark_n16: 23% slowdown (n16_sequential_add<unodb::db>/64) to 3% speedup (full_n16_tree_full_scan<unodb::db>/64)
  • micro_benchmark_n48: 7% slowdown (full_n48_tree_random_delete<unodb::db>/192) to 3% speedup (full_n48_tree_full_scan<unodb::db>/4096)
  • micro_benchmark_n256: 2% slowdown (full_n256_tree_sequential_delete<unodb::db>/512) to 5% speedup (full_n256_tree_full_scan<unodb::db>/32768)

micro_benchmark_n16 24 slowdown, GCC 10:

 Performance counter stats for './micro_benchmark_n16 --benchmark_filter=n16_sequential_add<unodb::db>/64 --benchmark_repetitions=9':

         13,337.62 msec task-clock                #    0.999 CPUs utilized          
                20      context-switches          #    0.001 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
             7,424      page-faults               #    0.557 K/sec                  
    33,382,560,983      cycles                    #    2.503 GHz                      (83.34%)
     7,738,722,831      stalled-cycles-frontend   #   23.18% frontend cycles idle     (83.33%)
     2,726,154,605      stalled-cycles-backend    #    8.17% backend cycles idle      (66.65%)
    80,896,266,696      instructions              #    2.42  insn per cycle         
                                                  #    0.10  stalled cycles per insn  (83.33%)
    13,921,969,896      branches                  # 1043.813 M/sec                    (83.35%)
        26,729,009      branch-misses             #    0.19% of all branches          (83.34%)

GCC 11:

         13,439.69 msec task-clock                #    0.999 CPUs utilized          
                29      context-switches          #    0.002 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
             7,334      page-faults               #    0.546 K/sec                  
    33,638,383,411      cycles                    #    2.503 GHz                      (83.34%)
     7,446,332,576      stalled-cycles-frontend   #   22.14% frontend cycles idle     (83.33%)
     2,691,713,094      stalled-cycles-backend    #    8.00% backend cycles idle      (66.67%)
    80,470,158,602      instructions              #    2.39  insn per cycle         
                                                  #    0.09  stalled cycles per insn  (83.33%)
    13,736,250,817      branches                  # 1022.066 M/sec                    (83.33%)
        32,313,073      branch-misses             #    0.24% of all branches          (83.32%)

--benchmark_filter=unodb::olc_db:

  • micro_benchmark_n4: 3% slowdown (shrink_node16_to_n4_sequentially<unodb::olc_db>/25) to 4% speedup (full_n4_sequential_insert<unodb::olc_db>/100)
  • micro_benchmark_n16: 2% slowdown (minimal_n16_tree_full_scan<unodb::olc_db>/512) to 9% speedup (n16_sequential_add<unodb::olc_db>/64)
  • micro_benchmark_n48: 12% slowdown (n48_random_add<unodb::olc_db>/8) to 1% speedup (grow_n16_to_n48_sequentially<unodb::olc_db>/8)
  • micro_benchmark_n256: 3% slowdown (grow_n48_to_n256_randomly<unodb::olc_db>/8) to 6% (full_n256_tree_random_delete<unodb::olc_db>/32768)
⚠️ **GitHub.com Fallback** ⚠️