Effect of removing redundant nullptr checks in deleters - laurynas-biveinis/unodb GitHub Wiki

baseline commit, patch commit, filtered with --benchmark_filter="(shrink|delete).*unodb::db"

  • micro_benchmark_node4: 2% slowdown (full_node4_to_minimal_sequential_delete<unodb::db>/65532) to 11% speedup (shrink_node16_to_node4_randomly<unodb::db>/64)
  • micro_benchmark_node16: 3% slowdown (full_node16_tree_random_delete<unodb::db>/32768) to 7% speedup (shrink_node48_to_node16_randomly<unodb::db>/16383)
  • micro_benchmark_node48: 3% slowdown (full_node48_tree_sequential_delete<unodb::db>/192) to 12% speedup (shrink_node256_to_node48_randomly<unodb::db>/64)
  • micro_benchmark_node256: 1% slowdown

perf stat for shrink_node256_to_node48_randomly<unodb::db>/64:

baseline:

$ perf stat ./micro_benchmark_node48 --benchmark_filter="shrink_node256_to_node48_randomly<unodb::db>/64"
2021-05-06T05:51:49+02:00
Running ./micro_benchmark_node48
Run on (8 X 3800 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x4)
  L1 Instruction 32 KiB (x4)
  L2 Unified 256 KiB (x4)
  L3 Unified 8192 KiB (x1)
Load Average: 0.25, 0.49, 0.51
----------------------------------------------------------------------------------------------------------
Benchmark                                                Time             CPU   Iterations UserCounters...
----------------------------------------------------------------------------------------------------------
shrink_node256_to_node48_randomly<unodb::db>/64       14.2 us         14.0 us        49952 items_per_second=4.56595M/s size=475.906k

 Performance counter stats for './micro_benchmark_node48 --benchmark_filter=shrink_node256_to_node48_randomly<unodb::db>/64':

         18,349.19 msec task-clock                #    0.999 CPUs utilized          
               170      context-switches          #    0.009 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               344      page-faults               #    0.019 K/sec                  
    69,793,580,619      cycles                    #    3.804 GHz                      (83.33%)
    12,757,622,496      stalled-cycles-frontend   #   18.28% frontend cycles idle     (83.32%)
     3,666,650,080      stalled-cycles-backend    #    5.25% backend cycles idle      (66.68%)
   182,994,791,626      instructions              #    2.62  insn per cycle         
                                                  #    0.07  stalled cycles per insn  (83.35%)
    32,695,626,969      branches                  # 1781.856 M/sec                    (83.35%)
        71,736,753      branch-misses             #    0.22% of all branches          (83.32%)

      18.368685049 seconds time elapsed

      18.254159000 seconds user
       0.096011000 seconds sys

patch:

$ perf stat ./micro_benchmark_node48 --benchmark_filter="shrink_node256_to_node48_randomly<unodb::db>/64"
2021-05-06T05:52:20+02:00
Running ./micro_benchmark_node48
Run on (8 X 3800 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x4)
  L1 Instruction 32 KiB (x4)
  L2 Unified 256 KiB (x4)
  L3 Unified 8192 KiB (x1)
Load Average: 0.39, 0.50, 0.51
----------------------------------------------------------------------------------------------------------
Benchmark                                                Time             CPU   Iterations UserCounters...
----------------------------------------------------------------------------------------------------------
shrink_node256_to_node48_randomly<unodb::db>/64       12.6 us         12.4 us        56269 items_per_second=5.14701M/s size=475.906k

 Performance counter stats for './micro_benchmark_node48 --benchmark_filter=shrink_node256_to_node48_randomly<unodb::db>/64':

         20,263.39 msec task-clock                #    0.999 CPUs utilized          
                42      context-switches          #    0.002 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               344      page-faults               #    0.017 K/sec                  
    77,066,687,284      cycles                    #    3.803 GHz                      (83.32%)
    13,001,846,782      stalled-cycles-frontend   #   16.87% frontend cycles idle     (83.34%)
     3,967,510,685      stalled-cycles-backend    #    5.15% backend cycles idle      (66.68%)
   201,972,140,130      instructions              #    2.62  insn per cycle         
                                                  #    0.06  stalled cycles per insn  (83.34%)
    36,100,564,556      branches                  # 1781.566 M/sec                    (83.34%)
        64,862,577      branch-misses             #    0.18% of all branches          (83.32%)

      20.281543370 seconds time elapsed

      20.187859000 seconds user
       0.076014000 seconds sys
⚠️ **GitHub.com Fallback** ⚠️