Concurrent ART flavor overhead in single threaded workloads - laurynas-biveinis/unodb GitHub Wiki

How much overhead concurrency management adds for single-threaded workloads?

Optimistic lock coupling performs extra actions at every step. How expensive are they if the workload is single-threaded?

2021-10-27, after an optimization round

  • micro_benchmark_key_prefix: 246% (unpredictable_prepend_key_prefix) to 155% (unpredictable_get_shared_length)
  • micro_benchmark_n4: 226% (full_n4_sequential_delete/100) to 9% (n4_random_gets/65535)
  • micro_benchmark_n16: 378% (full_n16_tree_full_scan/64) to 9% (minimal_n16_tree_random_gets/16383)
  • micro_benchmark_n48: 260% (full_n48_tree_random_delete/512) to 11% (full_n48_tree_random_gets/131064)
  • micro_benchmark_n256: 668% (full_n256_tree_full_scan/128) to 11% (full_n256_tree_random_gets/131064)

2021-03-08, after the initial OLC commit:

Comparing unodb::db to unodb::olc_db (from ./micro_benchmark_key_prefix)
Benchmark                                                                            Time             CPU      Time Old      Time New       CPU Old       CPU New
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
unpredictable_get_shared_length<[unodb::db vs. unodb::olc_db]>                    +1.5010         +1.4950             1             2             1             2
unpredictable_leaf_key_prefix_split<[unodb::db vs. unodb::olc_db]>                +1.8351         +1.8390            17            49            17            49
unpredictable_cut_key_prefix<[unodb::db vs. unodb::olc_db]>                       +1.8779         +1.8815            18            53            18            53
unpredictable_prepend_key_prefix<[unodb::db vs. unodb::olc_db]>                   +2.3361         +2.3415            19            63            19            63

What about the mutex version?

unpredictable_get_shared_length<[unodb::db vs. unodb::mutex_db]>_mean                      +0.3133         +0.3153             1             1             1             1
unpredictable_leaf_key_prefix_split<[unodb::db vs. unodb::mutex_db]>_mean                  +0.2164         +0.2167            18            22            18            22
unpredictable_cut_key_prefix<[unodb::db vs. unodb::mutex_db]>_mean                         +0.2127         +0.2124            19            23            19            23
unpredictable_prepend_key_prefix<[unodb::db vs. unodb::mutex_db]>_mean                     +0.1494         +0.1496            19            22            19            22

So, for this benchmark we have mutex version overhead of 15%-31% and OLC overhead of 150%-230%. For Node4 ops, mutex: 2%-35%, OLC: 10% (large Node4 tree full scan) to 220% (small Node4 sequential delete). For Node16 ops, mutex: 2%-80%, OLC: 13% (small node random gets) to 440% (small full tree full scan). For Node48 ops, mutex: ~0% (small full tree random gets) to 45% (small full tree full scan), OLC: 10% (large full tree random gets) to 250% (small full tree full scan). For Node256 ops, mutex: 2% (small minimal tree random gets) to 145% (small full tree full scan), OLC: 10% (large minimal tree random gets) to 700% (small full tree full scan).