WiredTiger Performance Benchmarks - wiredtiger/rocksdb GitHub Wiki
Introduction
Running a scaled down version of the RocksDB benchmarks from here: https://github.com/facebook/rocksdb/wiki/Performance-Benchmarks
The benchmark is based on the original LevelDB benchmark, but the operation counts are extended significantly.
Running with 50 million operations, instead of a billion used in the original RocksDB version (I am not patient, and my hardware is more constrained). I'm also using snappy compression instead of zlib.
Running on an AWS c3.8xlarge instance:
- 32 cores of Xeon E5-2680 CPU
- 60GB RAM
- 2x320GB SSD drives
General settings:
The benchmarks were run with:
- 50 and 200 million operations (unless otherwise stated)
- 1 GB cache size (RocksDB has several other caches as well)
- 800 byte records (quite large)
- 16 byte keys
- WiredTiger develop branch at Git commit: 6881a66651755ed0a46560fee9e49fd886d82edb
- RocksDB master branch at Git commit: 930cb0b9ee12c18eb461ef78748ed5b9bcf80d98
- WiredTiger RocksDB fork on wiredtiger branch at Git commit: 5a064e377f983e9b3335b4401fc848b0b23730b1
Fill Sequential
50 million operations
Measurement | WiredTiger | RocksDB |
---|---|---|
Ops/sec | 386842 | 228285 |
micros/op | 2.73 | 4.38 |
Throughput (MB/s) | 285 | 177 |
99% Latency | 3.99 | 6.82 |
99.99% Latency | 38.67 | 30.98 |
Max Latency | 162095 | 213925 |
DB size (GB) | 29 | 40 |
200 million operations
Measurement | WiredTiger | RocksDB |
---|---|---|
Ops/sec | 349930 | 226305 |
micros/op | 2.86 | 4.41 |
Throughput (MB/s) | 272 | 176 |
99% Latency | 4.35 | 6.75 |
99.99% Latency | 39.4 | 34 |
Max Latency | 392138 | 214074 |
DB size (GB) | 115 | 159 |
Notes:
- This test artificially loads all the items into the RocksDB database at the final level. Thus skipping all of the "normal" LSM merge overhead. WiredTiger is loading into the LSM tree and doing background merges.
- The test is also single threaded. WiredTiger should be able to increase the throughput with multiple threads, if there is I/O capacity.
Fill Random
50 million operations
Measurement | WiredTiger | RocksDB |
---|---|---|
Ops/sec | 296399 | 229644 |
micros/op | 3.37 | 4.355 |
Throughput (MB/s) | 230 | 178 |
99% Latency | 10.6 | 6.77 |
99.99% Latency | 88 | 19.74 |
Max Latency | 34444 | 213805 |
DB size (GB) | 18 | 25 |
200 million operations
Measurement | WiredTiger | RocksDB |
---|---|---|
Ops/sec | 295439 | 227928 |
micros/op | 3.38 | 4.38 |
Throughput (MB/s) | 229 | 177 |
99% Latency | 10.6 | 6.5 |
99.99% Latency | 96 | 22 |
Max Latency | 204659 | 214027 |
DB size (GB) | 115 | 100 |
Notes:
- The database size referenced from a fill random is measured after allowing a compact operation to finish.
- The database size when populated with fill random is smaller than with fill sequential, because not all keys in the range are inserted (and some keys are inserted multiple times). For this reason benchmarks that follow a populate phase should be based on a database generated via fill sequential, followed by an overwrite (the overwrite avoids any tricks re: loading sequentially).
Overwrite
50 million operations
Measurement | WiredTiger | RocksDB |
---|---|---|
Ops/sec | 301910 | 41720 |
micros/op | 3.31 | 23.97 |
Throughput (MB/s) | 234 | 32 |
99% Latency | 10.6 | 1047 |
99.99% Latency | 88 | 1199 |
Max Latency | 32596 | 3208136 |
DB size (GB) | 47 | 51 |
200 million operations
Measurement | WiredTiger | RocksDB |
---|---|---|
Ops/sec | 292151 | 17954 |
micros/op | 3.42 | 55.7 |
Throughput (MB/s) | 227 | 14 |
99% Latency | 10.7 | 1142 |
99.99% Latency | 108 | 1199 |
Max Latency | 1294245 | 6528779 |
DB size (GB) | 221 | 180 |
Notes:
- This is a more realistic test than the fill random and fill sequential tests, it equates to loading into an LSM tree that already contains some data.
Read Random
50 million items in DB. 32 threads, 500k ops each thread
Measurement | WiredTiger | RocksDB |
---|---|---|
Ops/sec | 151604 | 450157 |
micros/op | 6.59 | 2.22 |
99% Latency | 1302 | 190 |
99.99% Latency | 2852 | 1962 |
Max Latency | 19560 | 19970 |
DB size (GB) | 44 | 40 |
200 million items in DB. 32 threads, 500k ops each thread
Measurement | WiredTiger | RocksDB |
---|---|---|
Ops/sec | 69864 | 21365 |
micros/op | 14.31 | 46.8 |
99% Latency | 3023 | 3241 |
99.99% Latency | 18404 | 19930 |
Max Latency | 4960972 | 251701 |
DB size (GB) | 196 | 162 |
Notes:
- In the smaller test WiredTiger is slower than RocksDB, because it is operating on a database with multiple levels, thus reads often require more lookups to satisfy. The performance improves over time as the data set is read into file system cache - thus running the test for longer improves the WiredTiger numbers.
- The RocksDB version of this test is based on a LSM tree generated with a sequential load. Thus all data has been loaded into a single level of the LSM tree - and each search should only ever need a single lookup (plus a bloom filter lookup). WiredTiger is searching in a tree that has been allowed to settle, but still contains several levels (chunks).
- WiredTiger matches the RocksDB configuration of disabling mmap support. If mmap support is not disabled, WiredTiger is able to make much more effective use of the OS filesystem cache, especially when the working set is smaller than available memory.
- The readrandom test produces larger databases because compression is disabled
Scripts
The scripts used to generate the results follow.
WiredTiger
Fill Sequential:
#!/bin/bash
# RocksDB uses block_size 65536, WiredTiger 4096
# RocksDB cache size 10485760, WiredTiger 1GB (RocksDB has other caches)
RESULT_DIR=results
echo "Load 1B keys sequentially into database....."
bpl=10485760;overlap=10;mcz=2;del=300000000;levels=6;ctrig=4; delay=8; stop=12; wbn=3; mbc=20; mb=67108864;wbs=134217728; dds=0; sync=0; r=50000000; t=1; vs=800; bs=4096; cs=1000000000; of=500000; si=1000000; ./db_bench_wiredtiger --benchmarks=fillseq --mmap_read=0 --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --verify_checksum=1 --db=$RESULT_DIR --sync=$sync --disable_wal=1 --compression_type=snappy --stats_interval=$si --disable_data_sync=$dds --target_file_size_base=$mb --stats_per_interval=1 --use_existing_db=0
du -s -k $RESULT_DIR
Fill Random:
#!/bin/bash
# NOTES: RocksDB disables compactions during fillrandom.
# RocksDB uses target_file_size_base 1073741824, WiredTiger 67108864
# RocksDB uses block_size 65536, WiredTiger 4096
# RocksDB cache size 10485760, WiredTiger 1GB (RocksDB has other caches)
RESULT_DIR=results
echo "Bulk load database ...."
bpl=10485760;overlap=10;mcz=2;del=300000000;levels=2;ctrig=10000000; delay=10000000; stop=10000000; wbn=30; mbc=20; mb=67108864;wbs=268435456; dds=1; sync=0; r=50000000; t=1; vs=800; bs=4096; cs=1000000000; of=500000; si=1000000; ./db_bench_wiredtiger --benchmarks=fillrandom --mmap_read=0 --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs --cache_size=$cs --bloom_bits=10 --verify_checksum=1 --db=$RESULT_DIR --sync=$sync --disable_wal=1 --compression_type=snappy --stats_interval=$si --disable_data_sync=$dds --target_file_size_base=$mb --stats_per_interval=1 --use_existing_db=0
echo "Running manual compaction...."
bpl=10485760;overlap=10;mcz=2;del=300000000;levels=2;ctrig=10000000; delay=10000000; stop=10000000; wbn=30; mbc=20; mb=67108864;wbs=268435456; dds=1; sync=0; r=50000000; t=1; vs=800; bs=4096; cs=1000000000; of=500000; si=1000000; ./db_bench_wiredtiger --benchmarks=compact --mmap_read=0 --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --verify_checksum=1 --db=$RESULT_DIR --sync=$sync --disable_wal=1 --compression_type=snappy --stats_interval=$si --disable_data_sync=$dds --target_file_size_base=$mb --stats_per_interval=1 --use_existing_db=1
du -s -k $RESULT_DIR
Overwrite:
#!/bin/bash
# RocksDB uses block_size 65536, WiredTiger 8192
# RocksDB cache size 10485760, WiredTiger 1GB (RocksDB has other caches)
RESULT_DIR=results
echo "Overwriting the 1B keys in database in random order...."
bpl=10485760;overlap=10;mcz=2;del=300000000;levels=6;ctrig=4; delay=8; stop=12; wbn=3; mbc=20; mb=67108864;wbs=134217728; dds=0; sync=0; r=50000000; t=1; vs=800; bs=4096; cs=1000000000; of=500000; si=1000000; ./db_bench_wiredtiger --benchmarks=overwrite --mmap_read=0 --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --verify_checksum=1 --db=$RESULT_DIR --sync=$sync --disable_wal=1 --compression_type=snappy --stats_interval=$si --disable_data_sync=$dds --target_file_size_base=$mb --stats_per_interval=1 --use_existing_db=1
du -s -k $RESULT_DIR
Read Random:
#!/bin/bash
# RocksDB uses block_size 65536, WiredTiger 4096
# RocksDB cache size 10485760, WiredTiger 1GB (RocksDB has other caches)
RESULT_DIR=results
echo "Load 1B keys sequentially into database....."
bpl=10485760;overlap=10;mcz=2;del=300000000;levels=6;ctrig=4; delay=8; stop=12; wbn=3; mbc=20; mb=67108864;wbs=134217728; dds=1; sync=0; r=50000000; t=1; vs=800; bs=4096; cs=1000000000; of=500000; si=1000000; ./db_bench_wiredtiger --benchmarks=fillseq --mmap_read=0 --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --verify_checksum=1 --db=$RESULT_DIR --sync=$sync --disable_wal=1 --compression_type=none --stats_interval=$si --disable_data_sync=$dds --target_file_size_base=$mb --stats_per_interval=1 --use_existing_db=0
echo "Allowing populated database to settle."
bpl=10485760;overlap=10;mcz=2;del=300000000;levels=6;ctrig=4; delay=8; stop=12; wbn=3; mbc=20; mb=67108864;wbs=134217728; dds=0; sync=0; r=1000000; t=1; vs=800; bs=4096; cs=1000000000; of=500000; si=1000000; ./db_bench_wiredtiger --benchmarks=compact --mmap_read=0 --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --db=$RESULT_DIR --sync=$sync --disable_wal=1 --compression_type=none --stats_interval=$si --disable_data_sync=$dds --target_file_size_base=$mb --stats_per_interval=1 --use_existing_db=1
echo "Reading 1B keys in database in random order...."
bpl=10485760;overlap=10;mcz=2;del=300000000;levels=6;ctrig=4; delay=8; stop=12; wbn=3; mbc=20; mb=67108864;wbs=134217728; dds=0; sync=0; r=50000000; reads=500000; t=32; vs=800; bs=4096; cs=1000000000; of=500000; si=100000; ./db_bench_wiredtiger --benchmarks=readrandom --mmap_read=0 --statistics=1 --histogram=1 --num=$r --reads=$reads --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --db=$RESULT_DIR --sync=$sync --disable_wal=1 --compression_type=none --stats_interval=$si --disable_data_sync=$dds --target_file_size_base=$mb --stats_per_interval=1 --use_existing_db=1
du -s -k $RESULT_DIR
RocksDB
These scripts are copied directly from the RocksDB benchmark page referenced above. The changes are operation count and cache size. The original had a cache size of 100MB configured.
Fill Sequential:
#!/bin/bash
RESULT_DIR=results
echo "Load 1B keys sequentially into database....."
bpl=10485760;overlap=10;mcz=2;del=300000000;levels=6;ctrig=4; delay=8; stop=12; wbn=3; mbc=20; mb=67108864;wbs=134217728; dds=0; sync=0; r=50000000; t=1; vs=800; bs=65536; cs=1000000000; of=500000; si=1000000; ./db_bench --benchmarks=fillseq --disable_seek_compaction=1 --mmap_read=0 --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=4 --open_files=$of --verify_checksum=1 --db=$RESULT_DIR --sync=$sync --disable_wal=1 --compression_type=snappy --stats_interval=$si --compression_ratio=50 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop --num_levels=$levels --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz --max_grandparent_overlap_factor=$overlap --stats_per_interval=1 --max_bytes_for_level_base=$bpl --use_existing_db=0
du -s -k $RESULT_DIR
Fill Random:
#!/bin/bash
RESULT_DIR=results
echo "Bulk load database into L0...."
bpl=10485760;overlap=10;mcz=2;del=300000000;levels=2;ctrig=10000000; delay=10000000; stop=10000000; wbn=30; mbc=20; mb=1073741824;wbs=268435456; dds=1; sync=0; r=50000000; t=1; vs=800; bs=65536; cs=1000000000; of=500000; si=1000000; ./db_bench --benchmarks=fillrandom --disable_seek_compaction=1 --mmap_read=0 --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=4 --open_files=$of --verify_checksum=1 --db=$RESULT_DIR --sync=$sync --disable_wal=1 --compression_type=snappy --stats_interval=$si --compression_ratio=50 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop --num_levels=$levels --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz --max_grandparent_overlap_factor=$overlap --stats_per_interval=1 --max_bytes_for_level_base=$bpl --memtablerep=vector --use_existing_db=0 --disable_auto_compactions=1 --source_compaction_factor=10000000
echo "Running manual compaction to do a global sort map-reduce style...."
bpl=10485760;overlap=10;mcz=2;del=300000000;levels=2;ctrig=10000000; delay=10000000; stop=10000000; wbn=30; mbc=20; mb=1073741824;wbs=268435456; dds=1; sync=0; r=50000000; t=1; vs=800; bs=65536; cs=1000000000; of=500000; si=1000000; ./db_bench --benchmarks=compact --disable_seek_compaction=1 --mmap_read=0 --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=4 --open_files=$of --verify_checksum=1 --db=$RESULT_DIR --sync=$sync --disable_wal=1 --compression_type=snappy --stats_interval=$si --compression_ratio=50 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop --num_levels=$levels --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz --max_grandparent_overlap_factor=$overlap --stats_per_interval=1 --max_bytes_for_level_base=$bpl --memtablerep=vector --use_existing_db=1 --disable_auto_compactions=1 --source_compaction_factor=10000000
du -s -k $RESULT_DIR
Overwrite:
#!/bin/bash
RESULT_DIR=results
echo "Overwriting the 1B keys in database in random order...."
bpl=10485760;overlap=10;mcz=2;del=300000000;levels=6;ctrig=4; delay=8; stop=12; wbn=3; mbc=20; mb=67108864;wbs=134217728; dds=0; sync=0; r=50000000; t=1; vs=800; bs=65536; cs=1000000000; of=500000; si=1000000; ./db_bench --benchmarks=overwrite --disable_seek_compaction=1 --mmap_read=0 --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=4 --open_files=$of --verify_checksum=1 --db=$RESULT_DIR --sync=$sync --disable_wal=1 --compression_type=snappy --stats_interval=$si --compression_ratio=50 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop --num_levels=$levels --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz --max_grandparent_overlap_factor=$overlap --stats_per_interval=1 --max_bytes_for_level_base=$bpl --use_existing_db=1
du -s -k $RESULT_DIR
Read Random:
#!/bin/bash
RESULT_DIR=results
echo "Load 1B keys sequentially into database....."
bpl=10485760;overlap=10;mcz=2;del=300000000;levels=6;ctrig=4; delay=8; stop=12; wbn=3; mbc=20; mb=67108864;wbs=134217728; dds=1; sync=0; r=50000000; t=1; vs=800; bs=4096; cs=1000000000; of=500000; si=1000000; ./db_bench --benchmarks=fillseq --disable_seek_compaction=1 --mmap_read=0 --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=6 --open_files=$of --verify_checksum=1 --db=$RESULT_DIR --sync=$sync --disable_wal=1 --compression_type=none --stats_interval=$si --compression_ratio=50 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop --num_levels=$levels --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz --max_grandparent_overlap_factor=$overlap --stats_per_interval=1 --max_bytes_for_level_base=$bpl --use_existing_db=0
echo "Reading 1B keys in database in random order...."
bpl=10485760;overlap=10;mcz=2;del=300000000;levels=6;ctrig=4; delay=8; stop=12; wbn=3; mbc=20; mb=67108864;wbs=134217728; dds=0; sync=0; r=50000000; reads=500000; t=32; vs=800; bs=4096; cs=1000000000; of=500000; si=100000; ./db_bench --benchmarks=readrandom --disable_seek_compaction=1 --mmap_read=0 --statistics=1 --histogram=1 --num=$r --reads=$reads --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=6 --open_files=$of --verify_checksum=1 --db=$RESULT_DIR --sync=$sync --disable_wal=1 --compression_type=none --stats_interval=$si --compression_ratio=50 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop --num_levels=$levels --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz --max_grandparent_overlap_factor=$overlap --stats_per_interval=1 --max_bytes_for_level_base=$bpl --use_existing_db=1
du -s -k $RESULT_DIR