Selecting Datetime MessagePack serialization schema - tsafin/tarantool GitHub Wiki
FMT_RAW_FULLis fastest, but is least efficient with average 18 bytes per record.FMT_TNT_EPOCH_DATEis most compact with average 5.9 bytes per record, but is slowest. Looks likeFMT_RAW_NONZEROmight be good combination of speed (close toFMT_RAW_FULL) but is much more compact with average size of 8 bytes per record
Code used for benchmarking has been committed to these branches:
- Tarantool - https://github.com/tsafin/tarantool/tree/tsafin/gh-6504-mp-datetime-format-benchmark
- TPC-H - https://github.com/tarantool/tpch/tree/tsafin/gen-data-of-tarantool-epoch
We will test several various chemas of serialization of datetime/timestamp data MessagePack stream:
| encode type | details |
|---|---|
FMT_MP_FULL |
All members are msgpack encoded separately in MP_EXT data. |
FMT_MP_NONZERO |
Some, non-zero members are mp encoded in MP_EXT data. |
FMT_RAW_FULL |
Whole structure is directly copied to MP_EXT data.** |
FMT_RAW_NONZERO |
Partial, nonzero part (epoch field) is copied to MP_EXT data.** |
FMT_TNT_EPOCH |
Shift epoch closer to Tarantool Epoch |
FMT_MP_DATE |
Save separately date and seconds parts |
FMT_TNT_EPOCH_DATE |
Save date separately, with shift to Tarantool epoch |
Intuitively we would expect that any *RAW* method (which would use memcpy
instead of messagepack encoding) would be way faster than any messagepack encoding. And the question is - what would be difference between them?
Also, it's expected that FMT_RAW_FULL (which would do memcpy of whole 128-bit
structure) would be transfer the biggest chunks of memory, because any other encoding would try to reduce amount of transferred data. FMT_RAW_NONZERO still uses same memcpy approach, but would not save fields beyond epoch if they were all zero (i.e.
all of nanoseconds, timezone offset and timezone index not defined and equal to 0)
If those fields defined to non-zero, then FMT_RAW_NONZERO would be equivalent
to FMT_RAW_FULL.
** After internal discussion we agreed that we do not save floating .epoch field as is, but rather save it as converted integer values. As we would see from benchmark timings later this extra conversion does not slow down encoding/decoding for
FMT_RAW_*schemas and is negligible.
Among all messagepack encoding schemas, FMT_MP_FULL is biggest, and most
straighforward (saves all fields, regardless their values). FMT_MP_NONZERO tries
to avoid to save unnecessary 0s, reducing workload size.
FMT_TNT_EPOCH is kind of variant of FMT_MP_NONZERO but with tirck of moving
range of saved data from Unix Epoch-based to newly introduced "Tarantool-epoch",
i.e. days not since 1970-01-01, but rather 2011-01-01. Which in ideal case might
reduce values of saved to MessagePack data, potentially making more compressed
dataset.
FMT_MP_DATE is variant of FMT_MP_NONZERO schema, but where we split epoch (
number of seconds since Unix epoch) to 2 separate fields: days since Epoch, and
seconds since start of day. In the case when we do not have seconds defined (i.e.
SQL DATE type, instead of TIMESTAMP) we would have 0 in seconds field, which
would allow us further compress saved data.
FMT_TNT_EPOCH_DATE is variant which combines both FMT_MP_DATE (separate
days/seconds) and FMT_TNT_EPOCH (shifting dates to Tarantool epoch).
Algorithmically we expect it to have speed similar to FMT_MP_DATE but further
reducing size of saved workload.
Depending on circumstances (e.g. for date values of modern time from the range of current or last decade) with this schema we may end up with significantly
reduced size of storage used.
As an example, for the date 2021-09-31T00:00:00Z value of epoch which needs
to be saved:
FMT_MP_FULL, FMT_MP_NONZERO
|
epoch=1632960000 |
FMT_TNT_EPOCH |
epoch=339120000 |
FMT_MP_DATE |
day=18900, seconds=0 |
FMT_TNT_EPOCH_DATE |
day=3925, seconds=0 |
We used different methods to populate data of running workloads.
| workload name | details |
|---|---|
FULL_DATE |
All members are non-zero. |
EPOCH_ONLY |
Epoch is non-zero, the rest members are zero. |
MIXED_LOAD |
50/50 one of the above. |
TPCH_1COLUMN |
Use single column of lineitem table from TPC-H dataset |
TPCH_ALLCOLUMNS |
Use whole lineitem table for estimations (not implemented) |
FULL_DATE, EPOCH_ONLY and MIXED_LOAD all generate randomly spread data
around basic date of 2021-10-15T08:26:51Z. EPOCH_ONLY and part of MIXED_LOAD
modes fill (unexpectedly) only epoch field, and not populate nanoseconds and
timezone fields. FULL_DATE and part of MIXED_LOAD fill all known fields.
TPCH_* modes are special - they use separately generated data for TPC-H benchmark.
We have modified TPC-H dbgen generator so it would generate data values not from
original range of 1992-01-01 .. 1998-12-31, but rather closer to current time values
of a range 2011-01-01 .. 2021-12-31. tpch/dbgen generates set of various tables data
and the largest database is liteitem.tbl which has several fields of standard SQL
type DATE (only date, without seconds information). For the mode TPCH_1COLUMN
we have selected lineitem.l_receipdate field, though all other date fields may
be used similarly.
Generated lineitem.tbl is of a special format very similar to csv, to preload
this table data into benchmark (and to avoid stage of parsing external file) we
have converted lineitem.tbl to a couple of C files lineitem.h/lineitem.c, which
allow to precompile constant data directly into benchmark executable. Command-line
used for C files generation is luajit tbl2c.lua -N $(echo 8*1024|bc)
Idea was to implement even closer to reality mode - TPCH_ALLCOLUMNS which would
use the same TPC-H generated data to estimate whole workload size once it's being
serialized to MessagePack (i.e. full size of lineitem table, including all columns
serialized to their MessagePack representation). We have not implemented this mode
at the moment.
Benchmark could check 3 different layouts used for struct datetime storing:
dbl_epoch |
epoch field is double
|
int_epoch |
epoch field is int64_t
|
reordered |
like int_epoch but layout is in reverse order (from smallest 16-bit field to 64-bit field) |
Worth to note that dbl_epoch is our currently used in master layout, introduced
with recent commit of datetime subsystem. It was selected after prior performance
evaluations described here - https://gist.github.com/tsafin/618fbf847d258f6e7f5a75fdf9ea945b
In the current code in the branch we left active only
dbl_epochandint_epochlayouts, becausereorderedlayout makes no much sense if we want proper comparison of universal time. After normalization to UTC times should stay comparable without any timezone considerations), i.e.2021-09-01T03:00:00+0300is equal to2021-09-01T00:00:00Z, because they both point to the same moment of time...
Evaluations were done on server machine dev4 using 80-threads Intel Xeon(R) Gold 6230 CPU running at base 2.10GHz frequency. https://ark.intel.com/content/www/us/en/ark/products/192437/intel-xeon-gold-6230-processor-27-5m-cache-2-10-ghz.html
Final results have been gathered using clang-11 executable recompiled with AVX2 activated. To make this mode possible we have extended AVX2 support in the Tarantool build infrastructure. But worth to note that timings for both AVX and AVX2 code generated were not very much different, and we consider them equivalent from this point of view.
Full results compiled without AVX2 and without any filtering are collapsed here, click if you want to see them all...
[t.safin@dev4 build]$ ./perf/mp_datetime.perftest
setting up benchmark data
2021-11-06T00:57:31+03:00
Running ./perf/mp_datetime.perftest
Run on (80 X 3900 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x40)
L1 Instruction 32 KiB (x40)
L2 Unified 1024 KiB (x40)
L3 Unified 28160 KiB (x2)
Load Average: 0.40, 0.26, 0.15
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
---------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
---------------------------------------------------------------------------------------------------------------------------
bench_encode<dbl_epoch, FMT_MP_FULL, FULL_DATE> 15.7 ns 15.7 ns 44476716 avg_size=17.6
bench_decode_search<dbl_epoch, FMT_MP_FULL, FULL_DATE> 253 ns 253 ns 2761097 items_per_second=45.4364M/s
bench_encode<dbl_epoch, FMT_MP_NONZERO, FULL_DATE> 17.7 ns 17.7 ns 39315711 avg_size=17.6
bench_decode_search<dbl_epoch, FMT_MP_NONZERO, FULL_DATE> 374 ns 374 ns 1870821 items_per_second=30.7233M/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH, FULL_DATE> 17.9 ns 17.9 ns 39254930 avg_size=17.6
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH, FULL_DATE> 374 ns 374 ns 1872966 items_per_second=30.7336M/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH_DATE, FULL_DATE> 22.3 ns 22.3 ns 31412096 avg_size=18.7
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH_DATE, FULL_DATE> 426 ns 426 ns 1641223 items_per_second=26.9942M/s
bench_encode<dbl_epoch, FMT_RAW_FULL, FULL_DATE> 2.29 ns 2.29 ns 306932966 avg_size=18
bench_decode_search<dbl_epoch, FMT_RAW_FULL, FULL_DATE> 67.9 ns 67.9 ns 10277182 items_per_second=169.352M/s
bench_encode<dbl_epoch, FMT_RAW_NONZERO, FULL_DATE> 1.98 ns 1.98 ns 355023497 avg_size=18
bench_decode_search<dbl_epoch, FMT_RAW_NONZERO, FULL_DATE> 111 ns 111 ns 6317831 items_per_second=103.882M/s
bench_encode<dbl_epoch, FMT_MP_DATE, FULL_DATE> 22.0 ns 22.0 ns 31512234 avg_size=18.7
bench_decode_search<dbl_epoch, FMT_MP_DATE, FULL_DATE> 419 ns 419 ns 1671252 items_per_second=27.4724M/s
bench_encode<dbl_epoch, FMT_MP_FULL, EPOCH_ONLY> 8.21 ns 8.21 ns 85634451 avg_size=10
bench_decode_search<dbl_epoch, FMT_MP_FULL, EPOCH_ONLY> 130 ns 130 ns 5357802 items_per_second=61.4584M/s
bench_encode<dbl_epoch, FMT_MP_NONZERO, EPOCH_ONLY> 8.42 ns 8.42 ns 83035716 avg_size=8
bench_decode_search<dbl_epoch, FMT_MP_NONZERO, EPOCH_ONLY> 101 ns 101 ns 7015101 items_per_second=79.7139M/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH, EPOCH_ONLY> 8.16 ns 8.16 ns 86286738 avg_size=8
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH, EPOCH_ONLY> 103 ns 103 ns 6786081 items_per_second=77.6622M/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH_DATE, EPOCH_ONLY> 11.2 ns 11.2 ns 62778069 avg_size=9
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH_DATE, EPOCH_ONLY> 154 ns 154 ns 4533742 items_per_second=51.9143M/s
bench_encode<dbl_epoch, FMT_RAW_FULL, EPOCH_ONLY> 2.31 ns 2.31 ns 303906922 avg_size=18
bench_decode_search<dbl_epoch, FMT_RAW_FULL, EPOCH_ONLY> 37.2 ns 37.2 ns 18823338 items_per_second=215.326M/s
bench_encode<dbl_epoch, FMT_RAW_NONZERO, EPOCH_ONLY> 1.79 ns 1.79 ns 391059971 avg_size=10
bench_decode_search<dbl_epoch, FMT_RAW_NONZERO, EPOCH_ONLY> 61.0 ns 61.0 ns 11500758 items_per_second=131.3M/s
bench_encode<dbl_epoch, FMT_MP_DATE, EPOCH_ONLY> 11.2 ns 11.2 ns 62483403 avg_size=9
bench_decode_search<dbl_epoch, FMT_MP_DATE, EPOCH_ONLY> 150 ns 150 ns 4645250 items_per_second=53.2943M/s
bench_encode<dbl_epoch, FMT_MP_FULL, MIXED_LOAD> 15.2 ns 15.2 ns 45936540 avg_size=13.8
bench_decode_search<dbl_epoch, FMT_MP_FULL, MIXED_LOAD> 208 ns 208 ns 3441915 items_per_second=49.0672M/s
bench_encode<dbl_epoch, FMT_MP_NONZERO, MIXED_LOAD> 16.3 ns 16.3 ns 41445541 avg_size=12.8
bench_decode_search<dbl_epoch, FMT_MP_NONZERO, MIXED_LOAD> 229 ns 229 ns 3065369 items_per_second=44.6993M/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH, MIXED_LOAD> 16.1 ns 16.1 ns 43448816 avg_size=12.8
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH, MIXED_LOAD> 231 ns 231 ns 3039471 items_per_second=44.339M/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH_DATE, MIXED_LOAD> 20.5 ns 20.5 ns 34189905 avg_size=13.9
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH_DATE, MIXED_LOAD> 290 ns 290 ns 2414425 items_per_second=35.259M/s
bench_encode<dbl_epoch, FMT_RAW_FULL, MIXED_LOAD> 2.03 ns 2.03 ns 345211466 avg_size=18
bench_decode_search<dbl_epoch, FMT_RAW_FULL, MIXED_LOAD> 56.0 ns 56.0 ns 12523866 items_per_second=182.607M/s
bench_encode<dbl_epoch, FMT_RAW_NONZERO, MIXED_LOAD> 3.53 ns 3.53 ns 197709309 avg_size=14
bench_decode_search<dbl_epoch, FMT_RAW_NONZERO, MIXED_LOAD> 97.7 ns 97.7 ns 7166848 items_per_second=104.667M/s
bench_encode<dbl_epoch, FMT_MP_DATE, MIXED_LOAD> 20.1 ns 20.1 ns 34571177 avg_size=13.9
bench_decode_search<dbl_epoch, FMT_MP_DATE, MIXED_LOAD> 298 ns 298 ns 2333376 items_per_second=34.2835M/s
bench_encode<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN> 8.33 ns 8.33 ns 85155523 avg_size=10
bench_decode_search<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN> 187 ns 187 ns 3738862 items_per_second=56.2882M/s
bench_encode<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN> 8.26 ns 8.26 ns 84936235 avg_size=8
bench_decode_search<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN> 149 ns 149 ns 4778930 items_per_second=70.4746M/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN> 7.50 ns 7.50 ns 88137561 avg_size=8
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN> 151 ns 151 ns 4648573 items_per_second=69.9514M/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN> 8.38 ns 8.38 ns 84103405 avg_size=5.9
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN> 158 ns 158 ns 4436016 items_per_second=66.7102M/s
bench_encode<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN> 2.03 ns 2.03 ns 344212514 avg_size=18
bench_decode_search<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN> 55.6 ns 55.6 ns 12598979 items_per_second=189.521M/s
bench_encode<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN> 1.81 ns 1.81 ns 388446226 avg_size=10
bench_decode_search<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN> 94.4 ns 94.4 ns 7418838 items_per_second=111.533M/s
bench_encode<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN> 7.29 ns 7.28 ns 95793763 avg_size=6
bench_decode_search<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN> 152 ns 152 ns 4619792 items_per_second=69.3505M/s
bench_encode<int_epoch, FMT_MP_FULL, FULL_DATE> 15.0 ns 15.0 ns 46764944 avg_size=17.6
bench_decode_search<int_epoch, FMT_MP_FULL, FULL_DATE> 253 ns 253 ns 2789979 items_per_second=45.5047M/s
bench_encode<int_epoch, FMT_MP_NONZERO, FULL_DATE> 16.9 ns 16.9 ns 41567984 avg_size=17.6
bench_decode_search<int_epoch, FMT_MP_NONZERO, FULL_DATE> 366 ns 365 ns 1918266 items_per_second=31.4699M/s
bench_encode<int_epoch, FMT_TNT_EPOCH, FULL_DATE> 18.3 ns 18.3 ns 38164868 avg_size=17.6
bench_decode_search<int_epoch, FMT_TNT_EPOCH, FULL_DATE> 350 ns 350 ns 1998179 items_per_second=32.8376M/s
bench_encode<int_epoch, FMT_TNT_EPOCH_DATE, FULL_DATE> 21.0 ns 21.0 ns 33465827 avg_size=18.7
bench_decode_search<int_epoch, FMT_TNT_EPOCH_DATE, FULL_DATE> 411 ns 411 ns 1703374 items_per_second=27.9582M/s
bench_encode<int_epoch, FMT_RAW_FULL, FULL_DATE> 2.02 ns 2.02 ns 345645352 avg_size=18
bench_decode_search<int_epoch, FMT_RAW_FULL, FULL_DATE> 71.7 ns 71.6 ns 9765716 items_per_second=160.532M/s
bench_encode<int_epoch, FMT_RAW_NONZERO, FULL_DATE> 2.26 ns 2.25 ns 310268221 avg_size=18
bench_decode_search<int_epoch, FMT_RAW_NONZERO, FULL_DATE> 104 ns 104 ns 6791739 items_per_second=111.048M/s
bench_encode<int_epoch, FMT_MP_DATE, FULL_DATE> 20.6 ns 20.6 ns 33933192 avg_size=18.7
bench_decode_search<int_epoch, FMT_MP_DATE, FULL_DATE> 404 ns 404 ns 1733167 items_per_second=28.4836M/s
bench_encode<int_epoch, FMT_MP_FULL, EPOCH_ONLY> 6.58 ns 6.58 ns 106018986 avg_size=10
bench_decode_search<int_epoch, FMT_MP_FULL, EPOCH_ONLY> 121 ns 121 ns 5808449 items_per_second=66.7467M/s
bench_encode<int_epoch, FMT_MP_NONZERO, EPOCH_ONLY> 7.22 ns 7.22 ns 96959124 avg_size=8
bench_decode_search<int_epoch, FMT_MP_NONZERO, EPOCH_ONLY> 94.4 ns 94.4 ns 7417165 items_per_second=85.2356M/s
bench_encode<int_epoch, FMT_TNT_EPOCH, EPOCH_ONLY> 8.15 ns 8.15 ns 85794147 avg_size=8
bench_decode_search<int_epoch, FMT_TNT_EPOCH, EPOCH_ONLY> 96.0 ns 96.0 ns 7266161 items_per_second=83.812M/s
bench_encode<int_epoch, FMT_TNT_EPOCH_DATE, EPOCH_ONLY> 10.7 ns 10.7 ns 65522497 avg_size=9
bench_decode_search<int_epoch, FMT_TNT_EPOCH_DATE, EPOCH_ONLY> 138 ns 138 ns 5075261 items_per_second=58.2439M/s
bench_encode<int_epoch, FMT_RAW_FULL, EPOCH_ONLY> 2.03 ns 2.03 ns 345321282 avg_size=18
bench_decode_search<int_epoch, FMT_RAW_FULL, EPOCH_ONLY> 31.6 ns 31.6 ns 22191770 items_per_second=255.107M/s
bench_encode<int_epoch, FMT_RAW_NONZERO, EPOCH_ONLY> 1.78 ns 1.78 ns 392548606 avg_size=10
bench_decode_search<int_epoch, FMT_RAW_NONZERO, EPOCH_ONLY> 62.4 ns 62.4 ns 11222765 items_per_second=128.924M/s
bench_encode<int_epoch, FMT_MP_DATE, EPOCH_ONLY> 10.8 ns 10.8 ns 63654828 avg_size=9
bench_decode_search<int_epoch, FMT_MP_DATE, EPOCH_ONLY> 143 ns 143 ns 4922721 items_per_second=56.399M/s
bench_encode<int_epoch, FMT_MP_FULL, MIXED_LOAD> 14.3 ns 14.3 ns 48881627 avg_size=13.8
bench_decode_search<int_epoch, FMT_MP_FULL, MIXED_LOAD> 201 ns 201 ns 3490607 items_per_second=50.886M/s
bench_encode<int_epoch, FMT_MP_NONZERO, MIXED_LOAD> 15.8 ns 15.8 ns 44390237 avg_size=12.7
bench_decode_search<int_epoch, FMT_MP_NONZERO, MIXED_LOAD> 227 ns 227 ns 3079141 items_per_second=44.9125M/s
bench_encode<int_epoch, FMT_TNT_EPOCH, MIXED_LOAD> 16.1 ns 16.1 ns 43416830 avg_size=12.7
bench_decode_search<int_epoch, FMT_TNT_EPOCH, MIXED_LOAD> 217 ns 217 ns 3226935 items_per_second=46.9933M/s
bench_encode<int_epoch, FMT_TNT_EPOCH_DATE, MIXED_LOAD> 19.6 ns 19.6 ns 35903225 avg_size=13.8
bench_decode_search<int_epoch, FMT_TNT_EPOCH_DATE, MIXED_LOAD> 279 ns 279 ns 2504486 items_per_second=36.5657M/s
bench_encode<int_epoch, FMT_RAW_FULL, MIXED_LOAD> 2.03 ns 2.03 ns 345363495 avg_size=18
bench_decode_search<int_epoch, FMT_RAW_FULL, MIXED_LOAD> 54.3 ns 54.3 ns 12928423 items_per_second=188.013M/s
bench_encode<int_epoch, FMT_RAW_NONZERO, MIXED_LOAD> 3.61 ns 3.61 ns 193174040 avg_size=14
bench_decode_search<int_epoch, FMT_RAW_NONZERO, MIXED_LOAD> 90.7 ns 90.6 ns 7730499 items_per_second=112.563M/s
bench_encode<int_epoch, FMT_MP_DATE, MIXED_LOAD> 18.9 ns 18.9 ns 37032935 avg_size=13.8
bench_decode_search<int_epoch, FMT_MP_DATE, MIXED_LOAD> 281 ns 281 ns 2498602 items_per_second=36.3592M/s
But... Taking into account that most of workloads used are purely synthetic and
has nothing common to real-life data, and among workloads only TPC-H generated
workload could pretend to any relevance to real-life data, we use only benchmarks
matching TPCH_1COLUMN pattern for final selection.
./perf/mp_datetime.perftest --benchmark_filter=TPCH_1COLUMN --benchmark_repetitions=5 --benchmark_report_aggregates_only=true
Full log collapsed here...
[t.safin@dev4 build]$ ./perf/mp_datetime.perftest --benchmark_filter=TPCH_1COLUMN --benchmark_repetitions=5 --benchmark_report_aggregates_only=true --benchmark_out=mp_datetime_perf.csv --benchmark_out_format=csv
setting up benchmark data
2021-11-08T01:43:27+03:00
Running ./perf/mp_datetime.perftest
Run on (80 X 3900 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x40)
L1 Instruction 32 KiB (x40)
L2 Unified 1024 KiB (x40)
L3 Unified 28160 KiB (x2)
Load Average: 0.62, 0.41, 0.20
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
----------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
----------------------------------------------------------------------------------------------------------------------------------
bench_encode<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN>_mean 6.98 ns 6.98 ns 5 avg_size=10
bench_encode<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN>_median 6.97 ns 6.97 ns 5 avg_size=10
bench_encode<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN>_stddev 0.016 ns 0.016 ns 5 avg_size=0
bench_decode_search<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN>_mean 130 ns 130 ns 5 items_per_second=81.2925M/s
bench_decode_search<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN>_median 130 ns 130 ns 5 items_per_second=81.2974M/s
bench_decode_search<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN>_stddev 0.160 ns 0.160 ns 5 items_per_second=100.724k/s
bench_encode<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN>_mean 5.68 ns 5.68 ns 5 avg_size=8
bench_encode<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN>_median 5.67 ns 5.67 ns 5 avg_size=8
bench_encode<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN>_stddev 0.011 ns 0.011 ns 5 avg_size=0
bench_decode_search<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN>_mean 125 ns 125 ns 5 items_per_second=84.4458M/s
bench_decode_search<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN>_median 125 ns 125 ns 5 items_per_second=84.3785M/s
bench_decode_search<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN>_stddev 0.238 ns 0.237 ns 5 items_per_second=160.792k/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN>_mean 7.42 ns 7.42 ns 5 avg_size=8
bench_encode<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN>_median 7.42 ns 7.42 ns 5 avg_size=8
bench_encode<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN>_stddev 0.017 ns 0.017 ns 5 avg_size=0
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN>_mean 131 ns 131 ns 5 items_per_second=80.2188M/s
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN>_median 131 ns 131 ns 5 items_per_second=80.2567M/s
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN>_stddev 0.155 ns 0.155 ns 5 items_per_second=94.5911k/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN>_mean 6.41 ns 6.41 ns 5 avg_size=5.9
bench_encode<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN>_median 6.41 ns 6.41 ns 5 avg_size=5.9
bench_encode<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN>_stddev 0.002 ns 0.002 ns 5 avg_size=0
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN>_mean 146 ns 146 ns 5 items_per_second=72.017M/s
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN>_median 146 ns 146 ns 5 items_per_second=72.0416M/s
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN>_stddev 0.091 ns 0.091 ns 5 items_per_second=44.8115k/s
bench_encode<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN>_mean 2.50 ns 2.50 ns 5 avg_size=18
bench_encode<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN>_median 2.50 ns 2.50 ns 5 avg_size=18
bench_encode<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN>_stddev 0.035 ns 0.035 ns 5 avg_size=0
bench_decode_search<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN>_mean 52.0 ns 52.0 ns 5 items_per_second=202.401M/s
bench_decode_search<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN>_median 52.5 ns 52.5 ns 5 items_per_second=200.556M/s
bench_decode_search<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN>_stddev 1.06 ns 1.06 ns 5 items_per_second=4.21401M/s
bench_encode<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN>_mean 2.49 ns 2.49 ns 5 avg_size=10
bench_encode<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN>_median 2.49 ns 2.49 ns 5 avg_size=10
bench_encode<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN>_stddev 0.006 ns 0.006 ns 5 avg_size=0
bench_decode_search<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN>_mean 70.9 ns 70.9 ns 5 items_per_second=169.287M/s
bench_decode_search<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN>_median 70.9 ns 70.9 ns 5 items_per_second=169.269M/s
bench_decode_search<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN>_stddev 0.341 ns 0.341 ns 5 items_per_second=816.492k/s
bench_encode<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN>_mean 5.99 ns 5.99 ns 5 avg_size=6
bench_encode<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN>_median 5.96 ns 5.96 ns 5 avg_size=6
bench_encode<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN>_stddev 0.057 ns 0.057 ns 5 avg_size=0
bench_decode_search<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN>_mean 135 ns 135 ns 5 items_per_second=78.0833M/s
bench_decode_search<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN>_median 135 ns 135 ns 5 items_per_second=78.0751M/s
bench_decode_search<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN>_stddev 0.091 ns 0.091 ns 5 items_per_second=52.6062k/s
Putting this together in nice table we will see:
| name | iterations | real_time | cpu_time | time_unit | items_per_second | avg_size |
|---|---|---|---|---|---|---|
| bench_encode<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN>_mean | 5 | 6,97 | 6,98 | ns | 10,0 | |
| bench_decode_search<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN>_mean | 5 | 129,55 | 129,54 | ns | 81 292 500 | |
| bench_encode<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN>_mean | 5 | 5,68 | 5,68 | ns | 8,0 | |
| bench_decode_search<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN>_mean | 5 | 124,71 | 124,70 | ns | 84 445 800 | |
| bench_encode<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN>_mean | 5 | 7,42 | 7,42 | ns | 8,0 | |
| bench_decode_search<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN>_mean | 5 | 131,28 | 131,27 | ns | 80 218 800 | |
| bench_encode<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN>_mean | 5 | 6,41 | 6,41 | ns | 5,9 | |
| bench_decode_search<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN>_mean | 5 | 146,23 | 146,22 | ns | 72 017 000 | |
| bench_encode<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN>_mean | 5 | 2,50 | 2,50 | ns | 18,0 | |
| bench_decode_search<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN>_mean | 5 | 52,05 | 52,05 | ns | 202 401 000 | |
| bench_encode<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN>_mean | 5 | 2,49 | 2,49 | ns | 10,0 | |
| bench_decode_search<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN>_mean | 5 | 70,89 | 70,89 | ns | 169 287 000 | |
| bench_encode<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN>_mean | 5 | 5,99 | 5,99 | ns | 6,0 | |
| bench_decode_search<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN>_mean | 5 | 134,87 | 134,86 | ns | 78 083 300 |
So if we want to select the fastests methods to decode datetime:
| method | decode speed (items/sec) | avg size |
|---|---|---|
FMT_RAW_FULL |
202M | 18.0 |
FMT_RAW_NONZERO |
169M | 10.0 |
But if we want to get the most compact storage (which may eventually, for larger workload sizes, to become faster):
| method | decode speed (items/sec) | avg size |
|---|---|---|
FMT_TNT_EPOCH_DATE |
72M | 5,9 |
FMT_MP_DATE |
78M | 6,0 |