Selecting Datetime MessagePack serialization schema - tsafin/tarantool GitHub Wiki

TL;DR

FMT_RAW_FULL is fastest, but is least efficient with average 18 bytes per record. FMT_TNT_EPOCH_DATE is most compact with average 5.9 bytes per record, but is slowest. Looks like FMT_RAW_NONZERO might be good combination of speed (close to FMT_RAW_FULL) but is much more compact with average size of 8 bytes per record

Table of Content

Introduction

Code used for benchmarking has been committed to these branches:

Encoding schemas

We will test several various chemas of serialization of datetime/timestamp data MessagePack stream:

encode type details
FMT_MP_FULL All members are msgpack encoded separately in MP_EXT data.
FMT_MP_NONZERO Some, non-zero members are mp encoded in MP_EXT data.
FMT_RAW_FULL Whole structure is directly copied to MP_EXT data.**
FMT_RAW_NONZERO Partial, nonzero part (epoch field) is copied to MP_EXT data.**
FMT_TNT_EPOCH Shift epoch closer to Tarantool Epoch
FMT_MP_DATE Save separately date and seconds parts
FMT_TNT_EPOCH_DATE Save date separately, with shift to Tarantool epoch

Intuitively we would expect that any *RAW* method (which would use memcpy instead of messagepack encoding) would be way faster than any messagepack encoding. And the question is - what would be difference between them?

Also, it's expected that FMT_RAW_FULL (which would do memcpy of whole 128-bit structure) would be transfer the biggest chunks of memory, because any other encoding would try to reduce amount of transferred data. FMT_RAW_NONZERO still uses same memcpy approach, but would not save fields beyond epoch if they were all zero (i.e. all of nanoseconds, timezone offset and timezone index not defined and equal to 0) If those fields defined to non-zero, then FMT_RAW_NONZERO would be equivalent to FMT_RAW_FULL.

** After internal discussion we agreed that we do not save floating .epoch field as is, but rather save it as converted integer values. As we would see from benchmark timings later this extra conversion does not slow down encoding/decoding for FMT_RAW_* schemas and is negligible.

Among all messagepack encoding schemas, FMT_MP_FULL is biggest, and most straighforward (saves all fields, regardless their values). FMT_MP_NONZERO tries to avoid to save unnecessary 0s, reducing workload size.

FMT_TNT_EPOCH is kind of variant of FMT_MP_NONZERO but with tirck of moving range of saved data from Unix Epoch-based to newly introduced "Tarantool-epoch", i.e. days not since 1970-01-01, but rather 2011-01-01. Which in ideal case might reduce values of saved to MessagePack data, potentially making more compressed dataset.

FMT_MP_DATE is variant of FMT_MP_NONZERO schema, but where we split epoch ( number of seconds since Unix epoch) to 2 separate fields: days since Epoch, and seconds since start of day. In the case when we do not have seconds defined (i.e. SQL DATE type, instead of TIMESTAMP) we would have 0 in seconds field, which would allow us further compress saved data.

FMT_TNT_EPOCH_DATE is variant which combines both FMT_MP_DATE (separate days/seconds) and FMT_TNT_EPOCH (shifting dates to Tarantool epoch). Algorithmically we expect it to have speed similar to FMT_MP_DATE but further reducing size of saved workload. Depending on circumstances (e.g. for date values of modern time from the range of current or last decade) with this schema we may end up with significantly reduced size of storage used.

As an example, for the date 2021-09-31T00:00:00Z value of epoch which needs to be saved:

FMT_MP_FULL, FMT_MP_NONZERO epoch=1632960000
FMT_TNT_EPOCH epoch=339120000
FMT_MP_DATE day=18900, seconds=0
FMT_TNT_EPOCH_DATE day=3925, seconds=0

Data workloads

We used different methods to populate data of running workloads.

workload name details
FULL_DATE All members are non-zero.
EPOCH_ONLY Epoch is non-zero, the rest members are zero.
MIXED_LOAD 50/50 one of the above.
TPCH_1COLUMN Use single column of lineitem table from TPC-H dataset
TPCH_ALLCOLUMNS Use whole lineitem table for estimations (not implemented)

FULL_DATE, EPOCH_ONLY and MIXED_LOAD all generate randomly spread data around basic date of 2021-10-15T08:26:51Z. EPOCH_ONLY and part of MIXED_LOAD modes fill (unexpectedly) only epoch field, and not populate nanoseconds and timezone fields. FULL_DATE and part of MIXED_LOAD fill all known fields.

TPCH_* modes are special - they use separately generated data for TPC-H benchmark. We have modified TPC-H dbgen generator so it would generate data values not from original range of 1992-01-01 .. 1998-12-31, but rather closer to current time values of a range 2011-01-01 .. 2021-12-31. tpch/dbgen generates set of various tables data and the largest database is liteitem.tbl which has several fields of standard SQL type DATE (only date, without seconds information). For the mode TPCH_1COLUMN we have selected lineitem.l_receipdate field, though all other date fields may be used similarly. Generated lineitem.tbl is of a special format very similar to csv, to preload this table data into benchmark (and to avoid stage of parsing external file) we have converted lineitem.tbl to a couple of C files lineitem.h/lineitem.c, which allow to precompile constant data directly into benchmark executable. Command-line used for C files generation is luajit tbl2c.lua -N $(echo 8*1024|bc)

Idea was to implement even closer to reality mode - TPCH_ALLCOLUMNS which would use the same TPC-H generated data to estimate whole workload size once it's being serialized to MessagePack (i.e. full size of lineitem table, including all columns serialized to their MessagePack representation). We have not implemented this mode at the moment.

Different data structures used

Benchmark could check 3 different layouts used for struct datetime storing:

dbl_epoch epoch field is double
int_epoch epoch field is int64_t
reordered like int_epoch but layout is in reverse order (from smallest 16-bit field to 64-bit field)

Worth to note that dbl_epoch is our currently used in master layout, introduced with recent commit of datetime subsystem. It was selected after prior performance evaluations described here - https://gist.github.com/tsafin/618fbf847d258f6e7f5a75fdf9ea945b

In the current code in the branch we left active only dbl_epoch and int_epoch layouts, because reordered layout makes no much sense if we want proper comparison of universal time. After normalization to UTC times should stay comparable without any timezone considerations), i.e. 2021-09-01T03:00:00+0300 is equal to 2021-09-01T00:00:00Z, because they both point to the same moment of time...

Benchmark results

Evaluations were done on server machine dev4 using 80-threads Intel Xeon(R) Gold 6230 CPU running at base 2.10GHz frequency. https://ark.intel.com/content/www/us/en/ark/products/192437/intel-xeon-gold-6230-processor-27-5m-cache-2-10-ghz.html

Final results have been gathered using clang-11 executable recompiled with AVX2 activated. To make this mode possible we have extended AVX2 support in the Tarantool build infrastructure. But worth to note that timings for both AVX and AVX2 code generated were not very much different, and we consider them equivalent from this point of view.

Full results compiled without AVX2 and without any filtering are collapsed here, click if you want to see them all...
[t.safin@dev4 build]$ ./perf/mp_datetime.perftest
setting up benchmark data
2021-11-06T00:57:31+03:00
Running ./perf/mp_datetime.perftest
Run on (80 X 3900 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x40)
  L1 Instruction 32 KiB (x40)
  L2 Unified 1024 KiB (x40)
  L3 Unified 28160 KiB (x2)
Load Average: 0.40, 0.26, 0.15
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
---------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                 Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------------------------------------
bench_encode<dbl_epoch, FMT_MP_FULL, FULL_DATE>                        15.7 ns         15.7 ns     44476716 avg_size=17.6
bench_decode_search<dbl_epoch, FMT_MP_FULL, FULL_DATE>                  253 ns          253 ns      2761097 items_per_second=45.4364M/s
bench_encode<dbl_epoch, FMT_MP_NONZERO, FULL_DATE>                     17.7 ns         17.7 ns     39315711 avg_size=17.6
bench_decode_search<dbl_epoch, FMT_MP_NONZERO, FULL_DATE>               374 ns          374 ns      1870821 items_per_second=30.7233M/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH, FULL_DATE>                      17.9 ns         17.9 ns     39254930 avg_size=17.6
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH, FULL_DATE>                374 ns          374 ns      1872966 items_per_second=30.7336M/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH_DATE, FULL_DATE>                 22.3 ns         22.3 ns     31412096 avg_size=18.7
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH_DATE, FULL_DATE>           426 ns          426 ns      1641223 items_per_second=26.9942M/s
bench_encode<dbl_epoch, FMT_RAW_FULL, FULL_DATE>                       2.29 ns         2.29 ns    306932966 avg_size=18
bench_decode_search<dbl_epoch, FMT_RAW_FULL, FULL_DATE>                67.9 ns         67.9 ns     10277182 items_per_second=169.352M/s
bench_encode<dbl_epoch, FMT_RAW_NONZERO, FULL_DATE>                    1.98 ns         1.98 ns    355023497 avg_size=18
bench_decode_search<dbl_epoch, FMT_RAW_NONZERO, FULL_DATE>              111 ns          111 ns      6317831 items_per_second=103.882M/s
bench_encode<dbl_epoch, FMT_MP_DATE, FULL_DATE>                        22.0 ns         22.0 ns     31512234 avg_size=18.7
bench_decode_search<dbl_epoch, FMT_MP_DATE, FULL_DATE>                  419 ns          419 ns      1671252 items_per_second=27.4724M/s
bench_encode<dbl_epoch, FMT_MP_FULL, EPOCH_ONLY>                       8.21 ns         8.21 ns     85634451 avg_size=10
bench_decode_search<dbl_epoch, FMT_MP_FULL, EPOCH_ONLY>                 130 ns          130 ns      5357802 items_per_second=61.4584M/s
bench_encode<dbl_epoch, FMT_MP_NONZERO, EPOCH_ONLY>                    8.42 ns         8.42 ns     83035716 avg_size=8
bench_decode_search<dbl_epoch, FMT_MP_NONZERO, EPOCH_ONLY>              101 ns          101 ns      7015101 items_per_second=79.7139M/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH, EPOCH_ONLY>                     8.16 ns         8.16 ns     86286738 avg_size=8
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH, EPOCH_ONLY>               103 ns          103 ns      6786081 items_per_second=77.6622M/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH_DATE, EPOCH_ONLY>                11.2 ns         11.2 ns     62778069 avg_size=9
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH_DATE, EPOCH_ONLY>          154 ns          154 ns      4533742 items_per_second=51.9143M/s
bench_encode<dbl_epoch, FMT_RAW_FULL, EPOCH_ONLY>                      2.31 ns         2.31 ns    303906922 avg_size=18
bench_decode_search<dbl_epoch, FMT_RAW_FULL, EPOCH_ONLY>               37.2 ns         37.2 ns     18823338 items_per_second=215.326M/s
bench_encode<dbl_epoch, FMT_RAW_NONZERO, EPOCH_ONLY>                   1.79 ns         1.79 ns    391059971 avg_size=10
bench_decode_search<dbl_epoch, FMT_RAW_NONZERO, EPOCH_ONLY>            61.0 ns         61.0 ns     11500758 items_per_second=131.3M/s
bench_encode<dbl_epoch, FMT_MP_DATE, EPOCH_ONLY>                       11.2 ns         11.2 ns     62483403 avg_size=9
bench_decode_search<dbl_epoch, FMT_MP_DATE, EPOCH_ONLY>                 150 ns          150 ns      4645250 items_per_second=53.2943M/s
bench_encode<dbl_epoch, FMT_MP_FULL, MIXED_LOAD>                       15.2 ns         15.2 ns     45936540 avg_size=13.8
bench_decode_search<dbl_epoch, FMT_MP_FULL, MIXED_LOAD>                 208 ns          208 ns      3441915 items_per_second=49.0672M/s
bench_encode<dbl_epoch, FMT_MP_NONZERO, MIXED_LOAD>                    16.3 ns         16.3 ns     41445541 avg_size=12.8
bench_decode_search<dbl_epoch, FMT_MP_NONZERO, MIXED_LOAD>              229 ns          229 ns      3065369 items_per_second=44.6993M/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH, MIXED_LOAD>                     16.1 ns         16.1 ns     43448816 avg_size=12.8
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH, MIXED_LOAD>               231 ns          231 ns      3039471 items_per_second=44.339M/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH_DATE, MIXED_LOAD>                20.5 ns         20.5 ns     34189905 avg_size=13.9
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH_DATE, MIXED_LOAD>          290 ns          290 ns      2414425 items_per_second=35.259M/s
bench_encode<dbl_epoch, FMT_RAW_FULL, MIXED_LOAD>                      2.03 ns         2.03 ns    345211466 avg_size=18
bench_decode_search<dbl_epoch, FMT_RAW_FULL, MIXED_LOAD>               56.0 ns         56.0 ns     12523866 items_per_second=182.607M/s
bench_encode<dbl_epoch, FMT_RAW_NONZERO, MIXED_LOAD>                   3.53 ns         3.53 ns    197709309 avg_size=14
bench_decode_search<dbl_epoch, FMT_RAW_NONZERO, MIXED_LOAD>            97.7 ns         97.7 ns      7166848 items_per_second=104.667M/s
bench_encode<dbl_epoch, FMT_MP_DATE, MIXED_LOAD>                       20.1 ns         20.1 ns     34571177 avg_size=13.9
bench_decode_search<dbl_epoch, FMT_MP_DATE, MIXED_LOAD>                 298 ns          298 ns      2333376 items_per_second=34.2835M/s
bench_encode<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN>                     8.33 ns         8.33 ns     85155523 avg_size=10
bench_decode_search<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN>               187 ns          187 ns      3738862 items_per_second=56.2882M/s
bench_encode<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN>                  8.26 ns         8.26 ns     84936235 avg_size=8
bench_decode_search<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN>            149 ns          149 ns      4778930 items_per_second=70.4746M/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN>                   7.50 ns         7.50 ns     88137561 avg_size=8
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN>             151 ns          151 ns      4648573 items_per_second=69.9514M/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN>              8.38 ns         8.38 ns     84103405 avg_size=5.9
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN>        158 ns          158 ns      4436016 items_per_second=66.7102M/s
bench_encode<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN>                    2.03 ns         2.03 ns    344212514 avg_size=18
bench_decode_search<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN>             55.6 ns         55.6 ns     12598979 items_per_second=189.521M/s
bench_encode<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN>                 1.81 ns         1.81 ns    388446226 avg_size=10
bench_decode_search<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN>          94.4 ns         94.4 ns      7418838 items_per_second=111.533M/s
bench_encode<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN>                     7.29 ns         7.28 ns     95793763 avg_size=6
bench_decode_search<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN>               152 ns          152 ns      4619792 items_per_second=69.3505M/s
bench_encode<int_epoch, FMT_MP_FULL, FULL_DATE>                        15.0 ns         15.0 ns     46764944 avg_size=17.6
bench_decode_search<int_epoch, FMT_MP_FULL, FULL_DATE>                  253 ns          253 ns      2789979 items_per_second=45.5047M/s
bench_encode<int_epoch, FMT_MP_NONZERO, FULL_DATE>                     16.9 ns         16.9 ns     41567984 avg_size=17.6
bench_decode_search<int_epoch, FMT_MP_NONZERO, FULL_DATE>               366 ns          365 ns      1918266 items_per_second=31.4699M/s
bench_encode<int_epoch, FMT_TNT_EPOCH, FULL_DATE>                      18.3 ns         18.3 ns     38164868 avg_size=17.6
bench_decode_search<int_epoch, FMT_TNT_EPOCH, FULL_DATE>                350 ns          350 ns      1998179 items_per_second=32.8376M/s
bench_encode<int_epoch, FMT_TNT_EPOCH_DATE, FULL_DATE>                 21.0 ns         21.0 ns     33465827 avg_size=18.7
bench_decode_search<int_epoch, FMT_TNT_EPOCH_DATE, FULL_DATE>           411 ns          411 ns      1703374 items_per_second=27.9582M/s
bench_encode<int_epoch, FMT_RAW_FULL, FULL_DATE>                       2.02 ns         2.02 ns    345645352 avg_size=18
bench_decode_search<int_epoch, FMT_RAW_FULL, FULL_DATE>                71.7 ns         71.6 ns      9765716 items_per_second=160.532M/s
bench_encode<int_epoch, FMT_RAW_NONZERO, FULL_DATE>                    2.26 ns         2.25 ns    310268221 avg_size=18
bench_decode_search<int_epoch, FMT_RAW_NONZERO, FULL_DATE>              104 ns          104 ns      6791739 items_per_second=111.048M/s
bench_encode<int_epoch, FMT_MP_DATE, FULL_DATE>                        20.6 ns         20.6 ns     33933192 avg_size=18.7
bench_decode_search<int_epoch, FMT_MP_DATE, FULL_DATE>                  404 ns          404 ns      1733167 items_per_second=28.4836M/s
bench_encode<int_epoch, FMT_MP_FULL, EPOCH_ONLY>                       6.58 ns         6.58 ns    106018986 avg_size=10
bench_decode_search<int_epoch, FMT_MP_FULL, EPOCH_ONLY>                 121 ns          121 ns      5808449 items_per_second=66.7467M/s
bench_encode<int_epoch, FMT_MP_NONZERO, EPOCH_ONLY>                    7.22 ns         7.22 ns     96959124 avg_size=8
bench_decode_search<int_epoch, FMT_MP_NONZERO, EPOCH_ONLY>             94.4 ns         94.4 ns      7417165 items_per_second=85.2356M/s
bench_encode<int_epoch, FMT_TNT_EPOCH, EPOCH_ONLY>                     8.15 ns         8.15 ns     85794147 avg_size=8
bench_decode_search<int_epoch, FMT_TNT_EPOCH, EPOCH_ONLY>              96.0 ns         96.0 ns      7266161 items_per_second=83.812M/s
bench_encode<int_epoch, FMT_TNT_EPOCH_DATE, EPOCH_ONLY>                10.7 ns         10.7 ns     65522497 avg_size=9
bench_decode_search<int_epoch, FMT_TNT_EPOCH_DATE, EPOCH_ONLY>          138 ns          138 ns      5075261 items_per_second=58.2439M/s
bench_encode<int_epoch, FMT_RAW_FULL, EPOCH_ONLY>                      2.03 ns         2.03 ns    345321282 avg_size=18
bench_decode_search<int_epoch, FMT_RAW_FULL, EPOCH_ONLY>               31.6 ns         31.6 ns     22191770 items_per_second=255.107M/s
bench_encode<int_epoch, FMT_RAW_NONZERO, EPOCH_ONLY>                   1.78 ns         1.78 ns    392548606 avg_size=10
bench_decode_search<int_epoch, FMT_RAW_NONZERO, EPOCH_ONLY>            62.4 ns         62.4 ns     11222765 items_per_second=128.924M/s
bench_encode<int_epoch, FMT_MP_DATE, EPOCH_ONLY>                       10.8 ns         10.8 ns     63654828 avg_size=9
bench_decode_search<int_epoch, FMT_MP_DATE, EPOCH_ONLY>                 143 ns          143 ns      4922721 items_per_second=56.399M/s
bench_encode<int_epoch, FMT_MP_FULL, MIXED_LOAD>                       14.3 ns         14.3 ns     48881627 avg_size=13.8
bench_decode_search<int_epoch, FMT_MP_FULL, MIXED_LOAD>                 201 ns          201 ns      3490607 items_per_second=50.886M/s
bench_encode<int_epoch, FMT_MP_NONZERO, MIXED_LOAD>                    15.8 ns         15.8 ns     44390237 avg_size=12.7
bench_decode_search<int_epoch, FMT_MP_NONZERO, MIXED_LOAD>              227 ns          227 ns      3079141 items_per_second=44.9125M/s
bench_encode<int_epoch, FMT_TNT_EPOCH, MIXED_LOAD>                     16.1 ns         16.1 ns     43416830 avg_size=12.7
bench_decode_search<int_epoch, FMT_TNT_EPOCH, MIXED_LOAD>               217 ns          217 ns      3226935 items_per_second=46.9933M/s
bench_encode<int_epoch, FMT_TNT_EPOCH_DATE, MIXED_LOAD>                19.6 ns         19.6 ns     35903225 avg_size=13.8
bench_decode_search<int_epoch, FMT_TNT_EPOCH_DATE, MIXED_LOAD>          279 ns          279 ns      2504486 items_per_second=36.5657M/s
bench_encode<int_epoch, FMT_RAW_FULL, MIXED_LOAD>                      2.03 ns         2.03 ns    345363495 avg_size=18
bench_decode_search<int_epoch, FMT_RAW_FULL, MIXED_LOAD>               54.3 ns         54.3 ns     12928423 items_per_second=188.013M/s
bench_encode<int_epoch, FMT_RAW_NONZERO, MIXED_LOAD>                   3.61 ns         3.61 ns    193174040 avg_size=14
bench_decode_search<int_epoch, FMT_RAW_NONZERO, MIXED_LOAD>            90.7 ns         90.6 ns      7730499 items_per_second=112.563M/s
bench_encode<int_epoch, FMT_MP_DATE, MIXED_LOAD>                       18.9 ns         18.9 ns     37032935 avg_size=13.8
bench_decode_search<int_epoch, FMT_MP_DATE, MIXED_LOAD>                 281 ns          281 ns      2498602 items_per_second=36.3592M/s

But... Taking into account that most of workloads used are purely synthetic and has nothing common to real-life data, and among workloads only TPC-H generated workload could pretend to any relevance to real-life data, we use only benchmarks matching TPCH_1COLUMN pattern for final selection.

./perf/mp_datetime.perftest --benchmark_filter=TPCH_1COLUMN --benchmark_repetitions=5 --benchmark_report_aggregates_only=true
Full log collapsed here...
[t.safin@dev4 build]$ ./perf/mp_datetime.perftest --benchmark_filter=TPCH_1COLUMN --benchmark_repetitions=5 --benchmark_report_aggregates_only=true --benchmark_out=mp_datetime_perf.csv --benchmark_out_format=csv
setting up benchmark data
2021-11-08T01:43:27+03:00
Running ./perf/mp_datetime.perftest
Run on (80 X 3900 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x40)
  L1 Instruction 32 KiB (x40)
  L2 Unified 1024 KiB (x40)
  L3 Unified 28160 KiB (x2)
Load Average: 0.62, 0.41, 0.20
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
----------------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                        Time             CPU   Iterations UserCounters...
----------------------------------------------------------------------------------------------------------------------------------
bench_encode<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN>_mean                       6.98 ns         6.98 ns            5 avg_size=10
bench_encode<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN>_median                     6.97 ns         6.97 ns            5 avg_size=10
bench_encode<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN>_stddev                    0.016 ns        0.016 ns            5 avg_size=0
bench_decode_search<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN>_mean                 130 ns          130 ns            5 items_per_second=81.2925M/s
bench_decode_search<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN>_median               130 ns          130 ns            5 items_per_second=81.2974M/s
bench_decode_search<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN>_stddev             0.160 ns        0.160 ns            5 items_per_second=100.724k/s
bench_encode<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN>_mean                    5.68 ns         5.68 ns            5 avg_size=8
bench_encode<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN>_median                  5.67 ns         5.67 ns            5 avg_size=8
bench_encode<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN>_stddev                 0.011 ns        0.011 ns            5 avg_size=0
bench_decode_search<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN>_mean              125 ns          125 ns            5 items_per_second=84.4458M/s
bench_decode_search<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN>_median            125 ns          125 ns            5 items_per_second=84.3785M/s
bench_decode_search<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN>_stddev          0.238 ns        0.237 ns            5 items_per_second=160.792k/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN>_mean                     7.42 ns         7.42 ns            5 avg_size=8
bench_encode<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN>_median                   7.42 ns         7.42 ns            5 avg_size=8
bench_encode<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN>_stddev                  0.017 ns        0.017 ns            5 avg_size=0
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN>_mean               131 ns          131 ns            5 items_per_second=80.2188M/s
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN>_median             131 ns          131 ns            5 items_per_second=80.2567M/s
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN>_stddev           0.155 ns        0.155 ns            5 items_per_second=94.5911k/s
bench_encode<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN>_mean                6.41 ns         6.41 ns            5 avg_size=5.9
bench_encode<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN>_median              6.41 ns         6.41 ns            5 avg_size=5.9
bench_encode<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN>_stddev             0.002 ns        0.002 ns            5 avg_size=0
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN>_mean          146 ns          146 ns            5 items_per_second=72.017M/s
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN>_median        146 ns          146 ns            5 items_per_second=72.0416M/s
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN>_stddev      0.091 ns        0.091 ns            5 items_per_second=44.8115k/s
bench_encode<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN>_mean                      2.50 ns         2.50 ns            5 avg_size=18
bench_encode<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN>_median                    2.50 ns         2.50 ns            5 avg_size=18
bench_encode<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN>_stddev                   0.035 ns        0.035 ns            5 avg_size=0
bench_decode_search<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN>_mean               52.0 ns         52.0 ns            5 items_per_second=202.401M/s
bench_decode_search<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN>_median             52.5 ns         52.5 ns            5 items_per_second=200.556M/s
bench_decode_search<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN>_stddev             1.06 ns         1.06 ns            5 items_per_second=4.21401M/s
bench_encode<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN>_mean                   2.49 ns         2.49 ns            5 avg_size=10
bench_encode<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN>_median                 2.49 ns         2.49 ns            5 avg_size=10
bench_encode<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN>_stddev                0.006 ns        0.006 ns            5 avg_size=0
bench_decode_search<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN>_mean            70.9 ns         70.9 ns            5 items_per_second=169.287M/s
bench_decode_search<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN>_median          70.9 ns         70.9 ns            5 items_per_second=169.269M/s
bench_decode_search<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN>_stddev         0.341 ns        0.341 ns            5 items_per_second=816.492k/s
bench_encode<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN>_mean                       5.99 ns         5.99 ns            5 avg_size=6
bench_encode<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN>_median                     5.96 ns         5.96 ns            5 avg_size=6
bench_encode<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN>_stddev                    0.057 ns        0.057 ns            5 avg_size=0
bench_decode_search<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN>_mean                 135 ns          135 ns            5 items_per_second=78.0833M/s
bench_decode_search<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN>_median               135 ns          135 ns            5 items_per_second=78.0751M/s
bench_decode_search<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN>_stddev             0.091 ns        0.091 ns            5 items_per_second=52.6062k/s

Putting this together in nice table we will see:

name iterations real_time cpu_time time_unit items_per_second avg_size
bench_encode<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN>_mean 5 6,97 6,98 ns 10,0
bench_decode_search<dbl_epoch, FMT_MP_FULL, TPCH_1COLUMN>_mean 5 129,55 129,54 ns 81 292 500
bench_encode<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN>_mean 5 5,68 5,68 ns 8,0
bench_decode_search<dbl_epoch, FMT_MP_NONZERO, TPCH_1COLUMN>_mean 5 124,71 124,70 ns 84 445 800
bench_encode<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN>_mean 5 7,42 7,42 ns 8,0
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH, TPCH_1COLUMN>_mean 5 131,28 131,27 ns 80 218 800
bench_encode<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN>_mean 5 6,41 6,41 ns 5,9
bench_decode_search<dbl_epoch, FMT_TNT_EPOCH_DATE, TPCH_1COLUMN>_mean 5 146,23 146,22 ns 72 017 000
bench_encode<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN>_mean 5 2,50 2,50 ns 18,0
bench_decode_search<dbl_epoch, FMT_RAW_FULL, TPCH_1COLUMN>_mean 5 52,05 52,05 ns 202 401 000
bench_encode<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN>_mean 5 2,49 2,49 ns 10,0
bench_decode_search<dbl_epoch, FMT_RAW_NONZERO, TPCH_1COLUMN>_mean 5 70,89 70,89 ns 169 287 000
bench_encode<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN>_mean 5 5,99 5,99 ns 6,0
bench_decode_search<dbl_epoch, FMT_MP_DATE, TPCH_1COLUMN>_mean 5 134,87 134,86 ns 78 083 300

Conclusions

So if we want to select the fastests methods to decode datetime:

method decode speed (items/sec) avg size
FMT_RAW_FULL 202M 18.0
FMT_RAW_NONZERO 169M 10.0

But if we want to get the most compact storage (which may eventually, for larger workload sizes, to become faster):

method decode speed (items/sec) avg size
FMT_TNT_EPOCH_DATE 72M 5,9
FMT_MP_DATE 78M 6,0
⚠️ **GitHub.com Fallback** ⚠️