TickTockDB v.s. OpenTSDB, max cardinality comparison, X86 - ytyou/ticktock GitHub Wiki
Table of Contents
2. IoTDB-benchmark Introduction
4. Resource consumption comparison
1. Introduction
In our previous wikis, we have shown that TickTockDB is 14x better than InfluxDB in terms of max cardinality on Single Board Computers like RaspBerryPI and OrangePI. In this experiment, we want to compare TickTockDB with OpenTSDB, the original TimeSeriesDB which motivates us to develop TickTockDB. OpenTSDB is one of the very first TSDBs and still extensively used in production environments. One of its advantages is scalability since it is built on top of HBase. However, HBase also brings OpenTSDB lots of problems, e.g., maintenance due to complexity of architecture, performance etc. Please refer to TickTockDB README for our original motivations.
In our tests, we let clients send data points to a TSDB periodically (one data point every 10 seconds per time series). We measure how many time series a TSDB can handle in max. We think it is close to normal scenarios in which there should be certain intervals between two consecutive operations from a client. For example, CPU data are collected once every 10 seconds.
2. IoTDB-benchmark Introduction
We select IoTDB-benchmark for performance evaluation. Please refer to README and the introduction in the previous wiki for details.
3. Experiment settings
3.1. Hardware
We run the whole tests on an Ubuntu laptop with AMD Ryzen5 5600H cpu (12 vCpus), 24GB of memory (DDR4 3200 MHz), 1TB 5400rpm HDD. We also allocated 2GB swap space. OpenTSDB and TickTockDB run on an Ubuntu docker (X86, 2 vCpus, 4GB memory). IoTDB-benchmark runs on the laptop host.
3.2. Software
- TickTockDB
- Version: 0.11.7
- Command:
./bin/tt -c conf/tt.conf --tsdb.timestamp.resolution millisecond --http.server.port 6182,6183 --http.listener.count 2,2 --http.request.format json &
- Please update #openfile limits to a very high number. See this instruction.
- OpenTSDB
- Version: opentsdb-2.4.0. We use a docker image, petergrace/opentsdb-docker, which runs HBase on top of files instead of Hadoop.
- Docker command:
[yi-IdeaPad ~]$ docker run -d --name opentsdb --cpuset-cpus 1-2 -m 4g -h opentsdb -p 4242:4242 -v /opt/opentsdb:/etc/opentsdb petergrace/opentsdb-docker
- Config: default
- IoTDB-benchmark
- Version: main
- Sample config for TickTockDB (similar to OpenTSB except port 4242): config.properties
- Important settings in the config:
- Read-Write ratio: reads(10%) and writes(90%).
- Loop: 2160 and 10-seconds interval (which keeps each test running for 6 hours(=2160*10s))
- Number of sensors per device: 10
- We scale up loads by increasing the number of devices from 5k to 500k.
- We bind each client to 10 devices. For example, 5k devices would need 500 clients. We don't choose a bigger number than 10 because OpenTSDB response time is at seconds and a client would not finish within 10 second interval. We limit client number to 1000 since otherwise there would be too many clients (500k devices means 50k clients) overflowing the single test host. we will update CLIENT_NUMBER and DEVICE_NUMBER in config.properties for each test.
- Write requests are in Json format, the default format of OpenTSDB.
In summary, the above config will simulate a list of clients collecting a list of metrics (DEVICE_NUMBER * 10 sensors per device) every 10 seconds, and sending the metrics to TickTockDB/OpenTSDB in Json format.
4. max cardinality: Resource consumption comparison
In our tests, clients send a device's sensor data in one operation at every 10 seconds. Each sensor's data is a unique time series, so the cardinality is number of devices * 10 sensors/device. Note that it is not a backfill case in which data points are sent back to back right after a previous write is finished. The write throughput is fixed in our test setup. So it doesn't make sense to compare throughputs in both TickTockDB and OpenTSDB. Instead, we compare how much OS resources TickTockDB and OpenTSDB consume at this load. The lower OS resources a TSDB consumes, the better the TSDB is.
4.1. CPU
The above figures show cpu usage of OpenTSDB (in blue) and TickTockDB (in red) dockers during tests.
We used 5k, 6k, and 7k devices (corresponding to 50k, 60k, and 70k time series, respectively) for OpenTSDB, and 400k and 500k devices (4M and 5M time series, respectively) for TickTockDB. Each test is supposed to finish in 6 hours (21600 seconds). OpenTSDB finished in time with 50k and 60k time series, but took 22314 seconds with 70k time series. So we consider 60k as the max cardinality of OpenTSDB.
TickTockDB finished in time with 4M time series, but took 25587 seconds with 5M time series. So we consider 4M as the max cardinality of TickTockDB, 66x better than OpenTSDB.
Looking at the CPU metric, we note that OpenTSDB used lots of CPU at the first 1-2 hours, up to 180% (i.e., 1.8 vCPUs out of 2 in total) in 60k and 70k time series test. Then its CPU usage was kept below 50%. We think OpenTSDB was busy in preparing metadata such as UIDs for new time series. After metadata are done, it requires much less CPUs (still 0.5 vCPU for 70k time series).
TickTockDB CPU usage was stable around 100%-130% with 4M time series, a much larger cardinality than 70k. With 5M cardinality, TickTockDB CPU usage reached 170% and then started to drop. It was because TickTockDB can't catch up with 5M cardinality and it slowed down (i.e., a thrashing behavior happened). Let's see other metrics below.
4.2. IO Util
The figure above is IO util. With 5M cardinality, TickTockDB's IO util keeps growing to 55% until the end of test. It shows that TickTockDB can't catch up with 5M cardinality because of IO util. With 4M cardinality, TickTockDB's IO util was up to 13%.
OpenTSDB's IO util was low with 50k, 60k, and 70k cardinality, although there was a spike up to 20%. It doesn't seem IO is bottleneck for OpenTSDB.
4.3 Write and read bytes rate
The above figure shows read and write rate (bytes per second). OpenTSDB's write rate was zigzag between 500kB/sec to 3MB/sec with 60k time series. With 4M time series, TickTockDB's write rate was about 1.5MB/sec to 3MB/sec. The max write rates of two cases were very closed, although 4M is much larger than 60k. It indicates that TickTockDB is more efficient in write IOs.
We also compared the data storage size of OpenTSDB and TickTockDB. With 60k time series running for 6 hours at 10 second interval and about 90% operations are writes, there are 116,622,400 data points. The corresponding OpenTSDB storage size is 1.5GB. So in average a data point is compressed to 12.86 bytes. Similarly, there are 7,776,720,000 data points for 4M time series. The corresponding TickTockDB data storage size is 8GB. So in average a data point is compressed to 1.03 bytes in TickTockDB.
We tested our 5400rpm HDD write and read bytes rate using dd
, they are 89.3MB/s and 90.4MB/s, respectively. So the write rate in OpenTSDB and TickTockDB test cases are far from saturation yet.
[yi-IdeaPad ~]$ dd if=/dev/zero of=./test bs=512k count=40960
40960+0 records in
40960+0 records out
21474836480 bytes (21 GB, 20 GiB) copied, 240.49 s, 89.3 MB/s
[yi-IdeaPad ~]$ ls -hl ./test
-rw-rw-r-- 1 ylin30 ylin30 20G Apr 28 20:42 ./test
[yi-IdeaPad ~]$ sudo sysctl -w vm.drop_caches=3
vm.drop_caches = 3
[yi-IdeaPad ~]$ dd if=./test of=/dev/zero bs=512k count=40960
40960+0 records in
40960+0 records out
21474836480 bytes (21 GB, 20 GiB) copied, 237.679 s, 90.4 MB/s
[yi-IdeaPad ~]$
Both TickTockDB and OpenTSDB read bytes rate were very small.
4.4 Memory
Let's look at memory usage. We only care about RSS memory but not cache. OpenTSDB docker has two java processes running, opentsdb and hbase. So its memory usage should be the sum of the two. With 60k time series, HBase process used 1.2GB memory stably. Opentsdb process used up to 600MB. In total, they used 1.8GB RSS memory. With 70k time series, they both used 2.2GB (HBase 1.2GB, OpenTSDB 1GB). Note that we didn't limit java heap size with -xmx
, but just leaved it to JVM default. HBase might have used more memory and Opentsdb performance might have been better if -xmx
could be optimized properly.
With 4M time series, TickTockDB's RSS memory kept growing up to 3.2GB. Note that the total memory is just 4GB. It is already 80% full. With 5M time series, RSS was spiked at 3.4GB (even higher if we plot the figure with finer downsampling). We believe that memory was the actual bottleneck for TickTockDB. With no enough room for OS cache, IO util would be very bad. That's why we saw IO util kept growing to 50%, but write rate was not very high yet.
5. Response time comparison
How fast an TSDB can write and read data points is important. Let's look at response time in both OpenTSDB and TickTockDB. Note that it is response time per operation (not per data point)[^1].
[^1]:In IotDB-benchmark, a write operation is for a device and there are 10 data points (i.e., 10 sensors' data). In normal DevOps scenarios with collector framework like TCollector or Telegraf, an operation may contain more than 10 data points as collector frameworks would combine data points collected by different collectors (like cpu collector and memory collector) in one request. Generally, more data points in a request, less write response time per data point in average.
5.1. Average write response time
At its max cardinality (i.e., 60k time series), average write response time per operation for OpenTSDB was 15.01ms while 8.36ms for TickTockDB at 4M time series. TickTockDB is faster than OpenTSDB even with much higher cardinality.
OpenTSDB's average write response time doesn't go up with cardinality, neither does TickTockDB's (though very closed). We think it is because write response times are very volatile, especially in OpenTSDB. We will look at p999 response time later.
5.2. Average read response time
We used 5 kinds of reads in IotDB-benchmark. Their response time patterns are similar so we just present you AGG_RANGE here.
Aggregation query with time filter: select func(v1)... from data where device in ? and time > ? and time < ?.
Average read response time went up when loads were increased in both OpenTSDB and TickTockDB. Higher the loads, larger the response time.
At OpenTSDB's max cardinality (60k time series), its average read response time is 945ms. At TickTockDB's max cardinality (4M time series) the average read response time is 22.2ms.
5.3. P999 write response time
Now let's look at P999 write response time. The p999 write response time is much larger than the average in both OpenTSDB and TickTockDB.
When OpenTSDB reached its max cardinality 60k, its p999 write took 264.67ms. When TickTockDB reached max cardinality 4M, its p999 write took 62.16ms. Even though the two cases are in different cardinality, you can see that TickTockDB writes are much faster than OpenTSDB.
5.4. P999 read response time
When OpenTSDB reached its max cardinality 60k, its p999 read took 23952.48ms. When TickTockDB reached its max cardinality 4M, its p999 read took 1516.71ms, still much faster than OpenTSDB.
6. Conclusion
- We compared TickTockDB with OpenTSDB on X86 (2 vCPUs, 4GB memory, 5400rpm HDD) in terms of max cardinality. Instead of using backfill scenarios, we simulated normal scenarios that a list of clients send a list of time series (100 devices per client and 10 sensors per device) at every 10 seconds interval.
- OpenTSDB's max cardinality was 60K (i.e., 6k devices and 10 sensors/device). TickTockDB's max cardinality was 4M (i.e., 400k devices and 10 sensors/device).
- OpenTSDB almost used all of CPU at the beginning but then dropped down to 0.5 vCPU. IO utils were still low. Memory was on pressure for OpenTSDB due to HBase and Java.
- Memory was the bottleneck for TickTockDB. It lead to very high IO util since no much memory was left for OS cache. CPU usage was also high. It indicated that TickTockDB balanced memory and CPU resources well.
- Data compression ratio was 12.86 and 1.03 byte/data point for OpenTSDB and TickTockDB, respectively.
- OpenTSDB's read and write response times were much higher than TickTockDB's. For example, P999 write response times of OpenTSDB with 60k cardinality and TickTockDB with 4M cardinality were 264.67ms and 62.16ms, while P999 read is 23952.48ms and 1516.71ms, respectively.