Performance Evaluation of TickTock on RaspberryPI (ARM 32bit) - ytyou/ticktock GitHub Wiki

Table of Contents

1. RaspberryPI Introduction

2. IoTDB-benchmark Introduction

3. Experiment Settings

4. Test scenario 1: Write-only Mode

4.1 Throughput

4.2 Latency

4.3 CPU and Memory consumption

5. Test scenario 2: Read-Write Mixed Mode

5.1 Write operation (INGESTION)

5.2 Read operation 1 (PRECISION_POINT)

5.3 Read operation 2 (TIME_RANGE)

5.4 Read operation 3 (AGG_RANGE)

5.5 Read operation 4 (GROUP_BY)

5.6 Read operation 5 (LATEST_POINT)

6. Test scenario 3: Compared With Influxdb

6.1 Throughput comparison

6.2 Latency comparison

7. Conclusion

1. RaspberryPI Introduction

Here we would like to present you TickTock performance on RaspberryPI, the most popular IoT devices. RaspberryPI is a series of tiny and affordable Single Board Computers (SBC) developed in UK, which has been sold over 40 million units since 2012. It has been used extensively in hobby and industry projects like robotics, environment monitoring etc.

  • It is tiny, as shown in the figure below.
  • It is affordable, ranged from $4 to $75.
  • It has cpu, memory, network, and disk, and runs a full OS.

The figure shows a PI-zero-W, a Single Board Computer (SBC) with

  • 1GHz single-core CPU (ARMv6),
  • 512MB memory,
  • 802.11 b/g/n wireless LAN,
  • running Raspberry PI OS (RaspBian), a Debian-based 32 bit Linux OS.
  • And it costs only $10.

PI-zero-W is the lowest and cheapest PI model with a full OS and wireless LAN. PI pico is cheaper ($4) but it is a Micro Control Unit (MCU) and not designed to run general applications. In this performance evaluation, we used PI-zero-W to show how TickTock performs in such a small IoT device.

2. IoTDB-benchmark Introduction

We select IoTDB-benchmark for performance evaluation. IoTDB-benchmark is developed by THULAB, Tsinghua University, Beijing, to compare the performance of different TSDBs with flexible test settings and IoT industrial scenarios. It was published in CoRR 2019 (Computing Research Repository). You can download the PDF paper here.

@article{DBLP:journals/corr/abs-1901-08304,
 author    = {Rui Liu and Jun Yuan},
 title     = {Benchmark Time Series Database with IoTDB-Benchmark for IoT Scenarios},
 journal   = {CoRR},
 volume    = {abs/1901.08304},
 year      = {2019},
 url       = {http://arxiv.org/abs/1901.08304},
 timestamp = {Sat, 02 Feb 2019 16:56:00 +0100},
 }

IoTDB-benchmark simulates a wind power company operating several wind farms. There are many wind turbines (i.e., devices) in a wind farm, and each device has many sensors to collect different metrics such as wind speed, temperature etc periodically and send them to TSDBs.

IoTDB-benchmark is a great benchmark due to the followings:

  • It provides detailed measurement metrics such as throughput, latency (average, p10, p25, median, p75, p95, p99, p999).
  • It provides adaptors to various TSDBs such as InfluxDB, OpenTSDB, TimescaleDB. TickTock reuses the adaptor for OpenTSDB. By the way, THULAB in Tsinghua University also developed a TSDB, IoTDB which is an Apache project. So the benchmark also supports IoTDB.
  • It supports Out-Of-Order write which is a common scenario but not supported by many other TSDB benchmarks or TSDBs.
  • It supports different test scenarios, e.g, write only, read and write mixed.
  • It provides various data distributions to simulate industrial scenarios.

IoTDB-benchmark provides an opened-source implementation in github/iotdb-benchmark. We used a forked version of it in github.com/ytyou/iotdb-benchmark. Please read the user guide.

3. Experiment settings

3.1. Hardware

We run a TickTock in a PI-zero-W, and IoTDB-benchmark in an Ubuntu laptop (spec: xxx). The laptop connects a wireless router (spec: xxx) through a wired connection. The PI-zero-W connects the router with 2.4GHz wireless connection.

3.2. Software

  • TickTock
  • Version: 0.3.9
  • Config: tt.conf
  • Important settings:
  • Use TCP protocol for writes, and HTTP for reads
  • Page count: 1024
  • Please update #openfile limits to a very high number. See this instruction.
  • IoTDB-benchmark
  • Write-only: We aim to evaluate TickTock in the scenario of 100% writes.
  • Read-Write: We aim to evaluate TickTock in the scenario of mixed reads(50%) and writes(50%).
  • Important settings:
  • Loop: 1M (which keeps TickTock run for more than 2 hours in each test with a specific client number and device number)
  • Number of sensor per device: 10
  • We scale test loads by number of client (i.e., 1, 3, 5, 7, 9).
  • We bind each client to one device. So we will update CLIENT_NUMBER and DEVICE_NUMBER in config.properties for each test.

As mentioned above, we chose 1M loops in all tests. A 9-client write-only test with 1M loop will have 900M data points inserted. And it runs for more than 2 hours. We could have chosen more loops but the whole evaluation will take much longer time to finish. We did compare test results of 1M loops with 10M loops in one case, and they are very closed (with 5% difference).

For comparison purpose, we pick Influxdb since it is the most popular TSDB and one of very few TSDBs which can run in the very tiny ARM 32 bit SBC device, PI-zero-W. If you look up TSDB in RaspberryPI forum, Influxdb is the de facto option for TSDB.

  • Influxdb

4. Test scenario 1: Write-only mode

  • We will increase number of client and number of device in each test.
  • Example Config for 5 clients: config.properties
  • Note that:
  • It contains out-of-order writes (OUT_OF_ORDER_RATIO=0.5)
  • A test with 1M loops for 1 client will insert 100M data points, and 900M data points for a 9 clients test.

4.1. Throughput (DataPoint/second)

Based on our observation, the max throughput TickTock achieves is over 58K data points/second when client number is 5. The throughput of 1 client is lower since CPU is not saturated yet. The throughput of 9 clients degrades due to more concurrency.

4.2. Latency

In average, write operations were responded very fast, less than 1 millisecond. Latency increases while client number increases. We think it is due to more concurrency in processing requests.

The above figure shows the latency of 1 client test. Other client tests are similar. The percentile latencies grow as percentile grows, as expected. There was a jump from P95 to P99 latency.

4.3. CPU & Memory consumption

Based on our observation, CPUs were saturated in most cases. The 1-client (left most) still had some CPU idle throughout the test. So we can see its throughput lower than others. The 7-client case also didn't saturate the CPU in the early stage but fully saturated in the rest of test. Its throughput is better than 1-client but worse than 3 and 5-client.

There were still plenty memories available during all tests, over 300MB (Note the total memory of PI-zero-W is 512MB). This demonstrates that TickTock is very memory efficient while CPU intensive.

5. Test scenario 2: Read-Write mixed mode

  • We will increase number of client and number of device in each test.
  • Example Config for 5 clients: config.properties
  • The mixed ratio:
  • write(INGESTION): 50%
  • It contains out-of-order writes (OUT_OF_ORDER_RATIO=0.5)
  • A test with 1M loops for 1 client will insert 50M data points, and 450M data points for a 9 clients test.
  • read: 50% evenly distributed in 5 types of reads below. Other read types in IoTDB-benchmark are not supported by TickTock yet.
  • 10% PRECISE_POINT( i.e., select v1... from data where time=? and device in ?)
  • 10% TIME_RANGE(i.e., select v1... from data where time > ? and time < ? and device in ?)
  • 10% AGG_RANGE( i.e., select func(v1)... from data where device in ? and time > ? and time < ?, func() represents an aggregation function, such as avg, min, etc.)
  • 10% GROUP_BY (Group by time range query)
  • 10% LATEST_POINT( i.e., select time, v1... where device = ? and time = max(time))

5.1. Write operations (INGESTION)

5.1.1. Throughput (DataPoint/second)

Compared with write throughput (over 50k data points/second) in Write-only mode, the write throughput of Read-Write mixed mode is less than half of it, about 22k data points/second. Write throughput of 1 client test is lower than other tests. But the throughput does not degrade much when we increased the client number.

5.1.2. Latency

The write latency is still very fast, less than 1 millisecond in most cases, even in case of P99 latency. This is much better than the write-only mode whose P99 latency is over 100 milliseconds.

5.2. Read operation 1 (PRECISE_POINT)

5.2.1. Throughput (DataPoint/second)

Read throughput was stable while we increased the client number.

5.2.2. Latency

Read latency of different percentiles grows as percentile grows, as expected. Unlike write operations, there is no big jump or cliff from percentile to next percentile.

Other read operations below have very similar pattern as PRECISION_POINT read. We won't add explanation below for simplicity.

5.3. Read operation 2 (TIME_RANGE)

5.3.1. Throughput (DataPoint/second)

5.3.2. Latency

5.4. Read operation 3 (AGG_RANGE)

5.4.1. Throughput (DataPoint/second)

5.4.2. Latency

5.5. Read operation 4 (GROUP_BY)

5.5.1. Throughput (DataPoint/second)

5.5.2. Latency

5.6. Read operation 5 (LATEST_POINT)

5.6.1. Throughput (DataPoint/second)

5.6.2. Latency

6. Test scenario 3: Compared with InfluxDB

We compare TickTock with Influxdb in Read-Write mixed mode. We use the case of 5 clients and 5 devices only. Other scenarios are similar so we skip the results.

6.1 Throughput comparison (the higher the better)

(DataPoint/sec) Write Read 1 Read 2 Read 3 Read 4 Read 5
Influxdb 1429.57 2.86 145.65 2.87 38.56 2.86
TickTock 21631.48 43.31 2203.79 43.37 551.05 43.21
TickTock/InfluxDB 15.13 15.14 15.13 15.11 14.29 15.10

The test indicates that TickTock is consistently 15x better than InfluxDB in terms of throughput, in write and all reads.

6.2 Latency comparison (the lower the better)

write Latency(ms) avg min p10 p25 median p75 p90 p95 p99 p999 max
Influxdb 160.37 19.51 75.35 100.43 135.46 182.13 240.99 293.6 777.67 1846.86 9224.01
TickTock 4.35 0.05 0.07 0.09 0.13 0.18 0.22 0.24 0.44 1602.04 14069.94
Influxdb/TickTock 36.86 390.2 1076.43 1115.89 1042 1011.83 1095.41 1223.33 1767.43 1.15 0.66
Read 1 Latency(ms) avg min p10 p25 median p75 p90 p95 p99 p999 max
Influxdb 111.95 10.58 45.06 63.25 90.69 136.63 191.93 234.39 383.2 1694.2 5996.93
TickTock 15.9 2.78 4.46 6.37 10.16 19.97 34.28 44.47 65.68 102.34 1874.64
Influxdb/TickTock 7.04 3.81 10.10 9.93 8.93 6.84 5.60 5.27 5.83 16.55 3.20
Read 2 Latency(ms) avg min p10 p25 median p75 p90 p95 p99 p999 max
Influxdb 136.33 15.07 57.76 79.78 117.95 167.69 223.22 226.21 424.42 1714.44 8672.53
TickTock 20.75 3.85 6.01 8.26 13.62 27.68 44.41 55.42 78.89 119.73 1877.42
Influxdb/TickTock 6.57 3.91 9.61 9.66 8.66 6.06 5.03 4.08 5.38 14.32 4.62
Read 3 Latency(ms) avg min p10 p25 median p75 p90 p95 p99 p999 max
Influxdb 121.49 11.62 48.54 67.52 98.6 149.29 210.18 256.79 421.59 1704.09 13753.23
TickTock 16.29 2.93 4.57 6.53 10.44 20.64 35.25 45.49 66.85 102.15 1865.23
Influxdb/TickTock 7.46 3.97 10.62 10.34 9.44 7.23 5.96 5.64 6.31 16.68 7.37
Read 4 Latency(ms) avg min p10 p25 median p75 p90 p95 p99 p999 max
Influxdb 140.39 14.6 56.48 78.97 120.19 174.91 235.33 281.79 456.09 1705.75 13099.8
TickTock 17.75 3.2 5.05 7.13 11.47 22.9 38.26 48.71 71.15 110.13 1873.88
Influxdb/TickTock 7.91 4.56 11.18 11.08 10.48 7.64 6.15 5.79 6.41 15.49 6.99
Read 5 Latency(ms) avg min p10 p25 median p75 p90 p95 p99 p999 max
Influxdb 435.77 19.37 176.43 251.84 370.98 535.64 778.53 953.86 1343.1 2485.26 24918.16
TickTock 16.2 2.85 4.53 6.48 10.36 20.56 35.07 45.33 66.6 101.95 1872.89
Influxdb/TickTock 26.9 6.80 38.95 38.86 35.81 26.05 22.20 21.04 20.17 24.38 13.30

The tests indicate that TickTock latency is much faster than InfluxDB. The read latency of TickTock is about 3.20x to 38.95x faster.

The write latency gap is even higher. However, it is worth noting that TickTock's TCP writes are async, which means TickTock server responses after receiving write data and before applying the data. We suggest readers to read the difference between TCP and HTTP writes in our document here. For fair comparison, we should have used HTTP for writes. But we argue that IoT devices (e.g., PI-zero-W) very likely have very unstable network connections, and TCP writes should be better than HTTP writes, and be the default option. So we select TCP writes for this comparison. Actually in all tests with thousands million data points to be inserted, there was no even a single failed operation and data point.

We may provide HTTP write results in the future if we have time and resource.

6.3 CPU and Memory comparison

The above figure shows the CPU and memory consumption in testing TickTock and InfluxDB. The TickTock test only took about 2.5 hours (11556 seconds actually) to finish 1M loops (inserting about 500M data points) while the InfluxDB test took about 2 days, because of their throughput difference.

Both TickTock and InfluxDB used almost 100% of CPU. Note that TickTock uses less CPU.usr but more CPU.sys than InfluxDB. It is due to their difference in design and implementation.

TickTock used about 208MB memory (the worst memavailable is 304MB). InfluxDB used much more memory (the worst memavailable was 55MB). And you can see InfluxDB used more and more memory, until the test finished. We actually observed that the benchmark test progress was slower and slower for InfluxDB. We didn't observe the same behavior in TickTock, even when running 10M loops (no shown in the above figure).

7. Conclusion

  • TickTock can run in a very tiny IoT device, PI-zero-W, with ARM 32 bit OS.
  • We evaluated TickTock in both write-only mode and read-write mixed (50% to 50%) mode.
  • TickTock can achieve over 58K data points/second write throughput in write-only mode, and 22k in read-write mixed mode.
  • Read and write latency of TickTock is very fast. P95 and P99 of writes in write-only and mixed mode, respectively, can be done less than 1 millisecond.
  • We compared TickTock with InfluxDB in Read-Write mixed mode, with 5 clients.
  • TickTock is 15x better in terms of write throughput.
  • TickTock is also faster than InfluxDB. Average write latency of TickTock is 36x faster than InfluxDB. Average read latency of TickTock is from 6x to 26x faster than InfluxDB.
  • Both TickTock and InfluxDB consumed all CPUs. But TickTock consumes much less memory than InfluxDB.