PQC Computational Performance Metrics - crt26/pqc-evaluation-tools GitHub Wiki

Liboqs Computational Performance Metrics

The Liboqs performance tests collect detailed CPU and memory usage metrics for PQC digital signature and KEM algorithms. Using the Liboqs library, the automated testing tool performs each cryptographic operation and outputs the results, which are separated into two categories: CPU benchmarking and memory benchmarking.

CPU Benchmarking

The CPU benchmarking results measure the execution time and efficiency of various cryptographic operations for each PQC algorithm.

Using the Liboqs benchmarking tools, each operation is run repeatedly within a fixed time window (3 seconds by default). The tool performs as many iterations as possible in that time frame and records detailed performance metrics.

The table below describes the metrics included in the CPU benchmarking results:

Metric Description
Iterations Number of times the operation was executed during the test window.
Total Time (s) Total duration of the test run (typically fixed at 3 seconds).
Time (us): mean Average time per operation in microseconds.
pop. stdev Population standard deviation of the operation time, indicating variance.
CPU cycles: mean Average number of CPU cycles required per operation.
pop. stdev (cycles) Standard deviation of CPU cycles per operation, indicating consistency.

Memory Benchmarking

The memory benchmarking tool evaluates how much memory individual PQC cryptographic operations consume when executed on the system. This is accomplished by running the test-kem-mem and test-sig-mem Liboqs tools for each PQC algorithm and its respective operations with the Valgrind Massif profiler. Each operation is performed once with the Valgrind Massif profiler to gather peak memory usage and can be tested across multiple runs to ensure consistency.

The following table describes the memory-related metrics captured during after the result parsing process has been completed:

Metric Description
inits Number of memory snapshots (or samples) collected by Valgrind during profiling.
maxBytes Peak total memory usage across all memory segments (heap + stack + others).
maxHeap Maximum memory allocated on the heap during the execution of the operation.
extHeap Heap memory allocated externally (e.g., through system libraries).
maxStack Maximum stack memory usage recorded during the test.

Liboqs Result Data Storage Structure

All performance data is initially stored as un-parsed output when using the Liboqs benchmarking script (full-liboqs-test.sh). This raw data is then processed using the Python parsing script to generate structured CSV files for analysis, including averages across test runs.

The table below outlines where this data is stored and how it's organised in the project's directory structure:

Data Type State Description Location
CPU Speed Un-parsed Raw .csv outputs directly from speed_kem and speed_sig binaries. test-data/up-results/liboqs/machine-X/raw-speed-results/
CPU Speed Parsed Cleaned CSV files with per-algorithm speed metrics and averages. test-data/results/liboqs/machine-X/speed-results/
Memory Usage Un-parsed Raw .txt outputs from Valgrind Massif profiling of digital signature and KEM operations using the Liboqs test-kem-mem and test-sig-mem binaries. test-data/up-results/liboqs/machine-X/mem-results/
Memory Usage Parsed CSV summaries of peak memory usage for each algorithm-operation. test-data/results/liboqs/machine-X/mem-results/
Performance Averages Parsed Average results for the performance metrics across test runs. Located alongside parsed CSV files in results/liboqs/machine-X/