Performance Data Parsing Scripts - crt26/pqc-evaluation-tools GitHub Wiki

Performance Data Parsing Scripts Overview

The project includes several Python scripts that handle automatic parsing of benchmarking results:

parse_results.py
performance_data_parse.py
tls_performance_data_parse.py
results_averager.py

These scripts support automated invocation (triggered by the automated test scripts) and manual execution via terminal input or command-line flags. Parsing is currently supported only on Linux systems. Windows environments are not supported due to the inability to create the environment needed to parse the raw performance results.

By default, parsing is triggered automatically at the end of each test run. The test scripts directly pass the necessary parameters (Machine-ID, number of runs, and test type) to the parsing system.

While several scripts are utilised for the result parsing process, only the parse_results.py is intended to be run directly. The main parsing script calls the remaining scripts depending on which parameters the user supplies to the script when prompted. The main parsing script is stored in the scripts/parsing_scripts directory, whilst internal scripts are stored in the scripts/parsing_scripts/internal_scripts directory.

For full documentation on how the parsing system works, including usage instructions and a breakdown of the performance metrics collected, please refer to the following documentation:

parse_results.py

This script acts as the main controller for the result-parsing processes. It supports two modes of operation:

Interactive Mode: Prompts the user to select a result type (computational performance, TLS performance, or both) and to enter parsing parameters such as Machine-ID and number of test runs.
Command-Line Mode: Accepts the same parameters via flags. The automated test scripts use this mode and can also be called manually for scripting purposes.

In both modes, the script identifies the relevant raw test results in the test_data/up_results directory and invokes the appropriate parsing routines to generate structured CSV output. The results are then saved to the test_data/results directory, organised by test type and Machine-ID.

Usage Examples:

Interactive Mode:

python3 parse_results.py

Command-Line Mode:

python3 parse_results.py --parse-mode=computational --machine-id=2 --total-runs=10

The table below outlines each of the accepted commands that are required for operation:

Argument	Description	*Required Flag ()**
`--parse-mode=<str>`	Must be either computational or tls, both is not allowed here.	*
`--machine-id=<int>`	Machine-ID used during testing (positive integer).	*
`--total-runs=<int>`	Number of test runs (must be > 0).	*
`--replace-old-results`	Optional flag to force overwrite any existing results for the specified Machine-ID.

Note: The command-line mode does not support parsing both result types in one call. Use interactive mode to combine the parsing of computational performance and TLS performance data in a single session.

performance_data_parse.py

This script contains functions for parsing raw computational benchmarking data, transforming unstructured speed and memory test data into clean, structured CSV files. It processes CPU performance results and memory usage metrics gathered from Liboqs for each algorithm and operation across multiple test runs and machines. This script is not to be called manually and is only invoked by the parse_results.py script.

tls_performance_data_parse.py

This script processes TLS performance data collected from handshake and OpenSSL speed benchmarking using PQC, Hybrid-PQC, and classical algorithms. It extracts timing and cycle count metrics from both TLS communication and cryptographic operations, outputting the results into clean CSV files for analysis. This script is not to be called manually and is only invoked by the parse_results.py script.

results_averager.py

This script provides the internal classes used to generate average benchmarking results across multiple test runs for the two testing types. It is used by both performance_data_parse.py and tls_performance_data_parse.py to generate per-algorithm averages across multiple test runs. For computational performance tests, it handles the collection of CPU speed and memory profiling metrics collected using Liboqs. For TLS performance tests, it calculates average handshake durations and cryptographic operation timings gathered from OpenSSL and OQS-Provider. This script is not to be called manually and only executes internally by the result parsing scripts.