ProvBench Traces - ashish-gehani/SPADE GitHub Wiki
The provenance traces below were collected from three applications (Apache, BLAST, PostMark), each run on three operating systems (Linux, Mac OS X, and Windows). Each trace was collected using SPADE with a Linux / Mac OS X / Windows reporter and H2 SQL storage.
For more information about the workloads, please see:
- Ashish Gehani and Dawood Tariq, Cross-Platform Provenance, 1st Provenance Benchmark Challenge (ProvBench) affiliated with the 16th International Conference on Extending Database Technology (EDBT) and 16th International Conference on Database Theory (ICDT), 2013. [PDF]
The provenance traces are provided in compressed SQL script format. This can be imported into most SQL databases.
| Linux (Linux Audit) | Mac OS X (OpenBSM) | Windows (Process Monitor) | |
|---|---|---|---|
| Apache |
SQL script, H2 database |
SQL script, H2 database |
SQL script, H2 database |
| BLAST |
SQL script, H2 database |
SQL script, H2 database |
SQL script, H2 database |
| PostMark | SQL script, H2 database | SQL script, H2 database |
SQL script, H2 database |
To import the provenance traces into MySQL, use the following command on Linux or Mac OS X, replacing <user> and <password> with your MySQL credentials:
gzip -dc < script.tar.gz | mysql -u <user> -p <password>
To browse the H2 SQL database without importing the SQL script into an existing database, follow the instructions [here](Viewing\ provenance\ in\ a\ relational\ database).
For Apache, the following commands were used on Linux and Mac OS X:
./configure
make
The following command was used on Windows:
nmake /f Makefile.win _apached
To install Postmark on Ubuntu, use:
apt-get install postmark
The following configuration settings were used when running the benchmark:
set number 500
set transactions 500
set size 500 10000
set read 512
set write 512
For BLAST, the input data set was downloaded from ftp://ftp.ncbi.nlm.nih.gov/genomes/INFLUENZA/influenza.faa and the following command was used:
makeblastdb -in influenza.faa -parse_seqids -hash_index -out outputdb