Download and Start PDQ - ProofDrivenQuerying/pdq GitHub Wiki

This guide explains step-by-step how to download and run PDQ.

First of all, remember that PDQ is composed of 8 sub-projects that can be used independently.

Downloading

Easy!

Go to our Releases page on GitHub and download the JAR file.

You can download the following executable jar files that contain all dependencies:

  • pdq-planner-x.y.z-executable.jar
  • pdq-reasoning-x.y.z-executable.jar
  • pdq-runtime-x.y.z-executable.jar
  • pdq-main-x.y.z-executable.jar

We recommend you download the full project (pdq-main-x.y.z-executable.jar) which packages the whole of PDQ, and lets you run each component.

You can also download the following libraries if you wish to include PDQ functionality in your project:

  • pdq-common-x.y.z.jar
  • pdq-cost-x.y.z.jar
  • pdq-datasources-x.y.z.jar
  • pdq-planner-x.y.z.jar
  • pdq-reasoning-x.y.z.jar
  • pdq-runtime-x.y.z.jar

Database set-up

The only relevant installation after downloading the software is if you are using PDQ with a large enough dataset that you require external storage. Internally PDQ makes use of a database that stores the results of PDQ's reasoning process. By default PDQ stores all data in-memory, and if you use PDQ only for planning it probably suffices to use the in-memory DBMS. Alternatively, PDQ can use the external DBMS postgres. When reasoning with large amounts of data you will probably want to use the external DBMS.

Note that whether or not the in-memory database is used or not is controlled by a parameter, useInternalDatabaseManager which defaults to true (i.e. use the in-memory by default). See https://github.com/ProofDrivenQuerying/pdq/wiki/Parameter-files

If you do want the external database, you will need to set up the database in Postgres. PDQ works with all supported Postgres versions (9.5, 9.6, 10, 11, 12), and we currently test PDQ against Postgres 9.5 and 12. Postgres 11 or newer will be required if you are installing on Windows 10. You can use these instructions for getting Postgres for your system.

It can be created following these instructions in a shell (Win32 console, Terminal (macOS), Linux console, ...):

  1. Set (the specific command depends on the OS used) the variable PGDATA to the path where to store the data (for instance pdq)
  2. Run initdb -U postgres -A trust to initialize the DB
  3. Run pg_ctl -l logfile start to start the a PostgreSQL server (if it is not already running)
  4. Run createdb -U postgres pdq

Running

You can access PDQ functionality in 2 different ways:

Note that in the following we refer to the JAR of the full project (pdq-main-x.y.z-executable.jar), although each command can be executed by the specific sub-project it refers to.

Accessing any PDQ functionality via command-line

To run via command-line, open your favourite shell and run the following command:

java -jar pdq-main-x.y.z-executable.jar [<MODULE> <OPTIONS>]

Note that if you do not specify any <MODULE>, the GUI will start

Where <MODULE> is one of:

  • cost
    • To estimate the cost
  • planner
    • To perform the planning
  • runtime
    • To run the plan
  • regression
    • To perform the planning and/or run the plan
  • reasoning
    • To perform the chase
  • gui
    • To run the GUI
Show available options

There are <OPTIONS> common to all the <MODULE>:

Option Required Description
-h
--help
False Displays a help message.
(list all parameters available to a given <MODULE>)
-D False Dynamic parameters. Override values defined in the configuration files.

There are <OPTIONS> common to all the <MODULE> except regression:

Option Required Description
-c
--config
False Directory where to look for configuration files.
(default is the current directory)
-v
--verbose
False Activates verbose mode.

And other <OPTIONS> specific for each <MODULE>:

Cost

Option Required Description
-s
--schema
True Path to the input schema definition file.
-p
--plan
True Path to the input plan definition file.

The output cost will be sent to the console and to any other logging destinations that are enabled.

Planner

Option Required Description
-s
--schema
True Path to the input schema definition file.
-q
--query
True Path to the input query definition file.
-o
--output
False Path to the file name to store the output plan.

Runtime

Option Required Description
-s
--schema
True Path to the input schema definition file.
-p
--plan
True The plan (in XML) to execute.
-a
--accesses
True Directory where to look for executable access method descriptor XML files.
-o
--output
False Path to the output CSV file.

Reasoning

Option Required Description
-s
--schema
True Path to the input schema definition file.
-f
--facts
False Path to the folder containing files with data for the given relation.
(the files must be named "[RelationName].csv")
-q
--query
False Path to the input query definition file.
-o
--output
False Path to the file name to store the reasoner output (chase or certain answers).

Note that you must include either facts or a query.

Regression

Option Required Description
-i
--input
True Path to the regression test case directories.
-m
--mode
True Run the planner, the runtime or an end-to-end test.

Regression does not use a command line parameter for the output. Any generated outputs will use a specified name (e.g. generated-plan.xml), in the same directory where the inputs lie.

The format of the input files mentioned above is described here, while the output file format is described here

Running via the GUIs

Check the specific page about the GUIs.

⚠️ **GitHub.com Fallback** ⚠️