Instructions for using a database in a WRES standalone execution - NOAA-OWP/wres GitHub Wiki

To allow for evaluations larger than memory, the WRES supports the use of a relational database to store ingested data and facilitate pairing predictions with observations. This wiki explains how to direct the WRES to use a database established in a database server that has already been setup. The database server must employ SQL. While the WRES has been tested with both Postgres and H2 databases, the instructions below use Postgres as an example. However, the instructions should be applicable to any SQL-based relational database.

How do I setup the database for use?

To begin, make sure that the database server can allow a sufficient number of connections to support the WRES; 30 (TODO: DOUBLE CHECK!!!) should be the minimum for a single instance of the WRES running with default settings. For Postgres, it is the @max_connections@, specified in the Postgres configuration file, that must be set to at least that number.

Also, it is recommended, not required, that the database server is setup for TLS/SSL. If that is done, then, the WRES must be provided the correct certificate authority .pem file to trust that can be used to authenticate the server. In Postgres, there are various configuration options involved with setting up TLS/SSL. The configuration option that identifies the CA .pem is ssl_ca_file. However, other options determine if SSL is employed (e.g., ssl = on), what ciphers are supported (ssl_ciphers), the minimum TLS protocol version (ssl_min_protocol_version; WRES uses TLSv1.2), and other aspects of TLS/SSL that can impact the ability of the WRES to connect.

The general steps to setup a database within an identified server are as follows:

  1. Create a user that will "own" the database.
  2. Create the database within the identified server owned by that user.
  3. Create the "wres" schema within that database with the user authorized.
  4. Establish the user's password and a reasonably high connection limit.

For example, using Postgres, the following commands should be executed to create the user wres_user8 and the database wres8:

postgres=# CREATE USER wres_user8;
CREATE ROLE
postgres=# create database wres8 owner wres_user8;
CREATE DATABASE
postgres=# \c wres8 
postgres=#  create schema wres authorization wres_user8;
CREATE SCHEMA
postgres=# ALTER USER wres_user8 WITH PASSWORD '[a suitably long password]';
ALTER ROLE
postgres=# ALTER ROLE wres_user8 CONNECTION LIMIT 500;

How do I direct the WRES to use that database?

The WRES must be provided the necessary information to connect to a database through these system properties:

  • -Dwres.useDatabase=true: true if a database is to be used. By default, this is false indicating an in-memory execution.
  • -Dwres.databaseHost=[hostname of database sever]: Specifies the hostname of the database server including the domain.
  • -Dwres.databaseName=[database name]: The name of the database within the database server.
  • -Dwres.username=[database user name]: The name of the user that "owns" the database (see instructions, above).
  • -Dwres.certificateFileToTrust=[CA .pem file]: The name of the CA .pem file to employ in authenticating the database server. If not specified, no TLS/SSL is performed when communicating with the database server.
  • -Dwres.password=[database password]: The password to use when accessing the database.

NOTE: Alternatively to specifying the property wres.password, if a user is employing a Postgres database, the user's .pgpass file in their home directory can be used to indicate the user password. For more information, see https://www.postgresql.org/docs/current/libpq-pgpass.html.

Where do I specify those properties?

The recommended approach to providing the above system properties to the WRES is by specifying JAVA_OPTS directly in the command line before executing bin/wres, as explained in this section of Obtaining and using the WRES as a standalone application. For example,

JAVA_OPTS="-Dwres.useDatabase=true -Dwres.databaseHost=[hostname] -Dwres.databaseName=[db name] -Dwres.username=[user] -Dwres.certificateFileToTrust=[.pem file] -Dwres.password=[db password]" ./bin/wres [declaration .yml file]

However, if you are going to run evaluations repeatedly and would rather not specify the above options with each execution, then there are two alternatives, each of which involves editing a delivered file (meaning that, when a new release of the WRES is obtained, these files will need to be modified again):

  • Edit the delivered bin/wres (or, in Windows, bin/wres.bat) script to include the value of JAVA_OPTS shown above, but exported near the top of the script. For example:

export JAVA_OPTS="$JAVA_OPTS -Dwres.useDatabase=true -Dwres.databaseHost=[hostname] -Dwres.databaseName=[db name] -Dwres.username=[user] -Dwres.certificateFileToTrust=[.pem file] -Dwres.password=[db password]"

  • Edit the file, lib/conf/wresconfig.xml to indicate the system properties within the database element. The default options are shown here:
    <database>
        <!-- When a jdbcUrl is specified, it overrides host, type, name, port.
             Use either jdbcUrl or host. If you're using url, change to host
             instead. To try h2 in-memory db, uncomment the following line. -->
        <!--  <jdbcUrl>jdbc:h2:mem:test;MODE=PostgreSQL;TRACE_LEVEL_FILE=4;DB_CLOSE_DELAY=-1;INIT=create schema if not exists wres\;</jdbcUrl> -->
        <host>localhost</host>
        <username>wres_user</username>
        <name>wres</name>
        <port>5432</port>
        <database_type>postgresql</database_type>
        <max_idle_time>80</max_idle_time>
        <!-- Important: the sum of the max_pool_size settings of WRES instances
                 that connect to a shared database instance must be equal to or
                 less than the database instance's maximum connection settings 
                 else the WRES instances will see gnarly-looking exceptions 
                 such as "org.postgresql.util.PSQLException: FATAL: remaining 
                 connection slots are reserved for non-replication superuser 
                 connections" -->
        <max_pool_size>13</max_pool_size>
        <use_ssl>true</use_ssl>
        <certificate_file_to_trust>classpath:DODSWCA_60.pem</certificate_file_to_trust>
    </database>

Just modify the host, username, name (database name)`, or other arguments appropriately.

What happens when the WRES uses that database for the first time?

Through Liquibase, with the first evaluation using the WRES, the database will be setup for use with all necessary tables created and other aspects initialized. That first evaluation should, therefore, be expected to be slower than later evaluations that employ the same database.

What if I want to clean the database?

The data stored in the database will not be removed when an evaluation completes. Rather, the data is kept in the database, which will build up over time. That could cause the database to become less performant or even grow larger than the diskspace allocated. Therefore, to facilitate removing all data from earlier evaluations, the WRES should be executed with the cleandatabase option. For example,

bin/wres cleandatabase

where the bin/wres script has been updated to point to the database (see above). In Windows, be sure to use bin/wres.bat.

⚠️ **GitHub.com Fallback** ⚠️