WRES Local Server - NOAA-OWP/wres GitHub Wiki

About

The WRES evaluation engine can be run as a standalone, command-line application, as described in the top level wiki, or it can be run as a local-server, an long-running instance capable of being interacted with by the user via webservice requests. A WRES web-service employs instances of the local-server, referred to as a worker-server, to process user requested evaluations. Hence, understanding the local-server approach is important to understanding and maintaining a web-service instance.

This wiki describes how to start the local-server instance and how to interact with that instance.

Architecture

Design decisions:

There are some design decisions made while implementing this server mode that developers and users should be aware of:

  • Worker servers handle jobs synchronously.
  • The standalone server uses a cache to store the output of, at most, 100 jobs.
  • The WRES implementation of the server does not store output paths. After they are returned by the request, the server "forgets" about the evaluation output.
  • A WRES web-service implementation requires that the worker-server be started on port 8010 for Docker health checks.
  • Since the current implementation of the server handles jobs synchronously, we use atomic variables to track status and current evaluation ID.
  • If a job is not closed in the WRES implementation then the local-server will hang for a maximum of 5 minutes until the stale job thread closes the opened job.

Evaluation Flow for a WRES web-service implementation:

A web-service instance employs the following work-flow for an evaluation:

Client                                            Server

Heartbeats the server before sending request ---------->
                                                       |
                                                       |   Server sets the current status to AWAITING
                                                       v
<----------------------- Returns good response to client
|
|
v    
Request an evaluation is prepared --------------------->
                                                       |   Server creates the eval ID, starts redirecting standard streams, and starts an eval 
                                                       |   Server kicks off a stale job thread to close jobs completed jobs (Checks every 5 minutes)
                                                       v
<----------------- evaluation job created, the id is foo



give me the standard out stream for job id, foo ------->
                                                       |   Server sends the ChunkedOutput object we have redirect this stream to
                                                       |   The thread continues until the atomic status is CLOSED 
                                                       |   Retrievable after the job finishes as well
                                                       v
<------------ here is the standard stream for job id foo 



give me the standard error stream for job id, foo------>
                                                       |   Server sends the ChunkedOutput object we have redirect this stream to
                                                       |   The thread continues until the atomic status is CLOSED 
                                                       |   Retrievable after the job finishes as well
                                                       v
<------ here is the standard error stream for job id foo



Get evaluation with this ID --------------------------->
                                                       |  Server returns the Future evaluation response. Upon finishing sets status to COMPLETED
                                                       |  When the eval completes returns the paths the output was written to
                                                       v
<-- here are the output paths created by this evaluation
|
|
v  
please close the evaluation job with id foo------------>
                                                       |
                                                       |   Server sets status to CLOSED and we reset the standard streams to their original source
                                                       v
<------------------- evaluation job foo has been closed.
                                                       |
                                                       |   IF the server does not receive a close request, it will be unavailable till the stale job thread closes it
                                                       v

How do I start my own standalone server?

To run a local-server, do the following:

  1. Ensure that your machine has at least Java JDK 17 installed. The WRES local server must be run with at least that version of Java.

  2. Obtain the WRES release package .zip file. Release packages are made available via GitHub at https://github.com/NOAA-OWP/wres/releases. Download the latest core zip from the assets of the most recent deploy; it should appear as, wres-DATE-VERSION.zip.

  3. Unzip the release package. It is a .zip file, so use the appropriate tool for your system.

  4. Start the server. From the directory created by unzipping the package, cd into bin and run this command:

./wres server [PORT]

where [PORT] is the number of the port through which you will be able to communicate with the server. For example, to use port 8010, run the following:

./wres server 8010

Additional server paramaters.

If you want this local-server to attach to a database or you need a local cache (required to download files if you don’t have access to the output directory they are being written to), then you will need to add additional JAVA_OPTS:

JAVA_OPTS="-Dwres.enableServerCache=[true/false] -Dwres.useDatabase=[true/false] -Dwres.databaseHost=[host] -Dwres.databaseName=[databaseName] -Dwres.username=[databaseUsername]" ./wres.bat server [PORT]

Upon running the command, above, logging will be output to the terminal indicating that the server has been started. For example:

sh-4.1$ ./wres server 8010
2023-12-01T12:11:29.792+0000 INFO Main WRES version 20231127-2bbe97c
2023-12-01T12:11:29.823+0000 INFO Main Processors: 12; Max Memory: 2560MiB; Free Memory: 2494MiB; Total Memory: 2560MiB; WRES System Settings: [omitted]
2023-12-01T12:11:29.839+0000 INFO Functions Executing: server 8010
2023-12-01T12:11:29.917+0000 INFO BrokerUtilities Discovered a binding URL of failover:(amqp://localhost:0?transport.tcpKeepAlive=true&transport.connectTimeout=300000&amqp.idleTimeout=3000000)?failover.maxReconnectAttempts=20&failover.reconnectDelay=100&failover.useReconnectBackOff=true&failover.warnAfterReconnectAttempts=1, which includes the reserved TCP port of 0. An embedded broker will be created at this URL and the broker allowed to assign a port dynamically.
2023-12-01T12:11:29.917+0000 INFO EmbeddedBroker  Discovered the following binding URL for the embedded broker: failover:(amqp://localhost:0?transport.tcpKeepAlive=true&transport.connectTimeout=300000&amqp.idleTimeout=3000000)?failover.maxReconnectAttempts=20&failover.reconnectDelay=100&failover.useReconnectBackOff=true&failover.warnAfterReconnectAttempts=1
2023-12-01T12:11:30.432+0000 INFO EmbeddedBroker Started an embedded broker with AMQP transport on port: 52782. The short binding URL is: tcp://127.0.0.1?port=52782.
2023-12-01T12:11:30.495+0000 INFO BrokerConnectionFactory Established a connection to an AMQP message broker at binding URL: failover:(amqp://localhost:52782?transport.tcpKeepAlive=true&transport.connectTimeout=300000&amqp.idleTimeout=3000000)?failover.maxReconnectAttempts=20&failover.reconnectDelay=100&failover.useReconnectBackOff=true&failover.warnAfterReconnectAttempts=1.
2023-12-01T12:11:30.631+0000 INFO BrokerConnectionFactory Created a broker connection factory wres.events.broker.BrokerConnectionFactory@58e92c23 with name connectionFactory.statisticsFactory and binding URL failover:(amqp://localhost:52782?transport.tcpKeepAlive=true&transport.connectTimeout=300000&amqp.idleTimeout=3000000)?failover.maxReconnectAttempts=20&failover.reconnectDelay=100&failover.useReconnectBackOff=true&failover.warnAfterReconnectAttempts=1.
2023-12-01T12:11:30.694+0000 INFO Server jetty-11.0.18; built: 2023-10-27T02:14:36.036Z; git: 5a9a771a9fbcb9d36993630850f612581b78c13f; jvm 17.0.6+9-LTS-190
2023-12-01T12:11:30.835+0000 WARN WadlFeature JAXBContext implementation could not be found. WADL feature is disabled.
2023-12-01T12:11:30.882+0000 WARN Providers A provider wres.server.EvaluationService registered in SERVER runtime does not implement any provider interfaces applicable in the SERVER runtime. Due to constraint configuration problems the provider wres.server.EvaluationService will be ignored. 
2023-12-01T12:11:30.944+0000 INFO ContextHandler Started o.e.j.s.ServletContextHandler@4fb56bea{/,null,AVAILABLE}
2023-12-01T12:11:30.944+0000 INFO AbstractConnector Started ServerConnector@3cf7298d{HTTP/1.1, (http/1.1, h2, alpn)}{127.0.0.1:8010}
2023-12-01T12:11:30.944+0000 INFO Server Started Server@8ee0c23{STARTING}[11.0.18,sto=0] @1736ms
Server@8ee0c23{STARTED}[11.0.18,sto=0] - STARTED
...

How do I interact with my server?

How do I execute an evaluation?

To execute an evaluation first make sure the server is ready to receive an evaluation by closing any previous run (not needed for the first run on a server, but doesn’t hurt to always run this):

curl -X POST localhost:[PORT]/evaluation/close

Then run this command to start a new evaluation:

curl -X POST -d "$(cat [DECLARATION_FILE])" -H "Content-Type:text/xml" localhost:[PORT]/evaluation/startEvaluation

where [DECLARATION_FILE] is the location of the YAML file declaring the evaluation and [PORT] is the port number used when starting the server. If the evaluation is successfully started, the response from the server will indicate the evaluation PROJECT_ID which is used in later commands to obtain information about the project. For example:

sh-4.1$ curl -X POST -d "$(cat Example_Meal_for_6hour_AHPS_vs_USGS.yml)" -H "Content-Type:text/xml" localhost:8010/evaluation/startEvaluation
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1355  100   109  100  1246     24    277  0:00:04  0:00:04 --:--:--   301
5991318394783415109 

In this case, the evaluation PROJECT_ID is 5991318394783415109.

How do I obtain a list of the outputs generated?

To see a list of the output files generated by the server for the evaluation, execute the following command using the PROJECT_ID described above:

curl -X GET localhost:[PORT]/evaluation/getEvaluation/[PROJECT_ID]

For example:

sh-4.1$ curl -X GET localhost:8010/getEvaluation/5991318394783415109
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2662  100  2662    0     0  11495      0 --:--:-- --:--:-- --:--:-- 11523
C:\cygwin64\tmp\wres_evaluation_r6Yxc31J1oPuN48Wpt-yDp-kg5g\evaluation.csv.gz
C:\cygwin64\tmp\wres_evaluation_r6Yxc31J1oPuN48Wpt-yDp-kg5g\evaluation.csvt

How do I obtain an output file?

To obtain an output file visit the path or, run the following command:

curl -o [FILE_NAME] -X GET localhost:[PORT]/evaluation/[PROJECT_ID]/[FILE_NAME]

where [PORT] is the port number for the server, [PROJECT_ID] is the evaluation PROJECT_ID, and [FILE_NAME] is the name of the file. The -o@ option will result in the file being written to the working directory with the same name as generated by the WRES server. For example, the following command obtains the output file WRDS_AHPS_Streamflow_Forecasts_20231103T150645Z_102_HOURS.nc:

curl -o WRDS_AHPS_Streamflow_Forecasts_20231103T150645Z_102_HOURS.nc -X GET localhost:8010/evaluation/5991318394783415109/WRDS_AHPS_Streamflow_Forecasts_20231103T150645Z_102_HOURS.nc

How do I view evaluation logs?

To obtain the evaluation logs file, run the following command:

curl -o [FILE_NAME] -X GET localhost:[PORT]/evaluation/stdout/[PROJECT_ID]

2023-12-01T12:19:39.586+0000 INFO DeclarationFactory Encountered the following declaration string:
---
label: Example Meal for 6-hour AHPS vs USGS
observed:
  label: USGS NWIS Instantaneous Streamflow Observations
  sources:
  - interface: usgs nwis
    uri: https://nwis.waterservices.usgs.gov/nwis/iv
  variable:
    name: '00060'

... SNIP ...

2023-12-01T12:19:43.485+0000 INFO EvaluationUtilities Wrote the following output: [... LOCATIONS OF GENERATED OUTPUT FILES ...]
2023-12-01T12:19:43.485+0000 INFO EvaluationUtilities Closing the messager for evaluation S08LGCQAeIxn8wMTCQ3V1FnjyPY...
2023-12-01T12:19:43.494+0000 INFO EvaluationMessager Closed evaluation S08LGCQAeIxn8wMTCQ3V1FnjyPY with status EVALUATION_COMPLETE_REPORTED_SUCCESS. This evaluation contained 1 evaluation description message, 24 statistics messages, 0 pairs messages and 6 evaluation status messages. The exit code was 0.
2023-12-01T12:19:43.494+0000 INFO EvaluationUtilities The messager for evaluation S08LGCQAeIxn8wMTCQ3V1FnjyPY has been closed.

NOTE: Obtaining a list of output files and obtaining a file does not result in logging output.

How do I clean a database?

To clean the database being used by the server (If one is attached), run the following command:

Clean default database currently in use:
curl -X POST localhost:[PORT]/evaluation/cleanDatabase

Clean a specific database:
curl -X POST localhost:[PORT]/evaluation/cleanDatabase -d "dbHost=[HOST_NAME]" -d "dbName=[DATABASE_NAME]" -d "dbPort=[DATABASE_PORT]"

where [PORT] is the port number for the server, [HOST_NAME] is the host the database is on (Most likely localhost if you spin it up yourself), [DATABASE_NAME] is the name of the database, [DATABASE_PORT] is the port used to access the database.

Example:

$ curl -X POST localhost:8010/evaluation/cleanDatabase -d "dbHost=localhost" -d "dbName=wrestesting" -d "dbPort=5432"
Database cleaned

Helpful info

  • If you need to find the JobID of an evaluation you can use this following command: find {worker_or_Tasker_dir} -type f -mtime -{bottom_range_days_ago} -mtime +{top_range_days_ago} -print0 | xargs -0 grep -A 2 {search_term}

An example for an Evaluation that happened 2 days ago:

find worker/ -type f -mtime -3 -mytime +2 -print0 | xargs -0 grep -A 2 ZFvFQepxvrtpj8VOSDAQ6zi9Zi4

Example output:

worker/56a0c3c82f7c/wres-worker.2023-11-03.log:2023-11-03T07:00:10.007+0000  [main] INFO wres.worker.WresEvaluationProcessor - Sending output uri file:///mnt/wres_share/evaluations/wres_evaluation_ZFvFQepxvrtpj8VOSDAQ6zi9Zi4/pairs.csv.gz to broker.
worker/56a0c3c82f7c/wres-worker.2023-11-03.log-2023-11-03T07:00:10.009+0000  [main] INFO wres.worker.WresEvaluationProcessor - Request exited with http code: 200 for job: 8604167945922344971