Quick start guide for RHEAS Virtual Box - SERVIR/RHEAS GitHub Wiki

Quick start guide for RHEAS

This documentation is intended to be used with virtualbox image that has RHEAS model and other dependencies pre-installed. More comprehensive RHEAS model documentation can be found at: https://rheas.readthedocs.io/en/latest/

Requirements:

Packages/softwares pre-installed on virtualbox

  • OS: Ubuntu 16 (RHEAS has not been tested on latest versions 18 or 19)
  • RHEAS (latest develop branch)
  • QGIS
  • Psycopg
  • Python

Importing ubuntu image (.ova) file:

  1. Navigate to virtualbox > File > import Appliances
  2. Select the provided .ova file path and import with default settings(NOTE: Make sure to have at least 40GB of free storage for VB installation and making some runs)
  3. This process may take 10-30 min depending on system speed

You may run into Intels VI issue, to fix this under BIOS setting VI needs to be ‘enabled’

The default credentials for the machine are:

user: rheas password: pass


RHEAS model Steps:

Test model installation:

To open terminal enter
ctrl+alt+t

Navigate to rheas folder on terminal using
$cd RHEAS

After navigating to RHEAS directory, type following command to see RHEAS help. In addition to seeing RHEAS options, the command will also let us know if the model is available and properly installed:

Running RHEAS with the help switch

./bin/rheas -h

produces the proper usage command :

usage: rheas [-h] [-d db] [-u] [-v] [-l logfile] config 

Runs RHEAS simulation.

positional arguments:
config      configuration file

optional arguments:
	-h 	‘help’     	show this help message and exit
	-d 	‘db’       	name of database to connect
	-u      ‘update’	update database
	-v      ‘verbose’	increase verbosity
	-l 	‘logfile’ 	name of log file

Test simulation

An example simulation is configured to run as is. The nowcast.conf file is populated as follows to run the vic model from Jan 1 to Sept 30 over Kenya. The the nowcast documentation for more information of populating a nowcast file:

[nowcast]
model: vic
startdate: 2019-1-1
enddate: 2019-9-30
basin: /home/rheas/rheas/data/kenya/KEN_adm0.shp
name: Kenya_vic_test
resolution: 0.25 

[vic]
precip: chirps
temperature: ncep
wind: ncep
initialize: no
save to: db
save: tmax, runoff, evap, baseflow, soil_moist, rainf, tmin

Available Resources on the provided VM

The image comes loaded with several Kenya related shapefiles are available under ~/RHEAS/data/Ken/

Example configuration files for data ingestion as well as nowcast mode are stored under the home directory

Several months worth of data are provided with the current installation for a bounding box over Kenya to test and begin early model simulations

As you generate configuration files it is advised to store them in a dedicated confs directory and do not delete any file or directory from the main directory as they may be linked to model executables.

To create a confs directory type:

mkdir confs

from you home directory.

Data Ingestion

Before running any customized configurations of the model, data needs to be ingested into the database

Ingesting datasets:

We need to populate a data ingestion configuration file. By defining a domain (min/max latitude and longitude) of the study area. Make sure to have some buffer around the actual study area. This is particularly helpful in cases involving some very coarse resolution datasets and avoids any missing inputs. Next, we need to provide the name of datasets along with the start and end date. In case the end date is not provided, the model would ingest all data from the start date till the latest available date. Example of data ingestion file:

[domain]
minlat: -20.0
maxlat: 7.0
minlon: 17.0
maxlon: 60.0

[chirps]
startdate: 2015-1-1
enddate: 2017-12-31

[ncep]
startdate: 2010-1-1

#[smos]
#startdate: 2010-1-1
#enddate: 2017-12-31

In this example, the model will ingest precipitation data from ‘CHIRPS’ whereas temperature and wind data is obtained from ‘NCEP’. Chirps data is downloaded for 3 years (2015-2017) while NCEP is downloaded from 2010 till the latest date available. The section belonging to soil moisture data from SMOS is commented out. The file can be saved as ‘ingest.conf’. More detail on the populating the data ingestion file can be found here. A comprehensive list of all the datasets, their availability, resolution etc. information is available.

Once the ingestion configuration file is created, ingestion scripts can be called from the Linux command line:

Navigate to rheas (home) folder and type:

./bin/rheas -u path/to/your/ingest/config/file

NOTE: Ingestion may take some time depending on the datasets resolution, connection/transfer speed etc. The best way is to let the ingestion scripts run overnight or over weekends. Further, the above command will ingest input data to the default 'rheas' database. To ingest data into specific database use '-d' option.

NOWCAST Configuration and simulation

We use another configuration file to let the model know which mode we want the model to run the simulations on and under which set of options. RHEAS model can be run in either nowcast or forecast modes. And each of these modes can have multiple options to select. A full list of configurations are available.

[nowcast]
model: vic, dssat
startdate: 2019-1-1
enddate: 2019-9-30
basin: /home/rheas/rheas/data/kenya/KEN_adm0.shp
name: Kenya_dssat_test
resolution: 0.25

The above section defines the mode by declaring that the configuration file is for ‘nowcast’ mode simulation. Next, the file tells the model to run both ‘VIC’ and ‘DSSAT’ models under this configuration. Model selection is followed by start and end dates (of course, these dates should be within the limits of dates used while ingesting the data!) Next line points the model to the shapefile over which the ‘VIC’ simulation would be made. Some Kenya shapefiles (administrative boundaries) are provided with the VM but additional ones can be created either using QGIS (provided with the virtual machine) or ArcGIS software (outside of this VM). The ‘name’ is the simulation name, it is by this name the results will be saved under the database for this particular run. Resolution defines the spatial resolution at which the VIC simulations are made. Currently the model can handle 0.25 (~25 km) and 0.05 (~5 km) resolutions.

After defining the main model simulation options, we select the model specific options. For instance, a VIC configuration is shown below where the model is told to use ‘precipitation’ data from CHIRPS; temperature and wind from NCEP. Next, we tell the model to save the results to database (this could be a path to directory where raw VIC files would be dumped). For DSSAT, only three options are required. The first is the shapefile that DSSAT will be run over (if a multi-polygon dataset is provided, DSSAT will be run separately for each feature. Next is the number of ensembles to be simulated for each feature provided. DSSAT will perform a random selection of weather, soil, and cultivar information to construct each ensemble. Lastly, the copr type that DSSAT will simulate is specified. Currently Maize is the only crop available of Kenya, however, Wheat and Rice are currently being integrated. All meteorological variables need are either pass through or derived by VIC.

[vic]
precip: chirps
temperature: ncep
wind: ncep
initialize: no
save to: db
save: tmax, runoff, evap, baseflow, soil_moist, rainf, tmin

[dssat]
shapefile: /home/rheas/rheas/data/kenya/KEN_adm1.shp
ensemble size: 20
crop: maize

Once these simulations are complete, the data can be explored in the Postgres database and visualized in QGIS


Useful PSQL commands and example queries

To enter into database type from within main RHEAS directory on terminal:

$./bin/psql -d rheas

Here -d is the option for database name. In this case our database is called rheas (the default name for database is the machine name)

Common psql commands and queries

\dt			:  List all tables
\dn			:  List all schemas 
\d tablename 	: Describe table
\d+ tablename	: List columns of table 
\q			: exit postgres
\l			: list all databases
\c dbname 		:switch between database

To start/restart postgres database

./bin/pg_ctl -D data/postgres restart

Inspect a single raster from within QGIS

First connect to the database

Create table newname as (select rid, rast from schema.variable where fdate=’2017-06-55’)

A few examples:

Create table cdiCheck625 as (select rid, rast from ken_25km_test.cdi where fdate=’2017-06-05’)

Create table SM1Check625 as (select rid, rast from ken_25km_test.soil_moist where fdate=’2017-06-05’ and layer=’1’)

NOTE: avoid using DROP command - may mess up with the database structure

To delete any entry from a table:

DELETE FROM tablename;

To view DSSAT yields (kg/ha) for each ensemble:

SELECT gid, ensemble, max(gwad) FROM ‘schemaname’.dssat GROUP BY ensemble, gid;

To view DSSAT yields for each ensemble sorted by polygon order (for multi-polygon simulations):

SELECT gid, ensemble, max(gwad) FROM ‘schemaname’.dssat GROUP BY ensemble, gid ORDER BY gid;

To view cultivar entries:

SELECT p1,p2,p5,g2,g3,phint from dssat.cultivars;

Useful linux commands

ls 		list files or directories within current directory
pwd		path to current directory
cd		change directory
cd ..	takes you back to previous directory
mkdir	make new directory
mv		move file to a new location or used to change filename
cp        copy file(s) from one location to another
chmod	change access mode on files 
tail 	shows the last few lines of a file
rm 		deletes a file
kill		terminates a running command
find		search the system for a filename
grep 	search file for a text pattern
tar		compress a file/directory (tape archive)

All of these commands can go with multiple options, help and google will be your best friend and guide in this regard.

If something doesn’t work be sure to check that the database is started