access_gsd_RF3_Scripts - ACCESS-NRI/accessdev-Trac-archive GitHub Wiki

PageOutline

Processing Rainfields-3 data: Scripts and Running (Jan 2017)

The methodology/principles behind the processing of Rainfields3 data is described here. This page deals with the associated scripts and their running only.

Regridding the precip file

You can either use a file template to define the grid, or use an xml template (for regular grids only). Mark isn't sure if the regridder will work with a file template with a variable grid. Try and see.

This can only be completed on Embery,

Must load the rainfields module first.


anc_regrid -v precipitation 71_....nc   31....nc   output.nc

               ^ variable to regrid
                             ^ file that looks like output
                                         ^ input file
                                                    ^ otuput file

anc_regrid -v precipitation grid.xml   31....nc   output.nc

For a regular grid, the xml file is as follows:

<domain
    source="RAD:AU2,PLC:Melb"
    rows="916" cols="848"
    row0_coord="-25.66" col0_coord="142.05"
    row_delta="-0.1" col_delta="0.1">
  <projection
    type="latitude_longitude"
    spheroid="grs80"
    units="degrees" />
</domain>

The output file is used as input to the r2acobs program.

Update Log

  • (201706XX) Added option to unpack fields.
  • (201706XX) Added option to pre-calculate and use interpolation weights.
  • (20170407) Moved to code to version-control under git. (Finally!)
  • (20170406) Original default radar-data-type retrieved was "prcp-c60". This is reflected in documentation below. Have switched to 'prcp-m30' (merged, 30 minute accumulation) on advice from Mark Curtis:
    • The m30 data is kept for much longer.
    • It is the closest to the "truth" of the rain-gauge data.

To-do List

  • Add an option to 'zero-out' the zero-result mask.
    • 20170627 UPDATE: Mask-removal is now on by default, see the "-m" parameter above.
  • Save computation, by using cdo's ability to generate and use pre-calculated grid-weights.
    • 2017 mid-July UPDATE: Done, and active by default. Requires mask-removal however.
  • Now that RF3 has gone operational, need to be able to point RF3_getRawData.py at either the operational or developmental server.
    • 20170814 UPDATE: Done, but via the simplest route possible: added relevant "RF3_SERVER" value for the operational server into the "RF3_Config.py" file. For now, just comment out which server is not in-use - hopefully can just use the operational server going forward, and not have to chop and change too much.

Modules

The scripts are just pure-python codes - whether they're run interactively (typical GSD usage) or in batch, they assume the user has loaded required modules themselves. They're dependant on the NCO/NCKS suite, as well as CDO - and for CDO they need an up-to-date version.

GSD Module-loads, as at 2018-08-09
On the Python side, I'm using the Anaconda-based, CoE-supported python:
For more details see http://logan:8011/nwp-wg/wiki/ToolsPortal_PythonSetup

module use ~access/modules
module load nco/4.3.8
module load netcdf
module load nmoc-utils
module load cawcr-utils

module use /g/data3/hh5/public/modules
module load conda/analysis27
module unload cdo
module load cdo

Overview and Script location

Scripts are available under git (last updated 20170727):

https://gitlab.bom.gov.au/gsd/Rainfields3_Post_Processing

On-disk locations:

Gadi:  /g/data/dp9/slc548/Rainfields3_scripts/

The scripts are:

RF3_Config.py, RF3_Utility.py

RF3_getRawData.py
RF3_toCurvLatLon.py
RF3_toRegLatLon.py

RF3_Utility.py contains utility functions as the name suggests - a general user would not need to edit or run this.

RF3_Config.py contains configuration information - where the RF3 data is, etc. It is short, and extensively documented. The only likely change a user would make is to the TOPDIR variable.

Run-able script one: RF3_getRawData.py

This script copies raw RF3 datafiles from the RF3 server to a location on disk. It "understands" that there are multiple RF3 data-directories that store files. It can be run anywhere that can "see" the RF3 web-server.

Running and invoking help yields:

python RF3_getRawData.py -h
usage: RF3_getRawData.py [-h] [-t TYPE] [-d DOMAIN] [-s START] [-e END]
                         [-i INCR] [-v {on,off}]

Copy raw Rainfields-3 data across from its server (Rainfields Product Portal)

optional arguments:
  -h, --help   show this help message and exit
  -t TYPE      Field-type, eg., prcp-c60 (default)
  -d DOMAIN    Radar-domain-code, eg., 310 (default) for the Australian Mosaic
  -s START     Start date-time, in format: YYYYMMDDHHMM
  -e END       End date-time, in format: YYYYMMDDHHMM
  -i INCR      Increment between data-sets, in integer-minutes (default is
               60).
  -v {on,off}  Use verbose logging (default is off).

The default values for TYPE and DOMAIN are sensible: 60-minute calibrated precip, and the Australian Mosaic domain respectively. To see the range of allowable values, check the Rainfields Product Portal GUI.

A simple invocation to grab some data:

python RF3_getRawData.py -s 201609272300 -e 201609280700 -v on

and creates files such as:

> ls
310_20160928_000000.prcp-c60.nc  310_20160928_040000.prcp-c60.nc
310_20160928_010000.prcp-c60.nc  310_20160928_050000.prcp-c60.nc
310_20160928_020000.prcp-c60.nc  310_20160928_060000.prcp-c60.nc
310_20160928_030000.prcp-c60.nc  310_20160928_070000.prcp-c60.nc

Search the log-output for "ERROR" to spot any retrievals that have failed.

The top-level directory output directory is controllable by the TOPDIR variable discussed earlier. The "Raw" refers to the fact that all we have done is retrieve the data with absolutely no post-processing. "310" is the radar-domain-code, and the rest of the path is the usual date-tree-structure.

Run-able script two: RF3_toCurvLatLon.py

At this point we have retrieved RF3 data to disk.

The first step in post-processing it is to add lat/lon coords to each grid-point, via Mark Curtis' anc_grid_add_geodetic_coords program. Note that the resultant grid is still on the original radar grid-points (and hence is curvilinear).

Obviously, RF3_toCurvLatLon.py can only be run where anc_grid_add_geodetic_coords is available - for GSD at the moment, this means Embery.

Running RF3_toCurvLatLon.py is similar to RF3_getRawData.py:

python RF3_toCurvLatLon.py -h
usage: RF3_toCurvLatLon.py [-h] [-t TYPE] [-d DOMAIN] [-s START] [-e END]
                           [-i INCR] [-n VARNAME] [-v {on,off}]

Add (curvilinear / radar-native) lat/lon coords to RF3 data-files.

optional arguments:
  -h, --help   show this help message and exit
  -t TYPE      Field-type, eg., prcp-c60 (default)
  -d DOMAIN    Radar-domain-code, eg., 310 (default) for the Australian Mosaic
  -s START     Start date-time, in format: YYYYMMDDHHMM
  -e END       End date-time, in format: YYYYMMDDHHMM
  -i INCR      Increment between data-sets, in integer-minutes (default is
               60).
  -n VARNAME   Name of the variable to add lat/lon to. Default is
               "precipitation"
  -v {on,off}  Use verbose logging (default is off).

the only major difference is the addition of a VARNAME argument. It defaults to precipitation, which is what we normally want for verification, but other fields are available, like reflectivity.

We process the raw data we generated previously, by running:

python RF3_toCurvLatLon.py -s 201609272300 -e 201609280700 -v on

and creates files such as:

> ls
310_20160928_000000.prcp-c60_curv.nc  310_20160928_040000.prcp-c60_curv.nc
310_20160928_010000.prcp-c60_curv.nc  310_20160928_050000.prcp-c60_curv.nc
310_20160928_020000.prcp-c60_curv.nc  310_20160928_060000.prcp-c60_curv.nc
310_20160928_030000.prcp-c60_curv.nc  310_20160928_070000.prcp-c60_curv.nc

This is a similar directory-structure and filenaming to the raw-retrieval step, just with data stored under "Curv" rather than "Raw", and a "_curv" added to the end of the filename stem.

Run-able script three: RF3_toRegLatLon.py

This script is responsible for the final-step in the post-processing, interpolating from the curvilinear grid generated above, to an arbitrary regular lat/lon grid, which can then be used for model verification/evaluation, etc.

Because it doesn't use Mark Curtis' software to do the interpolation, it can be run on any platform that has cdo (almost anywhere). This is important, because the interpolation task is not trivial computationally.

Script invocation is similar to the preceding two:

python RF3_toRegLatLon.py -h
usage: RF3_toRegLatLon.py [-h] [-t TYPE] [-d DOMAIN] [-s START] [-e END]
                          [-i INCR] [-r REGDOMAIN] [-v {on,off}] [-m {on,off}]

Interpolate RF3 data from curvilinear/radar-native to regular lat/lon grid.

optional arguments:
  -h, --help    show this help message and exit
  -t TYPE       Field-type, eg., prcp-c60 (default)
  -d DOMAIN     Radar-domain-code, eg., 310 (default) for the Australian
                Mosaic
  -s START      Start date-time, in format: YYYYMMDDHHMM
  -e END        End date-time, in format: YYYYMMDDHHMM
  -i INCR       Increment between data-sets, in integer-minutes (default is
                60).
  -r REGDOMAIN  Name of the regular domain to interpolate to.
  -v {on,off}   Use verbose logging (default is off).
  -m {on,off}   Remove masks from the data (default is on).
  -u {on,off}   Unpack the data (default is on).
  -w {on,off}   Use only the first set of grid-weights generated (default is
                on).

Note in particular the REGDOMAIN argument. For the curvilinear data we have directory structures like ....../RF3_Data/Curv/310/2016/....., but to support the potentially infinite number of interpolated-to regular lat/lon grids we might want, we introduce another layer in the directory structure, eg., ...../RF3_Data/Reg_LatLon/310/Test_Buckland/2016/..... which means we've taken the data from the Australian Mosaic (code 310), and interpolated it to a (test) domain that we've called Test_Buckland (based on the the Buckland radar domain). It's up to the user to create that directory, having created it manually we point the script at it via the -r argument. The Test_Buckland directory has to contain a grid_desc.txt which defines the interpolated-to grid, in terms of domain, resolution, etc. This can be created manually, or generated automatically from other netCDF files, using the cdo griddes command described here.

The above approach supports as many regular lat/lon domains as we want - potentially one for ACCESS-R, one for each of the ACCESS-C's, etc. If the script is invoked with a -r option for whihc the associated directory or grid_desc.txt file is missing, it flags that as an error to the user and aborts.

Note that, by default, masks are removed from the data in this process (see the -m parameter above). There are pros and cons to this, but if masks are left in place it becomes difficult to form accumulated fields later - a mask in any one of 24 one-hour fields prevents the accumulation of the other 23 values to produce daily rain. Another advantage of removing the mask is that interpolation weights can be pre-generated and then used repeatedly - this is not the case when masks vary in time, as they usually do.

RF3 fields are generally stored as packed-fields. By default, we unpack them here because its useful later if you want to form accumulations (which otherwise can easily grow too large to "fit" within the initial time-period packing).

To process the curvilinear data generated previously, we run

python RF3_toRegLatLon.py -s 201609272300 -e 201609280700 -r Test_Buckland -v on

which produces terminal log output of

An example of the generated files is:

> ls
310_20160928_000000.prcp-c60_Test_Buckland.nc
310_20160928_010000.prcp-c60_Test_Buckland.nc
310_20160928_020000.prcp-c60_Test_Buckland.nc
310_20160928_030000.prcp-c60_Test_Buckland.nc
310_20160928_040000.prcp-c60_Test_Buckland.nc
310_20160928_050000.prcp-c60_Test_Buckland.nc
310_20160928_060000.prcp-c60_Test_Buckland.nc
310_20160928_070000.prcp-c60_Test_Buckland.nc

and, being (finally!) on a regular lat/lon grid, this data can be visualised in any number of ways, eg., using UMexplore:

RF3_regular_two.png

Further testing, June 2017

Because I'd made a number of changes around masking and use of pre-generated interpolating weights, thought it best to test the chain again. Following is a series of comparisons of RF3 merged hourly-precip against archive of BoM Radar images from the WeatherChaser site.

RF3_processing_test_20170329_13Z.png RF3_processing_test_20170329_16Z.png RF3_processing_test_20170329_18Z.png

Further testing

Spinning this off to a separate page to avoid clutter.

Links

Back to Rainfields3 Data on Gadi# Attachments

⚠️ **GitHub.com Fallback** ⚠️