access_aps1_access_c_run - ACCESS-NRI/accessdev-Trac-archive GitHub Wiki

Running ACCESS-C aps1_access_c

The operational ACCESS-C standard experiment aps1_access_c is provided on NCI by employing Rose-Cylc

technology for simplicity. Rose-Cylc is the new technology to run UM models in user-friendly

manner. In this section, we will cover the preparation, installation, running of the standard experiment.

And finally, we provide some useful information about the operational ACCESS_C standard experiment,

such as CPU hours and disk requirements running each domain, sample forecast data for comparison,

and sample plots for reference.

Note: The UM version of APS1 ACCESS-C is 7.6 which is not compatible with Rose. Therefore, you cannot

actually configure aps1_access_c using Rose. However, we managed to run aps1_access_c within Rose-Cylc

framework to make the running of the model an easy task for users. The next section Model in Details

will cover the modification of settings and other advanced issues.

Preparing the Environment

raijin is the HPC and accessdev is a cloud machine residing in NCI. In general, the aps1_access_c suite

control is located on accessdev and all other files, including executable, configurations, namelists,

ancillaries, shell scripts and so on are located on raijin. Each task is fired up on accessdev to be running

remote on raijin using PBS queuing system.

Before carrying on the experiment, one's environment on raijin needs to fixed as follows,

  1. Update .rashrc on raijin as follows,

setenv PROJECT dp9

setenv SHELL /bin/bash

Please change dp9 to your own default/valid project.

  1. Update .bashrc on raijin, add following lines into the script,

module use ~access/modules

module load rose

module load cylc

Note: a Rose/Cylc wrapper has been provided to run suites. If you get problems starting suite on accessdev,

please remove those three lines in .bashrc.

  1. Delete/Comment out on raijin all other module load statements in .bashrc and other login scripts

such as login, .profile and .bash_profile, etc.

We do not need to do anything on accessdev for running standard experiments.

  1. Make sure you have set up the ssh passwordless communications on raijin->accesssdev,

accesssdev->raijin, raijin->raijin, and accesssdev->accesssdev, Please refer the link

https://trac.nci.org.au/trac/access/wiki/sshUserGuide for help.

APS1 ACCESS-C Installation

On raijin, please run the installation script,

~access/AccessModelExperimentLibrary/aps1_access_c/install_aps1_access_c.ksh

The script will install the following things on either raijin or accessdev,

  1. Install raijin:$HOME/roses/aps1_access_c; inside this directory,
  • beans: Containing UM and reconfiguration executable, archiving scripts and some utility scripts

  • conf: Containing ancillary, STASHMASTER and other configuration files

  • bin: Containing all the Cylc scripts and model setting file env.aps1_access_c

The installation on raijin takes up around 70M of hard disk and may complete in few seconds

depending on how busy the system is.

  1. Install accessdev:$HOME/roses/aps1_access_c
  • Containing all rose suite definition, info and conf files

This does not take much space and should be running pretty quick.

  1. Install raijin:/short/$PROJECT/$USER/aps1_access_c_src
  • Containing the um and reconfiguration source code of APS1 ACCESS-C

The source code takes around 200M of space and it may take 5-10 minutes to copy total codes

to the default /short/$PROJECT/$USER place.

The structure of aps1_access_c will be analysed in details in next section Model in Details.

Running APS1 ACCESS-C

Go to accessdev:$HOME/roses/aps1_access_c and type,

rose suite-run

Note1: For simplicity, the default suite.rc only contains one domain sydney for testing.

If you want to test all 6 domains, please rename the suite.rc.full to suite.rc**.**

You may also compare two files to understand how to choose domains of interest.

**Note2: Please note that the job AC_cleanup is deliberately put at held by AC_setup (There is a statement in **

AC_setup.sh to hold the task AC_cleanup. You may comment out this one in future for long period of experiments)

The reason to put AC_cleanup at held is because all the tasks disappear from gcylc after the the cycle completes.

Some users new to Cylc may find that it is a bit confusing at the very first test with Cylc.

After rose prints many lines of useful messages on screen, a gcylc window will pop up (may take few to 20 seconds

depending on the load of the machines and communication between raijin and accessdev) and

runs the ACCESS-C automatically for 2014011800Z (which is specified in suite.rc);

see the following screen shot for example,

gcylc1.png, 70%

You may switch the view of gcylc to monitor the running suite in different aspect as

illustrated in the following screen shot,

gcylc2.png, 70%

For each job, it takes few seconds for the job to be sent to PBS queuing system (if the job is

running on computing nodes), which is being queued (light green); and depending on the resources required by the job,

it may take seconds, minutes or even longer in the queue for the job to start on the computing nodes, which then

shows as running (green) in the Cylc window. All the preliminary tasks are quite quick, except for the make_lbc step

which takes 30-60 min. We have listed the running time for each of the 6 domains in the section below

APS1 ACCESS-C Resources Required For Running. Please note that the table shows the running time for using 100 and

196 cores, respectively. However, 64 cores are used in the standard experiment as default. Users should

be aware that the running time should be proportionately longer than that of using 100 or 196 cores.

Please check the next chapter Model in Details and section ACCESS-C Cylc Suite in Details below for approximate

running time for other jobs.

The output of the forecast data will be in raijin:/short/$PROJECT/$USR/aps1_access_c_S.

You may compare the forecast data with those located

~access/AccessModelExperimentLibrary/aps1_access_c/data/2014011800

Inside this folder, the forecast data for each domain is located in AD, BN, DN, PH, SY and

VT, respectively. In addition to the forecast data, the running outputs from jobs, such as makebc, recon and

um are also kept in there for reference. For example, users may compare um.fort6.pe0 with the standard

one to investigate whether the um is running fine: usually when running with default settings, the two

um.fort6.pe0 should be exactly the same except for some path and timing info differences.

The forecast data in netCDF format is located in raijin:/short/$PROJECT/$USR/ncdata/aps1_access_c_S

Note _S is appended to aps1_access_c. S is the Cylc suite ID defined as EXPTID in

raijin:$HOME/roses/aps1_access_c/bin/env.aps1_access_c.

When running, the suite reads into the env.aps1_access_c to set variables on fly.

Check Job Output

It is helpful to check the output of the completed job regardless of the status of the completion.

Users may right click certain task on gcylc and choose to view job script, log files of either

err or out. Additionally, users can go to

https://accessdev.nci.org.au/rose-bush/

to check those files online. Simply input your login in the user-ID box and click the Suites List

then you should be able to see a list of suites you have been running for the last months.

rose-bush_index.PNG, 100% rose-bush_suites.PNG, 100%

Note: Unfortunately, PBS on raijin uses a temporary place to hold the real-time output. Therefore, it is

not possible to view those files before the completion of the job.

Specify Running Dates

The initial cycle time and final cycle time are specified in accessdev::$HOME/roses/aps1_access_c/suite.rc.

The default value of initial cycle time and final cycle time are both set at 2014011800

because the test case only does one run 20140118 00Z(UTC). Please update those two variables

for different test cases. initial cycle time and final cycle time do not need to be same but

final cycle time should not be earlier than initial cycle time. For example,

initial cycle time = 2014011800

final cycle time = 2014013112

will set the suite run from 20140118 to 20140131 for both 00Z and 12Z runs.

Note: By default, the suite gets the ACCESS-R dump and pi files from raijin:/g/data/rr4/samnmc/access-r

to generate LBC and IC. You may go to the place to check data availabilty before specifying the cycle times.

Please note the data is organised in the structure of raijin:/g/data/rr4/samnmc/access-r/YYYY/MM. If the data

is not available there yet, you may contact Tan Le ([email protected]) or Wenming ([email protected])

to ask the data to be copied over there from BoM data repository.

Also, you can try your own data set as the input for dump and pi files. You need to edit the file

raijin:$HOME/roses/aps1_access_c/bin/env/apps1_access_c and the section starting with # initial, pi data etc.

Environment Setting

You may modify the settings by updating raijin:$HOME/roses/"aps1_access_c/bin/env."aps1_access_c.

Most of the system variables are defined in this file. A list of variables you may need to modify are

as follows,

  • EXPTID: an ID for the experiment, default S.
  • DATADIR0: path where all the forecast data goes.
  • INITIAL_DIR: path where the dump file is.
  • PI_DIR: path where the pi files are.
  • INITIAL_DATA: naming convention of the dump file.
  • PIPRE: pi files prefix.
  • PI_DATA: naming convention of pi files.
  • PI_MAX: the number of pi files.
  • ARCV_CLEAN: pp files to be removed from disk.
  • ARCV_GRB: pp files to be converted to grib format.
  • ARCV_NCDF: pp files to be converted to netCDF format.
  • LOG: path to hold log files.
  • LOG_TASKS: path to the central log files.

You can also modify the location in which netCDF data is located. Go to

raijin:$HOME/roses/"aps1_access_c/beans/mars/um2nc, and update the variable

NC_ARCV_DIR="/short/$PROJECT/$USER/ncdata/" to a new place.

APS1 ACCESS-C Resources Required For Running

We list the resources, including wall time, CPU time and data size, required for running ACCESS-C in

the following table. Users may calculate the cost of experiments on NCI using this reference.

City Domain Num. of Cores Wall Time (min) CPU time (min) Input Data Size(B) Output Data Size(GB)
Brisbane BN 100 28 (22*) 2756 294M/3.9G 16.0
Sydney SY 100 25 (20*) 2420 272M/3.5G 15.0
VicTas VT 100 52 (38*) 4760 620M/5.5G 33.0
Adelaide AD 100 38 (28*) 3375 438M/4.9G 23.0
Perth PH 100 37 (27*) 3605 396M/4.7G 16.0
Darwin DN 100 28 (25*) 2480 309M/4.1G 18.0
  • Note*: Wall time with 196 cores used.
  • In the column of Input Data Size(B), the sizes of initial condition and lateral boundary

condition files (IC/LBC) are listed.

  • Output Data Size(B) only includes all forecast data in pp format. The usage is larger if

you keep forecast data in grib and netCDF format as well.

Sample Plots

In ths section, we list a number of plots for reference.

BN 20130204 00Z time+2 Forecast: Mslp_Precip(1 Hour); time+1 Forecast: Sfc Temp, Sfc Dewpoint, Sfc Wind

  • Note: For all plots please click the image for display in the full size.
mslp-precipsfcACCESS_Cr20130204T00000020130204T020000~BN.gif, 75% TScreensfcACCESS_Cr20130204T00000020130204T010000~BN.gif, 75%
TdScreensfcACCESS_Cr20130204T00000020130204T010000~BN.gif, 75% windbarb10mACCESS_Cr20130204T00000020130204T010000~BN.gif, 75%

Sample Plots For Extreme Weather Cases

VT 20130321 00Z time+1 to time+6 Forecast: Mslp_Precip(1 Hour)

mslp-precip1.gif, 73% mslp-precip2.gif, 73% mslp-precip3.gif, 73%
mslp-precip4.gif, 73% mslp-precip5.gif, 73% mslp-precip6.gif, 73%

Oswald BN 20130125 00Z time+2 Forecast: Mslp_Precip(1 Hour); time+1 Forecast: Sfc Real. Humidity, 850hPa Wind, Geo-potential Height Wind

mslp-precipsfcACCESS_Cr20130125T00000020130125T020000~BN.gif, 75% RHScreensfcACCESS_Cr20130125T00000020130125T010000~BN.gif, 75%
windbarb850hPaACCESS_Cr20130125T00000020130125T010000~BN.gif, 75% HtWind850hPaACCESS_Cr20130125T00000020130125T010000~BN.gif, 75%

Attachments