Running regression test using rt.sh - ufs-community/ufs-weather-model GitHub Wiki

NOTE: THIS PAGE IS UNDER CONSTRUCTION

Update 05/31/2023: On May 31st, 2023, Major updates to this wiki were made. The most recent RT changes bring in: the argument -a was added for users to provide their HPC account to access the queues. Intel/GNU tests are now in the same rt.conf. opnReqTests script will run only supported cases if -c is not provided. Logs now provide hash information.

Update 01/06/2021: On January 6, 2021, the argument -f was removed. Adding it will force rt.sh to exit immediately. The default for rt.sh is to run the full regression tests in rt.conf unless -l xyz.conf is provided.

Shell script-based Regression Test: rt.sh

Set of simple shell script files and input files
Build + Run
Run only (workaround)
Supports Rocoto and ecFlow workflow managers
./rt.sh to run full regression tests
./rt.sh -c to create new baselines (in your own directory)
./rt.sh -m to verify your own baselines that you created beforehand with -c
Regression test root directory: ufs-weather-model/tests

rt.sh related Files

rt.sh calls:

detect_machine.sh
compile.sh
run_test.sh which calls rt_utils.sh
rt.sh uses following input files:
- rt.conf
- default_vars.sh
- <test-name>
- <run-setup-name>

rt.sh related Files

detect_machine.sh: detect and assign machine name, set account (nems is default)
compile.sh: calls build.sh which in turn invokes CMake build
run_test.sh: sets env variables, run directory, etc., prepares a canned case in the run directory, and calls rt_utils functions
bl_date.conf: Set date for where to place RT baselines (YYYYMMDD)
rt_utils.sh: contains utility functions, e.g.,
- submit_and_wait
- check_results
- rocoto_create_compile_task, rocoto_create_run_task, rocoto_run, rocoto_kill
- ecflow_create_compile_task, ecflow_create_run_task, ecflow_run, ecflow_kill

Input Files: 1) rt.conf

COMPILE Line ( Items separated by a | )
- Item 1: COMPILE - This tells rt.conf the following information is to be used in setting up a compile job
- Item 2: Compile number - must be sequential in rt.conf - use as a reference for compile failures
- Item 3: Compiler to use in build (intel or gnu)
- Item 4: CMAKE Options - Provide all CMAKE options for the build
- Item 5: Machines to run on (- is used to ignore specified machines, + is used to only run on specified machines)
  - -> EX: + hera orion gaea = compile will only run on hera orion and gaea machines
  - -> EX: - wcoss2 acorn = compile will NOT be run on wcoss2 or acorn
- Item 6: [set as fv3]. Used to control the compile job only if FV3 was present, previously used to run a test w/o compiling code
RUN Line ( Items separated by a | ) NOTE: The build resulting from the COMPILE line above the RUN line will be used to run the test
- Item 1: RUN - This tells rt.conf the following information is to be used in setting up a model run
- Item 2: Test name. (Which test in the tests/tests directory should be sourced)
- Item 3: Machines to run on (- is used to ignore specified machines, + is used to only run on specified machines).
  - reference example above
- Item 4: Controls whether the run creates its own baseline or it uses the baseline from a different (control) test.
- Item 5: Test name to compare baselines with if not itself.

Input Files: 2) tests/<test-name>

Two levels to set simulation parameters
- default_vars.sh sets default values
- <test-name> overrides default values, adds test-specific parameters, e.g.,
  - SYEAR=2013, FHMAX=24, FDIAG=6, WLCLK=30
Set environment variables that are passed onto various template files in parm/
- input.*.nml.IN
- nems.configure.*.IN
- model_configure.IN
Specify configuration templates to use, e.g.,
- INPUT_NML=”input.mom6_ccpp.nml.IN”
- NEMS_CONFIGURE=”nems.configure.med_atm_ocn_ice_wav.IN”
- FV3_RUN=”cpld_fv3_mom6_cice_atm_flux_run.IN”

Input Files: 3) fv3_conf/<run-setup-name>

Set up input data, grid data, etc. by copying files from baseline directory to run directory
Baseline directory contains
- Subdirectories for input data (e.g., CICE_IC, MOM6_IC, FV3_input_data, MEDIATOR_ccpp)
- Subdirectories for previous run results (e.g., cpld_control_p8_intel or cpld_control_p8_gnu)
Make sure directories and files exist in RTPWD

Default Directories specified in rt.sh

Baseline directory (RTPWD)
- Hera: /scratch1/NCEPDEV/nems/emc.nemspara/RT/NEMSfv3gfs/develop-YYYYMMDD
- Orion: /work/noaa/nems/emc.nemspara/RT/NEMSfv3gfs/develop-YYYYMMDD
Run directory root (RUNDIR_ROOT)
- Hera: /scratch1/NCEPDEV/stmp2/${USER}/FV3_RT/rt_$$
- Orion: /work/noaa/stmp/${USER}/stmp/${USER}/FV3_RT/rt_$$
- RUNDIR=${RUNDIR_ROOT}/${TEST_NAME}
New baseline directory (NEW_BASELINE)
- Hera: /scratch1/NCEPDEV/stmp4/${USER}/FV3_RT/REGRESSION_TEST
- Orion: /work/noaa/stmp/${USER}/stmp/${USER}/FV3_RT/REGRESSION_TEST

Build

Triggered by COMPILE row in rt.conf with specified <MAKE_OPT>
Build is done using CMakeLists.txt in ..
compile.sh is a simple wrapper to interface with rt.sh
If you build exe file separately (i.e., w/o rt.sh), make a copy as ufs-weather-model/tests/fv3_0.exe. Also,
- $ cp ../NEMS/src/conf/modules.nems modules.fcst_0
If you want to reuse your exe, keep a copy with a different name

rt.sh Usage

./rt.sh: display usage information
./rt.sh -a | -c | -e | -h | -k | -w | -d | -l | -m | -n | -r
- -a to use on for HPC queue"
- -c create new baseline results"
- -e use ecFlow workflow manager"
- -h display this help"
- -k keep run directory after rt.sh is completed"
- -l runs test specified in "
- -m compare against new baseline results"
- -n run single test "
- -r use Rocoto workflow manager"
- -w for weekly_test, skip comparing baseline results"
- -d delete run directories that are not used by other tests"

Run Full Regression Tests

If you make code changes that are not expected to change simulation results, you can run full regression tests afterward to demonstrate your changes do not break anything
Currently, there are over 150 standard regression tests built and run on supported TIER-1 platforms.
In ufs-weather-model/tests/ directory, use any one of the following:
- $ ./rt.sh -a >output 2>&1 & (run rt.sh using HPC account )
- $ ./rt.sh -e >output 2>&1 &(use ecFlow)
- $ ./rt.sh -r >output 2>&1 &(use Rocoto)
- $ ./rt.sh -ek >output 2>&1 &(use ecFlow, keep run directory for post-run diagnosis)

Run a Single Regression Test

Create a file, say my_test.conf, with a single COMPILE and a single RUN
- $ cp rt.conf my_test.conf
- $ vi my_test.conf
- $ ./rt.sh -l my_test.conf
Or use -n option
- $ ./rt.sh -n cpld_control_p8 >output 2>&1 &

Create a New Baseline of Existing Test

Your code changes are expected to change simulation results (e.g., physics change), and thus cannot be compared against existing baseline results
You still need RTPWD as it contains the simulation input data
./rt.sh -c -f OR ./rt.sh -c -l my_test.conf
- rt.sh will copy input data from RTPWD to NEW_BASELINE
- Simulation results will be copied from RUNDIR to NEW_BASELINE
Manually move your NEW_BASELINE to emc.nemspara

Run Regression Test against New Baseline

You have generated new baseline
You want to compare all your subsequent tests against the new baseline
./rt.sh -m -f OR ./rt.sh -m -l my_test.conf
Internally, -m flag sets RTPWD=${NEW_BASELINE}

Add a New Test

Configuration files (select, or copy and modify):
- rt.conf
- tests/<test-name>
- tests/fv3_conf/
- parm/input.*.nml.IN
- parm/nems.configure.*.IN
- parm/model_configure.IN
- parm/ice_in_template
- parm/MOM_input_template
./rt.sh -c -l my_test.conf
- Will not compare with existing baselines
If your case requires new input data not in RTPWD, set RTPWD to your local directory

Already have an Executable File (needs update)

Remove COMPILE row in rt.conf
$ cp ../NEMS/exe/NEMS.x fcst_0.exe
$ cp ../NEMS/src/conf/modules.nems modules.fcst_0
- This module file needs to be identical to the one you used for build
$ ./rt.sh
This approach does not work with workflow managers because RUN depends on COMPILE

Output Files and Log Files for Diagnosis

Summary files
- Hera: RegressionTests_hera.log
- Orion: RegressionTests_orion.log
- MISSING file, MISSING baseline, OK, NOT OK...
./rt.sh >output 2>&1 &: output of rt.sh
Log files in log_hera/ and log_orion/
- compile_*.log: output of compile.sh
- run_*.log: output of run_test.sh
Run directory RUNDIR_ROOT/
- subdir: contains all files necessary for simulation, e.g., sbatch job_card
- QUEUE is set in rt.sh

Running regression test using rt.sh - ufs-community/ufs-weather-model GitHub Wiki

NOTE: THIS PAGE IS UNDER CONSTRUCTION

Shell script-based Regression Test: rt.sh

rt.sh related Files

rt.sh related Files

Input Files: 1) rt.conf

Input Files: 2) tests/<test-name>

Input Files: 3) fv3_conf/<run-setup-name>

Default Directories specified in rt.sh

Build

rt.sh Usage

Run Full Regression Tests

Run a Single Regression Test

Create a New Baseline of Existing Test

Run Regression Test against New Baseline

Add a New Test

Already have an Executable File (needs update)

Output Files and Log Files for Diagnosis

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️