OptClim‐UKESM more realistic example - optclim/ModelOptimisation GitHub Wiki

Overview

Its assumed the demonstration case has been run to see that OptClim works - see wiki page for this - and to gain some familiarity.

If you wish to replicate a study based on an existing base job, note the differences between:

  • model runs comprise a single job (an NRUN without CRUN) e.g. base suite: ~mjmn02/roses/base-db167. This has the optclim tasks code included within its initial cycle, "R1".
  • NRUN with CRUNs, e.g. base suite obase-db898 which currently runs for 15 months, 1 month per SLURM job, so each instance uses 15 cycles on Cylc. This has the optclim_reuse and optclim_prerun tasks in the R1 initial cycle that runs in the NRUN. It has the optclim_postrun task defined a bit further down, in a new graph for a cycle that executes once after the end of the model's iterations. OptClim does not "know" about the CRUNs, and is just told as usual when the model completes all its jobs. A study example is in/work/n02/shared/mjmn02/OptClim/optclim3/studies/UKESM_P2/898 The last few cycles can be seen here for a running instance, using gcylc (from cylc gscan, on Puma): image (A note: housekeeping is probably not needed in the final cycle)

Below:

  1. How to replicate a run
  2. How to generate a new base suite

To replicate a study

on puma copy the base suite

copy a directory, and the named destination defines the new suite name. For example:

cp -rl ~mjmn02/roses/obase-db898 . # if you rename it, then also amend the json as below on ARCHER2
edit ARCHER2_USERNAME on line 8 of rose-suite.conf.
To shorten the 15 months, edit EXPT_RUNLEN='P1Y3M0D'

Note, but do not amend in this in initial run: in rose-suite.conf:

...
TASK_BUILD_UM=false
...
TASK_OPTCLIM=true
OPT_SOURCE_SHARE='/work/n02/n02/mjmn02/cylc-run/st-db898/share/'
OPTCLIM_RUNDIR='xxxx'
OPTCLIM_PARAM_EDIT="/work/n02/shared/mjmn02/sw/conda/opt_4/bin/modeloptimisation2-create"

on ARCHER2

create a directory in which the study will be started, cd into it
cp /work/n02/shared/mjmn02/OptClim/optclim3/studies/UKESM_P2/898/UKESM_2.json  newfilejson

edit the study name (directory for the study): line 6  "Name": "on" - can be longer than that, need not be the same as the prefix below, up to you!
edit the prefix for the interface directory (line 8, "on"  so model interface directories are of form on001)
check the maxfun you want - max number of model instances - an emergency stop for a study so it doesn't go for ever
edit the parameter start values
IF you renamed the base job then edit line with referenceModelDirectory:
from
 "study": {
    "comment": "Parameters that specify the study. Used by framework and not by optimisation routines",
    "referenceModelDirectory": "obase-db898",

to replace obase-db8989 with the base suite name you had defined on puma.

To use a new suite not already set up for OptCLim

To create a new base model first gain familiarity by using one of the above, determined by whether yours is an NRUN or a NRUN with CRUNs. See demo example for the anatomy of the study. Steps are:

  1. Run a routine model without OptClim. (st-db898 for example below)
  2. Copy that suite on PUMA to make a new base suite: the one with optclim additions (obase-db898)
  3. Add to the base suite's rose-suite.config the environment variables

Its recommended you compare a standard and an OptClim base model. st-db898 is a routine model suite, a copy of u-db898. obase-db898 is a base suite for this model, with OptClim additions.

diff ~mjmn02/st-db898/rose-suite.conf ~mjmn02/obase-db898/

81c81
< TASK_BUILD_UM=true
---
> TASK_BUILD_UM=false
94a95,98
> TASK_OPTCLIM=true
> OPT_SOURCE_SHARE='/work/n02/n02/mjmn02/cylc-run/st-db898/share/'
> OPTCLIM_RUNDIR='xxxx'
> OPTCLIM_PARAM_EDIT="/work/n02/shared/mjmn02/sw/conda/opt_4/bin/modeloptimisation2-create"
Noting that the BUILD is now false as we are reusing executables.


Also compare the suite.rc

  1. Into suite.rc add these variables, for example.
under the 
[runtime]
    [root](/optclim/ModelOptimisation/wiki/root)
        script = {{TASK_RUN_COMMAND}}
        env-script = "eval $(rose task-env)"
        [[environment](/optclim/ModelOptimisation/wiki/[environment)]
....
add the following:
           DATASHR=$CYLC_TASK_WORK_PATH/../../../share
            SRCSHR={{ OPT_SOURCE_SHARE }}
            OPTCLIM_RUNDIR={{ OPTCLIM_RUNDIR }}
            OPTCLIM_PARAM_EDIT={{ OPTCLIM_PARAM_EDIT }}

because: DATASHR - is the destination for the reused execs

  1. into suite.rc following obase-db898
under dependencies add  tasks to the INIT_GRAPH
        {% set INIT_GRAPH = ' ' %}

        {% if TASK_OPTCLIM %}
        {% set INIT_GRAPH = 'optclim_reuse => optclim_prerun => ' %}
        {% endif %}
  1. into suite.rc, follow the relevant base example according to whether you have an NRUN or NRUN+CRUN suite to include
in CRUNS case, add the R1/$ after 
        [[ {{EXPT_RESUB}} ](/optclim/ModelOptimisation/wiki/[-{{EXPT_RESUB}}-)]
            graph = {{ RESUB_GRAPH }}

add this:
        [[ R1/$ ](/optclim/ModelOptimisation/wiki/[-R1/$-)]
            graph = "housekeeping => optclim_postrun"

In NRun case, add optclim_postrun to the INIT_GRAPH, as in the demonstration suite

  1. unless you already have explicit code for optclim_* in the suite.rc you copied, then add to your suite directory:
cp ~mjmn02/roses/obase-db898/optclim_tasks.rc .

and in suite.rc in the tasks list, after housekeeping add the include line below.

   [housekeeping](/optclim/ModelOptimisation/wiki/housekeeping)
        inherit = RUN_MAIN, HOUSEKEEP_RESOURCE
%include optclim_tasks.rc

You can now check the base suite by:

cd ~/roses/your-suite-name
rose suite-run -l
cylc graph your-suite-name

Note this will only show the initial cycle - dotn expect to see a final cycle with optclim_postrun in a CRUN case.

Once it is set up, and the json file exists (again, bse yours on an existing one where possible - this is not fully documented here) then run the study as described elsewhere with runAlgorithm.py

Again, refer tot he explanations on the demonstration wiki page.

When the model has run to completion, as with one based on obase-db898 you will see log files in the cylc directory such as:

cd /home/n02/n02/mjmn02/cylc-run/on011_obase-db898/log/job
ls -ltr
total 16K
drwxr-sr-x 4 mjmn02 n02 4.0K Apr 17 13:27 20110901T0000Z
drwxr-sr-x 4 mjmn02 n02 4.0K Apr 17 14:07 20111001T0000Z
drwxr-sr-x 4 mjmn02 n02 4.0K Apr 17 14:52 20111101T0000Z
drwxr-sr-x 3 mjmn02 n02 4.0K Apr 17 14:58 20111130T2359Z

The first three are the final cycles of the model, the 4th and last of these is the final task that runs optclim_postrun.

Adding parameters for use by OptClim

At present this entails setting up your own python environment so you can make amendments for the parameters you are using by:

  1. adding to the study's JSON file: for each parameter, its range and default value (that typically used), and initial value for OptClim
  2. amending the UKESM.py in $OPTCLIMTOP/OptClimVn2/UKESM.py - the OptClim2 code that manages parameter definitions, and also....
  3. amending ModelOptimisation2/UKESM.py (different file to the one just above! This is the new way to manage parameters used for UKESM).