CLM - Ivanderkelen/wiki_tryout GitHub Wiki

The Community Land Model (CLM) is the land module of the Community Earth System Model (CESM) and is maintained at NCAR. CLM is embedded within Community Terrestrial Systems model, and maintained in the same github repository.

More information and some useful links:

Contents

  1. A mini tutorial to run CLM5.0 on Piz Daint
  2. Installing and running CTSM with the NUOCP driver
  3. The ctsm_python_gallery from NCAR on Daint
  4. Template notebooks for postprocessing
  5. Running regional CTSM simulations
  6. Noteworthy CTSM updates

A mini tutorial to run CLM5.0 on Piz Daint

❗ Note that this is a tutorial of the OLD version of CLM.

In this tutorial, you run a sandbox model of the CLM5.0 release version. This is a good start for learning the model. If you want to run newer (CTSM) versions, you need an ESMF installation, as this is required for its new NUOCP driver. The steps to obtain these are described (here)[].

One-time setup

1. Getting the machine settings

Copy the .cime folder in your $HOME. This provides the configuration files for our machine (pizdaint), compilers (gnu, gnu-oasis) and batch system (slurm) and is originally cloned from from https://github.com/pesieber/.cime.

cp -r /users/ivanderk/.cime $HOME

2. Creating a base directory for CCLM2 and clone CLM

In the CCLM the code base of CLM will reside, next to COSMO and OASIS. In the scripts, this is the $CCLM2ROOT directory.

mkdir $HOME/CCLM2/
cd CCLM2   

In this directory, clone the CLM source code. (the CLM base directory will be $CLMROOT)

git clone -b release-clm5.0 https://github.com/pesieber/CTSM.git clm5.0

Note: here we use the clone of Petra Sieber, as she is testing out the coupling with COSMO. The only difference to the NCAR repo is the addition of the jobscript. For more info see her wiki

Navigate into the clm5 directory and checkout the externals (this goes looking for the other modules of the models and fetches their latest versions)

cd clm5.0
./manage_externals/checkout_externals

The last set-up step is to copy the inputdata needed by CLM on your $SCRATCH directory. This is necessary as the CSCS systems prevent reading/writing on the projects folder. To not overload the login node, we submit the rsync statement to the xfer queue using the following script:

sbatch transfer_clm_inputdata.sh

Step-by-step guide on running CLM

To understand how the model works, where options are altered etc, it is advised to go through a manual case creation, setup, compiling and building of the model.

1. Creating a case

From the clm5.0 source code, go to the cime/scripts directory:

cd /cime/scripts/

Invoke the create a newcase-script

./create_newcase --case $SCRATCH/CCLM2_cases/testcase --res hcru_hcru --compset I2000Clm50SpGs --mach pizdaint --proj sm62 --compiler gnu --driver mct --run-unsupported

Here we create a 0.5° global case, using CLM version 5.0 in land-only mode (data atmosphere, indicated by the 'I'), with the climatology of the year 2000, using satellite phenology and a sub glacier component. A list with all available component sets can be found here.

All available, supported resolutions can be found here, it is however possible to add new, user defined resolutions by creating mapping, domain and surface data files.

2. Setting up a case

Go to the cases directory and view the different files in your newly created case

cd $SCRATCH/CCLM2_cases/testcase

In the .xml files, the settings to the case are done.

First, change the directory structure for consistency to Petra's setup (change where model is build and run) (by defaut, RUNDIR is $SCRATCH/$CASENAME/run)

./xmlchange RUNDIR="$SCRATCH/CCLM2_cases/testcase/run"
./xmlchange EXEROOT="$SCRATCH/CCLM2_cases/testcase/bld"

Then, point the code to the input data

./xmlchange DIN_LOC_ROOT="$SCRATCH/CCLM2_inputdata/cesm_inputdata"
./xmlchange DIN_LOC_ROOT_CLMFORC="$SCRATCH/CCLM2_inputdata/cesm_inputdata/atm/datm7"

and the mapping files for the global case

./xmlchange LND2ROF_FMAPNAME="$SCRATCH/CCLM2_inputdata/CCLM2_EUR_inputdata/mapping/map_360x720_nomask_to_0.5x0.5_nomask_aave_da_c130103.nc"
./xmlchange ROF2LND_FMAPNAME="$SCRATCH/CCLM2_inputdata/CCLM2_EUR_inputdata/mapping/map_0.5x0.5_nomask_to_360x720_nomask_aave_da_c120830.nc"

Next, we need to alter the pes setup (these changes are done in the env_mach_pes.xml file)

./xmlchange COST_PES=288
./xmlchange NTASKS_CPL=-24
./xmlchange NTASKS_ATM=-24
./xmlchange NTASKS_OCN=-24
./xmlchange NTASKS_WAV=-24
./xmlchange NTASKS_GLC=-24
./xmlchange NTASKS_ICE=-24
./xmlchange NTASKS_ROF=-24
./xmlchange NTASKS_LND=-24

Then, invoke case.setup to create the namelist files

./case.setup

3. Building the case

Before building the case, we will do some more basic model modifications

First, turn of the short term archiver. This makes the output is not moved to the output directory, but remains in the run directory for closer inspection.

./xmlchange DOUT_S=FALSE

Then, alter the stop option and number of stops to 1 month. For now, we will run 1 year. Also alter the walltime.

./xmlchange STOP_OPTION=nyears
./xmlchange STOP_N=1
./xmlchange JOB_WALLCLOCK_TIME=01:00:00

Next, open the user_nl_clm namelist file and add the following line pointing to the surface dataset:

fsurdat = "$SCRATCH/CCLM2_inputdata/CCLM2_EUR_inputdata/surfdata/surfdata_360x720cru_16pfts_simyr2000_c170428.nc"

In the namelists other options, like frequency, time interval and output fields are defined. The full list of parameters and their available options can be found here.

Finally, build the case

./case.build

4. Running the case

To check how the run will look like issue

./preview_run

Finally, submit the case and check wether it is running in the queue

./case.submit

5. Inspecting the output

To inspect output with ncview, you need to have the ncview module loaded. Note that for compiling and running the model, you cannot have these modules loaded, so opening a second window in MobaXTerm to connect to daint might be advisable.

module load daint-gpu
module load ncview

Move to the run directory, and inspect output from the land model.

cd run/
ncview testcase.clm2.h0.0001-01.nc

The files we produced in our testrun are monthly written output for one year, while CLM runs at 30 min time resolution.

Now, the analysis fun can start!

Running CLM using a script

All these steps and settings can also be done in one compile-and-run script.

See the script at $HOME/CCLM2/clm5.0/compile_clm.sh for more details. For the online version, click here compile_clm.sh. Here the different options for running on a European and South American domain are included.

Tips

Good resource for postprocessing scripts: https://github.com/NCAR/ctsm_python_gallery/tree/master How to set this up on Piz Daint is described here

Installing and running CTSM with the NUOCP driver

In the newer versions of CTSM (from ctsmXXX onwards), the MCT driver is replaced by NUOCP. This requires an ESMF installation.

Below, the steps to achieve this version are included.

1. Installing ESMF

First, use Spack to install ESMF. The CTSM requires at least ESM version 8.4.1 and later. If not yet done, Please follow the steps to install spack listed here .

module load daint-gpu spack-config
. spack/share/spack/setup-env.sh

Install all necessary packages for the gcc-9.3.0 compiler. For PIO (parallelio), you need to build with a newer netcdf-c with respect to the one provided on PizDaint.

spack install [email protected]%[email protected] ^ [email protected]
spack install [email protected]%[email protected] ~pnetcdf

Then, find the path where ESMF is installed by issuing

spack location -i esmf

The resulting path should look like /project/s1207/ivanderk/spack-install/cray-cnl7-haswell/gcc-9.3.0/esmf-8.4.1-4n356am7enhj6lwdzjbkzjzpbbu674ug/. Add the ESMF path into your .bashrc as follows: export ESMF_PATH=<insert ESMF path>, replacing the path by the output of the previous command.

2. Cloning CTSM and one time setup

Here we clone from the official CTSM repo.

git clone [email protected]:ESCOMP/CTSM.git CTSM (or git clone https://github.com/ESCOMP/CTSM.git CTSM)
cd CTSM
./manage_externals/checkout_externals

The checkout of the masterbranch is to obtain the latest version (and not the clm5.0-release, which still runs with the MCT driver)

Then, you need to clone the helper scripts from this repository and copy both the compile and input data transfer scripts into the CTSM folder.

cd $HOME
git clone https://github.com/Ivanderkelen/wiki_tryout.git
cp wiki_tryout/running_clm_helpscripts/compile_ctsm.sh CTSM/compile_ctsm.sh
cp wiki_tryout/running_clm_helpscripts/transfer_cesm_inputdata.sh CTSM/transfer_cesm_inputdata.sh

Next, clone the .cime repository. This contains the machine settings for ctsm. (Fork of the settings by Petra Sieber?

cd $HOME
git clone https://github.com/Ivanderkelen/.cime

3. Running the compile script

Then, go back to the CTSM directory and run the compile script. This script includes all of the steps explained above: creates a case, sets it up, compiles it and submits it to the compute nodes.

cd CTSM
./compile_ctsm

Happy CLM'ing! 😄

The ctsm_python_gallery from NCAR on Daint

NCAR has a repository where sample workflows and scripts are collected to analyse ctsm model output. There are a lot of handy functions which make your life with CTSM output a lot easier.

To install and use this repository, you need a working python environment set up as described in Using Python environment for JupyterHub on Daint.

First, make sure that your python environment is activated. If you installed the python environment as described before, this can be done through:

source ~/env_ctsm_py/env_ctsm_py/bin/activate

Then, clone the repository containing the package:

git clone https://github.com/NCAR/ctsm_py.git

Then, install the utilities:

cd ctsm_py
pip install -e .

Now, the package can be used in your Jupyter notebooks.

To do so, load the package by including from ctsm_py.utils import * at the top of your notebook

Template notebooks for postprocessing

See here for a Jupyter notebook template for analysing CLM output.

Running regional CTSM simulations

Here, we describe the setup to prepare all the input (mesh files, surfacedata files and domain files) necessary for regional CTSM simulations, as described in Regional CTSM simulations

:exclamation at the time of writing, the tool used are still an open Pull Request (https://github.com/ESCOMP/CTSM/pull/1892). If you see this changed, please update this wiki.

Setting up python environment necessary by ctsm tools

This is a different environment than we installed for analyzing CTSM data.

First, we start by creating a spack environment.

load daint-{mc,gpu} and its related spack-config module

module load daint-mc spack-config
. spack/share/spack/setup-env.sh

Create a folder to work in

mkdir ctsm_pylib
cd ctsm_pylib

Create a file called spack.yaml with the following content:

# spack.yaml
spack:
  specs: [python, geos, miniconda3]
  view: true
  concretizer:
    unify: true
  modules:
    prefix_inspections:
      lib64: [LD_LIBRARY_PATH]
  packages:
    python:
      require: '@3.8:'

Install and activate the spack environment using the spack.yaml file (spack recognizes this file automatically). Note that the installation itself can take a couple of minutes.

spack -e . concretize -f
spack -e . install
spack env activate .

For conda to take action, issue

conda init bash

Next, log out and in again on daint.

Then, clone the ctsm from negin, checkout the correct branch and checkout the externals (at the time of writing the tool is still a pull request, but this should be replaced with the official source code once ingested)

git clone https://github.com/negin513/ctsm.git ctsm_mesh
cd ctsm_mesh

git checkout subset_mesh_dask
./manage_externals/checkout_externals

Then, use the script to install all necessary modules, using the script provided by NCAR.

./py_env_create
conda activate ctsm_pylib

Run ./subset_data for the region of interest to create surface dataset, mesh files, and all the usermods necessary to run the regional case. In this example, the region of Kenya is used. The output will be saved on SCRATCH, and set $CESMDATAROOT to point to /project/s1207/CCLM2_inputdata (as defined in CTSM/compile_ctsm.sh)

cd tools/site_and_regional
./subset_data region --create-surface --create-mesh --create-user-mods --create-domain --lat1 20 --lat2 55 --lon1 230 --lon2 300 --overwrite --verbose --outdir $SCRATCH/mesh_kenya --inputdata-dir $CESMDATAROOT/cesm_inputdata

If you want to see all options and how to use the script issue

./subset_data -h

In the output directory (here $SCRATCH/mesh_kenya), you can find the generated input files: surface data and domain, as well as a directory called 'user_mods'. The scripts in user_mods can help setup the new case in the CTSM/compile_ctsm.sh (see section )

Further information can be found here: https://gist.github.com/negin513/d71268419f808b91a82cde5530d390b1#file-regional_ctsm-md

To deactivate conda environment

conda deactivate

Noteworthy CTSM updates