Reading and Writing Checkpoints in GEOS - GEOS-ESM/MAPL GitHub Wiki

Overview

MAPL has TWO distinct code paths to WRITE checkpoints. The first is the "classic" path that uses MPI and in theory allows parallelization using parallel NetCDF or allows one to break the checkpoint up into multiple files. This was the original checkpoint code. The second and more recent option uses the output server (the same used by History) to a single file. Both have limitations and situation when they should not be used.

MAPL has ONE and only one mode to READ checkpoint files which uses parallel NetCDF.

Writing Checkpoints

The first decision when writing checkpoints is which of the two code paths to use. This is controlled by a keyword in the AGCM.rc which is WRITE_RESTART_BY_OSERVER:

# options are YES or NO (default NO)
WRITE_RESTART_BY_OSERVER: YES 

There are specify times when you would want to use the server or not which I will go over.

OServer

To use the OServer to write the checkpoints set WRITE_RESTART_BY_OSERVER: YES in your AGCM.rc. That is the only thing to do. Any other parameters that are discussed for the "classic" path have NO effect here.

Why would you use the OServer? The main reason would be allow asynchronous writing of checkpoints while the model is running. Note that if you are only writing checkpoints at the end, obviously there is no compute to overlap so you don't get an asynchronous benefit. Also note to get asynchronous writing you would have to have configured GEOS to run with extra oserver nodes, but if you are running c180 or higher you probably are anyway. Otherwise if you don't have extra nodes, you can still write with the oserver, but it will be the simple server and will not be asynchronous.

When can't you use the OServer? The oserver gathers the entire contents of a file to a single process. Therefore if the file contents are bigger than the memory on a node you can't use it. In practical terms it means this can't be used at very high resolutions, say c2880 and above.

"Classic" Path

The "classic" path uses MPI to partially gather each variable to a subset of MPI processes. Then each of these MPI processes write their "piece" to a single file using parallel NetCDF or to individual files that have the individual piece.