DataCuration_CoupledModelInputDirectoryStructure - ACCESS-NRI/CMIP7-Input GitHub Wiki

Data Curation: Coupled Model Input Directory Structure

From Spencer Wong:

The directory /g/data/vk83/configurations/inputs contains inputs for released configurations of several models:

ls /g/data/vk83/configurations/inputs
access-am3  access-esm1p5  access-esm1p6  access-om2  access-om3  JRA-55  LICENSE  README.md

In access-esm1p5 for example, the top level groups configurations which will generally share a lot of inputs. For example

ls /g/data/vk83/configurations/inputs/access-esm1p5
CHANGELOG  modern  paleo  share

Here, the configurations are grouped by general time periods, in which the same land-sea mask would be used.

ls /g/data/vk83/configurations/inputs/access-esm1p5/modern
amip  historical  pre-industrial  share  unused

and within each category, the inputs for different configurations are further grouped together.

Individual input files are then versioned using directory names

├── aerosol
│   └── global.N96
│       └── 2020.05.19
│           ├── BC_hi_1850_ESM1.anc
│           ├── Bio_1850_ESM1.anc
│           ├── .manifest.yaml
│           ├── OCFF_1850_ESM1.anc
│           └── scycl_1850_ESM1_v4.anc
├── forcing
│   ├── global.N96
│   │   └── 2020.05.19
│   │       ├── .manifest.yaml
│   │       └── ozone_1850_ESM1.anc
│   └── resolution_independent
│       └── 2020.05.19
│           ├── .manifest.yaml
│           └── volcts_18502000ave.dat
└── land
    └── biogeochemistry
        └── global.N96
            └── 2020.05.19
                ├── .manifest.yaml
                └── Ndep_1850_ESM1.anc

Each directory containing input files also has a .manifests file, containing md5 hashes of each of the files.

The second part of the inputs infrastructure is the model-config-inputs repository created by Tommy. This mirrors the directory structure on /g/data/vk83/configurations/inputs, generates the .manifests files, and backs up all the data to tape.

It also includes a workflow for copying data into the /g/data/vk83/configurations/inputs directory. The intention is for people to add data via the workflow rather than directly modifying the directory.

There's finally also a less tightly controlled prerelease inputs directory /g/data/vk83/prerelease/configurations/inputs which is being used during development of ESM1.6 and CM3.