access_nwp_suite_input_data - ACCESS-NRI/accessdev-Trac-archive GitHub Wiki
[{ #!div style="max-width: 1000px; margin: 0 auto;" PageOutline
NWP Suites input data on NCI
Available projects at NCI and their data policy
-
lb4 and wr45
- Funded by NCI for data useful for others (CSIRO/universities)
- We provide the data as part of BoM contribution to NCI
- Not for a general use of running NWP experiments
-
ig2
- Data for Bureau's access only; it is not visible to others
- 50 TB in total storage space (45 TB are used as of 21 Dec, 2020)
- As of late 2020 it is used for archiving data from operational NWP models
- Newer data are copied and as of late 2020 there is no auto-purge
- Data holding,
- bufr/ - Bufr files from operational ACCESS-G2 and ACCESS-R
- cmcgem/, jmagsm/, ops_aps3/, ukgc/ usavm/ - copies of MARS
- ACCESS_prod/
- dev_aps4/
- ei/
- mnth/
- UKDAILY/, UKDAILY_PS/
\\
Data inventory
NWP input data (currently under nas_data)
# Data | # From | # Size per day | # size per year | # C3 | # CE3 | # NAS | # G3 | # ADEPT | # obsmon |
---|---|---|---|---|---|---|---|---|---|
bgerr | G3 | 5.7MB | 2GB | x | x | ||||
glu_smc.gz | G3 | 159MB | 58GB | x | x | c | |||
glu_varbc.gz | G3 | 1.5MB | 0.55GB | x | x | c | |||
obs (includes RAMMSA SST) | C3 getobs | 23GB | 8400GB | x | x | ||||
obs15 | NAS getobs | 17GB | 6200GB | x | |||||
gl t+3 dump | G3 | 160GB | c | c | c | x | |||
odb2s | C3 | 3.5G | 1300GB | x | |||||
Frames (C3, NAS, ADEPT) | G3 | 11G | 4000GB | x | x | x | |||
Total | |||||||||
*x means used daily or each cycle. c means used for cold starting |
Alternate plan for G3 dumps:
- 6 months of 4/day = 29TB
- 30 months of 1/day = 36TB Total = 65 TB
3 years is then 100TB
Alternate (non-GetObs) obs archive
- /g/data/dp9/da/access-c/obs/radar ~130GB per month
- /g/data/dp9/da/access-c/obs/rainfall ~ 250MB per month
- /g/data/dp9/da/access-c/surf/ascat ~ 250MB per month
The radar data archived here is now static, since the C3 getobs (above) now includes radar data. It should be excluded from ongoing estimates and left where it is. The ascat data might need to be added to the estimates if we start to do soil moisture analysis for high-res systems (likely). Say 9G for 3 years.
Static data
- Ancillaries (/g/data/access/ANCIL/APS3 or APS4) 12TB
- Local control files
- (/g/data/access/OPS/[control|Data]) 2.5GB
- (/g/data/access/VAR/[data_64/CovStats|ext/VAR]) 6GB
This should be excluded from estimates as it is small and/or static. I.e. not part of a "3 year archive", but must be kept permanently.
- /g/data/access/VAR/ (2.3T)
- /g/data/access/OPS/ (330G)
- /g/data/access/ANCIL (12T)
- /projects/access/umdir/ancil/data/ (48M)
- /projects/access/umdir/vn10.8/ctldata (289M)
- /g/data/dp9/reana/anc_data/ukmo/ (permission denied)
- /scratch/dp9/ttl548/nas_data/2020/02/ (9.4T)
- /g/data/dp9/da/access-c/obs/radar/202002/ (131G)
- /g/data/dp9/da/access-c/obs/rainfall/202002/ (300M)
Total: ~24.54T
City Suite (u-bn286)
Including all in above table, 15.7TB per year for inputs, excluding cold start dumps.
ADEPT
-
For 3 ADEPT domains MCM / SCS / ADS
-
External Inputs (per domain per cycle)
-
G3 start dump: 40GB used by all 3 domains
-
G3 Frames: ~6GB total for 3 domains
-
Ancils: ~1TB total for 3 domains
Global Trials
The following is an estimate of input data needed for global suites (assuming 3 years' worth of data are stored on disk),
- Bufr observational data: ~2T
- OSTIA SST and sea ice files: ~1T
- Error modes for global hybrid non-coupled trialling: 31G/cycle, 124G/day; total of ~136T
UKMO start dumps
3 months' worth of UKMO IC files take up approx 9 TB (Wenming)
Others
NAS, CE, BARRA and BARPA are outside the scope of this current work in data consolidation. But the data requirements for those are listed here.
\\
Consolidated and final estimate of all input data required for NWP trials
ig2 whose allocation is 50 TB currently is insufficient for long trial periods needed by various NWP models. Because of this constraint we decided to store only limited amount of input data under ig2. Here is a summary of how we will be managing ig2,
- For each model we will store enough input data to allow the completion of 2 standard trial periods: one summer and one winter with each lasting 2 1/2 to 3 months
- Any other data outside the standard trial periods will need to be stored elsewhere - e.g. MDSS
- At any given development cycle (e.g. APS4) the standard trial periods will be fixed but can be changed by consensus. When the standard trial periods change the older input data will migrate to MDSS and the required newer data will be stored in ig2
Here's the final estimate of data requirement for various trials,
Model | Trial period | File type | Input data requirement (in TB) | Comment |
---|---|---|---|---|
Global | 20171201T06 - 20180228T00 and 20170620T06 - 20170930T12 | 0.5 | this estimate does not include error modes which are needed for uncoupled hybrid 4DVar | |
Global | 1 year | glu_smc.gz | 0.058 | From Susan's table (above) |
Global | 1 year | glu_varbc.gz | 0.00055 | From Susan's table (above) |
1 year | _sst.um | 0.0051 | once daily at T0600Z, files are from ACCESS_prod/access_r_update/$y/$m/input_$ymd$hh/$ymd${hh}_sst.um on SAM | |
ACCESS-C | 1 year | obs | 8.4 | From Susan's table (above). Includes RAMMSA SST. From C3 getobs |
ACCESS-C | 20200201T06 - 20200415T21 and 75 day winter period | frames | 0.9 | This estimate is for the frames files for two (2) city regions, for two (2) sets of 75 day periods (one summer, one winter) |
ADEPT | 28 day winter period in August | frames | 0.544 | Three regions, SCS, MCM, ADS, 4 times/day. This period should be within the Global model winter trial period so that the Global files can be used here too |
ADEPT | 28 day winter period in August | t+3 | 1.12 | |
Total | 11.532 |
\\
Directory structure
The directory tree structure used for ig2 is identical to that on sam. This will make locating of data files on ig2 easy.
\\
Scripts used in transferring NWP input data from Bureau to NCI
Tan's scripts for transferring NWP input data to NCI are under logan:/home/ttl/cron
(some are copied to gadi:/scratch/dp9/ttl548/DL
),
nas_obs.sh
andnas_frame.sh
Some of Milton's scripts are under logan:/home/mwoods/australis-mirror
(some are copied to gadi:/scratch/dp9/ttl548/DL
),
logan:/home/mwoods/australis-mirror/common/lftp.sh
- for start-up options to set up lftp; mainly used in transferring data from NCI to Bureaumirror_nas.sh
- to transfer NAS data to Bureau machines
\\