access_STASHpage - ACCESS-NRI/accessdev-Trac-archive GitHub Wiki

STASH and NWP

Implementation and STASH Working Group

Philosophy

  • STASH needs to be modular to allow sets of requests (packages) to be inserted and removed.

    • This means that the package label becomes paramount; if a request belongs to an important suite function (e.g. etkf, background errors etc) it MUST BE LABELLED as such. I don't really care much whether different functional aspects of user STASH are labelled as different packages (although it would aid extraction/manipulation of user stash if they had 'user' in the package name.
    • The hash ID needs to be de-emphasised; the user should not need to know them or have to deliberately change them (it should be a background task done by the modifying script when saving changes).
    • When saving packages the profiles need to be saved as well - preferably with modified names so that conflicts don't arise when reading them back in.
  • Although it is possible to turn off stash requests individually it is probably better to remove unused stash requests and profiles to avoid extra clutter in the rose files.

  • STASH should be as similar as possible across the model suites to streamline downstream processing, facilitate modification, insertion and deletion of fields across models and make it easier to keep users informed.

  • Where possible it is better to leverage existing stash/rose macros to process stash. (This is complicated by the fact that I didn't find the relevant macros until I was well into the process and by the fact that they seem to be based on a different philosophy to mine.)

  • We need a STASH editor which allows group edits where collections of requests can be moved from package to package, have common properties edited etc.

  • It would also be useful if essential non-user STASH used a Model Output Stream file name which did not use any of the conventional streams pp0-pp9. Different centres could have specific streams that they use for user STASH and it would be easier if they didn't have to adjust the stream names between suites (e.g. some essential reanalysis fields are written to pp0 and pp4 which overlap with some of my user STASH streams - they could be written to base names using 'errmod' or 'snowsfc' instead).

My Current Strategy for Moving Forward

Some of this will need some modification when I move to UM 10.4 but the gist of it and the implications for suite design should still be valid.

A. All non-user STASH essential to the full functioning of the suite has to be in a labelled package. Currently I follow a trial and error process to identify the essential STASH groups which are not labelled and relabel them before I extract the proper user STASH. A. Real user STASH is extracted to a separate file (opt/rose-app-[STASH_ID].conf where STASH_ID is a suite variable set for the site (e.g. UKMO, BOM etc). A. The option for STASH_ID has to be added to the options for the sub-suites which use the um_fcst STASH. A. STASH settings for other options in opt need to be labelled in a similar fashion if they contain interactions with the site specific user STASH (e.g. opt/rose-app-[key_name][STASH_ID].conf) with appropriate modifications to the options statements in the suite task configuration files. A. An effort is required to keep STASH hash based names in sync across the opt files and the basic rose-app.conf file and with the site specific split suggested above that might be easier deal with.

The original plan to move from the UMUI style files to the rose/cylc era.

The pre-rose operational models use stash settings generated to reproduce (as much as possible) the output from the legacy models to minimise disruptions to downstream processes etc. The same philosophy is the basis for the original plan to translate the information in the STASHC file into a rose-app.conf style file. Hence the process was suppoed to be:

  1. Translate the STASHC (and some auxiliary information) into namelist blocks in the rose-app.conf standard saved as a specifically named file in the opt sub-directory.

  2. Removing the user stash requests from the rose development suite (u-aa670) and saving them to a file in the opt sub-directory thereby leaving the requests required for other components of the suite (e.g. background errors etc) in a stripped-down rose-app.conf file.

  3. Insert the BoM stash from the translated file in opt into the rose-app.conf file and clean up any errors so that the validation macros are happy.

In the process I discovered all sorts of interesting quirks which required extra processing to solve and the conversion/translation script now does a few extra changes to improve the layout of the output for our current plans (e.g. removal of the 4v fields, changes to frequency of output, additional fields, removal of some fields etc)

Thoughts on the process

The rose/cylc version of stash has some improvements over the old umui version but also adds some frustrating complications. Some of these undoubtedly stem from the conflict between my philosophy and that at the UKMO about how the output should be laid out but need to be solved if the process is to be portable between institutions.

  • Requests needed by other components in the suite are unlabelled and therefore identified as user stash as a first cut. For gl_um_fcst this easier to solve because the aberrant packages did not write to standard user model output streams (e.g. pa,pb, pc etc) but this was not the case for the engl_um_fcst where there are packages which write to pb (ETKF) and pc (??).) Some of these problems would be ameliorated if a consistent package labelling policy was put in place!

  • Setting identification (ID) for the profiles and stash requests with a hash code based on the content is good for finding changes, conflicts etc but it is a nightmare to use when you are initially setting up a large stash set as every change requires a rehash (admittedly easy enough to do with the rose validateTransform macros but still...)

  • the profile name is not part of the hashing process but it is the profile name and not the ID which is used by the stash requests. This means that there is no solid link between the profiles and the associated profiles - you can unwittingly change the output profile by introducing another profile with the same name but different ID. (Since the ID's are used to identify identical profiles this should get picked up but it does require special care when inserting and deleing package sets).