access_ScottWalesEmail19April2012 - ACCESS-NRI/accessdev-Trac-archive GitHub Wiki

Rose Database Proposal
Scott Wales
To:
 [email protected] 
Cc:
 Michael Rezny ‎[[email protected]]‎‎; Michael Naughton ‎[[email protected]]‎‎; Asri Sulaiman ‎[[email protected]]‎ 
Thursday, 19 April 2012 5:48 PM

Hi Dave,

I'm part of the UM support group for the Australian universities and am
currently maintaining our test installation of Rose. The suite database
proposal on the collab wiki has recently been brought to my attention, I wanted
to share some thoughts on it with you.

The current proposal is to continue to use a fixed width alphanumeric
identifier for jobs. Is there a particular reasoning for choosing this as
opposed to simply using a number incremented for each job, eg 1,2,3? A drawback
of the current fixed width system is the limited amount of runs that can be
grouped into a folder which we've hit when creating nested runs.

I personally believe it would be better to have the job names in the repository
more free-form, with users able to group jobs as they wish under a username
folder. While id numbers can be convenient for quick reference this could be
handled by the metadata database. You also avoid the fcm magic involved in
creating a new folder with the correct id number in the repository.

Having branches within jobs seems to me a bit excessive. Perhaps it would be
better to think of the jobs themselves as branches from a set of standard runs
for NWP, climate &c., with the ability to merge jobs provided.

Storing jobs in a sub directory for each character of the ID also might be a
bit much. Perhaps this could be condensed into something like ab/012 rather
than a/b/0/1/2.

Have you considered using svn properties to store the metadata rather than a
config file? I believe there is an api to access properties that a discovery
tool could be built around. This would avoid having to synchronise to an
external database, although it would be less convenient to modify than a text
file.

I think it would be worthwhile to limit the amount of magic done by fcm,
preferably it would just be a thin layer on top of subversion adding support
for path shortening. If the metadata stuff is done using commit hooks that
shouldn't be a problem.

Suite-create could be handled as creating a branch off of a standard empty job,
avoiding needing to add logic for creating the directory structure to fcm.

What is the purpose of having a standard location for checking out a job,
rather than just adding it to the current directory? Users should be encouraged
to think of working copies as temporary things, the permanent storage is the
repository itself.

In order to properly share jobs between collaborators you need to share not
just the UM configuration, but also the ancillary and IC files. The best way to
do that might require some thought.

I think I'll leave it there for the moment as this has gotten rather
long-winded, hopefully at least some of this is useful to you. I'm very interested
in the direction that Rose is heading, and definitely think that it's a positive one.

Cheers,

Scott Wales

Scott Wales, Computational Modelling Support
School of Earth Sciences, The University of Melbourne, Australia 3010
[email protected] / P +61 3 8344 6907 / M 0450 012 907