access_RaijinAdmin - ACCESS-NRI/accessdev-Trac-archive GitHub Wiki

PageOutline

NCI's Raijin User Guide

Groups

The ACCESS ecosystem uses a number of UNIX groups to provide access control to our systems. These include:

access

All users of the ACCESS systems need to be members of the access group. It provides access to the UM code, accessdev & accesscollab servers and ~access directory. You can request this at https://my.nci.org.au/mancini/project/access/join.

access.admin

Access technical support members are part of this group. It allows write access to the shared directories like ~access.

access.dev

accessdev administrators

NCI projects

NCI accounting is done on a per project basis, each project has a corresponding UNIX group. These are used for quotas and other resource allocation.

Filesystems

/home

User home directories are under the control of users. By default only the user is able to access files here, which includes stuff like output logs. /home storage is limited (default quota is ?? GB, more can be requested from NCI via [email protected] with sufficient justification). /home has rolling backups.

/short/$PROJECT

Project-specific scratch storage. Quotas are managed on a per-project basis, to change what project a file is accounted under use chgrp. Storage quotas are more flexible than /short, reasonable quota extensions are usually easy to get. Recommended practice is for users to store files under /short/$PROJECT/$USER. Only members of a project can access files under the related /short/$PROJECT directory. There is also a /short/public directory without that restriction.

/short is not backed up, do not use for long-term storage

/projects/access

Shared storage for special projects like ACCESS to store common scripts & libraries. /projects/access is also mounted at ~access. Not for storage of large files, instead use /g/data.

/projects/access/data

Local storage for long-term files (e.g. ancillaries)

/g/data/access

This is also mounted on accessdev so used for prebuilds and rose-metadata which need to be visible there. This is not backed up so should only be used for files that can be recreated.

Other

MDSS

Long term archive, refer to NCI's user guides

/apps

NCI applications, maintained using modules

jobfs

Temporary storage for running jobs, located on the compute nodes

Permissions for shared storage

Shared storage like ~access and /g/data/access must be writable by members of the access.admin group, readable by members of the access group and hidden to anyone else (due to UM licensing). This is somewhat doable using ACLs, however there are limitations.

The ACLs for shared storage look like:

$ getfacl /projects/access
# file: projects/access
# owner: access
# group: access.admin
# flags: -s-
user::rwx
group::rwx
group:access:r-x
mask::rwx
other::---
default:user::rwx
default:group::rwx
default:group:access:r-x
default:mask::rwx
default:other::---

This does several things

  • Firstly the set-group-id bit is set, meaning files created under this directory have the 'access.admin' group, regardless of the current group of the creator.
  • The user and the group have full rwx permissions
  • Additionally, the access group has rx permissions
  • Any new files created in this directory will have the same permissions as this one

This is what we want, and it mostly works, but with one exception - the mask. The mask specifies extra restrictions applying to all groups, the actual permissions of a group are a boolean AND of the group's permissions and the mask. A mask of r-- means that regardless of ACL settings group members will not be able to write to a file (this does not affect the owner).

The mask of a newly created file is affected both by the default mask in the ACL and the current umask setting of your shell. The umask says what permissions new files do not have - the NCI default value of 0022 means that new files cannot be written by group members or others.

In other words, a newly created directory in the shared space has permissions

# owner: saw562
# group: access.admin
# flags: -s-
user::rwx
group::rwx			#effective:r-x
group:access:r-x
mask::r-x
other::---

Note that group writes have been masked out, despite the default settings.

There are two possible ways to get around this:

  • 'chmod g+w' newly created files
  • 'umask 0002' before working in the shared filespace (be sure to go back to umask 0022 once you're done)

Setting FACLs

To set the permissions on an existing directory

chgrp access.admin dir
chmod g+s dir
setfacl -R -d -m g:access:rx dir
setfacl -R -d -m g:access.admin:rwx dir
setfacl -R -m g:access:rx dir
setfacl -R -m g:access.admin:rwx dir

If the directory is empty, the last two lines aren't required because no existing files need permissions changed.

Applications

The ACCESS system makes use of a number of applications & libraries beyond what is available in /apps:

  • UM Small execs
  • GCOM
  • FCM
  • Oasis3
  • Oasis-MCT
  • Rose
  • Cylc
  • Drhook

These are available to users using module files, as is the case for all other libraries at NCI. To access the modules users run commands like:

module use ~access/modules
module load gcom/4.2

Installing new applications

New applications should be installed under a path /projects/access/apps/$PROGRAM/$VERSION, using the usual UNIX standard of bin/, include/ lib/ etc. subfolders. These subfolders get added to PATH, CPATH, RPATH etc. by the module file so users don't need to explicitly set paths. If any special steps are required to install the program please make note of them in /projects/access/apps/$PROGRAM/README so that the installation can be upgraded in the future.

Module files are pretty simple to create, if you're using the standard directory structure all you need is e.g.:

#%Module

set help            "Met Office comms library"
set prefix          /projects/access/apps/gcom/4.4
set install-contact "Scott Wales <[email protected]>"
set install-date    "21-06-13"
set url             collab.metoffice.gov.uk

conflict gcom
source /projects/access/modules/common

The conflict command says the module can't be loaded twice, the common file automatically sets up PATH, CPATH etc. for you.

If you wish to load a dependency (e.g. python) add to the module file:

if [ is-loaded python ] {
    module load python
}

The module gets put into a file named /projects/access/modules/$PROGRAM/$VERSION.

If you're wanting to install a python library from PyPI you can use the script ~access/admin/install-python-lib.sh which will automatically install to the correct path & set up a module for you.

UM Prebuilds

UM Prebuilds are a way for multiple UM jobs to share the same files when building. If a job is using a prebuild, the build job will check each file being compiled (& it's build settings) against the prebuild - if the files are different it will compile the file, however if they are the same it will use the precompiled file from the prebuild. This can greatly speed up builds if only a few settings have been changed.

A prebuild is just a normal UM build, just in a shared directory. They are named systematically like $VERSION_$CONFIG_$BUILDLEVEL, e.g. vn7.3_access1.3_safe. There are also generic prebuilds named like $VERSION_$BUILDLEVEL, which can be used if there is not a prebuild for the configuration being used.

Prebuilds should be stored under ~access/prebuilds on accesscollab, and under ~access/umdir/prebuilds on Raijin.

To ensure that debug information is viewable to everyone you should build the model in the same location where it will eventually reside. This may require some manipulation of the build scripts & fcm config.

To build, process the UMUI job then edit the files FCM_EXTR_SCRIPT & FCM_BLD_COMMAND, changing the values in the top declarations to:

  • UM_MAINDIR -> /projects/access/prebuilds/PREBUILDNAME
  • UM_RMAINDIR -> /projects/access/prebuilds/prebuilds/PREBUILDNAME
  • UM_OUTDIR -> The same as UM_MAINDIR
  • UM_ROUTDIR -> The same as UM_ROUTDIR

The hand edit ~access/raijin/create-7.3-prebuild.sh will do this for you, creating a prebuild with the same name as the job.

See RoseSuitePrebuilds for information on using prebuilds from a rose suite.

Admin tools

There are some useful tools for administration in the ~access/admin directory

  • install-python-lib.sh: Download a python library from PyPI to the ~access/apps directory & set up a module for it
  • verify-access.sh: Verify ACLs for all the files in ~access, printing out a list of paths with incorrect permissions