access_NewSun_003 - ACCESS-NRI/accessdev-Trac-archive GitHub Wiki


NOTES - updated after 3rd porting Meeting, Wed 11am, 24th July 9E Meeting room.

** * * * Changes/Updates to previous notes are in bold. * * * **

Present: Joerg, mjn, Joan, ttl, ilia, Wen Ming, Zhihong, Xiao, Mdix, ciwt, azs, rab, Scott Wales

Configuration Management of systems/suites that go into operation

  • There was a discussion on rationalising the directory structure of operational systems and using SVN to keep track of changes as well as thorough documentation of all components.
  • -- This aim is generally agreeable to all present although there are differences in priorities, and on the best way to proceed.
  • -- It was noted that this subject extend beyond the scope of "porting" although there are arguments that its implementation (at least some of it, eg. using SVN) may aid in the porting work itself.
  • -- This subject is also potentially vast.
  • -- Failure to implement some kind of source control at the very beginning will result with hard to maintain system, detrimental in the long term.
  • -- Need to be able to trace (and rebuild) source of executables.
  • -- REF, see UKMO ParallelSuites as example

A round table discussion in Meeting 2 was initiated to give everyone present the chance to present their issues. Source/Config management issues above were raised. Other issues raised are recorded in the table below.

Issues and Task List table

This table contains the main current items relevant to the working group. Issues no longer active are moved to inactive list at the bottom. Previous meeting notes are also available in links at the bottom of the page.

No. ITEM STATUS/COMMENTS Contact Person
2 30 Days Ngamai Acceptance test Commenced from handover date (end June), running a SSP mix of programs (including UM benchmark from solar. Acceptance mainly upon 30 days of running without error. No acceptance test doc. Due to complete on 4th week of July. So far there have been no problems. Side note: So far Raijin have been found to be slower than solar for UM using equivalent number of cores/config. acceptance test going well. rab
3 Setup and rebuild of /apps So far < 50% have been done, but the main ones crucial for porting such as compilers have been done. Ongoing work. **still in the process of installing additional apps. sms is almost ready. will be available via module loads. python with matplotlib is also almost ready. ** /apps will be added to user's module path by default upon login by the system /apps group, rab
--3a Logins for BoM users General user access to ngamai is now available, starting from pm Wed 17th July 2013. Solar users are not automatically given ngamai accounts. Notifications given, and those with need but still w/o accounts can apply. rab
6 Intel compiler 11.0.83 Strong need, but not straightforward to provide. This may be a difficult issue. Two prong approach is advised. Attempt to use LD_LIBRARY_PATH to tackled this so far failed due to what appear as hardcoded path within the intel 11.0.83 setups. Consultation with various local experts and intel staff so far is unsuccesfull. In the meantime recompilation with v12 have shown very small difference which may be acceptable (Ilia & Xiao). Do not use -Xhost option. Do not use Intel v13 compilers. **Progress made with v12 compilers. re-download & rebuild of v11 will be attempted by Tim Pugh and Justin Freeman. If newly built, it will be compatible with openMPI and there will be no need for MPI/Sun. ** ilia,rab, group
7 Source Code for UM Migrate SVN repository mirror from solar - cut over during 1st weekend of October. access-svn was down the last a few days since Fri 19/7 * [due to disk & vmware issues, back up on Thu 25/7 ]*. This raise issue on its robustness as well as resources allocated to it, especially disk space. OPS/VAR/SURF repositories to be migrated to ngamai first, with move to NCI delayed until stable accessdev svn repositories are fully available. azs
8 Source Code for VAR Migrate SVN repository from solar - cut over during 1st weekend of October. azs
9 Source Code for OPS Migrate SVN repository from solar - cut over during 1st weekend of October. azs
10 Source Code for SURF Migrate SVN repository from solar - cut over during 1st weekend of October. In general, all svn repository will be moved to NCI to allow access from raijin. azs
11 ~access directory Copy solar ~access to "from_solar" on ngamai. Move to proper place as we go along. Improve where suitable. Interoperability with raijin is important consideration and should be done where possible. Using scp to transfer files caused symbolic links to be copied in full, resulting in substantial duplicates; transfer redone using rsync. access.admin
12 GCOM Libraries Decide naming strategy, identify required versions/compile options, Synchronise with Raijin. eg. proposed structure ~access/apps/gcom/GCOM4.4/bld_mpp_12.1.8.273_1.6.3 submitted to a few access.admins access.admin, azs,ilia,Martin,Scott Wales
13 Migrate Trac databases Cutover from solar to ngamai 1st weekend of Oct. azs
14 UM Small execs Do for vn7.3, 7.5, 7.6, 8.2 and 8.4 ** May be able to use copy from raijin** Martin
15 Migrate UMUI, VARUI, OPSUI and SCSUI Initially to operate UMUI from solar.Big Font issue encountered on ngamai. Zhihong, azs, ilia, xiao, Say
16 CAP program on ngamai vn8.1 now available and set up on Raijin. Sufficient for time being; not urgent to port to ngamai, as we can create ancils on raijin and copy to ngamai. Martin
17 Re-compile/build UM 7.5/7.6 Executables for APS1 - Global, Regional, Access-C ... Documentation of build procedure, UI job id, wiki page of build or similar to UKMO Parallel suite page. Sample of build documention to be created for discussion. Ilia, Xiao, azs, Wenming, Martin
--17a ACCESS-G VN7.6 (PS25) Ilia, Xiao, azs
--17b ACCESS-R VN7.6 (PS25) Ilia, Xiao, azs
--17c ACCESS-C Wenming & azs trying out build job from vn7.6, PS25 UK4. UKMO qazga, Xiao's xbdec. Wenming, Ilia, Xiao, azs
--17d ACCESS-TC VN7.6 (PS25) Ilia, Xiao, azs
--17e UKV VN7.6 (PS25) Ilia, Xiao, azs
--17f VAR Compile/build of VAR not done through UI.** Recompiled VAR gives different result. -model_precise compile option is important** Ilia, Xiao
--17g OPS Compile/build of OPS not done through UI Ilia, Xiao
--17h SURF Compile/build of SURF not done through UI Ilia, Xiao
--17i Martin's Worm Diff plots, ensemble plots etc will be useful for verification Martin
18 MARS scripts and exes Archiving scripts, converters etc. ttl
19 Porting/Migration website Use NCI access trac. Need NCI userid for everyone Porting Group
20 Web serving and documentation from ngamai from ngamai is active. nothing setup yet rab/Porting Group
21 fcm utility Only fcm2.3 should be required as fcm is backward compatible. fcm 1.5 have been used succssfully on yambuk. Need certain Perl Libraries.Should already be available with std perl on ngamai. Martin
22 Miscellaneous utilities used by ACCESS systems [None highlighted so far] UM Porting group, azs
23 profile.gen.access Should be in /access/scripts, not in /access Xiao
24 UM Ancillaries input data. Since UM ancillaries, in particular for N512 can be quite large, propose that it go into /access/ancils instead of into UMDIR (or into /access/data/ancils ). access.admin, azs
25 Misc UM parameter files A few smallish radiation param, verts, ancil and STASHmasters etc. UM porting group, azs
26 APS1 suite Xiao working on ngamai ACCESS-NWP suite(s) based on NMOC SCS components, paths, etc. Joan, ciwt, Xiao
27 APS2 suite APS2 is of secondary importance since it is not operational yet. ciwt, Xiao, Sergei, Joan
28 ACCESS systems/suites not synchronous with operational system Will not be able to address this as part of porting project. Will need to wait until APS3 ROSE based systems. In the meantime, better documentation will help, also better source code tracking/building procedure - See discussion at start of this note. Porting Group
30 NCI Training course on 22nd July There are sessions for new users as well as experienced users. Porting Group
31 Backup copy of solar files prior to its decommisioning. Easy to do. Porting Group
32 $CWSHARE Will still be required, though most shared scripts and utilities will be in /access/bin or /access/scripts. Porting Group
34 Higher management's Porting plan. Will be of help for our reference. There is also call for detailed estimates of resources required.Staff doing porting asked to provide estimates of the resources needed particularly in terms of time and %FTE. Any risk need to be highlighted rab, mjn
35 SMS To be installed as a "module", as part of /apps. - Will be modularised. - Use new NMOC included files from Milton Woods, but CAWCR research may stay with their own set of include files. - put "SMSOUT" into different directory /apps group, Wen Ming
36 Veri Py Part of some SMS suites. New version to be developed for Raijin mjn, Wen Ming
37 Plotting Systems Part of some SMS suites. New version to be developed for Raijin mjn, Wen Ming

EMERGING ISSUES

  • Hyperthreading, Turboboost, thread-settings for openmp libraries. These have been noted, for both ngamai and raijin, will be investigated; they're not critical to operational porting activity.

  • nci trac wiki was down for a day or two. Not sure if it is related to nci vmware issue, or due to disk full. Question was asked about available diskspace for the wiki.

POINTS/ISSUES THAT CAN BE OMITTED FROM FUTURE NOTES
1 Oracle Handover of Ngamai to BoM Done on 27th June rab
4 Ngamai Configuration Linux 6.2 Will be identical to solar, but with newer s/w versions. Generally, for all s/w on solar, there will be equivalent on ngamai, but no older version unless necessary. File system setup is similar, different h/w namespaces. There will be filesystem quota. rab
29 STASHMaster File Regression Not possible. Must be due to misunderstanding/mistake Porting Group
33 Suites/Applications to work regardless of user's default login shell and environment Desirable. Porting group, all developers.

Next Meeting is on Wed 11am, 31st July 9E Meeting room.