access_NewSun_009 - ACCESS-NRI/accessdev-Trac-archive GitHub Wiki


#!html
<h1  style="text-align: center; color: green"> CAWCR-BoM ACCESS NWP Ngamai Migration Working Group</h1>

CAWCR-BoM ACCESS-NWP Ngamai Porting Working Group Meeting Notes

Meeting 9: Wednesday 4th September 2013, 9E Meeting Room

Present: Robin Bowen, Joerg Henrichs, Ed Habjan, Ilia Bermous, Jim Fraser, Joan Fernon, Wenming Lu, Martin Dix, Chris Tingwell, Michael Naughton, Asri Sulaiman, Yi Xiao

Apologies: Zhihong Li, Robert Jukic, Ivor Blockley, Peter Steinle


Agenda

  • Follows notes from previous meeting.

Suites

AG1

  • DISK usage
  • Current disk usage close to maximum
  • Can stop suite which has gone to mid Aug, (verification already done with acceptable result).
  • Consider deleting pi files and re-run suite to generate as needs arise.
  • ACTION: On-going monitoring and management.
  • Plots
  • Verification essentially OK
  • Gary's Diagnostics
  • Exploratory task
  • ACTION: Gary to report.
  • NMOC AG1
  • Joan's suite have started cycling.
  • Started with input from 25/6, now up to 30/6. Running several days run/day.
  • MARS7 not yet ready for archiving
  • opdata capacity is 80Tb, sufficient for 2 months worth of output.
  • pi files will be deleted once MARS archiving is done
  • MARS7 is expected to be available by next week. It will be a disk only system. With 40Tb capacity and only subset of the fields from pi files to be archived, should be sufficient for several months of archiving by all the suites.
  • Discussion will be made with Richard Oxbrow regarding new SAM capacity.
  • LSDSS disk are already mounted on ngamai.
  • Verification of NMOC AG1 suite require MARS -- will need to wait until it is available.

AR1

  • Now ran to 25 Jun - 8 Aug. Will stop at 11 Aug.
  • Joan is close to starting NMOC's AR1
    • consolidating directories
    • save to svn
  • ciwt done 1 month of preliminary verification
    • Virtually identical result to solar apart from expected variation in wind biases.
    • MSLP SI absolutely identical
    • OPS and VAR performance also virtually identical.
    • Can declare the suite as OK.
  • No need to continue further with CAWCR's ACCESS-R on ngamai.
  • Gary's plots will be also be useful on ACCESS-R results. Mike to find someone to take this up.
  • The suite have also incorporated speed-ups from Ilia and Joerg.
  • Speed of reconfiguration still a concern with large runtime variations
    • R12 dump files are much larger than global suite's.
    • Reconfiguration may take between 3 - 20 minutes.
    • This problem was not observed on solar.
  • Timing information on individual task from Joan's runs.
  • Frames are being used for LBCs generation in Joan's suite
  • No benefit have been observed in using 1/2 hourly rather than hourly LBC's in ACCESS-C suite
    • Should update run settings to go to use hourly ourput - significant saving of disk space and run time.
    • Joan to coordinate with Chris to implement the change
    • Preferably done before start of Joan's ACCESS-R suite.

Executables

  • Joerg has been investigating run time variations
  • Initial runs have variation up to 100%
  • Over hundreds of runs, variation is about 10%
  • There appear to be some messaging bottle-neck
    • Modified messages to reduce size
      • Remove combining of several halo exchanges.
  • Problem do not occur on raijin - suspect difference in IB network.
  • Ilia have tried runs with "sleep" between runs
  • No improvement observed.
  • Second run still have up to 100% variation.
  • First run can have up to 20% variation.
  • Ed said Oracle will investigate several areas
  • source code
  • mpi library
  • System configurations
    • buffers
    • Infiniband drivers.
  • There are a great number of runtime settings possible with intel mpi
  • Open MPI appear less so, but according to Joerg, there are also a large number of changeable runtime settings.
  • It may be worthwhile to communicate with Paul Selwood of UKMO as well as MPI developers.
  • Joerg will also look at re-configuration issues next week.

AC1

  • Now running July to present.
  • It is possible to compare to legacy runs, but not that straightforward to do.
  • Differences observed in trial verifications of March/April results
  • Due to ntile settings?
  • Convective settings should be identical, different in re-runs.
  • No difference apart from machine for July.
  • Holly doing rainval plots
  • Chris Bridge & Xiaoxi Wu to run obs verification on surface fields.
  • Run Gary's plots

UIs

  • Basically ok, minor issues still being ironed out.
  • To start planning the migration of all the UI's from solar to ngamai, together with SVN repositories and Trac databases.

OTHER BUSINESS

  • This porting meeting will now be conducted every fortnight.
    • update and revisit issues list for next meeting
    • review status of project documentation
  • It is now appropriate to broaden scope to cover the need of migrating everything that currently runs on solar such as research suites.
    • Look to reconvene UM Working Group meetings.
  • Find out MARS status from Tan Le
  • Follow up on "Build Process"
  • links to preliminary documentation added to new section in https://trac.nci.org.au/trac/access/wiki/NewSun

NEXT MEETING: Wed 18th September, 11am, 9E Meeting Room.


[ 8/9/2013 ] azs, first cut. [ 8/9/2013 ] rab, fix some typos. [ 8/9/2013 ] mjn, few changes.

⚠️ **GitHub.com Fallback** ⚠️