Sync meeting 2023 11 14 - multixscale/meetings GitHub Wiki

attending: Thomas (UiB), Kenneth (UGent), Xin (SURF), Bob (RUG), Lara (UGent), Danilo (HPCNow!), Caspar (SURF), Susana (HPCNow!), Julián (BSC), Helena (HPCNow!), Erica (HPCNow!), Satish (SURF) excused: Alan (UB), Elisabeth (HPCNow!)

  • tasks with M12 milestone need to be done by then (so we can report on them in deliverables)
    • https://github.com/orgs/multixscale/projects/1
    • T1.1 (UGent) Providing a stable, optimized, shared scientific software stack with support for established system architectures
      • M1-M24
      • software.eessi.io repo is in place, ready to start building software layer
        • new setup with dedicated Stratum-0 server @ RUG + Stratum-1 mirror servers in AWS (eu-central) + Azure (us-east)
        • corresponding issues in planning need to be closed => Bob?
        • bot configuration needs to be updated accordingly to start building
        • initial build of compat layer is ready, but security updates should be installed before we start building software layer (Thomas?)
        • focus on recent toolchains (2022b + 2023a, based on GCC 12.x) and software versions
      • PR opened to add software.eessi.io to default CernVM-FS configuration
      • good progress on GPU support by Alan, see EESSI software-layer issue #375
        • Kenneth will follow up with Alan once he's back from SC23
      • software
      • relevant progress to mention in D1.1 (M12)
    • T1.2 (RUG) Extending support of the shared software stack to emerging system architectures
      • M10-M30
      • just getting started, for example building/testing on Arm, hitting problems with TensorFlow, LAMMPS, ...
      • also relevant progress to mention in D1.1 (M12)
    • T1.3 (SURF) Design and creation of a software test suite and facilitating CI for software developers
      • M1-M30
      • EESSI test suite
        • initial release v0.1.0 was made on 5 Oct'23
      • bi-weekly meetings, see notes at https://github.com/EESSI/meetings/wiki
      • OSU test WIP (PR #54)
      • GPU test part of effort by Alan
        • we can already develop GPU test and test on own software stack
        • via CUDA samples, see easyconfig PR #18994 + PR #18998
      • can release v0.2 or v0.1.1 of test suite once currently open PRs are merged
        • OSU test (?)
        • updated CI scripts (PR #93)
        • extra tags
        • minor code refactoring
      • reached out to ReFrame developers to see how effort with their hpctestlib & EESSI test suite align
      • relevant for D1.2 (M12)
    • T5.1 (UGent) Setting up a support portal
    • T5.2 (SURF) Monitoring and testing of the central shared software stack
      • M10-M30
      • initial meeting with SURF visualisation team for creating dashboard
      • ongoing discussion on how to collect/store performance results and define performance reference values
      • should look into generating overview of available software installations in EESSI
    • T5.3 (UiB) Facilitating community contributions to the central software stack
      • M1-M12
      • bot
        • some minor bug fixes/changes done in develop branch, ready for v0.1.1 bugfix release
        • open PRs to add initial testing step to bot workflow
          • see also software-layer PR
          • would be nice to have this included in a bot release by end of 2023, so we can include it in deliverable
        • new Slurm cluster set up with Magic Castle to build/deploy for EESSI software layer (pilot + software.eessi.io)
        • should look into crashes that happen in bot job manager
      • contribution policy
      • relevant for D5.1 (M12)
    • WP6
      • 5th "Code of the Month" webinar for NCCs and CoEs (internal session) on Tue 28 Nov'23 (11:00 CET, ~1h): "EESSI by MultiXscale"
      • Mon 4 Dec'23 (13:30-17:00 CET): "Best Practices for CernVM-FS in HPC"
        • 100 registrations already!
        • could be nice to inform people that they can follow along on a VM if they want to
      • Tue 5 Dec'23 (14:30 - 16:30 CET): "Streaming Optimised Scientific Software: an Introduction to EESSI"
        • 27 registrations so far, should promote this more?
  • status check on deliverables M12
    • D1.1 (RUG - Bob/Pedro) Report on shared software stack prototype
      • high-level structure of deliverable (sections) is there
      • need to start writing out section contents
      • would be good to get feedback on high-level structure ASAP (Alan,Kenneth)
    • D1.2 (SURF - Caspar) Plan for the design of a portable test suite
      • initial high-level overview in place
      • intro + first part written
      • would be good to get feedback on this early draft (Alan,UGent)
    • D5.1 (UiB) Community contribution policy and GitHub App
      • structure is there and has been looked at by Alan
      • getting closer to complete draft, hopefully before 24 Nov
      • more of a guiding document, companion to bot releases + contrib policy
    • D5.2 (UGent) Support portal
      • good feedback from Alan on structure
      • section 1+2+4 ready for review
      • section 3 will be removed
      • 5+6 need more work
    • to give feedback in Overleaf project
      • use \comment{...} and \todo{...}
    • may need to set up another sync meeting focused on deliverables
  • CASTIEL2
    • (Kenneth,Alan) trying to make sure that CernVM-FS is at least mentioned as an alternative to GitLab Runner for Continuous Deployment in their deliverable due M12 (CASTIEL2 D5.8)
  • lightning talk on EESSI at SC23 @ "BoF for Scientific Software and the People who Make It Happen"
  • Susana: new "In the Media" section on website is coming
    • can be used for quarterly reports, etc.
    • anything related to MultiXscale in traditional media (radio, podcast, newspaper, ...)
    • new edition of newsletter is being prepared
      • things to mention are welcome => contact Susana via email