Sync meeting 2024 06 11 - multixscale/meetings GitHub Wiki

MultiXscale WP1+WP5 sync meetings


Next meetings

  • Tue 11 June 2024 10:00 CEST
  • Tue 9 July 2024 10:00 CEST
    • planning to attend: Caspar, Bob
    • on summer break: Kenneth, Lara, Thomas
  • Tue 13 Aug 2024 10:00 CEST
    • planning to attend: Kenneth, Lara, Caspar, Thomas
    • on summer break: Bob

Agenda/notes 2024-06-11

attending:

  • Neja (NIC)
  • Alan (UB)
  • Kenneth, Lara (UGent)
  • Caspar, Casper, Maxim (SURF)
  • Thomas & Richard (UiB)
  • Bob, Pedro (RUG)
  • Jean-Noël (UStutt)
  • Julián (BSC)
  • Eli, Susana, Nadia (HPCNow!)
  • problems with shared drive
    • cfr. incomplete progress reports 2024Q1 for WP1 (see drafts in drive upload by Satish) + WP5 (see Lara's email 6 May)
    • works for Alan when logging in through incognito browser + logging in with personal Microsoft account
      • who else has this problem?
        • Kenneth, Susana, Jean-Noël, Rudolph, Pedro
    • we can try to take a copy and create a new OneDrive
  • 2024Q2 quarterly report
    • try to get info in place end of June/early July
    • Lara & Caspar will be mostly available in July
    • if problems with OneDrive persist, send PMs info + bullet points with tasks to Caspar/Lara via email/Slack/HackMD
  • Milestone 3 (M18 - June 2024, lead: UStuttgart)
    • Milestone name: "First portable test run on two systems with different architectures (e.g. with and without accelerators)"
    • Means of validation: "Performance and scalability plots available for the application on the two architectures"
    • working on this using ESPResSo, as extra test in EESSI test suite
  • WP status updates
    • [SURF] WP1 Developing a Central Platform for Scientific Software on Emerging Exascale Technologies
      • [UGent] T1.1 Stable (EESSI) - D1.3 due M24 (Dec'24)
        • more software, incl. Espresso 4.2.2
        • dev.eessi.io => see notes + support issue #61
          • would be very interesting service for developers in scientific WPs => cross-cutting across technical & scientific PRs
          • "looser" policy compared to software.eessi.io production repo
          • Devs can trigger their own builds
          • pre-release builds accepted (specific commits)
          • intially focused on Espresso & co
          • could also be used as "dev" environment for software.eessi.io features (e.g. GPU support)
          • if we're doing this on Azure, we should do it in a new subscription
            • needs to be created by Martin @ SURF
            • if done in AWS, Alan can do it
        • GPU software => see notes + support issue #59
          • Update bot to have GPU support [Thomas]
          • Update archdetect to support CUDA compute capability [???]
          • directory structure in software.eessi.io, for example software/x86_64/amd/zen2/accel/nvidia/cc80 [???]
          • blocked by dev.eessi.io?
            • we want to use this as a playground for GPU builds
            • => can look into this during hackathon on Tue 18 June
          • needs to be planned
        • need to review description of Task 1.1, make sure all subtasks are covered
        • "we will benchmark software from the shared software stack and compare the performance against on-premise software stacks to identify potential performance limitations, ..."
          • Espresso + LAMMPS + OpenFOAM + ALL(?) (MultiXscale), GROMACS (BioExcel)
        • "increase stability of the shared software stack ... pro-actively by developing monitoring tools"
          • proper monitoring for CVMFS network (S0 + S1s)
          • for RUG?
      • [RUG] T1.2 Extending support - D1.4 due M30 (June'25)
        • Arm support fits here
        • zen4 + sapphirerapids
        • AMD ROCm
          • lower impact, should we should limit our efforts here?
          • select apps, like PyTorch/TensorFlow
        • should also look into Grace Hopper (JUPITER)
      • [SURF] T1.3 Test suite - D1.5 due M30 (June'25)
        • Milestone 3 for Espresso test
      • [BSC] T1.4 RISC-V (starts M13)
        • cfr. efforts by Bob & Julian, incl. riscv.eessi.io
        • actively looking into adding more software, incl. Extrae
        • lot of interest from EUPILOT project @ BSC
      • [SURF] T1.5 Consolidation (starts M25 - Jan'25)
        • (not started yet)
    • [UGent] WP5 Building, Supporting and Maintaining a Central Shared Stack of Optimized Scientific Software Installations
      • (FINISHED M12 [UGent] T5.1 Support portal)
      • [SURF] T5.2 Monitoring/testing, D5.3 due M30 (June'25)
        • discussions with SURF + initial work done on dashboard
        • working on two dashboards: one detailed, one with overview
      • (FINISHED M12 [UiB] T5.3 community contributions (bot))
      • [UGent] T5.4 support/maintenance - D5.4 due M48 (Dec'26)
        • support portal + rotation working well
        • support issues in April+May
          • Opened: 12 issues
          • Closed: 10 issues
          • total: 69 issues (26 open, 43 closed)
        • bot release
    • [UB] WP6 Community outreach, education, and training
      • [Kenneth, Lara, Pedro] EasyBuild User Meeting (EUM'24), 23-25 April 2024 @ Umeå, Sweden
      • [Kenneth, Lara, Eli] activity at ISC'24, see https://eessi.io/docs/blog/2024/05/17/isc24
      • [Eli] Teratec (29-30 May'24)
        • poster
        • demo for Sanofi, were quite interested
      • [Thomas] presentation @ Norwegian Bioinformatic Days on making bionformatics workflows easy (using Nextflow)
        • they use a lot of containers, but can also use different backends
          • backend for EESSI could be interesting
        • similar work was done in BioHackaton Europe (https://biohackathon-europe.org)
      • [Lara] EESSI promotion @ DH Benelux in Leuven (Belgium), 4-7 June'24
        • some people were interested, like getting students easy access to software installations
      • [Matej] presenting poster at ASHPC this week
      • [Alan] invited speaker for Nordic Industry Days (early Sept'24)
      • submit BoF proposal on EESSI for SC24 (Atlanta, US)
        • HPCNow! will be attending
        • tutorial submission done
      • CernVM-FS workshop (Sept'24, Geneva)
        • submission due this month
        • EESSI is in default CernVM-FS configuration
        • could cover work on dev.eessi.io
      • deliverable due: D6.2 (M24 - Dec'24), D6.3 (M30 - June'25)
    • [HPCNow] WP7 Dissemination, Exploitation & Communication
      • T7.1 Scientific applications provisioned on demand (lead: HPCNow)
        • ...
      • Task 7.2 - Dissemination and communication activities (lead: NIC)
        • more EESSI stickers
          • via HPCNow?
          • Neja will ask at NIC
        • new section in MultiXscale website: https://www.multixscale.eu/dissemination
          • interview with Matej being worked on by Susana
          • will try to include it in newsletter of July
      • Task 7.3 - Sustainability (lead: NIC, started M18)
        • Legal entity for EESSI needs to be looked into?
        • subcontracting money available for this
        • we should explore options ourselves a bit first
      • Task 7.4 - Industry-oriented training activities (lead: HPCNow)
        • ...
    • [NIC] WP8 (Management and Coordination)
      • reply to review report (see Word doc in shared drive, 1st periodic report | Results of the Review)
      • amendment in the works?
        • Neja will start looking into that after holiday in July
      • next General Assembly meeting
      • two deliverables due 5th of July (in response to project review)
        • one on co-design (by Alan)
          • focus on collaborating with projects like EUPILOT, EPI, EUPEX (rather than contacting vendors directly)
        • one for scientific WPs

Notes

  • CI/CD call for EuroHPC
    • is 100% funded (not 50/50 EU/countries)
    • not published yet
  • request for success story by CASTIEL2
    • ideally end of June, by latest at end of August
    • involvement of SKA in EESSI
      • Neja is talking to Caspar on this
    • deployment of EESSI on Vega/Karolina
    • maybe something on Deucalion
      • at best by mid Aug'24
    • collaboration with AWS/Azure
      • getting EESSI in AWS ParallelCluster
  • next general MultiXscale meeting
    • Tue 25 June 2024, 10:00-11:00 CEST
    • hosted by Alan
    • agenda point: update on pairing of technical + scientific WPs
  • (Susana) suggestions for blog are welcome
    • something on leveraging EESSI on GitHub Actions to run CI
      • using GROMACS?
      • we should also have something on CD aspect
      • Alan has something that may be useful
    • something on progress in RISC-V

Notes of previous meetings


Template for sync meeting notes

TO COPY-PASTE

  • overview of MultiXscale planning
  • WP status updates
    • [SURF] WP1 Developing a Central Platform for Scientific Software on Emerging Exascale Technologies
      • [UGent] T1.1 Stable (EESSI) - due M12+M24
        • ...
      • [RUG] T1.2 Extending support (starts M9, due M30)
      • [SURF] T1.3 Test suite - due M12+M24
        • ...
      • [BSC] T1.4 RISC-V (starts M13)
      • [SURF] T1.5 Consolidation (starts M25)
    • [UGent] WP5 Building, Supporting and Maintaining a Central Shared Stack of Optimized Scientific Software Installations
      • [UGent] T5.1 Support portal - due M12
        • ...
      • [SURF] T5.2 Monitoring/testing (starts M9)
      • [UiB] T5.3 community contributions (bot) - due M12
        • ...
      • [UGent] T5.4 support/maintenance (starts M13)
    • [UB] WP6 Community outreach, education, and training
      • ...
    • [HPCNow] WP7 Dissemination, Exploitation & Communication
      • ...