Sync meeting 2024 06 11 - multixscale/meetings GitHub Wiki
MultiXscale WP1+WP5 sync meetings
- Monthly, every 2nd Tuesday of the month at 10:00 CE(S)T
- Notes of previous meetings at https://github.com/multixscale/meetings/wiki
Next meetings
- Tue 11 June 2024 10:00 CEST
- Tue 9 July 2024 10:00 CEST
- planning to attend: Caspar, Bob
- on summer break: Kenneth, Lara, Thomas
- Tue 13 Aug 2024 10:00 CEST
- planning to attend: Kenneth, Lara, Caspar, Thomas
- on summer break: Bob
Agenda/notes 2024-06-11
attending:
- Neja (NIC)
- Alan (UB)
- Kenneth, Lara (UGent)
- Caspar, Casper, Maxim (SURF)
- Thomas & Richard (UiB)
- Bob, Pedro (RUG)
- Jean-Noël (UStutt)
- Julián (BSC)
- Eli, Susana, Nadia (HPCNow!)
- problems with shared drive
- cfr. incomplete progress reports 2024Q1 for WP1 (see drafts in drive upload by Satish) + WP5 (see Lara's email 6 May)
- works for Alan when logging in through incognito browser + logging in with personal Microsoft account
- who else has this problem?
- Kenneth, Susana, Jean-Noël, Rudolph, Pedro
- who else has this problem?
- we can try to take a copy and create a new OneDrive
- 2024Q2 quarterly report
- try to get info in place end of June/early July
- Lara & Caspar will be mostly available in July
- if problems with OneDrive persist, send PMs info + bullet points with tasks to Caspar/Lara via email/Slack/HackMD
- Milestone 3 (M18 - June 2024, lead: UStuttgart)
- Milestone name: "First portable test run on two systems with different architectures (e.g. with and without accelerators)"
- Means of validation: "Performance and scalability plots available for the application on the two architectures"
- working on this using ESPResSo, as extra test in EESSI test suite
- see https://github.com/EESSI/test-suite/pull/144, using FFT test case
- working with JSC to make FFT communication 8-16x faster
- scalability for LJ test case should improve this year
- should also be added to ESPResSo test
- see https://github.com/EESSI/test-suite/pull/144, using FFT test case
- WP status updates
- [SURF] WP1 Developing a Central Platform for Scientific Software on Emerging Exascale Technologies
- [UGent] T1.1 Stable (EESSI) - D1.3 due M24 (Dec'24)
- more software, incl. Espresso 4.2.2
dev.eessi.io
=> see notes + support issue #61- would be very interesting service for developers in scientific WPs => cross-cutting across technical & scientific PRs
- "looser" policy compared to
software.eessi.io
production repo - Devs can trigger their own builds
- pre-release builds accepted (specific commits)
- intially focused on Espresso & co
- could also be used as "dev" environment for
software.eessi.io
features (e.g. GPU support) - if we're doing this on Azure, we should do it in a new subscription
- needs to be created by Martin @ SURF
- if done in AWS, Alan can do it
- GPU software => see notes + support issue #59
- Update bot to have GPU support [Thomas]
- Update archdetect to support CUDA compute capability [???]
- directory structure in
software.eessi.io
, for examplesoftware/x86_64/amd/zen2/accel/nvidia/cc80
[???] - blocked by
dev.eessi.io
?- we want to use this as a playground for GPU builds
- => can look into this during hackathon on Tue 18 June
- needs to be planned
- need to review description of Task 1.1, make sure all subtasks are covered
- => need to update project planning (Caspar, Kenneth)
- "we will benchmark software from the shared software stack and compare the performance against on-premise software stacks to identify potential performance limitations, ..."
- Espresso + LAMMPS + OpenFOAM + ALL(?) (MultiXscale), GROMACS (BioExcel)
- "increase stability of the shared software stack ... pro-actively by developing monitoring tools"
- proper monitoring for CVMFS network (S0 + S1s)
- for RUG?
- [RUG] T1.2 Extending support - D1.4 due M30 (June'25)
- Arm support fits here
- zen4 + sapphirerapids
- AMD ROCm
- lower impact, should we should limit our efforts here?
- select apps, like PyTorch/TensorFlow
- should also look into Grace Hopper (JUPITER)
- [SURF] T1.3 Test suite - D1.5 due M30 (June'25)
- Milestone 3 for Espresso test
- [BSC] T1.4 RISC-V (starts M13)
- cfr. efforts by Bob & Julian, incl.
riscv.eessi.io
- actively looking into adding more software, incl. Extrae
- lot of interest from EUPILOT project @ BSC
- cfr. efforts by Bob & Julian, incl.
- [SURF] T1.5 Consolidation (starts M25 - Jan'25)
- (not started yet)
- [UGent] T1.1 Stable (EESSI) - D1.3 due M24 (Dec'24)
- [UGent] WP5 Building, Supporting and Maintaining a Central Shared Stack of Optimized Scientific Software Installations
- (FINISHED M12 [UGent] T5.1 Support portal)
- [SURF] T5.2 Monitoring/testing, D5.3 due M30 (June'25)
- discussions with SURF + initial work done on dashboard
- working on two dashboards: one detailed, one with overview
- (FINISHED M12 [UiB] T5.3 community contributions (bot))
- [UGent] T5.4 support/maintenance - D5.4 due M48 (Dec'26)
- support portal + rotation working well
- support issues in April+May
- Opened: 12 issues
- Closed: 10 issues
- total: 69 issues (26 open, 43 closed)
- bot release
- [UB] WP6 Community outreach, education, and training
- [Kenneth, Lara, Pedro] EasyBuild User Meeting (EUM'24), 23-25 April 2024 @ Umeå, Sweden
- [Kenneth, Lara, Eli] activity at ISC'24, see https://eessi.io/docs/blog/2024/05/17/isc24
- [Eli] Teratec (29-30 May'24)
- poster
- demo for Sanofi, were quite interested
- [Thomas] presentation @ Norwegian Bioinformatic Days on making bionformatics workflows easy (using Nextflow)
- they use a lot of containers, but can also use different backends
- backend for EESSI could be interesting
- similar work was done in BioHackaton Europe (https://biohackathon-europe.org)
- they use a lot of containers, but can also use different backends
- [Lara] EESSI promotion @ DH Benelux in Leuven (Belgium), 4-7 June'24
- some people were interested, like getting students easy access to software installations
- [Matej] presenting poster at ASHPC this week
- [Alan] invited speaker for Nordic Industry Days (early Sept'24)
- submit BoF proposal on EESSI for SC24 (Atlanta, US)
- HPCNow! will be attending
- tutorial submission done
- CernVM-FS workshop (Sept'24, Geneva)
- submission due this month
- EESSI is in default CernVM-FS configuration
- could cover work on
dev.eessi.io
- deliverable due: D6.2 (M24 - Dec'24), D6.3 (M30 - June'25)
- [HPCNow] WP7 Dissemination, Exploitation & Communication
- T7.1 Scientific applications provisioned on demand (lead: HPCNow)
- ...
- Task 7.2 - Dissemination and communication activities (lead: NIC)
- more EESSI stickers
- via HPCNow?
- Neja will ask at NIC
- new section in MultiXscale website: https://www.multixscale.eu/dissemination
- interview with Matej being worked on by Susana
- will try to include it in newsletter of July
- more EESSI stickers
- Task 7.3 - Sustainability (lead: NIC, started M18)
- Legal entity for EESSI needs to be looked into?
- subcontracting money available for this
- we should explore options ourselves a bit first
- Task 7.4 - Industry-oriented training activities (lead: HPCNow)
- ...
- T7.1 Scientific applications provisioned on demand (lead: HPCNow)
- [NIC] WP8 (Management and Coordination)
- reply to review report (see Word doc in shared drive,
1st periodic report | Results of the Review
) - amendment in the works?
- Neja will start looking into that after holiday in July
- next General Assembly meeting
- 23-24 Jan'25 in Barcelona/Sitges
- coupled to HiPEAC'25 (20-22 Jan 2025)
- https://www.hipeac.net/2025/barcelona
- call for workshops/tutorials at HiPEAC'25
- https://www.hipeac.net/2025/barcelona/#/call/
- deadline: 1 July
- Eli working on workshop submission for Women in HPC/CoE's
- 23-24 Jan'25 in Barcelona/Sitges
- two deliverables due 5th of July (in response to project review)
- one on co-design (by Alan)
- focus on collaborating with projects like EUPILOT, EPI, EUPEX (rather than contacting vendors directly)
- one for scientific WPs
- one on co-design (by Alan)
- reply to review report (see Word doc in shared drive,
- [SURF] WP1 Developing a Central Platform for Scientific Software on Emerging Exascale Technologies
Notes
- CI/CD call for EuroHPC
- is 100% funded (not 50/50 EU/countries)
- not published yet
- request for success story by CASTIEL2
- ideally end of June, by latest at end of August
- involvement of SKA in EESSI
- Neja is talking to Caspar on this
- deployment of EESSI on Vega/Karolina
- maybe something on Deucalion
- at best by mid Aug'24
- collaboration with AWS/Azure
- getting EESSI in AWS ParallelCluster
- next general MultiXscale meeting
- Tue 25 June 2024, 10:00-11:00 CEST
- hosted by Alan
- agenda point: update on pairing of technical + scientific WPs
- (Susana) suggestions for blog are welcome
- something on leveraging EESSI on GitHub Actions to run CI
- using GROMACS?
- we should also have something on CD aspect
- Alan has something that may be useful
- something on progress in RISC-V
- something on leveraging EESSI on GitHub Actions to run CI
Notes of previous meetings
- https://github.com/multixscale/meetings/wiki/Sync-meeting-2024-05-14
- https://github.com/multixscale/meetings/wiki/Sync-meeting-2024-04-09
- https://github.com/multixscale/meetings/wiki/Sync-meeting-2024-03-12
- https://github.com/multixscale/meetings/wiki/Sync-meeting-2024-02-13
- https://github.com/multixscale/meetings/wiki/Sync-meeting-2024-01-09
- https://github.com/multixscale/meetings/wiki/Sync-meeting-2023-12-12
- https://github.com/multixscale/meetings/wiki/Sync-meeting-2023-11-14
- https://github.com/multixscale/meetings/wiki/Sync-meeting-2023-10-10
- https://github.com/multixscale/meetings/wiki/Sync-meeting-2023-09-12
- https://github.com/multixscale/meetings/wiki/Sync-meeting-2023-08-08
- https://github.com/multixscale/meetings/wiki/Sync-meeting-2023-07-11
- https://github.com/multixscale/meetings/wiki/Sync-meeting-2023-06-13
- https://github.com/multixscale/meetings/wiki/Sync-meeting-2023-05-09
- https://github.com/multixscale/meetings/wiki/Sync-meeting-2023-04-11
- https://github.com/multixscale/meetings/wiki/Sync-meeting-2023-03-14
- https://github.com/multixscale/meetings/wiki/Sync-meeting-2023-02-14
- https://github.com/multixscale/meetings/wiki/sync-meeting-2023-01-10
Template for sync meeting notes
TO COPY-PASTE
- overview of MultiXscale planning
- WP status updates
- [SURF] WP1 Developing a Central Platform for Scientific Software on Emerging Exascale Technologies
- [UGent] T1.1 Stable (EESSI) - due M12+M24
- ...
- [RUG] T1.2 Extending support (starts M9, due M30)
- [SURF] T1.3 Test suite - due M12+M24
- ...
- [BSC] T1.4 RISC-V (starts M13)
- [SURF] T1.5 Consolidation (starts M25)
- [UGent] T1.1 Stable (EESSI) - due M12+M24
- [UGent] WP5 Building, Supporting and Maintaining a Central Shared Stack of Optimized Scientific Software Installations
- [UGent] T5.1 Support portal - due M12
- ...
- [SURF] T5.2 Monitoring/testing (starts M9)
- [UiB] T5.3 community contributions (bot) - due M12
- ...
- [UGent] T5.4 support/maintenance (starts M13)
- [UGent] T5.1 Support portal - due M12
- [UB] WP6 Community outreach, education, and training
- ...
- [HPCNow] WP7 Dissemination, Exploitation & Communication
- ...
- [SURF] WP1 Developing a Central Platform for Scientific Software on Emerging Exascale Technologies