meeting 2024 07 04 - EESSI/meetings GitHub Wiki
Notes for 2024-07-04 meeting
- date & time: Thu 7 July 2024 - 14:00 CEST (13:00 UTC)
- (every first Thursday of the month)
- venue: (online, see mail for meeting link, or ask in Slack)
- agenda:
- Quick introduction by new people
- EESSI-related meetings and events in last month
- Progress update per EESSI layer
- Update on build-and-deploy bot
- Update on EESSI production repository software.eessi.io
- Update on EESSI documentation
- Update on EESSI test suite
- EESSI as backend in Ramble
- Additional EESSI repositories: dev.eessi.io, riscv.eessi.io
- AWS/Azure sponsorship update
- Q&A
Slides
Meeting notes
(by Bob/Kenneth)
Quick introduction by new people
EESSI-related meetings in last month
(see slides)
- Deucalion team is interested in working together and making EESSI available, regular calls are set up
- Brainstorm with someone from AMD about adding AMD ROCm support to EESSI
- All information is available at https://gitlab.com/eessi/support/-/issues/71
- Getting access to MIX300X instances in Azure will be very difficult
- Other (smaller?) AMD GPU instances will be publicly available soon
Progress update per EESSI layer
Filesystem layer
(see slides)
- CVMFS dashboard to keep an eye on CVMFS infrastructure
- internal dashboard initially
- we also want to expose some metrics to EESSI status page (like disk usage for Stratum-1 mirror server)
Compatibility layer
(see slides)
Software layer
(see slides)
- Lots of software has been added, also for the new CPU targets
x86_64/amd/zen4andaarch64/a64fx - CP2K has a dependency (libxsmm) that doesn't work on Arm / with specific GCC versions
- Trying to make this an optional dependency
Build-and-deploy bot
(see slides)
- For some PRs / bot instances the bot suddenly started adding lots of messages, see e.g. https://github.com/EESSI/software-layer/pull/630#issuecomment-2205679779
- Still a bit unclear what was causing this, but it seems like smee is sending out the same event multiple times?
- We should open an issue and document what we saw, and see if we can come up with a way to make the bot more robust against this (like don't re-process the same event multiple times)
software.eessi.io repository
(see slides)
EESSI documentation
(see slides)
- The available software page in the docs is automatically updated by a cronjob that opens a PR in case something is added
EESSI test suite
(see slides)
- Caspar is working on documentation for getting from a job script to a ReFrame test and to a portable ReFrame test
EESSI as backend in Ramble
(see slides)
Additional EESSI repositories: dev.eessi.io, riscv.eessi.io
(see slides)
- The ultimate goal here is to merge the RISC-V stack into the production repository
- It's too early to already do this now, also because older toolchains don't work on RISC-V
- Maybe this can be considered for one of the next EESSI versions
- We're only building for
riscv64/genericright now, more issues may pop up when we start building for RISC-V CPUs with vector instructions - PR 618 revealed that the CI that tests the
eessi_container.shscript has a bug (due to a wrong regular expression), but this can/will be easily fixed in the same PR.
AWS/Azure sponsored credits
(see slides)
Events
(see slides)
Q&A
- The monthly meeting in August will be skipped due to summer holidays
- interest in meeting on support for AMD ROCm by Hugo & Jurij
- will probably scheduled some time in Sept'24
- Jurij tested EESSI on CentOS Stream 10, and things worked fine
- using CVMFS nightly build for Fedora 40, building from source doesn't work yet (linking issue)
- any other interest in Varnish as proxy for CernVM-FS?