meeting 2024 07 04 - EESSI/meetings GitHub Wiki

Notes for 2024-07-04 meeting

  • date & time: Thu 7 July 2024 - 14:00 CEST (13:00 UTC)
    • (every first Thursday of the month)
  • venue: (online, see mail for meeting link, or ask in Slack)
  • agenda:
    • Quick introduction by new people
    • EESSI-related meetings and events in last month
    • Progress update per EESSI layer
    • Update on build-and-deploy bot
    • Update on EESSI production repository software.eessi.io
    • Update on EESSI documentation
    • Update on EESSI test suite
    • EESSI as backend in Ramble
    • Additional EESSI repositories: dev.eessi.io, riscv.eessi.io
    • AWS/Azure sponsorship update
    • Q&A

Slides

Meeting notes

(by Bob/Kenneth)

Quick introduction by new people

EESSI-related meetings in last month

(see slides)

  • Deucalion team is interested in working together and making EESSI available, regular calls are set up
  • Brainstorm with someone from AMD about adding AMD ROCm support to EESSI

Progress update per EESSI layer

Filesystem layer

(see slides)

  • CVMFS dashboard to keep an eye on CVMFS infrastructure
    • internal dashboard initially
    • we also want to expose some metrics to EESSI status page (like disk usage for Stratum-1 mirror server)
Compatibility layer

(see slides)

Software layer

(see slides)

  • Lots of software has been added, also for the new CPU targets x86_64/amd/zen4 and aarch64/a64fx
  • CP2K has a dependency (libxsmm) that doesn't work on Arm / with specific GCC versions
    • Trying to make this an optional dependency
Build-and-deploy bot

(see slides)

  • For some PRs / bot instances the bot suddenly started adding lots of messages, see e.g. https://github.com/EESSI/software-layer/pull/630#issuecomment-2205679779
    • Still a bit unclear what was causing this, but it seems like smee is sending out the same event multiple times?
    • We should open an issue and document what we saw, and see if we can come up with a way to make the bot more robust against this (like don't re-process the same event multiple times)
software.eessi.io repository

(see slides)

EESSI documentation

(see slides)

  • The available software page in the docs is automatically updated by a cronjob that opens a PR in case something is added
EESSI test suite

(see slides)

  • Caspar is working on documentation for getting from a job script to a ReFrame test and to a portable ReFrame test
EESSI as backend in Ramble

(see slides)

Additional EESSI repositories: dev.eessi.io, riscv.eessi.io

(see slides)

  • The ultimate goal here is to merge the RISC-V stack into the production repository
    • It's too early to already do this now, also because older toolchains don't work on RISC-V
    • Maybe this can be considered for one of the next EESSI versions
  • We're only building for riscv64/generic right now, more issues may pop up when we start building for RISC-V CPUs with vector instructions
  • PR 618 revealed that the CI that tests the eessi_container.sh script has a bug (due to a wrong regular expression), but this can/will be easily fixed in the same PR.

AWS/Azure sponsored credits

(see slides)

Events

(see slides)

Q&A

  • The monthly meeting in August will be skipped due to summer holidays
  • interest in meeting on support for AMD ROCm by Hugo & Jurij
    • will probably scheduled some time in Sept'24
  • Jurij tested EESSI on CentOS Stream 10, and things worked fine
    • using CVMFS nightly build for Fedora 40, building from source doesn't work yet (linking issue)
  • any other interest in Varnish as proxy for CernVM-FS?