Notes from EasyBuild maintainer summit 2023 - easybuilders/easybuild GitHub Wiki

EasyBuild maintainer summit 2023 (Mon 27 Feb'23 + Wed 1 Mar'23)

Part I - Mon 27 Feb 2023

  • attending:
    • Adam, Åke, Alan, Alex, Bart, Caspar, Davide, Jasper, Kenneth, Lars, Sam, Sebastian, Simon

EasyBuild Docs

  • review cycle => could be added to non-easyconfigs MotW?
    • typically few incoming issues/PRs for non-easyconfigs MotW
    • how to track who updated what and assign new weekly documentation tasks?
      • source documents can be simply tracked by URL of final web page
      • timestamp of source document tells how long ago was it updated
      • single issue for all documents? or github project? > try issue per page, incl. checklist
  • shall we systematically update documentation on new PRs?
    • on framework yes, either before or after merge of PR
    • at least open an issue to update docs later
  • documentation is quite technical, shall we integrate the tutorial into it to have a less steep entry point?
    • making docs more approachable/easier to digest could be an item on the checklist for (1st) review cycle
    • the landing page of the docs already has links to the tutorial
    • note could be added to point to tutorial from reference docs
    • adding examples to reference docs could help a lot as well
    • writing-easyconfigs page could maybe be split up in smaller more detailed pages, and only cover most common usage on "landing" page
  • separate group of docs maintainers
    • ask for review in #docs
    • who could we solicit for this?
    • PRs for code and documentation are tighly linked to each other, will it be useful to have separate reviewers for those?
      • a doc reviewer can provide a different point of view to the PR than the pure code review
  • can we add comments in the source files of the documentation about TODO and desirable things to do?
    • maybe comments can be added to markdown following the HTML format

User Survey

  • volunteer to review: Åke, Adam
  • Alan: ask about EESSI
    • Should keep them simple for this year
      • Have you heard of EESSI?
      • Would you be interested in leveraging the EESSI project?

EasyBuild 5.0

  • (more on this on Wed 1 Mar)
  • making software license spec required in easyconfigs
  • moduleclass is now required, up for discussion?

MotW

  • ping contributors in open PRs
    • should also be done occasionally, and only if we expect contributor to make requested changes
  • are we "aggressive" enough when dealing with PRs?
    • should we actively make additional changes ourselves in the PR to get it merged?
    • or more quickly close if there are no updates
    • different answer for framework vs easyblocks vs easyconfigs
  • potential consequences of merging a PR are often difficult to assess
    • example: change to PythonPackage easyblock
    • reviewing PRs that touch code you've never looked into is tricky
  • what could help to make maintainers more confident in merging framework/easyblock PRs?
    • training?
    • sometimes more a question of design of framework
    • labels to assign priorities/complexity?
  • sometimes reluctant to get actively involved in PRs because they'll take a lot of time
  • some PRs should get more than 1 reviewer?
  • checklist for framework PRs?
    • should include stuff like "is this change covered by the tests" + "docs update"
    • also on scope for review (code style, logging, ...)
    • see https://github.com/easybuilders/easybuild/wiki/Review-process-for-contributions
      • this should be moved to the docs, so contributors can easily refer to it (and we can find it more easily)
    • should we limit scope of review to what's being changed in PR?
      • example: new easyblock that works for what it's intended, but maybe not for all possible combinations of easyconfig parameters
      • could also ask to split up huge PRs into smaller ones to actually make progress
    • definitely out of scope for EB: trying to keep clean commit history
    • maybe discourage use of force pushes in PRs?
  • checklist when a PR is opened
    • via PR template (or make --new-pr use the template)
  • issue template could also be helpful

Easyconfig PRs

  • try to assess the quality/interest of PRs to prioritize/focus effort (with labels?)

  • PRs that "go out of view" are not being picked up again

  • less time to tackle bug PRs (e.g. TF, PyTorch)

    • chain effect as more and more stuff depends on those bug packages
    • so if PyTorch PRs get stuck, so do other PRs
  • does the current MotW role work?

  • certain features (e.g. rpath) are not systematically tested, and that shows for users that rely on them with unexecpected failures

  • test matrix: define 5-6 test environments

    • some Tier-1 (must work), some Tier-2 (doesn't block PR if build fails)
    • keep track of issues in easyconfigs?
    • flag easyconfigs that are known to fail under certain "tolerable" circumstances (e.g. certain archs), but merge them anyway and make EB aware of those failures
  • easy to lose track over which PRs you were looking into

    • for example when generoso test takes a while
    • can we automate this somehow to notify maintainer who was looking into a particular PR?
    • maybe via project board?
    • alternative strategy, MotW can check PRs regardless of review status to catchs those forgotten PRs that got updates
  • automate test builds in AWS/Azure using EESSI credits

    • start with testing in one test env, only test in more contexts after review
    • have a different test suite for PRs and another more extensive for stuff merged in develop
    • let bot automate merging of PR once PR was reviewed by maintainer + all test builds pass
    • new test cluster with Zen3 + A100 at JSC as well
  • label PRs by priority to help focus effort

  • introduce waiting-for-contributor label

    • bot could automatically remove it when a contributor pushes an update of adds a comment
      • Maybe also automatic tagging when waiting for generoso / jsc-zen2?
  • be more restrictive in range of toolchains allowed to get new contributions

    • focus only on the most recent one (2 years?)
    • auto-deprecate toolchains older than 3 years, archive easyconfigs using a toolchain older than 4 years
    • publish policies about rules in our repo
      • Maybe point out they can still open a PR, close it themselves, and use --from-pr. We just won't review & merge.
    • disallow backports of newer versions to older toolchains
  • we need to clarify that the central easyconfigs repo is not the one-repo-to-rule them all

  • why don't we currently let the bot on our HPC systems?

  • we should try and focus on stuff that doesn't require additional manpower

    • please use "eb --review-pr"
    • bot should post review-pr diff
  • 1h/day is currently not sufficient to keep # of open PRs at same level

  • CI security needs a good review

  • add label to flag PRs that are not being updated, so maintainer can take matters into their own hands

Part II - Wed 1 Mar 2023

  • attending: Adam, Åke, Alan, Alex, Bart, Caspar, Jasper, Kenneth, Lars, Mikael, Pablo, Sam, Sebastian, Simon

EasyBuild v5.0

  • use of f-strings in easyconfigs?
    • sure, why not
    • check in CI for new easyconfigs?
  • load dependencies before unpacking?
    • perhaps introduce new step and e.g. unpack_dependencies?
  • other feature ideas
    • being able to express "load PerlBase or Perl", install PerlBase if Perl is not available
  • drop lmod 6/7 support
    • Ubuntu 20.04 is still on lmod6.6
  • should also allow a list of values for import check
  • software license spec
  • more inspiration for new run cmd
    • plumbum
  • support a way of uploading EasyBuild logs for failing installations
  • make EasyBuild logs easier to navigate
    • could be "backwards-incompatible"
    • https://docs.easybuild.io/log-files
      • not found when using "logging" in search box, annoying...
    • Sam - implemented custom syntax highlighting for Vim
  • libaries.io
    • supports licensing?
    • supports dependencies? Could perhaps be used to determine order to install python packages in parallel?
  • Python / R are becoming too fat
    • Python + R should be "bare"
    • separate easyconfig for PyPI packages bundle + CRAN packages bundle
    • yet keep setuptools+pip in Python easyconfig
      • keep Cython or not?
  • also run --module-only + --sanity-check-only in test reports
    • to make sure that easyblocks are compatible with these options
    • these shouldn't block easyconfig PRs though, since fix needs to be in easyblock
    • this should also go in checklist for easyblock PRs
  • support for resuming interrupted builds
    • "checkpointing" entire build dir could be expensive
    • easyblocks would need to be implemented more carefully to support this
    • bwrap could be useful here (https://github.com/containers/bubblewrap)
    • declare in easyblock whether resuming an interrupted build is supported by it
  • Alex: feedback on more static easyconfigs
    • pull out metadata to a separate file (homepage, descr, checksums, etc.)
    • could also include list of contributors to the support of this software
    • only have stuff that actually affects the build in the easyconfig file
    • going beyond https://github.com/easybuilders/easybuild-framework/pull/3749
    • there's currently no motivation for coming up with a better description
      • having a single place where description is would make that easier
    • for PRs that add an easyconfig file for new software, a bot could make a commit/PR to pull apart the metadata into a separate file
  • native support for "bundle of components" should be implemented via extensions
    • any type of installation should be possible as extension
      • is "extensions" then still the right term?
      • should we get rid of the whole easyblock vs extension difference?
    • next step could be to have more checking on extensions (conflicting versions, etc.)
    • there should be a way to declare a conflict on component vs a particular module (like OpenSSL)
      • a way to declare conflicts in module file: conflicts_with = [...]
  • make review-pr smarter (for diff reported by bot)
    • no need to let it report changes in checksums, comments, ...
    • different implementation of review-pr that just compares easyconfig parameter values (that really matter)

Easyconfig PRs

  • automatically label PRs that haven't seen any activity for N months
    • section in docs with reasons to close old PRs

Actions

  • (Kenneth) Training on framework
  • (NAME) Prepare issues for yearly documentation update
  • (Alex) Update checklist/guidelines for reviewers
  • (NAME) Write templates for new issues/PRs
  • (Åke+Adam) review survey