Conference call notes 20211027 - easybuilders/easybuild GitHub Wiki

(back to Conference calls)

Notes on the 184th EasyBuild conference call, Wednesday Oct 27th 2021 (15:00 UTC)

Attendees

Alphabetical list of attendees (10):

  • Sebastian Achilles (Jülich Supercomputing Centre, Germany)
  • Simon Branford (Univ. of Birmingham, UK)
  • Alex Domingo (Vrije Universiteit Brussel, Belgium)
  • Jorge Guerra (Universidad Polit écnica de Madrid, Spain)
  • Kenneth Hoste (HPC-UGent, Belgium)
  • Adam Huffman (Big Data Institute, Oxford, UK)
  • Kurt Lust (Univ. of Antwerp, Belgium + LUMI User Support Team)
  • Alan O'Cais (Jülich Supercomputing Centre, Germany)
  • Jurij Pečar (EMBL, Germany)
  • Jörg Saßmannshausen (NIHR Biomedical Research Centre, UK)

Agenda

  • overview of recent developments
  • update on progress towards EasyBuild v4.5.0 release
  • promoting candidates for '2021b' common toolchains
  • Q&A

Recent developments

  • release timeline
    • latest release: EasyBuild v4.4.2 (Sept 7th 2021)
    • next release
      • EasyBuild v4.5.0 (some significant enhancements already merged to develop)
      • ETA: end of this week (?)
      • project with target issues/PRs for this release: https://github.com/orgs/easybuilders/projects/14
      • TODO:
        • merge experimental support for installing R extensions in parallel
        • deprecate old toolchains (< 2019?)
        • promote foss/2021.07 to foss/2021b + intel/2021.09 to intel/2021b
        • add documentation for:
          • experimental support for installing (R) extensions in parallel
          • integration with Rich (progress bars)
  • recent changes
    • framework
      • bug fixes
        • refactor EasyBlock to decouple collecting of information on extension source/patch files from downloading them (PR #3860)
          • fixes downloading of sources for extensions with --module-only (issue #3849)
          • deprecated fetch_extension_sources (replaced with collect_exts_file_info)
        • fix library paths to add to $LDFLAGS for intel-compilers toolchain component (PR #3866)
        • remove '--depth 1' from git clone when 'commit' is specified in git_config (PR #3871 + PR #3872)
          • fixes regression introduced in EasyBuild v4.4.2 for easyconfigs that use git_config with a specific commit for downloading sources (issue #3870)
      • enhancements
        • use separate different progress bars for different aspects of the installations being performed (PR #3844, PR #3864, PR #3867)
        • filter out duplicate paths added to module files (PR #3770 + PR #3874)
          • warning is printed when duplicate paths for module file are suppressed: "Suppressed adding the following path(s) to $%s of the module as they were already added"
        • add check_async_cmd function to facilitate checking on asynchronously running commands (PR #3865)
        • add support for --insecure-download configuration option (PR #3859)
        • make intelfftw toolchain component aware of imkl-FFTW module (PR #3859)
          • for intel/2021.09 (soon to be promoted to intel/2021b), imkl FFTW wrappers were separated from imkl installation; see easyconfigs PR #14195)
      • changes
        • ...
    • easyblocks
      • bug fixes
        • restore RPATH wrappers for OpenMPI sanity check (PR #2582)
        • avoid that path to CUDA install directory is added to $PATH (PR #2593)
        • make version regex in OpenSSL wrapper easyblock less strict (version string does not always end with a letter) (PR #2597)
        • redefine collect_exts_file_info instead of now deprecated fetch_extension_sources in OCaml easyblock (PR #2603)
        • fix support for recent imkl version in numexpr easyblock (PR #2606)
      • enhancements
        • enhance GCC easyblock to add support for AMD GPU offloading (PR #2578)
        • enhance imkl easyblock to add support for installing with NVHPC (PR #2595)
        • update ELSI easyblock to support two new external dependencies (PR #2602)
      • new easyblocks
      • changes
        • (none)
    • easyconfigs
      • ~Over 75 easyconfig PRs merged since last conf call
      • bug fixes
        • fix source URL for SCOTCH 6.1.0 (PR #14099)
        • add patch for OpenBLAS 0.3.17 + 0.3.18 to fix segfault triggered by scipy tests (PR #14178)
        • fix AmberTools v20 easyconfig using intel/2020a toolchain (PR #14028)
        • add patch to fix PMIx detection in OpenMPI v4.0.3, v4.0.5, v4.1.0 (PR #14177)
        • fix spatstat.* downloads for Seurat v4.0.1 (PR #14199)
      • enhancements
        • add compiler/parallel/tcltk R libraries included in base installation to extensions in recent R easyconfigs (PR #14189 + PR #14190 + PR #14194)
      • new software
        • ...
      • noteworthy software updates
      • changes
        • updates for foss/2021.07 (to be promoted to foss/2021b):
          • UCX(-CUDA) 1.11.2 as dependency for OpenMPI 4.1.1 + NCCL 2.10.3 (PR #14090)
          • libfabric 1.13.2 as dependency for OpenMPI 4.1.1 + add it as a dependency for PMIx (PR #14176)
          • OpenBLAS 0.3.18 as dependency for FlexiBLAS 3.0.4 (PR #14167)
        • updates for intel/2021.09
          • use imkl with system toolchain + imkl-FFTW (PR #14195
  • to merge/fix/tackle soon
    • framework
      • reported bugs / bug fixes
        • failing GitHub tests in CI with recent git version (issue #3873)
        • FlexiBLAS module and toolchain use $EBROOTFLEXIBLAS/include and not $EBROOTFLEXIBLAS/include/flexiblas (issue #3868)
      • enhancements
        • add initial/experimental support for installing extensions in parallel (PR #3667)
          • opt-in via --parallel-extensions-install --experimental
          • two requirements for extensions:
            • must be able to determine list of required dependencies (via required_deps property method in easyblock)
            • must be able to start installation asynchronously via run_async + check for completion via async_cmd_check
          • works well with R easyconfigs: installation is several hours faster with enough cores available
          • known limitations:
            • doesn't work yet for R-bundle-Bioconductor, because all required dependencies must be listed in exts_list
            • skipping of installed extensions and sanity check for extensions is still done sequentially
        • detect download failure due to outdated certificate and print helpful warning? (issue #3863)
      • changes
        • print the hook messages only for debug-mode (PR #3843)
    • easyblocks
      • reported bugs / bug fixes
        • ...
      • enhancements
        • add support for installing R extensions in parallel (PR #2408)
        • detect problem with compiling CPU detection code in configure output in GROMACS easyblock (PR #2609)
        • add custom custom easyconfig parameter 'blas_libs' in FlexiBLAS easyblock to specify backends (PR #2605)
          • backends would be a better name?
      • changes
        • don't use --config=mkl for TensorFlow 2.4+ (PR #2583)
      • new software
        • (nothing major?)
    • easyconfigs

Common toolchains

2021b (WIP!)

  • ready to promote foss/2021.07 to foss/2021b and intel/2021.09 to intel/2021b?
  • first merge FFTW 3.3.10 + update in foss/2021.07

Q&A

  • Kurt: status of easystack support?
    • some issues with using multiple versionsuffix for a specific software version?
  • Kurt: how compatible is a custom repository with easyconfigs/easyblocks using GPLv3 with the central repositories using GPLv2
  • Kurt: some interest from PDC in Sweden to work together with LUMI on using EasyBuild common toolchains on Cray Slingshot systems
  • Alex: NVIDIA only provides NCCL builds with specific CUDA versions, while we make different combinations, is that a problem?
    • recent NCCL versions are built from source in EasyBuild, so should be OK?
    • NVIDIA only provides recent NCCL builds for CUDA 11.0 and 11.4
      • probably because CUDA 11.0 is considered a long-term release?
    • similar issue with cuDNN compatibility matrix
    • can we make the NCCL easyblock run tests?
  • Alan: impact of splitting up imkl and imkl-FFTW
    • opens the door to pull down SciPy-bundle down to compiler-only level (if we flesh out mpi4py)
  • Sebastian: is ROCm being built with EasyBuild at LUMI?
    • Kurt: not currently, ROCm is provided as a part of the Cray software stack, but there's interest in changing this (cfr. collaboration with PDC)
    • Sebastian is asking in the context of a meeting with AMD (Michael Klemm)
      • it seems they're willing to help out if needed
    • see Jürgen's work on ROCm at https://github.com/easybuilders/easybuild-easyconfigs/pull/14156
      • Sebastian is planning to reach out to Jürgen regarding potential collaboration with AMD
    • some hardcoded stuff (like assuming /opt/rocm or having everything installed in a single place) are slowly being fleshed out from ROCm (partially due to pressure some Spack)
    • would be nice to figure out first what we would like to get out of it + which questions we have
    • Kurt: multiple different compilers from AMD, unclear which is best option for what, how compatible they are, etc.
    • should we ask Michael Klemm is he's up for doing an EasyBuild Tech Talk on this?
      • ROCm software stack, HIP & related tools, different AMD compilers, ...
  • Jörg: is Intel oneAPI free for use in academia?