Conference call notes 20210609 - easybuilders/easybuild GitHub Wiki

(back to Conference calls)

Notes on the 174th EasyBuild conference call, Wednesday June 9th 2021 (08:00 UTC)

Attendees

Alphabetical list of attendees (10):

  • Sebastian Achilles (Jülich Supercomputing Centre, Germany)
  • Miguel Dias Costa (National University of Singapore)
  • Alexander Grund (TU Dresden, Germany)
  • Jorge Guerra (Universidad Politécnica de Madrid, Spain)
  • Kenneth Hoste (HPC-UGent, Belgium)
  • Adam Huffman (Big Data Institute, Oxford, UK)
  • Kurt Lust (Univ. of Antwerp, Belgium + LUMI User Support Team)
  • Alan O'Cais (Jülich Supercomputing Centre, Germany)
  • Mikael Öhman (Chalmers University of Technology, Sweden)
  • Jörg Saßmannshausen (NIHR Biomedical Research Centre, UK)

Agenda

  • overview of recent developments
  • Q&A

Recent developments

  • last release: EasyBuild v4.4.0 (June 2nd)
  • ETA next release: early July
  • recent changes
    • framework
      • bug fixes
        • various fixes for Fujitsu toolchain support (PR #3704, PR #3712, PR #3713, PR #3714, PR #3717, PR #3721, PR #3731)
          • Miguel: toolchain concept in EasyBuild helps a lot here
            • now looking into numpy
              • easy when you ignore the failing tests...
              • currently fighting with numpy compiler detection
        • fix support for specifying multiple PRs to --from-pr (PR #3707, PR #3708)
        • avoid overwritting pr_nr in post_pr_test_report for reports with --include-easyblocks-from-pr (PR #3724 + PR #3726)
          • results in posting test report in easyblock PR rather than easyconfigs PR, but only when running EasyBuild on top of Python 2
      • enhancements
        • add support for --skip-extensions (PR #3702)
        • support systems with more than 1024 cores (PR #3701)
      • changes
        • drop support for Python 2.6 (PR #3715)
    • easyblocks
      • bug fixes
        • correctly handle empty list of sources in PythonPackage._should_unpack_source (PR #2442)
          • bug introduced in PR that skips unpacking of *.whl files
        • make sure that self.python_cmd is set before using it in PythonPackage.sanity_check_step (PR #2447)
          • required to avoid breaking --module-only for easyconfigs using PythonBundle, now that we're also checking extensions when using --module-only (unless --skip-extensions is used)
        • only use siterc fix for NVHPC < 21.3 (PR #2453)
      • enhancements
        • enhance sanity check for Clang to verify if CUDA offload library was produced (PR #2454)
      • new software
      • changes
        • (none)
    • easyconfigs
      • close to 100 easyconfig PRs merged since last conf call
      • over 10,000 easyconfig PRs merged! \o/
        • #10,000 was easyconfig for EasyBuild v4.4.0 (PR #13012)
      • bug fixes
        • add patches for PyTorch 1.7.1 avoiding failures on POWER and A100 (PR #12753)
        • fix download URL for DB 18.1.40 (PR #12974)
        • fix test failure in TensorFlow 2.4.1 on recent CUDA drivers (PR #12979)
        • add elfutils as build dependency for Clang easyconfigs with CUDA dependency (PR #13008 + PR #13015)
        • add patch to fix buffer overflow in OpenMPI 4.1.x (PR #12983)
        • add patch to fix installation of HDF 4.2.15 on aarch64 (PR #13059)
        • add new checksum of mvabund to R v4.0.4 (PR #13020 + PR #13021)
        • fix checksum for snpEff 5.0 (PR #13062)
      • enhancements
        • add check to easyconfigs test suite to ensure OpenSSL wrapper is used in easyconfigs using a recent toolchain (PR #13079)
      • new software
      • noteworthy software updates
      • noteworthy changes
        • update easyconfigs for binutils 2.35 to use binutils 2.35.2 source tarball instead to pick up bug fixes (PR #12967 + PR #12988)
        • promote foss/2021.04 to foss/2021a (PR #12975)
        • promote intel/2021.03 to intel/2021a (PR #12976)
        • add UCX patch to allow overriding modules (PR #12980)
          • to facilitate collapse of foss and fosscuda (2021a)
        • disable debuginfod for elfutils to minimize required dependencies (PR #13034)
  • to merge/fix/tackle soon
    • framework
      • reported bugs / bug fixes
        • specified easyblock for extension is not taken into account (issue #3710)
        • fix crash in get_config_dict when copying modules that were imported in easyconfig file (like 'import os') (PR #3729, fixes issue #3727)
        • rebuilding module breaks for HMNS if there are sort-of circular builddependencies (e.g. XZ and gettext) (issue #3722)
      • enhancements
        • support additional features in easystack files
          • support for filtering via labels (PR #3620)
        • avoid using a priority in prepend_module_path (Lmod) to avoid costly module calls (PR #3636)
        • add support for installing extensions in parallel (WIP) (PR #3667)
        • add make_extension_string and _make_extension_list to EasyBlock (PR #3697)
          • related to avoiding duplicates in Perl extensions
        • enhance detection of patch files with better error messages (PR #3709)
        • add per-step timing information (PR #3716)
        • add module-write hook (PR #3728)
        • add option to ignore failing test step (--ignore-test-failure) (PR #3732)
        • finding modules with multiple modulepaths and HMNS (issue #3703)
      • changes
        • make sure that tests requiring a github token have 'github' in the test name so that they can be easily filtered (PR #3730)
        • also enable static analysis for Python 2.7 (PR #3725)
          • catches accidentally overwriting local variables in list comprehensions (like --from-pr --include-easyblocks-from-pr bug)
    • easyblocks
      • bug fixes
        • treat files/directories of unpacked sources equally in PackedBinary (PR #2306)
        • --module-only doesn't always work as expected
          • we need a better way of catching this in tests
          • problem is that you typically need an actual installation to catch these problems, so can't be done in easyconfigs or easyblocks test suite run in CI
          • test installations done on generoso via boegelbot could be enhanced to catch problems with --module-only?
        • explicitly use only OpenBLAS for PyTorch if MKL is not used (PR #2448)
        • Fix CPU-only runtime for dpcpp-generated executables in custom easyblock for intel-compilers (oneAPI) (PR #2457)
      • enhancements
        • enhance test and install step of CMakePythonPackage easyblock (PR #2318)
        • add support for installing R extensions in parallel (WIP) (PR #2408)
        • allow for Perl modules being part of other, already installed Perl modules (PR #2386)
        • including FlexiBLAS as the default BLAS in foss will require easyblock changes (issue #2421)
        • should set BLA_VENDOR in CMakeMake easyblock if BLAS is in the toolchain (PR #2420)
        • enhance sitecfg to support overriding core Python packages (PR #2458)
        • enable make check and sanity check exec for OpenMPI (PR #2444)
        • add CMake support for Amber 20 (PR #2445)
        • enhance ConfigureMake generic easyblock to add support for building multiple build targets (PR #2449)
        • update custom easyblock for Boost to always build single and multi threaded versions (PR #2456)
          • included CMake modules are broken (and skipped) because we install Boost libraries multiple times
        • update CMakeMake to handle old and new Boost/Boost.Python builds using custom easyblock for Boost (PR #2461)
      • changes
        • (nothing major)
      • new software
        • new easyblock for NCCL (built from source) (PR #2337)
    • easyconfigs
      • bug fixes
        • improve check for multi-variant dependencies per generation of easyconfigs (PR #12687)
      • enhancements
        • (nothing major)
      • new software
        • (nothing major?)
      • software updates
        • SciPy-bundle with intel/2021a (PR #12964)
          • need to look into handful of failing tests...

2021a update of common toolchains

  • TODO: fosscuda/2021a
    • collapsing foss and fosscuda toolchains
    • see https://github.com/easybuilders/easybuild-easyconfigs/issues/12484
    • status? (Mikael)
    • we wouldn't have fosscuda/2021a anymore, just depend on an extra dependency (UCX-CUDA) for GPU Direct RDMA support
    • CUDA bundle that depends on CUDAcore + UCX-CUDA to use as dependency?
    • CUDA support should be reflected in versionsuffix?
    • Intel MPI probably doesn't even support GPU Direct?
    • upstream UCX issue opened, developers were open to suggestion of environment variable
    • currently hardcoded in UCX-CUDA easyconfig PR, could be better to have a custom easyblock for this?
    • CUDA 11.1.3 is not supported with GCC 10.3
      • known issues with GCC 10.3
        • patches already available for a compiler error (C++ template parsing error)
      • CUDA 11 is only compatible with GCC 9.x officially on x86_64

Q&A

  • custom AMD toolchain? (AOCC, AOMP, AMD BLIS, etc.)?
    • AOCC currently best compiler for CPUs, no GPU offloading support
    • AOMP comes with ROCm, behind on CPU optimizations
    • there's even a 3rd one (development only)
    • can we include both AOCC and AOMP in an AMD toolchain?
      • these compilers are not compatible for Fortran...
    • would be a CPU and GPU variant?
      • CPU: AOCC-based
      • GPU: AOMP-based + ROCm
  • Alexander: include GCC rather than binutils as build dependency
    • a lot easier, don't have to figure out which binutils version matched with GCCcore
  • Mikael: should be we be more careful with $CPATH when building binutils?
    • there may be some mixup there...
    • see problem reported by Mikael on having to build binutils 2.35.2 with system before the one with GCCcore