Conference call notes 20181212 - easybuilders/easybuild GitHub Wiki

(back to Conference calls)

Notes on the 116th EasyBuild conference call, Wednesday Dec 11th 2018 (17:00 - 18:00 CET)

Attendees

Alphabetical list of attendees (4):

  • Damian Alvarez (JSC, Germany)
  • Kenneth Hoste (HPC-UGent, Belgium)
  • Bart Oldeman (ComputeCanada)
  • Davide Vanzo (Vanderbilt University, US)

Agenda

  • updates on upcoming EasyBuild v3.8.0
  • an easyblock for OpenMPI?
  • Q&A

Outlook to EasyBuild v3.8.0

Easyblock for OpenMPI

  • now easyconfigs only, lots of hardcoding
  • recent OpenMPI versions have better support for auto-detecting what is there
    • or even for fat builds
  • recent OpenMPI versions prefer UCX/libfabric
  • rely on auto-detection that OpenMPI configure does?
    • with dedicated easyconfig parameters for things like ibverbs, ucx, torque/slurm, ...
  • reach out to OpenMPI community about this?
  • Bart: benchmarking: UCX is overall winner + hcoll for collective communication
    • also: SHARP, but that needs a dedicated server (only beneficial with largish jobs, 20 nodes/800 cores)

Other

  • Damian: lots of problems with collectives on latest Intel MPI with large jobs (1000 cores)
    • Bart: 2019 versions have dropped everything but libfabric (OFI)
    • Damian: also similar problems with 2018 versions (w/ default fabric, DAPL?)
    • system isn't very special; JUWELS (Skylake + IB), JURECA (Haswell + IB)
    • in some collectives it hangs, sometimes segfaults, ...
    • Bart: reproducible with OSU microbenchmarks?
    • Bart can look into reproducing this on upcoming new system
    • relevant for upcoming intel/2019a
  • Davide: frequent question at SC18
    • which module naming schemes should be used?
    • how to experiment with multiple module naming schemes?
    • currently require multiple 'eb' runs:
      • first run using one module naming scheme and using --fixed-installdir-naming-scheme
      • subsequent run(s) using other module naming scheme(s) using --module-only
        • main issue here is that --module-only is not perfect, and you may run into trouble here
    • support for configuring EasyBuild to install module files with multiple different naming schemes in one go should not be too difficult (famous last words...)
  • Kenneth plans to make some (good) progress on porting EasyBuild to Python 3 during the holidays
    • plan is to ingest whatever parts of vsc-base are needed by EasyBuild into the EasyBuild framework repository
    • porting vsc-base to Python 3 is likely going to take too long
      • too much impact on system scripts in HPC-UGent, so changes to vsc-base have to be done with great care
    • there's also very little development going on in vsc-base, so ingesting it kind of makes sense
    • having a single code base to worry about when porting to Python 3 should help a lot with making actual progress