Conference call notes 20180228 - easybuilders/easybuild GitHub Wiki

(back to Conference calls)

Notes on the 96th EasyBuild conference call, Wednesday February 28th 2018 (5pm - 6pm CET)

Attendees

Alphabetical list of attendees (9):

  • Damian Alvarez (JSC, Germany)
  • Fotis Georgatos (Illumina, UK)
  • Balazs Hajgato (Free University of Brussels)
  • Victor Holanda (CSCS)
  • Kenneth Hoste (HPC-UGent)
  • Adam Huffman (Big Data Institute, University of Oxford)
  • Alan O'Cais (JSC, Germany)
  • Åke Sangren (Umeå University, Sweden)
  • Davide Vanzo (Vanderbilt University)

Agenda

  • update on upcoming EasyBuild v3.5.2 release
  • (very) early outlook to EasyBuild v3.6.0
  • best practices on Intel Skylake systems
  • Q&A

EasyBuild v3.5.2

  • TensorFlow easyblock
    • very important to build from source on CPU-only
    • virtually no performance gain on GPUs compared to binary release (K80, P100)
      • may be a different story on Volta GPUs (Adam can test this?)
  • ETA: end of this week
  • Victor: -ftree-vectorize by default (https://github.com/easybuilders/easybuild-framework/pull/2388)
    • not in v3.5.2, maybe in 3.6.0

EasyBuild v3.6.0

Best practices on Intel Skylake systems

  • Balazs
    • building software on Skylake with different versions of toolchain (except for intel/2018a)
      • intel/2016b, intel/2017a, intel/2017b
    • heterogenous setup
    • problems
      • sometimes compiler gets stuck in infinite loop when building on Skylake (cfr. https://github.com/easybuilders/easybuild-easyconfigs/pull/5915)
        • can be fixed with forcing AVX2 or -O1
      • Intel Compiler Error (ICE) when building with -O2 on Skylake
        • can be fixed with forcing AVX2 or -O1
      • sometimes compilation works, but resulting build produces NaN values (both with foss & intel)
      • compilation issues may be fixed with intel/2018a
    • recommendation is to use */2018a on Skylake if possible
      • Victor: GCC 5.4 in foss/2016b may not support AVX-512 yet (same for icc in intel/2016b?)
        • Damian: GCC 5.3/5.4 supports AVX-512 already, but maybe the binutils is the problem?
  • Molpro & VASP: can't compile even with AVX2, produces bogus results
    • Åke: VASP works fine even with AVX2 & AVX512
      • ScaLAPACK provided by MKL is a problem, should use netlib ScaLAPACK or even OpenBLAS in some cases
      • only way to pick an installation that produces right results is to test...
      • code usually reports problems when it notices something 'off', sometimes OK if it doesn't complain
      • main regression suite used is from Peter Larsson (see https://github.com/egplar/vasptest)
        • good starting point to test, results are scientifically correct
    • Victor (CSCS): also fine with AVX2/AVX512, in-house regtest doesn't show problems
      • or actually no :)
    • similar issues with related software: official binary produces wrong results on Skylake
  • can be starting point for a "Best practices" document for building software on Skylake systems...

Other