Conference call notes 20190306 - easybuilders/easybuild GitHub Wiki

(back to Conference calls)

Notes on the 120th EasyBuild conference call, Wednesday Mar 6th 2019 (17:00 - 18:00 CET)

Attendees

Alphabetical list of attendees (7):

  • Damian Alvarez (JSC, Germany)
  • Fotis Georgatos (SDSC, Switzerland)
  • Victor Holanda (CSCS, Switzerland)
  • Kenneth Hoste (HPC-UGent, Belgium)
  • Mikael Öhman (Chalmers University of Technology, Sweden)
  • Bart Oldeman (ComputeCanada)
  • Davide Vanzo (Vanderbilt University, US)

Agenda

  • update on next EasyBuild release
  • update on porting of EasyBuild to Python 3
  • update Python @ GCCcore
  • Q&A

Update on next EasyBuild release

Update on porting of EasyBuild to Python 3

Update on Python @ GCCcore

  • cfr. https://github.com/easybuilders/easybuild-easyconfigs/issues/7463
  • PR for Python 2.7.15 @ GCCcore: https://github.com/easybuilders/easybuild-easyconfigs/pull/7821
    • test reports welcome!
  • naming/versioning for bundle of scientific Python packages on top of Python @ GCCcore?
    • name:
    • SciPy?
      • Bart: clashes with scipy, could be confusing
      • SciPyBundle is better?
    • version:
    • 2019a?
    • 2019.03? better option
    • Davide: include additional extensions like wheel, Jinja2, etc.
      • Kenneth already included pytest
      • Mikael: also pkgconfig?
      • Damian: see JSC easyconfigs
  • currently: numpy, scipy, mpi4py, pandas, mpmath
    • include more?
      • matplotlib, scikit-learn, numexpr, ...
        • Mikael: need numexpr easyblock to build it with VML when building with Intel
      • how do we decide what goes in and what doesn't?
    • Davide: provide 'wrappers' for direct loading of numpy/scipy/...
      • numpy that loads (hidden) SciPy, etc.
      • could ship these easyconfigs in central repository as well
      • but only depend on SciPy bundle in other easyconfigs (+ Travis check)
  • TODO: documentation
  • benchmarking
    • tested modules:

      • (classic) Python/2.7.15-intel-2019a
      • (foss) SciPy/2019a-foss-2019a-Python-2.7.15
      • (intel) SciPy/2019a-intel-2019a-Python-2.7.15
      • (intel_LD_PRELOAD) SciPy/2019a-intel-2019a-Python-2.7.15 + $LD_PRELOAD of libimf.so
    • Intel Skylake (idle nodes, Intel Xeon Gold)

      |                              | *(classic)* | *(foss)*       | *(intel)*       | *(intel_ld_preload)* |
      |------------------------------|-------------|----------------|-----------------|----------------------|
      | 5k x 5x numpy.dot            | 211.33ms    | 291.66ms (72%) | 211.33ms (100%) | 211.33ms (100%)      |
      | numpy.sin                    | 8.73s       | 36.01s (24%)   | 35.95s (24%)    | 8.77s (99.6%)        |
      | numpy.cos                    | 8.73s       | 34.87s (25%)   | 35.18% (25%)    | 8.82s (98.9%)        |
      | numpy.tan                    | 48.62s      | 48.37s (100%)  | 48.96s (99.3%)  | **10.93s (444%)**    |
      | numpy.exp                    | 165.33ms    | 382.33ms (43%) | 382.66ms (43%)  | 165ms (100%)         |
      | fft.py (5k)                  | 1.15s       | 0.96s (120%)   | 1.16s (99%)     | 1.15s (100%)         |
      | ibench fft (large)           | 26.17s      |                | 26.22s (100%)   |                      |
      | ibench blacksch (large)      | 21.14s      |                | 32.44s (100%)   |                      |
      | ibench sklearn  (large)      |             |                |                 |                      |
      
    • observations:

      • huge speedup for numpy.tan when preloading libimf.so: ~4.5x
        • even when preloading on top of Python/2.7.15-intel-2019a
        • numpy.tan is somehow using libm.so rather than libimf.so?!
        • contradicts results for numpy.cos, numpy.sin, numpy.exp
      • FFT benchmarks slightly faster with foss
      • $LD_PRELOAD of libimf.so always works for same or better performance compared to classic approach
        • but do we want to use $LD_PRELOAD by default in SciPy/2019a-intel-2019a-Python-2.7.15?
          • Bart: downside is that it affects everything...
          • Damian: need to be very careful also w.r.t. numeric precision
          • Damian: numpy.sin & co are just loops over scalar operations, should not be used in performance-sensitive code
            • Mikael: numpy should be using VML for this, but they're not
            • better performance when using numexpr
          • Mikael: could consider defining an alias for "fast" python with $LD_PRELOAD set?
          • Damian: no complaints w.r.t. performance issues
            • except one case, which is probably related to numba
          • still unclear whether these performance differences are actually relevant for real-world applications
            • software that heavily depends on numpy.sin & co is doing it wrong...
    • TODO:

      • Damian: look into precision when preloading $LD_PRELOAD libimf.so
        • Kenneth will run numpy/scipy test suites with libimf.so preloaded

Other

  • Davide: requirement for versionsuffix that have -Python-* in it when Python is a runtime dep

    • worth requiring this for easyconfigs that have an indirect dependency on Python?
    • example: TopHat depends on Boost which can be built with Python support
      • Boost.Python is taking care of that in recent easyconfigs/toolchains
    • example: Mesa has a (build) dep on Mako
      • not a runtime dep :)
    • Kenneth: if (a specific version of) Python is a runtime dep, it should be reflected in the versionsuffix
    • should enhance existing Travis check to also take into account indirect runtime deps
    • should also have -Java in versionsuffix when Java is a runtime dep with dummy toolchain
      • for other toolchains, Java version is implied by toolchain version, since we stick to a single Java version per toolchain generation (e.g. the Java/1.8 wrapper)
  • Mikael: Python 3.6 or 3.7 for 2019a?

    • needs more testing
    • problems with TensorFlow and Python 3.7 are resolved, need to look into other packages