02 12 2020 - openucx/ucx GitHub Wiki

Participants:

  • Alex Margolin (Huawei)
  • Akshay (NVIDIA)
  • Artem Polyakov (Mellanox)
  • Devendar Bureddy (Mellanox)
  • Dmitry Gladkov (Mellanox)
  • Evgeny Leksikov (Mellanox)
  • Gil Bloch (Mellanox)
  • James Dinan (NVIDIA)
  • Ken Raffenetti (ANL)
  • Manjunath Gorentla (Mellanox)
  • Matt Baker (ORNL)
  • Sergey Lebdev (Mellanox)
  • Tony Curtis (SBU)
  • Valentin Petrov (Mellanox)

Discussion

Hierarchical vs Reactive

  • Alex discussed the slides in great detail (thanks!)
  • Hierarchical and Reactive approaches have some similarities but different enough to achieve different performance optimizations.
  • The collectives API should support interfaces for both hierarchical and reactive approaches to benefit various usage scenarios

Non-blocking vs Blocking

  • Blocking interfaces have performance impact (provide better latency) for some cases - based on the experience from Mellanox and NVIDIA.
  • Request for blocking barrier (from Akshay)
  • Given issues surrounding the progress and non-blocking interfaces’ ability to support blocking interfaces, the inclination is to only support non-blocking interfaces for now and add blocking interfaces to the wish list.

Reproducibility

  • Should this be a configuration option at the library level or team level? It seems like library level configuration should be enough.
  • NVIDIA to provide more input on their requirements

Next Meeting: Feb 26th, 2020

  • Potential agenda: Discussion of library initialization and local resource initialization.