Meeting 2021 09 02 - openpmix/openpmix GitHub Wiki

09/02/2021 PMIx call notes

Attendees

  • Michael Karo (Altair)
  • Charles Shereda (LLNL)
  • Ralph Castain (Nanook)
  • Thomas Naughton (ORNL)
  • Austen Lauria (IBM)

Notes

  • Issues raised in latest releases

    • Hit race condition in tests (squyres & castain tracked down)

    • Identified race - progress thread in prrte and pmix, and if hit tiny window could deadlock between the two

    • Added pmix fix, to be tested further

    • Then working on prrte's progress

    • Found few IOF issues related to debugger tests, signatures are out of date (need to figure out why signatures changed, what best to do)

    • One of output features not reporting deprecated, option getting rejected but not reporting why. Fixed.

    • Once these fixed will want to issue a bug-fix release

    • Q: Should these be fixed before ompi v5?

    • Yes, have fix for pmix side. Hope to have prrte fix shortly.

    • Rough plan for prrte rc late next week

  • Once these items resolved, Ralph will return to GPU distance/topo work. So likely will not make the next release.

  • Raise question on OpenPMIx version numbering

    • Currently, were roughly numbering mirrored the PMIx standard number
    • as we have subreleases for implementation, e.g., new features this would have skew between implementation and standard numbering
    • Example: gpu feature in openpmix, normally would go to openpmix-4.2.x but since it is supporting the PMIx v4.1, would be inconsistent.
    • Can look at #DEFINE in header to show what version of standard supported.
    • Since there's a defined way to get the standard version, seems like you'll be fine to do library number as needed/appropriate
    • Idea of having separation between standard versioning than library versioning
  • How can we help to keep efforts in Standard moving

    • How to keep pushing to keep active/in-sync w/ Implementation and avoid stagnation
    • Think about good ways to keep things vital
  • Python w/ pmix

    • mattb identified corner case with thread locking between python/pmix handlers. Sort of rare, but working to get nailed down and bring up to discuss/get feedback.
  • Next meeting in two weeks 16-sep