Meeting 2024 04 04 - openpmix/openpmix GitHub Wiki

April 4, 2024 OpenPMIx-devel call notes

Attendees

  • Ralph Castain
  • Rajat Bhattarai
  • Michael Karo (Altair)
  • Thomas Naughton (ORNL)
  • Aurelien Bouteiller (UTK)

Notes

  • OMPI PRRTE

    • Discussions to pull runtime environment back into OMPI
    • Part of the reason was to have closer controls/ownership of RTE in OMPI, but allow for experimentation for PMIx-Server
    • OMPI has cloned a point in time of PRRTE, with longer term maintenance being worked out over time
    • Focus of openpmix version of PRRTE being to support the more research experimentation aspects
  • OMPI “no-modex” option

    • Fell out while looking into hetergeneous collectives and found that "no-modex" was possibly removed
    • This would be problematic b/c this is helpful for workflows, and can help speed-up startup.
    • Turns out the option is still there, but the MCA params got renamed and are now MPI_xxx instead of PMIX_xxx but did not link them back to old option. So legacy mca param files were being ignored.
    • So a fix will be added to recognize those params
    • Question about the large-scale runtime issues (large node count)
    • Would be good to possibly revisit these things and do a scrub to see where unnecessary delays may exist
  • Group support redo

    • likely break xversion support for group ops - setup to return not-supported in that case
    • add bootstrap operation
    • fix data exchange
    • add operations to do new group add members when some have diff sized member info
    • Ralph has initial cut and plans to refactor before brining into repo, and when done could be brough into OMPI's connect/accept code with a simpler function call to help cleanup that group membership info
    • Docs will be included in the commit to provide better explanation
  • Conda package

    • request from Nvidia for Legate/Legion/Realm support
    • Add native PMIx support to Kubernetes
    • conda package requested to help help facilitate this work
    • see pmix-feedstock for new conda repo for PMIx+Kubernetes
  • Scheduler support

    • working on getting things finalized
    • working on paper