Meeting 2019 12 12 - openpmix/openpmix GitHub Wiki

  • PMIx v3.1.5
    • Pick up a number of fixes including memory
    • https://github.com/openpmix/openpmix/pull/1554
      • Artem has concerns if this was completely correct
      • The resource manager is responsible for deregister_namespace that it actually cleans up the namespace. RM responsible for tracking 'connected' and deregistering when appropriately disconnected in the session.
      • The PMIx server does not have internal security protections to isolate two users allocations from each other. If the RM requires this isolation then they need to create two PMIx server instances.
        • It's the responsibility of the system to do this
    • Tools support issue?
      • Already in 3.1.4 so would be no worse in a 3.1.5
      • No known problem at this time - but concern is that we need more exercise of this feature. Open MPI will exercise it, so maybe use that as a test vector.
    • Possible Blockers?
      • Double check the memory leak issue resolution (Dave/Artem to discuss the situation)
      • Multiple init/finalize - would like to see this fixed if possible
      • Bring in any tool related fixes that are ready.
        • Would like to fully exercise the tools as much as possible for this release.
      • Static 'get' -- is this targeted to v3.x or v4.x. Probably v4.x only.
    • Josh to roll an rc1 this week - note that further fixes may be coming.
      • Target release: Decide next week if we release before the end of the year or not.
  • Multi-server make check
  • v3.2 rebranch
    • Mellanox to check if they have a near term driver for this.
    • Not critical - can wait on the dstore optimizations in v4.0 (1H'2020)
    • For now we will hold v3.2 - if situation changes let us know
  • v4.0.x progress
    • Working through testing tool support (similar to discussion above for 3.1.5)
    • Python bindings coming along.
    • 2-3 months away from release
  • Standard clarification
    • Persistence option for publishing data
      • Persist 'app' and 'session' - should there be a 'job'?
        • Persist 'app' - one app in job may die then data is not available
        • Persist 'job' - once job dies then data goes away (should be default)
        • Persist 'session' - once session goes away then data goes away
  • PRRTE