WeeklyTelcon_20160315 - ICLDisco/ompi GitHub Wiki
Open MPI Weekly Telcon
Dialup Info: (Do not post to public mailing list or public wiki)
Attendees
Jeff Squyres
Geoff Paulsen
Brad Benton
Howard
Josh Hursey
Nathan Hjelm
ralph
Ryan Grant
Sylvain Jequgey
Todd Kordenbrock
Yohann Burette
Agenda
Review 1.10
Milestones: https://github.com/open-mpi/ompi-release/milestones/v1.10.3
PR 1004 - MPI_Ineighbor_alltoallw
PR 1002 - Memory allocation hooks
PR 1006 - Ralph will review.
PR 1008 - Jeff already reviewed.
Anyone testing on PSM2 or Omnipath? Intel guys have been. Howard - Ralph check this.
Symbol confusion fix requires 1.10.2, and need newer Omnipath (PSM2) library/driver.
1.10.3 in Late April - Some smaller fixes accumulating in the branch
Nothing critical at the moment
Review 2.0.x
Wiki: https://github.com/open-mpi/ompi/wiki/Releasev20
Blocker Issues: https://github.com/open-mpi/ompi/issues?utf8=%E2%9C%93&q=is%3Aopen+milestone%3Av2.0.0+label%3Ablocker
Issue 1425 - External PMIx server support
Ralph working on a fix - should be quick.
Issue 1418 - fix MPI process suicide code
Issue 1406 - TCP BTL THREAD_MULTIPLE deadlock
Nathan - George is working on a fix, but it is a rewrite. So might take some time.
If old and new rewrite of TCP BTL are "compatible", then we can switch based on threaded state of MPI_Init.
OR could require two different TCP components "tcp" / "tcpmt", and expose this issue to users.
Issue 1353 - -host behavior
Ticket has been updated with new commits. Jeff to test.
Ralph is out of time to work on.
Need to document behavior in different releases, then close this for v2.0.0.
Milestones: https://github.com/open-mpi/ompi-release/milestones/v2.0.0
PR 1003 - Race condition in process matching thread
Nathan to review the patch and sign off.
PR 1000 - misc warnings and missing include files
Ralph to review and update
PR 977 - span on heterogeneous clusters
Ralph to review and update
PR 973 - Parsing of envvars in MCA
Nathan pushing update, Jeff to review after the call.
Reviewed all Pull Requests, and pinged a few for comments.
Really need more people doing testing on v2.0 branch
Decision to shoot for April 24th for RC0 of 2.0.
Jeff saw someone is getting OPAL-FIFO is failing on 2.0.
Need more thread safety testing.
Jeff added his thread safety tests are a 2 night cycle to do all the tests.
Jeff seeing SIGPIPEs ONLY on master with usNIC - Ralph wonders if it's valgrind issue Mellanox was seeing.
Misc
MPI Forum - MPI_Info under discussion
Don't propagate infos with MPI_Comm_dup - use MPI_Comm_dup_with_info to propagate infos
Still discussion about MPI_Info_get/MPI_Info_set behavior
Open MPI Developer's Meeting
Nathan - Enabling thread_multiple all the time
PR 1397 - always enable MPI_THREAD_MULTIPLE support
Should we turn this on for everyone? Generally feeling is to accept this
Send another note to the devel list to give folks one last change to comment before commit.
Need to do some performance testing
v2.0.0 better MPI_THREAD_MULTIPLE correctness
focus on performance improvements in next v2.X series release
Review Master?
https://github.com/open-mpi/ompi/pull/1417 : "RFC: change default build to always be optimized (even for developers)" If no one has any further comments, it's time to merge.
Jeff thinks we are in consensus, but wants to check with developers.
To turn on --enable-debug, --enable-memdebug, --enable-picky.
Nathan - heads up that mpool re-write is ready to go.
will get merged in afternoon today.
MTT status:
Really need more people doing testing on v2.0 branch
Status Updates:
Status Update Rotation
Cisco, ORNL, UTK, NVIDIA
Mellanox, Sandia, Intel
LANL, Houston, IBM
🗂️ Page Index for this GitHub Wiki