Jeff's cluster had some timeouts on 10.10 network (no route to house), maybe cluster config. Ignore for now. Weird that no route should FAIL, not timeout. TCP BTL. Maybe multirail issue. inf. loop rather than fail.
Don't block 1.10 release since probably not common use case.
Paul found some issues.
NAG fortran support configuree isn't right.
NOT a regression (in 1.8). Do we care if this is a blocker?
mpirun is hanging after good run. Only in SLES. Also Cray (uses SLES).
proc is defunct / zombied.
IO Forwarding file descriptors may not be getting HANGUP.
Not a Regression, 1.10, 2.0, master. ORTED in event library. Pretty serious, but hoping it's just SLES issue.
Set state machine verbosity to 5.
Nathan will look at SLES 11 and SLES 12 (Different kernels even, very different)
Try to find if it's sigchild or file descriptor
Hold 1.10.2 release until Nathan runs tests today.
Network attomics are not neccisarily visible / interchangable with CPU atomics.
progress issue. Nathan proposed to add a decay function to progress function dispatch to naturally let components that are not progressing anything, lower in priority.
Review Master
General Discussion
Debian/Ubuntu package support
Ubuntu doesn't have a maintainer anymore for Open MPI. Packaged not officially "orphaned"
Then when it gets adopted, we could adopt it. Nathan has been
Old maintainer has a repo, and a bunch of patches, which no one in community has ever looked at.
Sent directions to Ralph on his directions, but quite complex.
Send request that the package get correctly orphaned.
Geoffrey Vallee willing to pickup official maintainer.
MTT status:
Status Updates:
Cisco - nothing OMPI specific to report. Please go sign up for face to face on wiki.
ORNL - MTT - running. Announced today that they'd be picking up Debian Package maintance of Open MPI.
NVIDIA - got MTT back to normal or close to that.
Couple of things failing when enabling GPU Direct RDMA. Has something to do with Atomic operations.
Can turn off atomic operations via MCA parameter. Look at bit flags in OMPI_INFO BTL openib
Turn off the Fetching ops and atomic ops (find bit values, calculate new flags without bits and reset)