RMA WG 09 27 2018 - openshmem-org/specification GitHub Wiki
Agenda
Memory model update (Anshuman)
Committee meeting follow-up
Attendees
David Ozog, Jim Dinan (Intel)
Anshuman Goswami (Nvidia)
Naveen, Bob (Cray)
Shamis, Pavel (Arm)
Gorentla Venkata, Manjunath (Mellanox)
Min Si, Huansong Fu (ANL)
Grossman, Max (Rice U)
Notes
Dave’s proposal on wait-until-some and test-some APIs
(Jim) We could do a practice reading in the next RMA WG meeting.
(Min) Sharing PDF before the meeting would be great.
Naveen’s proposal on put-with-signal APIs
(Manju) We should not use union for both int64_t uint64_t.
(Pasha) No need to change to int64_t if not using increment, and union makes it complicated.
(Naveen) Plan to remove the strict qualifier. In future when we want to describe how dest and src buffer can overlap, we can just remove the current description for put-signal and add a generic one.
(Jim) We could do a reading about this in the next RMA WG meeting.
(Manju) Do not see the upside of combining the two current proposals as Naveen asks.
(Jim) We can do a special ballot or read the proposal again if the two are combined.
Anshuman’s ticket on correct pt-to-pt synchronization:
(Anshuman introduces the background of the ticket) The goal of the ticket is to agree on the list of APIs that are allowed to signal to wait-until and test for p2p synchronization.
(Pasha) Why MPI operation is excluded?
(Pasha) Should better distinguish atomic operations and single-copy put.
(Pasha) Different memory fabrics have different single-copy atomicity support.
(Anshuman) Should we allow A and C (see issue #248) to conflict?
(Pasha) Agree that B should be excluded.
(Manju) A needs single-copy atomicity.
(Pasha) More homework is needed on the network spec before putting single-copy guarantee on A.
(Jim) If we mix other operation with AMO on the signal memory location, the behavior should be undefined.
(Jim) We do not actually want to choose A/B/C/D to be used simultaneously since we already have spec that prevents the need to support that which says the behavior is undefined.
(Min) Why not let the implementation choose to optimize the current atomic_set if it wants single-put atomicity?
(Anshuman) Then atomic set may no longer be read-modify-write.
(Pasha) Single-copy atomicity is not really atomic.
(Jim) Should do a numerate on what combinations have well-defined behavior and see if a shmem_wait can deal with each of those.
(Jim) A primary concern with B is that network might write again to the same buffer before the operation actually completes, like the case of retransmission when CRC fails.
(Jim) An argument to keep A in the list is that A has been used for signaling for a long time.