UCP flows draft - openucx/ucx GitHub Wiki

boostrap flow:

worker_get_address

GOAL: find boostrap transport in each "domain" and use the one with lowest latency.

  1. bootable transport holds the following:

    1. supports connect_to_iface
    2. supports short active message
  2. For each device, select bootable transport with lowest latency Assumption: if a remote device is reachable with one transport, this remote device is also reachable with other transports which run on the same local device.

  3. if a transport is selected more than once, use tl->iface_is_reachable() to eliminate devices which are on the same network.

  4. return a blob which contains records like this:

    • transport name
    • performance attributes: latency,bandwidth
    • iface address

ep_connect

GOAL: find best method to reach the destination address, for each of: short_am, long_am, rma, amo.

  1. find the boostrap method (will not neccesarily be used):
    for each local resource:
    for each remote address:
    if reachable and can-connect-to-iface:
    compute-score
    save-the-combination-which-had-best-score-amongst-which-can-connect-to-iface => "booststrap"

  2. find methods for each class:
    for "class" in [short_am, long_am, rma, amo]:
    for each local resource:
    for each remote address:
    if reachable:
    compute score based on local and remote capabilities / performance attributes.
    use-the-combination-which-has-the-best-score
    if the selected resource can connect to iface address:
    connect directly
    else
    boostrap[class] = selected-resource (local-index & remote-address)
    need-boostrap = 1

if need-boostrap:
start-boostrap

boostrap process

GOAL: exchange endpoint-specific address

  • send active message to remote-address: - my iface address is X - my ep address is Y

  • remote also sends: - my iface address is X - my ep address is Y

  • when getting a message from remote side:

    • search in hash table, key is: iface address
      • if exists: connect

ucp problems - tag matching

  • need to pass sender ep id from sender to receiver so the receiver could send replies:
    • rndv protocol
    • credits
    • cancel protocol
    • active message based protocols

options: - pass ep_id as part of the protocol for relevant messages - will it be needed for RMA/AMO emulation as well? yes - need to know to whom send - maybe could avoid TID for RMA/AMO, assuming order? - need TID+Sender only for: GET, FADD, SWAP, CSWAP

- pass uct_ep_h to AM callback
    - could be temporary handle
    - ud - already have it. rc - fast lookup. dc - need to build AH.
    - for dc - need to pass DCT number as immediate, ah could be created from LID.
    - could have "send" flag as to whether the transport should pass "sender handle" to am.