UCX Memory management - openucx/ucx GitHub Wiki

Inter-process memory-to-memory communication channels

| Host | GPU

----|------|---- Host| mmap,cma,knem,xpmem, verbs | verbs/cudaIPC GPU | verbs/cudaIPC | verbs/cudaIPC

Same node:
- CPU<->CPU - mmap,sysv,cma,knem,xpmem
- GPU<->GPU on same node - cudaIPC + host assisted pipelined cudaMemcpy
- GPU<->CPU on same node - same as above
Different nodes:
- CPU<->CPU - IB/Cray/TCP/...
- GPU<->GPU - IB/... - GPUDirect/host assisted pipelined copy
- GPU<->CPU - IB/... - GPUDirect/host assisted pipelined copy

Open Question:

How do we support GPU Direct capabilities for all the above combinations. The idea is to come up with a modular design that abstracts the difference between different types of memory.
Managed Memory has been introduced in CUDA 6.0, and there will be HW support in Pascal to transparently treat malloc-ed memory as GPU memory. Need to investigate fall-out effects.