Skip to content

Subsystem: RDMA memory management

Alex Forencich edited this page Jul 28, 2021 · 2 revisions

Protection domains, memory regions, and address translation are important components of the RDMA hardware stack. Since applications interfacing with the NIC operate in their own virtual address space, all pointers passed to the hardware must be translated to physical addresses. Additionally, all accesses, both local and remote, must be verified against the registered memory windows. See sections 10.2.3 and 10.6 of Volume 1 of the Infiniband specification (https://cw.infinibandta.org/document/dl/8567)

The memory management implementation will process all DMA transfer requests, verifying that each request and associated lkey/rkey falls within the target memory region and performing the appropriate address translation.

The current interface to the DMA engine looks like so:

input  wire [DMA_ADDR_WIDTH-1:0]            s_axis_read_desc_dma_addr,
input  wire [RAM_SEL_WIDTH-1:0]             s_axis_read_desc_ram_sel,
input  wire [RAM_ADDR_WIDTH-1:0]            s_axis_read_desc_ram_addr,
input  wire [LEN_WIDTH-1:0]                 s_axis_read_desc_len,
input  wire [TAG_WIDTH-1:0]                 s_axis_read_desc_tag,
input  wire                                 s_axis_read_desc_valid,
output wire                                 s_axis_read_desc_ready,

Fields for lkey/rkey, protection domain ID, and possibly queue pair number will have to be added to facilitate validating the target memory region.

It may make sense to split the implementation across two modules - the first to determine and validate the target memory region, the second to perform the address translation based on the target memory region. In this case, the first module will consume the lkey/rkey, protection domain ID, and QPN, and generate a memory region ID. The second module will consume the memory region ID and translate the DMA address. The first module will also enforce access permissions and return an error if the requested operation is not allowed.

It looks like it may be necessary to merge the read and write requests into a single interface; it may make sense to implement some sort of credit-based flow control scheme to reduce head-of-line blocking.

Phase 2

Phase 2 of the development process includes initial support for RoCEv2, storing state in on-FPGA memory. Limiting storage to only on-FPGA SRAM limits scalability, but it simplifies the implementation and keeping things self-contained can be useful for performance reasons (no cache misses if there is no cache) or in embedded applications (limited or no DRAM).

  • Memory region validation and enforcement
  • Address translation

Phase 3

Phase 3 of the development process includes more scalable implementations of the phase 2 components, storing state in on-host or on-card DRAM and caching state in on-FPGA memory. These changes will enable support for more and larger memory regions.

  • DRAM-backed caching of memory region information
  • DRAM-backed caching of address translation information