Skip to content

Corundum Roadmap

Alex Forencich edited this page Apr 8, 2022 · 9 revisions

Corundum is a high-performance, open-source, FPGA-based NIC. The intent of the project is to provide a high-performance reference NIC design that can be used as a foundation for various aspects of networking research.

This roadmap projects the development of Corundum through 2021. Features in the roadmap are broken down into four general categories: core hardware features, core software features, device support, and management features. Features are also classified as short term (6 months), medium term (1 year), or longer term (>1 year).

This roadmap is presented in two parts. The first part describes the short term and medium term plans based on what can be accomplished over the next year if only I work on developing the features. The last section describes features that may be useful, but I do not believe I will have the time and/or expertise to implement myself in the next year. The classification of various features may change depending on community interest and support.

Contributing to Corundum

We’re putting out a call for contributors! There is a lot to implement in Corundum; if you have experience working with FPGAs, linux kernel drivers, and/or DPDK and are interesting in getting involved, hop on the mailing list.

1. Core hardware features

Core hardware features are features of the main Corundum datapath and host interface. Core hardware features apply to all targets and all variants of Corundum.

1.1. Variable-length descriptors

Priority: high priority, short term

Status: preparing request for comments on descriptor format

Variable-length descriptor support opens the door for much more expressive descriptors, including improved scatter/gather support, metadata support, as well as inline headers and packets. Logic required to implement variable-length descriptor handling can also provide descriptor read batching, prefetch, and caching, which further improves PCIe link utilization.

Supporting variable-length descriptors will require a significant rewrite of the descriptor handling and queue management components. It will also require changing the descriptor and/or completion formats.

1.2. Metadata

Priority: high priority, medium term

Status: needs variable-length descriptors

Support for per-packet metadata enables efficient communication between hardware packet processing modules and host software. Variable-length descriptors provide the capability to exchange per-packet metadata with the host system, metadata support on the datapath provides this information to other components on the FPGA.

Supporting metadata requires variable-length descriptors as well as changes to the datapath modules to include metadata fields. New FIFOs may need to be developed to efficiently handle metadata, especially for designs that use narrower datapaths (10G and 25G).

1.2 Application section

Priority: high priority, short term

Status: initial implementation complete

Support for custom application logic integrated alongside the core datapath, providing access to packet traffic, DMA engine, and other resources.

1.3. Shared interface datapath

Priority: high priority, short term

Status: initial implementation complete, internal flow control is TODO

Move datapath logic from port module into interface modules. This is necessary for implementing more complex protocols such as RDMA, and it should also serve to reduce resoruce consumption for large port counts.

2. Device support

Device support includes support for different FPGA architectures. Support for new devices generally requires non-trivial interface changes to enable efficient operation with different PCIe IP cores and Ethernet interfaces.

Supporting new targets that utilize an already-supported FPGA architecture is usually straightforward for the core datapath, although management features can sometimes require additional work. As such, support for additional targets is not part of this roadmap. However, direct physical access is required in order to maintain corundum on each target, therefore new targets will not be considered for inclusion in the main Corundum codebase unless hardware is provided.

2.1. Intel device support (primarily Stratix 10 and Agilex)

Priority: high priority, medium term

Status: Operational at 10G on Stratix 10 MX dev kit (H-tile, -3 speed grade) at PCIe gen 3 x8. TODO: PCIe gen 3 x16 on H-tile, timing optimizations, P-tile, and Arria 10 PCIe IP core models and shims.

Intel Stratix 10 and Agilex can provide a PCIe gen 4 x16 interface to the host, double the bandwidth of the gen 3x16 or gen 4 x8 interfaces available on Xilinx UltraScale+ devices.

Implementation requires writing simulation models for Intel FPGA PCIe IP cores, creating variants of the DMA interface components that support the Intel FPGA PCIe interface, and creating additional modules for handling PCIe TLPs (FIFOs, multiplexers, demultiplexers, etc.).

Updated on 11/7/2021: Operational at 10G on Stratix 10 MX dev kit (H-tile, -3 speed grade) at PCIe gen 3 x8

2.2. Zynq MPSoC

Priority: medium priority, medium term

Status: Done

Support running Corundum on Zynq (and possibly other SoC devices) with the internal hard CPU cores as the host, instead of connecting to a host over PCI express.

Implementation requires modifying the drivers to operate as platform device drivers in addition to PCI device drivers, writing new DMA interface components that support AXI instead of PCIe, as well as possible organizational and/or build system changes to support the IPI flow and petalinux.

Update on 4/8/2022: Driver updated to add platform device support, AXI DMA interface modules integrated, and an example design for ZCU106 added along with petalinux build automation.

3. Management Features

Management features are features that are not part of the main Corundum datapath, but ease managing scaled-up deployments. These features are generally much more target board dependent and will not necessarily be supported on the same level on all target boards.

3.1. I2C interface to transceivers

Priority: high priority, medium term

Status: Currently working on several boards, needs driver work on ExaNIC X10 and X25 and ADM-PCIE-9V3, blocked on BMC support on AU50 and AU280

I2C access to optical transceivers is a useful feature for diagnostics, but it is required to change certain transceiver settings, such as controlling module CDRs when running a 100G or 25G module at 10G. If possible, the driver should be able to perform arbitrary I2C operations on all optical modules. This interface is partially implemented for most supported boards, but needs both hardware and driver work. Only direct bit-bang I2C is currently supported, boards that expose this functionality via other means require additional work.

3.2. Persistent MAC address storage

Priority: high priority, medium term

Status: Working on most boards

Persistent MAC addresses are very convenient when deploying at scale. When possible, the driver should be able to read a MAC address or set of MAC addresses from all target boards. Currently implemented for most supported boards, but also needs some work. Only I2C EEPROM support is currently implemented. Boards that expose this information via other means require additional work.

Updated on 2/1/2021: Alveo BMC now supported for persistent MAC addresses.

Updated on 3/4/2021: Silicom Gecko BMC now supported for persistent MAC addresses.

3.3. In-band firmware updates

Status: Working for most Xilinx boards

Firmware updates over PCI express are very convenient when deploying at scale. If possible, the driver should be able to read and write the configuration flash on all target boards. Currently works on all current boards that provide direct flash access.

4. Long-Term Features

Long-term features are features that are not currently a priority or features that will require additional development support beyond the one-year time horizon of this roadmap. These features can be elevated to a higher priority given sufficient community support.

4.1. Hardware virtualization (SR-IOV) (core hardware feature)

SR-IOV (single-root IO virtualization) is useful both for virtualization applications and for mixing use of the normal linux kernel driver other drivers, including DPDK drivers.

Supporting SR-IOV will require several changes to Corundum, including tracking the PCIe function associated with each operation, the ability to assign queues to PCIe functions, and the ability to restrict access to various control registers based on the PCIe function. Replacing or extending the AXI-lite infrastructure modules will likely be required to add the necessary sideband data.

4.2. Kernel driver optimizations (core software feature)

The current linux kernel driver has performance issues running at 100G line rate with frames significantly smaller than 9KB jumbo frames. There are several optimizations with respect to memory management that should be explored to improve the driver performance including receive scatter/gather and receive page re-use.

4.3. DPDK driver (core software feature)

For the highest possible performance and flexibility, a userspace driver and/or DPDK PMD should be developed. A DPDK PMD would enable the use of userspace networking frameworks, permitting access to the entire stack.

4.4. Switchable 10G/25G interfaces (management feature)

Convert all 25G designs to 10G/25G switchable designs. Enables mixing and matching 10G and 25G interfaces. Requires support for transceiver resets and reworking of the transceiver clocking and reset infrastructure, as well as driver support.

4.5 XDP (core hardware/software feature)

Support XDP in the device driver. Most likely will require some additional hardware as well, particularly some form of flow steering.

4.6 Descriptor-inline data (core hardware feature)

Support packet payload data inlined in descriptors, either whole packets or packet headers. Important for LSO and for efficient handling of small packets. Requires support for variable-length descriptors.

4.7 Large send offload (core hardware feature)

Add support to descriptor handling logic for LSO. Requires support for variable-length descriptors and descriptor-inline data.