Repository Overview - l-nic/chipyard GitHub Wiki

This page provides an overview of the nanoPU source code repositories. One high-level comment is that the source code refers to the nanoPU as L-NIC or Lightning NIC, which was the initial name of the project. We felt that calling the design a "NIC" was misleading because it is really a NIC-CPU co-design. We changed the name to "nanoPU" in order to include the processor design. The name "nanoPU" is supposed to be short for nanoservice processing unit, where nanoservices are essentially extremely fine-grained microservices that process network requests in <1us.

The following sections provide an overview of the main repositories used to build and evaluate the nanoPU prototype.

Chipyard

Chipyard is the top-level repository which contains the other repositories listed below as git submodules. Here are a few important directories in the chipyard repo:

chipyard/
|-- tests-lnic/ # nanoPU application code
|-- tests-icenic/ # Traditional NIC (i.e., IceNIC) application code
|-- sims/
|    |-- verilator/ # Verilator simulation infrastructure
|    |-- firesim / # Firesim repo
|-- software/ # Additional nanoPU application code as well as other useful software
|    |-- net-app-tester/ # Python unit testing framework for Verilator simulations
|    |-- firemarshal/ # FireMarshal repo
|-- generators/ # Chisel source code, mostly git submodules
     |-- lnic/ # L-NIC repo
     |-- icenet/ # IceNet repo
     |-- rocket-chip/ # Rocket-chip repo

See the Chipyard documentation for detailed information about the platform and its usage.

Rocket Chip

The Rocket-Chip repo contains all of the Chisel source code for the modified RISC-V Rocket core. Chisel is a domain specific language for constructing hardware circuits and is embedded in Scala. Here is an overview of the relevant files:

rocket-chip/src/main/scala/rocket/ # All of the Chisel code for the Rocket core
|-- RocketCore.scala # Modified rocket core pipeline, including modified register file read/write logic
|-- CSR.scala # Modified CSR File which instantiates the nanoPU's local TX/RX network queues
|-- LNICQueues.scala # Implements the nanoPU's local TX/RX network queues and thread scheduler
|-- LNICUtils.scala # Helper utilities

See the Chipyard documentation for more information about Rocket-Chip and the Rocket core.

L-NIC

The L-NIC repo contains the Chisel source code that implements the nanoPU's NIC. A block diagram of the NIC architecture is shown below:

images/nic-arch.png

On the left, the NIC is connected to the external network. On the right, the global RX/TX queues send messages to / receive messages from the CPU cores. The elements in the architecture interact by passing packets, messages, as well as metadata (a.k.a. data plane events). The architecture is designed with the intention of providing programmable support for transport protocols. By programming the modules shaded in green (ingress pipeline, egress pipeline, and packet generator), a developer can implement different transport protocols. Currently, these modules must be programmed in Chisel, but in the future we hope to enable P4 programmability using Xilinx SDNet. To explain the architecture details, we will walk through the processing on both the TX and RX paths.

On the TX path, message words transmitted by applications are loaded into the global TX queues, where they are buffered in per-application queues. Those messages are then passed to the packetization module as buffer space becomes available. The packetization module is responsible for splitting application messages into data packets, as well as maintaining a few important state variables that are used for reliable delivery and congestion control: delivered, toBtx, and credit. The delivered state tracks the packets of each message that have been successfully delivered to the destination. The toBtx state tracks the packets of each message that still need to be transmitted (or retransmitted) to the receiver eventually, and the credit state tracks the packets that are currently eligible for transmission. Inspired by Tonic, these state variables are implemented as bitmaps to efficiently store one bit of state for each packet in each message. Upon receiving the first word of a message, the packetization module will allocate and initialize each of these state variables as well as trigger an event to initialize a timer for the message. After receiving either the full message or a maximum transmission unit (MTU) of data, a packet descriptor will be enqueued into an internal scheduling module.

Upon receiving control packets from the peer, the programmable ingress pipeline triggers events containing metadata and instruction opcodes to update the delivered, toBtx, and credit state. When a message's credit increases or a packet retransmission is requested, the corresponding packet descriptors are scheduled for transmission within the packetization module. When a message timeout occurs, the packetization module will attempt to identify and schedule any packets that need to be retransmitted. Once all packets of a message have been successfully delivered to the peer, the corresponding message state in the packetization module is freed and the message timer is cancelled.

The arbiter schedules between data packets and control packets produced by the packetization and packet generator modules, respectively. Control packets are scheduled with higher priority in order to provide a low-latency feedback loop for the congestion control algorithm. The programmable egress pipeline consumes packet metadata and generates the appropriate Ethernet, IP, and transport headers for outgoing packets which are then sent into the network.

On the RX path, network packets are first processed by the programmable ingress pipeline. This module parses packet header fields and drives the congestion control logic by triggering data plane events that are processed by other modules in the architecture. For arriving data packets, the ingress pipeline will fire the get rx msg info event which is processed by the assembly module. If the assembly module determines that this is the first packet of a new message, it will attempt to allocate sufficient buffer space for the whole message. Upon success, it returns a unique message identifier, which can be used by the ingress pipeline to maintain state associated with the message. If the assembly module fails to allocate a buffer for the message, the packet is dropped. The ingress pipeline can also be configured to trigger an event that causes the packet generation module to generate and transmit custom control packets. These control packets can be used to, for example, indicate successful delivery of a packet or elicit a retransmission from the peer.

The assembly module reassembles data packets, which might arrive out-of-order, into application messages. Once the final packet of a message is received, a message descriptor is scheduled for delivery to the global RX queues. The global RX queues then load balance messages across cores using the Join Bounded Shortest Queue (JBSQ) policy on a per-application basis.

The Chisel source code at lnic/src/main/scala/ implements this architecture and programs it to deploy the NDP transport protocol.

Firesim

The Firesim repo provides all of the infrastructure that is required to run FPGA-accelerated, cycle-accurate simulations on AWS. Here is a brief overview of the relevant files:

firesim/
|-- deploy/workloads/ # Contains the config files used to run firesim evaluations
|-- target-design/switch/ # The cycle-accurate C++ switch model / load generator used for our evaluations.

See the Firesim documentation for detailed information about the platform and its usage.

Fire Marshal

FireMarshal is a separate repo that is used by Firesim to build and configure workloads. The nanoPU workloads are defined in firemarshal/lnic-workloads.

See the FireMarshal documentation for detailed usage information.