Firesim Simulations - l-nic/chipyard GitHub Wiki
Firesim is a platform for running FPGA-accelerated, cycle-accurate simulations on AWS. The simulations run orders of magnitude faster than a software-based, cycle-accurate simulator such as Verilator.
This page describes one of our Firesim topologies as well as how to reproduce our results.
Firesim Topology
The figure below shows the topology used for most of our Firesim evaluations.
It is a simple topology that only requires one f1.2xlarge
instance, which has a single VU9P FPGA.
The simulated nanoPU runs inside the Firesim infrastructure on the FPGA.
The Firesim infrastructure enables us to tune parameters such as simulated network link latency and it ensures that the simulation is cycle-accurate.
The FPGA simulation is connected to a cycle-accurate, C++ load generator running on the host CPU via a simulated network link over PCIe. The load generator generates requests for the nanoPU and measures the end-to-end response time (in terms of simulated clock cycles). Firesim allows us to simulate processing of tens of thousands of requests in just a few minutes.
The general simulation flow is as follows:
cd ~/chipyard/sims/firesim/deploy
EXP_CONFIG=workloads/lnic-evaluation/config_<experiment-name>_runtime.ini
firesim -c $EXP_CONFIG launchrunfarm
firesim -c $EXP_CONFIG infrasetup
firesim -c $EXP_CONFIG runworkload
- Wait for the simulation to complete.
firesim -c $EXP_CONFIG terminaterunfarm
While waiting for the simulation to complete, you can ssh into the F1 instance from your manager instance and attach to either the simulated nanoPU's console or the load generator's log file:
ssh F1_INSTANCE_IP
screen -r fsim0 # attach to the simulated nanoPU's console
screen -r switch0 # attach to the load generator's log file
We have created a simple script called run-nanoPU-sims.sh that runs all the firesim simulations sequentially so that all of the above commands do not need to be entered manually.
Description of Simulations
Microbenchmark Evaluations
For our microbenchmarks, we use a single RISC-V program called lnic-evaluation.c. Depending on the test, this program will configure the nanoPU with the appropriate number of threads running on each of the four cores, at the appropriate priority level. The load generator generates requests and measures the end-to-end response time as it gradually increases the offered load on the system.
As described in the paper, we perform three microbenchmarks:
- Priority Thread Scheduling - Compares the tail response time when using the nanoPU's hardware thread scheduler against a simple timer-interrupt driven thread scheduler.
- Bounded Message Processing Time - Demonstrates the benefit of bounding the message processing time of high priority applications.
- Core Selection - Demonstrates that the nanoPU's NIC is able to efficiently load balance requests across cores.
The config files used for these Firesim simulations are located here.
MICA
This artifact enables you to run the MICA key-value store on the nanoPU and evaluate its tail response time under load. The nanoPU MICA program is called lnic-multi-core-mica.cc. The paper also includes a comparison against MICA running on the traditional system (IceNIC). These results can be obtained using the standard chipyard / firesim repos and we do not include this simulation in the artifact.
The config file used for the MICA Firesim simulation is called config_lnic_mica_runtime.ini.
Set Intersection
The set intersection operation is commonly used in information retrieval systems. The lnic-intersect.cc program implements this operation on the nanoPU. The paper also includes a comparison against this application running on the traditional system (IceNIC). These results can be obtained using the standard chipyard / firesim repos and we do not include this simulation in the artifact.
The config file used for the Set Intersection Firesim simulation is called config_lnic_intersect_runtime.ini.
Running the Simulations
Setup
First ssh
into your manager instance if you have not already done so:
ssh -i firesim.pem -L 8888:localhost:8888 centos@YOUR_INSTANCE_IP
Note that we are forwarding port 8888 over the SSH connection so that we can access a Jupyter notebook server that we will run later.
We highly recommend using a tmux
session for the following commands to ensure that simulation results are not lost as a result of a flaky network connection to your manager:
tmux new -s firesim
Set up the environment:
cd ~/chipyard/sims/firesim
source sourceme-f1-manager.sh
Next, compile the RISC-V test programs if you have not already done so:
cd ~/chipyard/tests-lnic
make
cd ~/chipyard/tests-icenic
make
The set intersection application is located in a different directory. Make sure that is compiled too:
cd ~/chipyard/software/set-intersection/
make
Launch the F1 instance that we will use for our firesim simulations:
cd ~/chipyard/sims/firesim/deploy/
firesim -c workloads/lnic-evaluation/config_lnic_scheduling_runtime.ini launchrunfarm
Test
To make sure that everything is setup correctly, run a simple test simulation using the run-nanoPU-test.sh script:
cd ~/chipyard/sims/firesim/deploy/
./run-nanoPU-test.sh
The simulation should take about 10 minutes to complete. Part of the reason for the long runtime is because we need to set up a fresh F1 instance. If you run the simulation a second time, you'll see that it runs much faster. The simulation will write CSV files to nanoPU-results/lnic_test/switch0/
.
If you are not planning to run anymore simulations, terminate the F1 instance using this command: firesim -c workloads/lnic-evaluation/config_lnic_scheduling_runtime.ini terminaterunfarm
. Otherwise, we will use the same instance to run more simulations in the next section.
Run
Run all of the nanoPU Firesim simulations, which will take about one hour to complete:
cd ~/chipyard/sims/firesim/deploy/
./run-nanoPU-sims.sh
CSV files containing the simulation results will be logged in the nanoPU-results/
directory.
Analyze the Results
We will use a Jupyter notebook to view the simulation results.
Jupyter notebooks, plus matplotlib
, plus pandas
is a great way of interacting with and visualizing data.
Run the following command to start the jupyter notebook server if it is not already running. This command will run in the foreground so we recommend running it in a separate tab:
$ cd ~
$ jupyter notebook
When you start the jupyter notebook it should provide you with a link that looks like: http://localhost:8888/tree?token=<TOKEN-NUMBER>
. Visit this link using your local web browser. Note that we are able to do this because we are forwarding port 8888 over the ssh connection to the manager.
Open the Firesim-Evals.ipynb notebook located in the chipyard/sims/firesim/deploy/
directory.
Select Kernel
> Restart & Run All
.
At this point, you should be able to view all the plots created from the Firesim simulation results.
The results should closely match the ones presented in our paper.
Important: When you are done running simulations remember to terminate the F1 instance that you were using. You can use the following command to do so:
cd ~/chipyard/sims/firesim/deploy/
firesim -c workloads/lnic-evaluation/config_lnic_scheduling_runtime.ini terminaterunfarm
It always a good idea to verify that the instance has indeed been terminated in your EC2 console.
Building the FPGA Image
Compiling an FPGA image from the Chisel source code takes several hours. In order to expedite this process, we have provided a pre-compiled FPGA image that runs the simulated nanoPU in Firesim. However, you may also want to modify the Chisel source code and generate your own FPGA image. Here are the instructions to do so.
Update the S3 bucket name in ~/chipyard/sims/firesim/deploy/config_build.ini
. The name needs to be globally unique. A good option is s3bucketname=firesim-yournamehere
.
Run the following command to build a new Amazon FPGA Image (AFI). This will take about 5-6 hours to complete.
cd ~/chipyard/sims/firesim/deploy/
firesim buildafi
After it completes you should receive an email notification that looks like the following: images/aws-notification.png
Remove the current entry in ~/chipyard/sims/firesim/deploy/config_hwdb.ini
and add the new entry as instructed in the email.
At this point, you can run simulations that use your new FPGA image.
TIP: It is always a good idea to make sure that there are no timing errors with your new FPGA image.
Firesim will copy logs and reports into the results-build/
directory after the build completes.
You can use these files to debug build issues as well as check for timing errors.
OSDI21 AEC: If you decide to do this step, you will not receive an email notification when the AFI is generated because the instance is currently configured to send notifications to us. Instead, you will need to periodically check the status of the buildafi
command to see if it has completed.
Additional Topologies
In addition to the one described at the top of this page, we have also used other Firesim topologies in our evaluations. One of those topologies uses 81 AWS FPGAs to simulate 81 nanoPU servers. Getting access to this many AWS FPGA requires special permission from AWS so it is a non-trivial simulation to run. We plan to describe other nanoPU Firesim evaluations here in the future.