FPGA Emulation - StanfordVLSI/dragonphy2 GitHub Wiki
This page is about emulating the DragonPHY design on an FPGA. The first section consists of instructions to replicate the emulator that is run as part of the regression testing setup. Subsequent sections are about writing your own tests and possible areas for future work.
Instructions
- Install Vivado if you haven't already. (Instructions here)
- Switch to the
master
branch of DragonPHY (if you aren't on it already), and make sure it is up-to-date. - Set the environment variable
TAP_CORE_LOC
to point to the EDIF file (not folder) for the TAP core. Some information on how to generate that file is inexperiments/tap_export/README.md
; it should only be necessary to do that once. Please be aware of one limitation of this flow: unlike in the real ASIC, the emulator JTAG ID will not match thegit
hash. Instead, it will always be0x1DA8C133
.
> export TAP_CORE_LOC=PATH_TO_TAP_EDIF
- Set the environment variable
DW_TAP
to point to/cad/synopsys/syn/L-2016.03-SP5-5/dw/sim_ver/DW_tap.v
(you may need to make a local copy of that file). This is only needed for running a sanity check simulation (test_2
); it is not used for building the emulator bitstream. As before, the environment variable should point to the file, not folder.
> export DW_TAP=PATH_TO_DW_TAP
- Make sure that your emulation dependencies are up-to-date (run commands below in the top-level of the DragonPHY repository). There have been recent updates to all three packages, so previously installed versions are likely out-of-date.
> pip uninstall svreal msdsl anasymod
> pip install -e .
- Build FPGA models:
> python make.py --view fpga
- Change directory to folder that corresponds to the desired emulator architecture:
tests/fpga_system_tests/emu
: low-level modeling strategy. Uses the sameanalog_core
as the real ASIC. Can achieve 5 Mb/s emulation rate.tests/fpga_system_tests/emu_macro
: high-level modeling strategy. Uses a synthesizable macro-model foranalog_core
. Can achieve 80 Mb/s emulation rate.
- Build the project configuration for the emulator. Valid options for
BOARD_NAME
includeZCU106
,ZC706
,ZC702
, andPYNQ_Z1
. ForEMU_CLK_FREQ
,15e6
is recommended for the low-level modeling strategy while30e6
is recommended for the high-level strategy. The emulation throughput is directly proportional to this value.
> pytest -s -k test_1 --board_name BOARD_NAME --emu_clk_freq EMU_CLK_FREQ
- Before building the FPGA bitstream, check the emulator architecture with a simulation. This takes 5 minutes for the low-level architecture, or 1-2 minutes for the high-level architecture.
> pytest -s -k test_2
- If that looks good, build the FPGA bitstream. This normally takes about 30 minutes for the low-level architecture or 45 minutes for the high-level architecture.
> pytest -s -k test_3
- After that completes, you may want to open the project in Vivado. This is not required, but is sometimes useful to get a sense of how things went. To do that, launch Vivado and open the project file
build/fpga/prj/prj.xpr
. If you open the implemented design, you can report timing (to make sure there are no timing violations) and report utilization (to make sure the resource utilization looks as expected).
> vivado &
- At this point it is time to plug in the FPGA board:
- Connect power to the FPGA board (board-dependent; check the user manual for the board if you're not sure). There is often a power switch that has to be flipped, too.
- Connect
USB JTAG
andUSB UART
to the host computer. ForZCU106
, the connector locations are shown here.
- Now program the FPGA board and run the emulation:
> pytest -s -k test_4
- If that looks good, there are various additional arguments that can be passed to
test_4
:--prbs_test_dur
: Duration of the PRBS test, in seconds. Default value is 10 seconds, which is 50 Mb for the low-level architecture, or 800 Mb for the high-level architecture.--jitter_rms
: RMS jitter of the ADC sampling times, in seconds. Default value is0
; our design can tolerate up to about2.6e-12
with other settings at their defaults.--noise_rms
: RMS noise added to voltages sampled by ADC, in volts. Default value is0
; our design can tolerate up to about56e-3
with other settings at their defaults.--chan_tau
: Time constant of the channel, assuming a simple first-order exponent step response. Default value is25e-12
; our design can tolerate up to about217e-12
with other settings at their defaults. The emulator can be configured at runtime with a non-exponential step response, but this currently has to be done in Python, rather than through the command line.--chan_delay
: Time delay of the channel (i.e., time when the channel step response becomes non-zero). Default value is31.25e-12
, which is0.5 UI
. If you decreasechan_delay
towards0
, the PI control codes should decrease; if you increase it towards62.5e-12
, then the codes should increase.
- Sometimes it is useful to look at waveforms captured from the FPGA's Integrated Logic Analyzer (ILA). To do that, launch Vivado, open
build/fpga/prj/prj.xpr
, and then open the Hardware Manager. Connect to the FPGA but do not reprogram it, since that will restart the emulator. You can then select signals for probing and set triggering options using the ILA window.
Writing your own tests
If you want to change JTAG reads/writes, then you can copy the test_4
function definition from test*.py
into your own file, and edit from there. You'll notice that the interaction between the emulator and host computer takes place by sending ASCII commands over the USB UART link, mostly to read and write JTAG registers as in CPU-based tests. In writing your own tests, you don't necessarily have to use pytest
unless you want your test to be used in the regression suite.
One of the other things you might want to do in emulation is to set the channel dynamics to something other than an exponential step response. As a first pass, you can try editing the definition of chan_func
, which is just a regular Python function. You can see that the coefficients needed to represent chan_func
are computed after that point and uploaded to the emulator. If you want to experiment further with the channel dynamics, you may need to update some of the parameters in config/fpga/chan.yml
(low-level architecture) or config/fpga/analog_slice_cfg.yml
(high-level architecture). The properties func_*
refer to the representation of the channel step response:
func_order
:0
means piecewise-constant,1
means piecewise-linear,2
means piecewise-quadratic, etc.func_numel
: number of piecewise polynomial segments in the functionfunc_domain
: domain of the step response function. If you need a longer step response, you can increase the second number (but may want to increasefunc_numel
in order to keep the step size constant)func_widths
: Widths of the coefficients used in lookup tables for piecewise-polynomial coefficients. The first entry is for the offset, the second entry is for the slope, etc.func_exps
: Similar tofunc_widths
, but for the exponents of fixed-point formats. The resolution of the kth coefficient is2**func_exps[k]
, while its range is:
[-(2**(func_widths[k]-1))*(2**func_exps[k]), (2**(func_widths[k]-1)-1)*(2**func_exps[k])]
Creating your own test stimulus in Python is likely sufficient for many emulation use cases, but there may be times where you want to change something in the emulator hardware or firmware. To do that, it's recommended to copy the emu
or emu_macro
folder as a starting point (depending on whether you want to use the low-level or high-level architecture). You can then make these kinds of changes:
- Be able to probe different signals using the ILA: edit
simctrl.yaml
in thedigital_probes
section. This will require you to rebuild the emulator starting from step 3 (i.e.,test_3
, thentest_4
). - Add functions to the ARM core firmware: edit
main.c
. You'll see that this code consists of the UART interpreter that the Python code on the host computer is interacting with. You might want to add some higher-level features to the firmware that can be invoked over UART as a method for speeding up emulation. This requires rebuilding the emulator from step 4. - Add more signals that can be read/written by the ARM core: edit
simctrl.yaml
in thedigital_ctrl_inputs
ordigital_ctrl_outputs
sections. This requires rebuilding the emulator starting from step 3. If you want to test out the changes with sanity check simulation (test_2
), then you'll need to wire up the additional control signals insim_ctrl.sv
.
Ideas for future work
- There is a bunch of duplicated code in
test_emu.py
andtest_emu_macro.py
related to UART commands sent to the ARM core. This code should probably be pulled into a common controller.