04 Timing Closure Techniques, Timing Constraints, Multicycle Path Constraints - alex-aleyan/xilinx GitHub Wiki
- alex-aleyan/xilinx/wiki/Vivado-Design-Methodology
- Multicycle Path
- Xilinx:
- https://www.youtube.com/watch?v=L7drJ_-M-N8&list=PLrAD71IQv85d0kyUMK0I4xHVZKHQiY-Eq&index=19&t=85s
- https://www.youtube.com/watch?v=zqGI7Vmrwr8
- UG903 - Vivado: Using Constraints
- UG895 - Working with Constraints
- UG906 - Closure Techniques
- UG938 - Closure Tutorial
- UG949 - Design Constraints
- UG1292 - Timing Closure Quick Reference Guide
- UG612 - Timing Closure User Guide (Older:2012)
- UG912 (for ASYNC_REG and such)
- UG901 (RTL Attributes for synthesis/timing) See these sections:
- Running Timing Analysis.
- UG904 (Vivado Design Suite User Guide:Synthesis). See these sections:
- Guiding Implementation with Design Constraints.
- https://support.xilinx.com/s/article/44651?language=en_US
- UG903 Vivado Design Suite User Guide: Using Constraints
- ASIC - fixing setup/hold violation
- Limit the nesting of the control constructs (if, else, for, when, case) to a depth greater than 5.
- When performing comparisons
- Keep in mind the LUT input sizes and keep the sizes in mind when implementing comparisons..
- Counters:
- Pay extra attention to counters when counting up and comparing to a particular value - counting down and comparing may yield better results.
- Instead of incapsulating a counter and its comparison within your state machine, bring them out to a separate process/always block. Count and compare in this outside process and set the flag for the FSM once the counter reaches the desired value.
- FSMs
- never use large bus width comparisons - do it outside the FSM and have the FSM check the logic output via a flag.
- never infer multiplexors in the next state or the output logic - implement the MUXs outside the FSM.
- must contain RESET state (driven by synchronous reset on the same clock domain as the FSM is on).
- make sure to provide the default values for the FF to avoid latches and resultant undetermined behaviour.
- Floor planning (Device View)
- Perform preliminary floor planning to allocate the primitive (DSP and such) appropriate close to the PINS these primitives will be getting the data from.
- Vivado Design Suite QuickTake Video: Advanced Timing Exceptions - False Path, Min-Max Delay and Set_Case_Analysis.
- Vivado Design Suite User Guide: Design Analysis and Closure Techniques (UG906) - information on Floorplanning, Clock Networks Report, Clock Domain Crossing Synchronizer examples, details on using
report_cdc
- Vivado Design Suite User Guide: System-Level Design Entry (UG895) - information on how to create and add constraint files and constraints sets to your project.
- IO Pin Planning in the Vivado Design Suite User Guide: I/O and Clock Planning (UG899) - information on Pin Assignment.
- UltraFast Design Methodology Guide for FPGAs and SoCs (UG949) - Timing Constraints Wizard.
- Vivado Design Suite User Guide: Using the Vivado IDE (UG893) - Timing Constraints Wizard.
- Vivado Design Suite Properties Reference Guide (UG912) - CDC and ASYNC_REG property (preserves FF through optimization during synthesis).
- Vivado Design Suite User Guide: Synthesis (UG901) - RTL Attributes (Synthesis Attributes section;
DONT_TOUCH
,USED_IN_SYNTHESIS
,USED_IN_IMPLEMENTATION
), Constraints (Running Timing Analysis section). - Vivado Design Suite User Guide: Implementation (UG904) - (Guiding Implementation with Design Constraints section).
- Vivado Design Suite User Guide: Synthesis (UG901) - stuff on how Vivado flattens hierarchy...
- Schematic Viewer - to visualize the clock topology, and whether the clocks should not be timed together (clock groups; false path).
- Clock Networks Report - same as above.
- Clock Interactions Report - to view existing constraints between two clocks and if the clocks share primary clock (known phase, known common period).
-
Primary Clock (Page 29)
- A board clock that enters the design through an input port or a gigabit transceiver output pin (for example, a recovered clock a.k.a. Derived Clock).
- Needed for computing the timing slack for setup, hold, recovery, and removal checks that appear in the Recommended Constraints table.
-
Virtual Clock
- a clock not sourced from any netlist element:
- used to specify input and output delays of the data path for the purpose of timing analysis only like a test.
- created by using
create_clock
command, but without providing the source object/netlist object (port from which the clock is fed into the design or a pin from a net).
-
Physically Exclusive Clocks (Page 42)
- Two or more clocks MUXed to a single physical path thus only one of many clocks will run on the path per MUX selection.
-
Generated Clock (Page 30)
- Created using
create_generated_clock
. - Created on the output of a sequential cell when the cell drives the clock pins of other sequential cells (directly or through an interconnect). In other words, when the output of a FF drives a clock input of another FF, we have created a generated clock, or we can say when bring a signal from data path to a clock rail - the generated clock situation is taking place and the generated clock constraints are required.
- driven inside the design by special cells called Clock Modifying Blocks (for example, an MMCM), or by some User Logic (thus bring a signal from data path to a clock path). Generated clocks are associated with a master clock
- We generate a clock for:
- A simple frequency division
- A simple frequency multiplication
- A combination of a frequency multiplication and division in order to obtain a non-integral ratio (usually done by MMCM and PLL).
- A phase shift or a waveform inversion
- A duty cycle transformation
- A combination of all the above
- Created using
-
Forwarded Clock (Page 32, 90):
- is a subset of Generated Clock.
-
Master Clock - the clock that reaches the clock pin.
- A Primary Clock is the master of a Generated Clock.
- If a generated clock ClockB is derived from a previously generated ClockA. The ClockA is the master clock of the ClockB.
-
Auto-generated/Auto-derived Clock:
- Created automatically by the Vivado for:
- MMCM* / PLL*
- BUFG_GT / BUFGCE_DIV
- GT*_COMMON / GT*_CHANNEL / IBUFDS_GTE3
- BITSLICE_CONTROL / RX*_BITSLICE
- ISERDESE3
- PHASER* (7 Series only) An auto-generate
- Created automatically by the Vivado for:
- Timing Assertions Section
- Primary clocks
- Virtual clocks
- Generated clocks # Clock Groups
- Bus Skew constraints
- Input and output delay constraints
- Timing Exceptions Section # False Paths
- Max Delay / Min Delay # Multicycle Paths
- Case Analysis # Disable Timing
- Physical Constraints Section
- located anywhere in the file, preferably before or after the timing constraints # or stored in a separate constraint file
- Clocks
- Primary clocks (page 29).
- Generated clocks (page 30).
- Forwarded clocks (page 31).
- External feedback delays (page 33).
- Input and output ports
- Input delays (page 34).
- Synchronous (Describes the nature of the clock-data relationship):
- System: Use this setting when the data is launched and captured by different clock edges that are 1 period or ½ period apart
- Source: Ese this setting when the data is launched and captured by the same clock edge.
- Alignment (Describes the data transition alignment with respect to the active clock edge)
- Edge (For System Synchronous): Use this setting when the clock and data transition at the same time.
- Center (For Source Synchronous): Use this setting when the clock transitions in the middle of the data valid window.
- Edge Direct (For Source Synchronous): Use this setting when the clock transitions at the beginning of the data valid window.
- Edge MMCM (For Source Synchronous): Use this setting when the clock transitions at the end of the data valid window.
- Data Rate and Edge
- Single Rise: Use this setting for cases where only the rising clock edges launch the data outside the FPGA.
- Single Fall: Use this setting for cases where only the falling clock edges launch the data outside the FPGA.
- Dual: Use this setting for cases where both rising and falling clock edges launch the data outside the FPGA
- Synchronous (Describes the nature of the clock-data relationship):
- Output delays (page 37).
- Combinatorial delays (page 40).
- Input delays (page 34).
- Clock domain crossing
- Physically exclusive clock groups
- Logically exclusive clock groups with no interaction
- Logically exclusive clock groups with interaction
- Asynchronous clock domain crossings
- Constraints summary:
check_timing
report_drc
- Reading XDC files (no such thing as Constraint Groups)
read_verilog [glob src/*.v] read_xdc wave_gen_timing.xdc read_xdc wave_gen_pins.xdc synth_design -top wave_gen -part xc7k325tffg900-2 opt_design place_design route_design
- Out-of-Context Constraints:
read_xdc -mode out_of_context constraints_ooc.xdc
- In Non-Project Mode, the constraints are read directly between any steps of the flow (NO NEED FOR USED_IN_SYNTHESIS/USED_IN_IMPLEMENTATION properties):
read_verilog [glob src/*.v] read_xdc wave_gen_timing.xdc synth_design -top wave_gen -part xc7k325tffg900-2 read_xdc wave_gen_pins.xdc opt_design place_design route_design
- Manually editing constraints:
- Create new constraints using the Vivado Design Suite.
- Run one of the following commands:
write_xdc -exclude_physical timing_constraints.xdc write_xdc -type timing timing_constraints.xdc
- Manually edit timing_constraints.xdc to move the new constraints higher in the XDC file.
- Save the file.
- Run the following command:
reset_timing
- Read the edited timing constraints file by typing:
read_xdc timing_constraints.xdc
-
Report Timing Summary - broader view of the timing situation (sign off). Superset of
check_timing
, but thecheck_timing
report can be run independently.report_timing_summary
-
check_timing
- checks for missing constraints as:- No clock.
- Constant clock meaning clock is tied to constant logic.
- Unconstrained endpoints.
- Missing input or output delay constraints.
- Multiple clocks on the same pin.
- Generated clocks.
- Combination feedback loops.
-
report_timing
- most specific to a particular cell to cell. -
Report Clock Networks - showed hierarchically from clock pin sourcing buffer thru the MMCM and down.
report_clock_networks
-
Report Utilization - available post synthesis.
report_utilization
-
Report DRC (Design Run Check)
report_drc
-
Report Methodology (ultra fast design methodology) - looks at the netlist and how the resources are utilized and compares it to the Xilinx recommendations.
report_methodology
-
Report Design Analysis - looks at specific path and reports the timing information of whether it’s safely timed.
report_design_analysis
- Make sure to select the congestion option which helps to analyze the relative congestion across the chip.
-
Report Clock Interaction - reports interactions between the clocks, and allows you to constrain the clocks with respect to each other.
report_clock_interaction
- The tool allows the user to click on any of the presented squares and set a constraint (async clock groups, false path, max delay datapath only).
- Legend:
- green (timed): means it’s timed (it can trace the clocks to a common core like mmcm and able to measure automatically what’s the worst case relationship between the clocks - rise to rise, rise to fall, fall to fall, fall to rise).
- black (no path): means there is no data path between the two clock domains.
- blue (user ignored path): means the user already provided the constraint like set_clock_async or set_false_path
-
red(timed unsafe): bad clock domain crossing
- After the tool ran 1,000 clock edges, it has not found the clocks relationship to be repetitive (think of hunting gearing).
- This can be caused by using multiple clocks from the same MMCM whose frequencies are not integer multiples of each other (non harmonics) - in this case the tool can trace/time both clocks back to a common source (MMCM), but it can’t determine the clock relationship (repetition/alignment) after 1,000 clock cycles.
-
cyan(partial false path):
- orange(partial false path unsafe):
- purple(max delay datapath only):
-
Report CDC - looks at your clock domain crossing and grades your work based on the circuitry that the tool recognizes and whether appropriate properties (async_reg) are set which collocates the domain crossing flops in the same slice thus bringing them closer to each other (improves the clock domain crossing).
report_cdc
- Report Exceptions - reports everything that’s being included with your path exceptions - good for catching bad exception constrains that unintentionally excepted more than you intended to due to use of wild cards in your constraints.
-
Report Bus Skew - reports inter bit skew in a bus (better way of handling this situation would be utilizing IO registers; no longer automatically inferred and must be explicitly put in by the designer). Used to evaluate both internal buses and external busses that drive off chip devices.
report_bus_skew
-
Report Power - used on a netlist. Use SAIF (Switching Activity Interface File) file from simulation and use it as an input into report power to automate the “set switching activity” part of the pre configuration of this report.
report_power
-
Report QoR Suggestions - similar to report methodology.
report_qor_suggestions
- Read in the design, evaluates critical paths for too many logic levels as compared to the path requirement.
- Used prior Place & Route to catch issues upfront instead of waiting for P&R to finish to only find that these issues prevented the design from meeting the timing.
-
Timing Summary Report - provides a complete sign-off report and overview of the timing status.
report_timing_summary
- Available after synthesis.
- General Information section specifies target device, speed grade of the device, etc.
- Timing Setting section provides info about timing engine settings that is used to generate the timing report.
- Design Summary section provides info on whether the design met timing - WNS, TNS, WHS, THS, and failing endpoints.
- When failing timing, consider if: Slack: Large or small? Requirement: Reasonable or not, expected value?Source and destination clocks: Same clock or different clocks? Datapath delay: Large or small relative to the requirement? Levels of logic: Large or small? Skew and uncertainty: large or small (one useful trick here is to use PLL instead of MMCM since PLL is smaller and simpler and as the result introduces less clock uncertainty than MMCM; remember the uncertainty is added to the clock and hold analysis, and the uncertainty is subtracted from the clock and setup analysis)?
- Useful to run after optimization but prior routing to see what your block delays look like and whether any slack is left to satisfy routing delays to be introduced by place and route stage (let’s say you use 200 MHz clock so the clock period is 5 ns. If your block delay for a given path is greater than 2.5 ns (>50%), we know that it will be harder to route the design. If a block delay is 5ns, no slack is left for routing and we are doomed to fail the timing once routed).
- The timing summary report sections: Intra-Clock paths summarizes Worst slack and total violations of the timing path with the same source and destination clocks. Inter-Clock pathssummarizes Worst slack and total violations of the timing path with different source and destination clocks. Other Path groups Default path groups and user-defined path groups created by the group path command. Unconstrained Paths details paths that do not have timing constraints or timing exception paths
- Options:
-
-delay_type
: let’s the tool know what type of the path delay you would like the tool to report; Values: max, min and min max -
-report unconstrained
: whether report or not the unconstrained paths; Default: Enabled -
-report_datasheet
: report the data requirements at the package pins; useful when you want to use an iDELAY and you want to know what the tap (???) values are to be set; Default: Disable -
-max_paths
: Maximum number of paths to report per clock group or path group; Default: 1 -
-nworst
: Lists up to N worst paths per endpoints; Default:1 -
-slack_lesser_than
: Displays paths with less than this value (specifies the threshold for the slack): Default: 1e+30 -
-significant_digits
: Number of digits to display, Range: 0 to 13 -
-name
: Output results to GUI panel: name; Default: timing_ -
-input_pins
: whether or not to show the input pins on the reported path (the report is more detailed). -
-unique_pins
: narrow down the report to one path thru a unique set of pins. -
-file
: tells the tool to write the results into an output file. -
-quiet
: ignore command errors -
-verbose
: verbosity level. -
-interconnect (actual|estimated)
: the estimated can be used to speed up the report but it will be less accurate. - Speed Grade: used for “what if?” case study to see if the faster device resolves your timing issues.
- Multi-corner configurations: be very careful to mess with this as it configures your worst case and best case PVTs (process, voltage, temperature). You would be restricting how the PVTs are used and affecting the sign off.
-
- Default: