00 Overview - alex-aleyan/xilinx GitHub Wiki

Table of Contents

Links:

https://xilinx-wiki.atlassian.net/wiki/spaces/A/overview
https://store.digilentinc.com/zedboard-zynq-7000-arm-fpga-soc-development-board/
https://reference.digilentinc.com/reference/programmable-logic/zedboard/start?redirect=1
https://reference.digilentinc.com/reference/programmable-logic/zedboard/reference-manual
http://zedboard.org/product/zedboard
https://www.arm.com/resources/education/education-kits
http://www.zynqbook.com/?_ga=2.3610928.1238336875.1589467975-1278097476.1586913541
https://www.xilinx.com/products/silicon-devices/soc/zynq-7000.html?_ga=2.3610928.1238336875.1589467975-1278097476.1586913541

XILINX Docs:

Get Xilinx Document Navigator

UltraScale+:

Verification

  • Ray Salemi's book, FPGA Simulation. He describes, in very readable and broken-down chunks, how to improve your FPGA verification starting from manual directed testbenches. He describes the low hanging fruit (turning on code coverage and trying to improve it) and then gets into test planning, assertions, transactions, self-directed testbenches, automated stimulus, and functional coverage.

FPGA Review

  • Memory: DRAM (64b), BRAM (36Kb), URAM (288Kb), HBM (High Bandwidth Memory).

    • DRAM (UG574):

      • SLICEM CLB
      • 64-bit DRAM (1 LUT) or 32-bit shift register.
      • Up to 512 bits per SLICEM/CLB (8 LUTs per CLB)
      • Implemented as a synchronous RAM resource (also described as distributed RAM)
      • Multiple (adjacent) LUTs in a SLICEM can be combined in various ways to store larger amounts of data (up to 512 bits per SLICEM)
      • Ideal for small and fast memories
      • Distributed RAM configurations include:
        • Simple Port: A common address port for synchronous writes and asynchronous reads
        • Dual Port: One port for synchronous writes and asynchronous reads and the other port for asynchronous reads
        • Simple dual port: One port for synchronous writes (no data out/read port from the write port) and the other port for asynchronous reads
      • Some of the configurations that are possible are
        • single-port 32 x (1 to 16)-bit RAM.
        • dual-port 32 x (1 to 8)-bit RAM.
        • simple dual-port 32 x (1 to 14)-bit RAM.
    • BRAM (UG573):

      • 32Kx1, 16Kx2, 8Kx4, 4Kx9, 2Kx18, 1Kx36, 512x72 (Simple Dual Port only).
      • Each BRAM block is 36Kb/18Kb+18Kb (Memory/Memory+FIFO|Memory)
      • Depth cascading up to 12 BRAMs blocks (per clock region).
      • Standard data output cascade
      • Pipelined data output cascade.
    • URAM (UG573):

      • 4Kbx72 (4K deep by 72 wide).
      • Each URAM block is 288Kb.
      • 16 UltraRAM blocks per clock region per column.
      • True Dual Port memory: single-clocked, two port, synchronous memory.
      • Target UltraRAM for memories 144Kb or larger (If memory will use four or more BRAMs, consider using UltraRAM instead)
  • DSP48E2 (UG579)

  • IO:

  • GbT:

    • OTN and OTN mapping, FlexE (Flex Ethernet) Interface,

    • GTR (6Gb) GTX (12.5Gb), GTH (16.3Gb), GTZ (28Gb), GTY (32Gb), GTM (58Gb-112Gb)

      • GTYE5
  • Supported Protocols:

    • PCIe:
      • Gen1 (2.5, 8b/10b), Gen2 (5, 8b/10b), Gen3 (8, 128b/130b).
      • Hard IP only: Integrated Block for PCIe (UltraScale arch).
      • Hard IP for Physical Layer PHY IP + Soft IP for PCIe MAC.
      • CPM block: The Versal devices can include an integrated block for PCle with DMA and cache coherent interconnect (CCIX):
    • Ethernet: Preamble (0x55), SFD, DA, SA, Length/Type, Payload, FCS, IFG.
    • Aurora: B2B and C2C, System Synch or Asynch.
    • Interlaken: C2C.
    • JESD204: C2C for interfacing ADCs and DACs.
  • Line modulation?:

    • PAM4 (GTM; 64/66, 128/130)
    • NRZ (GTY/GTYP; 64/66)
  • Layers:

    • PMD (Physical Medium Dependent): LASER and Connector.
    • PMA (Physical Medium Attachment): SERDES
      • OOB
      • PCIe
      • Pre/Post Emb
      • PISO (parallel in serial out block) takes parallel data and serializes it LSb first.
      • Clock Interpolator.
      • Transmit Driver.
    • PCS (Physical Coding Sublayer)
      • Encoder (8b/10b, 64b/66b, 128b/130b).
        • Review SYNC PREAMBLE (2-bit), TYPE (8-bit), CONTROL (7-bit), DATA (8-bit).
          • Sync Preamble:
            • 0b01: pure data
            • 0b10: type, control + data.
          • Control words:
            • 0xFB: /S/ Start, start of packet.
            • 0xFD: /T/ Terminate - end of packet.
            • 0x07 or 0x00: /l/ Idle - no payload available.
            • /E/ Error - error indication.
            • /Q/ Ordered sets - control and status words
      • Gearbox
      • Scrambler (64b/66b)
        • Scrambling of the payload provides adequate transition for clock recovery.
        • Handles DC bias problems.
      • CRC Generator
      • OOB Generator
      • PRBS (Pseudo Random Binary Sequence).
      • 64b/66b - line encoding scheme developed for 10 Gigabit Ethernet that uses a scrambling method combined with a non-scrambled sync pattern and control type.
        • Alignment:
          • sync value of 01 or 10 every 66 bits - finds frame alignment.
          • To speed lock time, alternative/optional protocols replace data with special training or locking sequences that can ease alignment.
    • MAC (Media Access Control)
      • Hard IP Cores:
        • MRMAC (FEC + PCS+MAC),
        • DCMAC (FEC + PCS+MAC)
    • PMA:
      • System Synchronous, Source Synchronous, Implicitly Synchronous.

      • GTHE3:

        • Clock sources:
          • 2 pairs of Dedicated Differential Clock Pins/Buffers:
            • GTREFCLK0 (IBUFDS_GTE3, OBUFDS_GTE3)
            • GTREFCLK1 (IBUFDS_GTE3, OBUFDS_GTE3)
          • One from within the fabric: GTGREFCLK.
          • Neighboring QPLLs: GTNORTHREFCLK0, GTNORTHREFCLK1, GTSOUTHREFCLK0, GTSOUTHREFCLK1
        • PLLs (Ring vs LC?):
          • 4 Channel PLL (CPLL; GTHE3_CHANNEL[0,1,2,3]):
            • range 2 to 6.25 GHz,
            • one per transceiver channel (four per transceiver quad).
          • 2 Quad PLLs (QPLL; GTHE3_COMMON) :
            • Two fractional LC PLLs per transceiver quad.
            • LC tanks with VCO frequency range
            • QPLL0: 9.8 to 16.3GHz; QPLL1: 8.0 to 13.0GHz
          • Neighboring QPPLs:
            • Two North QPPLs: GTNORTHREFCLK0, GTNORTHREFCLK1
            • Two south QPLLs: GTSOUTHREFCLK0, GTSOUTHREFCLK1
      • GTYE5_QUAD:

        • Clock sources/pins:
          • Dedicated Clock Pins/Buffers:
            • *_GTREFCLK0 (IBUFDS_GTE5, OBUFDS_GTE5).
            • *_GTREFCLK1 (IBUFDS_GTE5, OBUFDS_GTE5).
          • Neighbor’s QPLLs:
            • *_NORTHREFCLK0, *_NORTHREFCLK1,
            • *_SOUTHREFCLK0, *_SOUTHREFCLK1
          • One from within the fabric: *_GTGREFCLK.
          • Note that *_ is
            • HSCLK0_RPLL, HSCLK0_LC for HSCLK0 block
            • HSCLK1_RPLL, HSCLK1_LC for HSCLK1 block
        • xBUFDS_GTE5
          • one O output pin
          • one ODIV2 pin (O, O/2, 0, reserved)
        • PLLs (HSCLK0 block, and HSCLK1 block):
          • Two HSCLK blocks per GbT Quad
            • HSCLK0 for CHANNEL0 Xcvr and CHANNEL1 Xcvr
            • HSCLK1 for CHANNEL2 Xcvr and CHANNEL3 Xcvr
          • Each HSCLK block has:
            • One LCPLL (8GHz - 16GHz)
            • One RPLL (4GHz - 8GHz)
  • Constraints:

  • CDC:

    • Resources:
    • Clock Domains:
      • Related clock a 10 MHz clock and it’s derived 5 MHz clocks are a single clock domain???
    • Clocks are synchronous when both their frequency and phase relationship to each other is known.
      • Frequency difference is known.
      • Phase is not out of alignment.
    • Reports:
      • Clock Interaction Report:
      • CDC Report:
        • report_cdc -name cdc_1 (use GUI)
        • Summary by Clock Pair, Summary by Type, Detailed Deport (schematic - use F4 key).
      • MTBF Report (Mean Time Between Failures):
        • report_synchronizer_mtbf
        • used to evaluate metastability issues
        • Design Summary
        • Default Synchronizer Chain Summary
        • Default hard FIFO Summary.
    • Constraints
      • ASYN_REG property.
      • set_clock_groups (be careful and avoid).
      • set_false_path: specifies CDC path in a point-to-point.
      • set_max_delay -datapath_only: tells the tool to place the cells at a distance specified in time.
      • TIMING-24: thrown when tool determines that the set_max_delay constraint has been overridden by set_clock_groups or set_false_path constraint.
    • XPM CDC IP (PG382):
      • Single-bit Array Synchronizer.
      • Asynchronous Reset Synchronizer.
      • Synchronizer via Gray Encoding.
      • Bus Synchronizer with Full Handshake.
      • Pulse Transfer.
      • Single-bit Synchronizer.
      • Synchronous Reset Synchronizer.
    • AXI
      • AXI Synchronization IP
      • AXI Stream CDC
      • AXI Clock Converter
      • AXI Interconnect for more complex circuitry like memmap (uses AXI clock converter and synchronization circuitry).
  • PCIe

⚠️ **GitHub.com Fallback** ⚠️