ZedBoard: The Zynq Book Notes - alex-aleyan/xilinx GitHub Wiki
UG761 - AXI Reference Guide
UG1037 - Vivado AXI Reference Guide
- Zynq is combines a Dual-Core ARM Cortex-A9 ver r3p0 processor (ARM v7) PS with Field Programmable Gate Array (FPGA) PL.
- ARM Cortex-A9 is an application grade processor, capable of running full operating systems such as Linux, while the programmable logic is based on Xilinx 7-series FPGA architecture.
- architecture is completed by industry standard AXI interfaces, which provide high bandwidth, low latency connections between the two parts of the device.
-
system-on-a-board vs system-on-chip (ASIC) vs System-on-Progammable-Chip (non-ASIC comprised of HPS+FPGA)
-
system-on-a-board:
- system-on-chip (completely ASIC) - typical microcontroller.
-
System-on-Progammable-Chip is comprised of HPS+FPGA (APSoC - All-Programmable SoC; XILINX reference):
- Processing System (PS/HPS Arm Processor - supports software routines and/or operating systems) formed around a dual-core ARM Cortex-A9 ver r3p0 processor.
- Programmable Logic (PL/FPGA - ideal for implementing high-speed logic, arithmetic and data flow subsystems).
- AXI Bus (Advanced eXtensible Interface) - linking the PS and PL systems.
-
system-on-a-board:
- IP-XACT - industry standard IP packaging format.
[1]ARM, “AMBA 4 AXI4-Stream Protocol Specification”, v1.0, March 2010.
Available: http://www.arm.com/products/system-ip/amba/ (then “Download Specifications”).
[2]ARM, “AMBA AXI and ACE Protocol Specification: AXI3, AXI4, and AXI-Lite, ACE and ACE-Lite”, February 2013.
Available: http://www.arm.com/products/system-ip/amba/ (then “Download Specifications”).
[3]ARM, “AMBA Open Specifications”, webpage.
Available: http://www.arm.com/products/system-ip/amba/amba-open-specifications.php
[4]ARM, “Architectures, Processors and Devices Development Article”, May 2009.
Available:
http://infocenter.arm.com/help/topic/com.arm.doc.dht0001a/DHT0001A_architecture_processors_and_devices.pdf
[5]ARM, “ARM Architecture Reference Manual: ARMv7-A and ARMv7-R edition”, July 2012.
Available: https://silver.arm.com/download/ARM_and_AMBA_Architecture/AR570-DA-70000-r0p0-00rel1/DDI0406C_b_arm_architecture_reference_manual.pdf (account sign in required)
[6]ARM white paper, “The ARM Cortex-A9 Processors”, v2.0, September 2009.
Available: http://www.arm.com/files/pdf/ARMCortexA-9Processors.pdf
[7]ARM, “Cortex-A9 Floating-Point Unit Technical Reference Manual”, revision r3p0, July 2011.
Available:
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0408g/DDI0408G_cortex_a9_fpu_r3p0_trm.pdf
[8]ARM, “Cortex-A9 MPCore Technical Reference Manual”, revision r3p0, July 2011.
Available:
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0407g/DDI0407G_cortex_a9_mpcore_r3p0_trm.pdf
[9]ARM, “Cortex-A9 NEON Media Processing Engine Technical Reference Manual”, revision r3p0, July 2011. Available:
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0409g/DDI0409G_cortex_a9_neon_mpe_r3p0_trm.pdf
[10]Freescale Semiconductor, “Reference Manual - M68HC11” (Section 8. Synchronous Serial Peripheral Interface), Rev. 6.1, 2007.
Available: http://www.freescale.com/files/microcontrollers/doc/ref_manual/M68HC11RM.pdf
[11]“IEEE Standard for Floating-Point Arithmetic”, IEEE Computer Society, revision IEEE Std. 754-2008, August 2008.
[12]“IEEE Standard Test Access Port and Boundary-Scan Architecture”, IEEE Computer Society, revision IEEE Std. 1149.1-1990 including IEEE Std \1149.1a-1993, February 1990 and June 1993.
[13]David A. Patterson and John L. Hennessy, Computer Architecture and Design: The Hardware / Software Interface, 4th Ed., Morgan Kaufmann, 2012.
[14]Philips, “I2C Bus Specification and User Manual”, UM10204, Rev. 5, October 2012.
Available: http://www.nxp.com/documents/user_manual/UM10204.pdf
[15]Qin, Leon, “Using NEON for Parallel Data Processing; Zynq-7000 Hardware Architecture”, Xilinx presentation, October 2012.
Available: http://www.xilinx.com/Attachment/53775/Neon_Introduction_for_Avnet_training.pdf
[16]R. Wilson, “Truth About Xilinx Love Affair with AMBA”, Electronics Weekly, 28th June 2010.
Available: http://www.electronicsweekly.com/articles/28/06/2010/48931/truth-about-xilinx-love-affair-with-amba.htm
[17]Xilinx, Inc., “7 Series DSP48E1 Slice User Guide”, UG479, v1.7, May 2014.
Available: http://www.xilinx.com/support/documentation/user_guides/ug479_7Series_DSP48E1.pdf
[18]Xilinx, Inc., “7 Series FPGAs and Zynq-7000 All Programmable SoC XADC Dual 12-Bit 1 MSPS Analog-to-Digital Converter User Guide”, UG480, v1.4, May 2014.
Available: http://www.xilinx.com/support/documentation/user_guides/ug480_7Series_XADC.pdf
[19]Xilinx, Inc., “7 Series FPGAs Clocking Resources User Guide”, UG472, v1.10, May 2014.
Available: http://www.xilinx.com/support/documentation/user_guides/ug472_7Series_Clocking.pdf
[20]Xilinx, Inc., “7 Series FPGAs Configurable Logic Block User Guide”, UG474, v1.5, August 2013.
Available: http://www.xilinx.com/support/documentation/user_guides/ug474_7Series_CLB.pdf
[21]Xilinx, Inc., “7 Series FPGAs GTX/GTH Transceivers User Guide”, UG476, v1.10, February 2014.
Available: http://www.xilinx.com/support/documentation/user_guides/ug476_7Series_Transceivers.pdf
[22]Xilinx, Inc., “7 Series FPGAs Integrated Block for PCI Express Product Guide”, PG054, v3.0, June 2014. Available: http://www.xilinx.com/support/documentation/ip_documentation/pcie_7x/v3_0/pg054-7series-pcie.pdf
[23]Xilinx, Inc., “7 Series FPGAs Memory Resources User Guide”, UG473, v1.10.1, May 2014.
Available:
http://www.xilinx.com/support/documentation/user_guides/ug473_7Series_Memory_Resources.pdf
[24]Xilinx, Inc., “7 Series FPGAs SelectIO Resources User Guide”, UG471, v1.4, May 2014.
Available: http://www.xilinx.com/support/documentation/user_guides/ug471_7Series_SelectIO.pdf
[25]Xilinx, Inc., “AXI Reference Guide”, UG761, v14.3, November 2012.
Available: http://www.xilinx.com/support/documentation/ip_documentation/axi_ref_guide/latest/ug761_axi_reference_guide.pdf
[26]Xilinx, Inc., “LogiCORE® IP 7 Series FPGAs Transceivers Wizard v2.6 User Guide”, UG769, June 2013. Available: http://www.xilinx.com/support/documentation/ip_documentation/gtwizard/v2_6/ug769_gtwizard.pdf
[27]Xilinx, Inc., “LogiCORE IP MicroBlaze Micro Controller System, Product Specification, DS865, v1.1, April 2012.
Available: http://www.xilinx.com/support/documentation/sw_manuals/xilinx14_1/ds865_microblaze_mcs.pdf
[28]Xilinx, Inc., “Security Solutions” webpage.
Available: http://www.xilinx.com/products/silicon-devices/soc/zynq-7000/security.html
[29]Xilinx, Inc., “Xilinx and ARM Announce Development Collaboration”, press release, 19th October 2009. Available: http://press.xilinx.com/2009-10-19-Xilinx-and-ARM-Announce-Development-Collaboration
[30]Xilinx, Inc., “Zynq-7000 All Programmable SoC Overview”, Preliminary Product Specification, DS190, v1.6, December 2013.
Available: http://www.xilinx.com/support/documentation/data_sheets/ds190-Zynq-7000-Overview.pdf
[31]Xilinx, Inc., “Zynq-7000 All Programmable SoC Packaging and Pinout Product Specification”, UG865, v1.3, November 2013.
Available: http://www.xilinx.com/support/documentation/user_guides/ug865-Zynq-7000-Pkg-Pinout.pdf
[32]Xilinx, Inc., “Zynq-7000 All Programmable SoCs Product Table”, XMP087, v1.7.
Available: http://www.xilinx.com/publications/prod_mktg/zynq7000/Zynq-7000-combined-product-table.pdf
[33]Xilinx, Inc., “Zynq-7000 Technical Reference Manual”, UG585, v1.7, February 2014.
Available: http://www.xilinx.com/support/documentation/user_guides/ug585-Zynq-7000-TRM.pdf
[34]Y. Gosain and P. Palanichamy, “TrustZone Technology Support in Zynq-7000 All Programmable SoCs”, Xilinx White Paper, WP429, v1.0, May 2014.
Available: http://www.xilinx.com/support/documentation/white_papers/wp429-trustzone-zynq.pdf \
- Zynq-7000 Technical Reference Manual
- AMBA, AXI, AHB
- architecture of the Zynq comprises two sections:
- Processing System (PS).
- Programmable Logic (PL).
- the alternative to a hard processor (Arm Cortex-A9 ver r3p0) is a ‘soft’ processor (Xilinx MicroBlaze).
- the presence of the ARM processor in the system does not preclude the use of soft processors.
- Zynq processing system encompasses:
- Application Processing Unit (APU: ARM processor + Processing Resources).
- peripheral interfaces.
- cache memory.
- memory interfaces.
- interconnect.
- clock generation circuitry.
- Application Processing Unit (APU)
- APU is primarily comprised of
- two ARM processing cores,
- NEON™ Media Processing Engine (MPE; 1 MPE per ARM Core)
- Floating Point Unit (FPU)
- Memory Management Unit (MMU)
- 32KB Level 1 cache memory (one L1 Cache per ARM Core; two sections for Instructions and Data).
- 512KB Level 2 cache memory (shared by all cores thru the SCU Bridge).
- 256KB On Chip Memory (OCM; shared by all cores thru the SCU Bridge)
- Snoop Control Unit (SCU) Bridge - bridges ARM Cores to shared L2 Cache and shared On Chip Memory, and also has some responsibility for interfacing with the PL, which is not depicted here.
- ARM Cortex-A9 ver r3p0 can operate at up to 1GHz
- MMU - translates between virtual and physical addresses (refer to Section 23.3).
- Snoop Control Unit (SCU) provides interfacing between the processors and Level 1 (Data) and Level 2 cache memories (ensuring cache coherency - managing the consistency of data across shared cache resources)
- SCU manages transactions between the PS and PL via the Accelerator Coherency Port (ACP);
- Timers and an interrupt controller located in the APU [8].
- Xilinx Software Development Kit (SDK) - provides support for ARM instructions and all necessary components to develop software for deployment on the ARM processor.
- SDK Compiler supports these instruction sets [5]:
- ARM (32-bit),
- Thumb® instruction sets (32-bit)
- JAVA Byte codes (8-bit Java Virtual Machines)
- NEON Engine
-provides Single Instruction Multiple Data (SIMD) facilities to enable strategic acceleration of media and DSP type algorithms [9].
- NEON instructions are an extension to the standard ARM instruction set.
- NEON instructions can be used explicitly or by inference via C (coding must follow expected form).
- NEON engine can accept multiple sets of input vectors, upon which the same operation is performed simultaneously to provide a corresponding set of output vectors.
- NEON supports a variety of data types including (double precision is not supported [9]):
- signed and unsigned integers.
- single precision floating point
- half-precision floating point.
- Floating Point Unit (FPU):
- referred to as Floating Point Extensions, or VFP Extensions (Vector Floating Point)
- provides hardware acceleration of floating point operations
- supports single and double precision formats, with some additional support for half-precision and integer conversion.
- Zynq-7000 specifically uses the r3p0 revision of the ARM Cortex-A9, which is based on the ARM v7-A architecture.
- APU is primarily comprised of
- PS and external interfaces (not PL; certain connections can be made via the Extended MIO to the resources shared with PL):
- achieved primarily via the Multiplexed Input/Output (MIO; provides 54 pins of flexible, on demand connectivity).
- Available IO (see Zynq-7000 Technical Reference Manual [33]):
- GPIO
- buttons
- switches
- LEDs
- SPI (x2)
- I2C (x2)
- CAN (x2)
- UART (x2)
- GPIO (x2)
- SD (x2)
- USB (x2)
- GigE (x2)
- Logic Fabric:
- Zynq PL architeccture is based on the Artix®-7 and Kintex®-7 FPGA fabric.
- PL is composed of general purpose FPGA logic fabric (which is composed of CLB which is composed of 2 slices), and Input/Output Blocks (IOBs) for interfacing.
-
Configurable Logic Block (CLB) [20] - each CLB is positioned next to a switch matrix and contains two logic slices
-
Slice - composed of 4 Lookup Tables, 8 Flip-Flops, and other logic.
- Lookup Table (LUT) - can be used to implement (i) a combinatorial logic function of up to six inputs; a small Read Only Memory (ROM); a small Random Access Memory (Distributed RAM); or a shift register.
- Flip-flop (FF) - configurable as either FF or a Latch.
-
Slice - composed of 4 Lookup Tables, 8 Flip-Flops, and other logic.
- Switch Matrix - provides inter CLB connectivity.
- Carry logic - comprises a chain of routes and multiplexers to link slices in a vertical column.
- IO Blocks - provide interfacing between the PL logic resources, and the physical device ‘pads’ used to connect to external circuitry.
-
Configurable Logic Block (CLB) [20] - each CLB is positioned next to a switch matrix and contains two logic slices
- Special Devices:
- Block RAM [23] - dense memory which
- can implement Random Access Memory (RAM), Read Only Memory (ROM), and First In First Out (FIFO) buffers, while also supporting Error Correction Coding (ECC)
- can store up to 36Kb of information.
- configured as one 36Kb RAM, or two independent 18Kb RAMs
- Block RAMs can normally be clocked at the highest clock frequency supported by the device.
- DSP48E1 [17] - high-speed arithmetic:
- used to implement high-speed arithmetic on signals with medium to long arithmetic wordlengths (LUTs are only useful for implementing arithmetic operators for short wordlengths).
- primarily comprise a pre-adder/subtractor, multiplier, and post-adder/subtractor with logic unit.
- capable of SIMD processing, implementing 2 or 4 shorter addition/subtraction/accumulation operations of 24 or 12 bits, respectively
- supports all of the fundamental boolean operations: bit-wise NOT, AND, OR, NAND, NOR, XOR, and XNOR.
- contains pattern detector (not shown in the diagram) used to detect overflow, perform rounding according to a selection of schemes, and other functions.
- arithmetic wordlengths as shown can be extended by combining multiple DSP48E1s.
- DSP48E1s can be clocked at the maximum clock frequency of the device)
- often used to implement Finite Impulse Response filters - each DSP48E1 can implement two filter taps, and entire filters can be formed by cascading DSP48E1s together.
- Block RAM [23] - dense memory which
- General Purpose IO (IOBs a.k.a. SelectIO Resources)
- organised into banks of 50 IOBs
- each IOB contains one pad, which provides the physical connection to the outside world for a single input or output signal.
- I/O banks are categorized as (see [24]):
- High Performance (HP) - limited to voltages of 1.8V and are typically used for high-speed interfaces to memory and other chips).
- High Range (HR) - permit voltages of up to 3.3V and cater for a wider variety of IO standards.
- Both single-ended (requiring 1 IOB) and differential signalling (requiring 2 IOB) are supported.
- Each IOB includes an IOSERDES resource for programmable conversion between parallel and serial data formats (serialisation and deserialisation), of between 2 and 8 bits.
-
GTX Transceivers:
- Hardened, high-speed communications interface blocks
- capable of supporting PCI Express (requires a second Hard IP block: a PCI Express block [22] and a BRAM), Serial RapidIO, SCSI and SATA.
- GTX Transceivers are implemented as quads (each lane contains: PLL + Tx-er + Rx-er)
- rates of up to 12.5Gbps.
- used to create connections to networking equipment, ard disks, and further FPGA/Zynq devices.
-
XADC block:
- features two separate 12-bit ADCs
- up to 1Msps sampling rate for external analog inputs.
- Control of the XADC is achieved using the PS-XADC interface block located within the PS - PS-XADC is programmable from software executing on the APU [18].
-
Clocks:
- PL receives four separate clock inputs from the PS
- PL has facilities to generate and distribute its own clock signals independently of the PS (FIXME: does it mean the clock can be sources fron on chip oscillators instead of using external oscillators???).
-
JTAG ports:
- TAG ports are provided in the PL section to facilitate configuration and debugging of the PL [33].
-
Processing System - Programmable Logic Interfaces:
- PS utilizes AXI interconnects/interfaces forming the bridge between the PS and PL (EMIO can also be used).
-
AXI4 (Advanced eXtensible Interface; ARM AMBA® 3.0 open standard) Standard.
- AXI4 [2] - memory-mapped interface: address of the first data word is supplied followed by a data burst/beats transfer of up to 256 data words (slave calculates the addresses of the data words that follow the first data word).
- AXI4-Lite [2] - memory-mapped interface: address and single data word are transferred; supports only one data transfer per connection (no bursts)
- AXI4-Stream [1] - streaming non-memory-mapped interface: high-speed streaming data, supporting burst transfers of unrestricted size (no address mechanism).
-
AXI Interconnects and Interfaces [33]:
- Internally to the Processing System, AXI interfaces are used within both the ARM APU (connecting ARM Cores, Cache Memory and OCM (256KB) via the SCU).
- Interconnect - a switch which manages and directs traffic between attached AXI interfaces; the connections between these interconnects are also formed using AXI interfaces.
- Interfaces - a point-to-point connection for passing data, addresses, and hand-shaking signals between master and slave clients within the system.
- The PS-PL AXI Interface flavors:
- General Purpose AXI - 32-bit data bus, which is suitable for low and medium rate communications between the PL and PS. The interface is direct and does not include buffering.
- Accelerator Coherency Port - bus width of 64 bits; single asynchronous connection between the PL and the SCU within the APU. This port is used to achieve coherency between the APU caches and elements within the PL
- High Performance Ports - data width is either 32 or 64 bits; include FIFO buffers to accommodate “bursty” read and write behaviour, high rate communications between the PL and memory elements in the PS.
- PS and PL communicate via 9 AXI interfaces (each composed of multiple channels)
- M_AXI_GP0 - General Purpose (AXI_GP); PS is the master.
- M_AXI_GP1 - General Purpose (AXI_GP); PS is the master.
- S_AXI_GP0 - General Purpose (AXI_GP); PL is the master.
- S_AXI_GP1 - General Purpose (AXI_GP); PL is the master.
- S_AXI_ACP - Accelerator Coherency Port (ACP), cache coherent transaction; PL is the master.
- S_AXI_HP0 - High Performance Ports (AXI_HP) with read/write FIFOs; PL is the master (Note that AXI_HP interfaces are sometimes referred to as AXI Fifo Interfaces, or AFIs).
- S_AXI_HP1 - High Performance Ports (AXI_HP) with read/write FIFOs; PL is the master (Note that AXI_HP interfaces are sometimes referred to as AXI Fifo Interfaces, or AFIs).
- S_AXI_HP2 - High Performance Ports (AXI_HP) with read/write FIFOs; PL is the master (Note that AXI_HP interfaces are sometimes referred to as AXI Fifo Interfaces, or AFIs).
- S_AXI_HP3 - High Performance Ports (AXI_HP) with read/write FIFOs; PL is the master (Note that AXI_HP interfaces are sometimes referred to as AXI Fifo Interfaces, or AFIs).
- EMIO Interfaces
- Allows connections from the PS to be routed through the PL to external interface.
- not all MIO interfaces are supported in EMIO
- the connections are arranged in two 32-bit banks.
- Interfaces routed through the EMIO can be used to:
- directly connect the PS to the desired external pins of the PL (require entries in the constraints file) so EMIO can provide an additional 64 inputs, and 64 outputs with corresponding output enables.
- to allow PS to interface with peripheral block in the PL.
- Other PS-PL Signals that cross the PS-PL bounaries
- watchdog timers
- reset signals
- interrupts
- DMA interfacing signals.
- Security - FIXME to be continued
ug873-zynq-ctt.pdf
ug1165-zynq-embedded-design-tutorial.pdf
ug898-vivado-embedded-design.pdf
ug940-vivado-tutorial-embedded-design.pdf
-
-
-
-
- a