Using BRAMs - red-bote/VHDL_Demos GitHub Wiki
Xilinx Block RAMs
Discusses the use of BRAM resources on Xilinx FPGA with VHDL. This guide is based mostly on information from the following Xilinx manuals:
- Vivado Design Suite User Guide Synthesis UG901 (v2022.1) June 6, 2022
- XST User Guide for Virtex-6, Spartan-6, and 7 Series Devices UG687 (v 14.5) March 20, 2013
- XST User Guide for Virtex-4, Virtex-5, Spartan-3 UG627 (v 14.5) March 20, 2013
You can clone a github repo to obtain the code files for the Xilinx XST Examples.
The archive of the code samples for UG901 is a registered download, there is no direct link.
Block Memory on the FPGA
A RAM is typically implemented as a table, but can just as well be implemented as a case. As a table, the table is indexed by the RAM address which must be an integer. Therefore, if the address is input to the RAM as a SLV, it must be converted to integer.
Bulk memory storage is important for many applications. Blocks of memory can be constructed on the FPGA from the basic building elements of flip-flops and lookup-tables. An FPGA device is likely to offer some amount of dedicated bulk memory, typically referred to as block-RAM or BRAM storage.
HDL memory descriptions aren't completely analogous to discrete "physical" component RAMs and ROMs.
Asynchronous or Synchronous Memory
In order to synthesize a BRAM, the HDL must describe a synchronous memory, where either the data or the address, or both, are synchronized to the clock. One important consideration for the application is that there may need to be some compensation in the design to account for the clock-cycle consumed to read data from the block RAM. This is encountered in video image generation where the pixel data coming from a video-RAM has to be synchronized exactly to the raster scan timing in order for the image to display properly on the screen at the intended location.
If the HDL does not describe a synchronous memory, a distributed RAM will be synthesized on the FPGA. There may be an advantage to using an asynchronous memory, but the size of the memory will be limited compared to that which is typically available from block-RAM resources.
Inferred RAM Primitives
Infer as used in this context pertains to the ability of the synthesis tool to take a "generic" HDL description (no vendor-specific attributes) and realize the implementation that is best optimized for the underlying FPGA technology. The code examples discussed on this page are intended to demonstrate the Xilinx templates for writing vendor-neutral HDL code from which the tool is most likely to infer the intended implementation without explicitly referencing any underlying vendor IP.
ROMs Using Block RAM Resources
rams_21c
is a ROM with registered addressrams_21a
is a ROM with registered output (template 1)roms_1
is nearly identical to rams_21a, it was imported from ug901-vivado-synthesis-examples and infers a BRAM.
Only the circuit using roms_1 inferred a BRAM in this test. roms_1
has The RAM_STYLE attribute (discussed in Chapter 4: HDL Coding Techniques of Vivado User Guide Synthesis UG901) which rams_21a does not have:
attribute rom_style : string;
attribute rom_style of ROM : signal is "block";
Complete roms_1.vhdl listing (cleaned up on VHDL Beautifier, Formatter):
-- ROM Inference on array
-- File: roms_1.vhd
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_unsigned.all;
entity roms_1 is
port (
clk : in std_logic;
en : in std_logic;
addr : in std_logic_vector(5 downto 0);
data : out std_logic_vector(19 downto 0)
);
end roms_1;
architecture behavioral of roms_1 is
type rom_type is array (63 downto 0) of std_logic_vector(19 downto 0);
signal ROM : rom_type := (
X"0200A", X"00300", X"08101", X"04000", X"08601", X"0233A",
X"00300", X"08602", X"02310", X"0203B", X"08300", X"04002",
X"08201", X"00500", X"04001", X"02500", X"00340", X"00241",
X"04002", X"08300", X"08201", X"00500", X"08101", X"00602",
X"04003", X"0241E", X"00301", X"00102", X"02122", X"02021",
X"00301", X"00102", X"02222", X"04001", X"00342", X"0232B",
X"00900", X"00302", X"00102", X"04002", X"00900", X"08201",
X"02023", X"00303", X"02433", X"00301", X"04004", X"00301",
X"00102", X"02137", X"02036", X"00301", X"00102", X"02237",
X"04004", X"00304", X"04040", X"02500", X"02500",
X"02500", X"0030D", X"02341", X"08201", X"0400D"
);
attribute rom_style : string;
attribute rom_style of ROM : signal is "BLOCK";
begin
process (clk)
begin
if rising_edge(clk) then
if (en = '1') then
data <= ROM(conv_integer(addr));
end if;
end if;
end process;
end behavioral;
Run synthesis (or implementation) tool an check the results under the Design Runs tab, or run Utilization for more detailed information: images/rams/SQCrpV.png
Multiple Architectures in VHDL Module
The ram_comp entity shows an example of multiple alternate architectures that can instantiated from the component.
entity ram_comp is
Port ( clk : in STD_LOGIC;
addr : in STD_LOGIC_VECTOR(5 downto 0);
data : out STD_LOGIC_VECTOR(19 downto 0));
end ram_comp;
architecture arch_rams_21c of ram_comp is -- arch_rams_21c
signal ram_addr : std_logic_vector(5 downto 0);
signal ram_data : std_logic_vector(19 downto 0);
begin
ram_addr <= addr;
u_rom : entity work.rams_21c
port map (
clk => clk,
en => '1',
addr => ram_addr,
data => ram_data
);
data <= ram_data;
end arch_rams_21c;
architecture arch_roms_1 of ram_comp is -- arch_roms_1
signal ram_addr : std_logic_vector(5 downto 0);
signal ram_data : std_logic_vector(19 downto 0);
begin
ram_addr <= addr;
u_rom : entity work.roms_1
port map (
clk => clk,
en => '1',
addr => ram_addr,
data => ram_data
);
data <= ram_data;
end arch_roms_1;
The following code illustrates entity instantiation with the architecture name designated explicitly:
u_ram : entity work.ram_comp(arch_roms_1)
Port map(
clk => clk,
addr => ram_addr,
data => ram_data
);
u_ram : entity work.ram_comp(arch_rams_21c)
Port map(
clk => clk,
addr => ram_addr,
data => ram_data
);
Inferring BRAM Examples
Single Port and Dual Port RAMs
Single-port RAMs will be introduced first. Simple dual-port RAMs where one input source is read-write and the other source is read-only will be considered (typical for video-RAM application).
The table below reproduced from Xilinx XST User Guide UG687 describes minimum sizes of BRAMs in the Xilinx ISE tool suite that preceded Vivado.
BRAM Synchronization Modes
BRAMs must be described in VHDL in such a way that the address, or the data out, or both, are registered. The typical RAM synchronizing topologies in Xilinx are:
- write first (aka read-through)
- read first
- no change
The following top-level VHdL code was used to simulate BRAM synchronization modes in Vivado. The accumulator implements an unsigned up-counter with increment of 1. The accumulator has a synchronous reset which is preferred for use with BRAMs. The counter provides data input to the RAM, and address for the RAM, as well as an arbitrary periodic write-enable signal to test writing to the RAM.
----------------------------------------------------------------------------------
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
entity rtl_top is
Generic (constant COUNTER_BITS : integer := 16);
Port ( clk : in STD_LOGIC;
reset : in STD_LOGIC;
led : out STD_LOGIC_VECTOR (15 downto 0));
end rtl_top;
architecture test_arch of rtl_top is
signal count : std_logic_vector (COUNTER_BITS-1 downto 0);
signal ram_wre : std_logic;
signal ram_addr : std_logic_vector (5 downto 0);
signal ram_din : std_logic_vector (COUNTER_BITS-1 downto 0);
signal ram_dout : std_logic_vector (COUNTER_BITS-1 downto 0);
begin
u_accum : entity work.accumulators_2
generic map (
WIDTH => COUNTER_BITS)
port map (
clk => clk,
rst => reset,
D => std_logic_vector(to_unsigned(1, COUNTER_BITS)),
Q => count
);
ram_din <= count;
ram_addr <= count(8 downto 3); -- 6-bit address is held for 8 clock cycles
ram_wre <= (not count(9)) -- allows 64 bytes of RAM to be initialized at startup
or (count(3) and not count(2) and count(1) and count(0))
or (count(2) and not count(1));
-- modify section to instantiate the RAM to be tested
u_rams_02a : entity work.rams_02a
port map (
clk => clk,
we => ram_wre,
en => '1',
addr => ram_addr,
di => ram_din,
do => ram_dout
);
led <= ram_dout;
end test_arch;
RAM with Asynchronous Read
Single-Port RAM with Asynchronous Read (Distributed RAM)
The write-cycle is synchronized to the clock but the new data is also written immediately to the output port independent of the clocked process. Therefore new data is available immediately as it is written.
On a read-cycle, data is available from addressed location immediately as ram address changes. Data out is not in the clocked process (although still occurs with the clock edge as the rom address is driven by counter derived from the clock).
BRAM Write-First Mode
Single-Port BRAM Write-First Mode (recommended template)
In a write-cycle new data is simultaneously stored to the addressed location as well as copied to an output register in the same clock period .
Single-Port BRAM Write-First Mode (registered read address template) also infers a BRAM in write-first mode.
Single-Port RAM with Synchronous Read (Read Through) is very similar. A description of the terminology "Synchronous Read (Read Through)" is found in the older documentation XST User Guide for Virtex-4, Virtex-5, Spartan-3 UG627 "A true synchronous read is the synchronization mechanism available in Virtex block RAMs, where the read address is registered on the RAM clock edge."
BRAM Read-First Mode
Single-Port BRAM Read-First Mode
In a write-cycle, stored data is "read first" with newest data appearing on the following clock period.
BRAM No-Change Mode
Single-Port BRAM No-Change Mode
Data output register is not updated during the write-cycle so new data can't appear until the next read occurs at that address.