FMC for 8080 LCD interface (STM32 Nucleo F767ZI ILI9341 240x320 TFT) - NickNagy/Cortet GitHub Wiki

Introduction

Setting up my STM32 board's FMC interface gave me quite a bit of trouble, and in my search for support on proper setup it became apparent that the interface is not widely understood, nor is it perfectly documented. Hopefully, this page will serve as a cohesive tutorial for setting up FMC for an 8080 LCD interface, using information I gathered from a wide array of sources.

Note: I will only be addressing how to write to the LCD (ie, how to display images). I will be ignoring how to read from the LCD (ie, how to use the touch-screen), as for my shield the touchscreen cannot be used in parallel with the FMC interface.

This tutorial assumes that you are new to, or unfamiliar with STM32CubeMX, HAL libraries, the LCD 8080 interface, and/or the FMC interface. It does however assume that you are familiar with computer architecture, registers, timing, etc.

Two faster tutorials (if all goes smoothly, that is)

If you aren't terribly concerned with the nuances of the 8080 interface or the FMC and want some quick and dirty how-to's, I strongly recommend these two sources:

Both are concise and easy to follow for the most part, and I reference both of them here as I myself followed them for my own setup initially. If you're lucky they will help you set up your interface in and have it running about 20 minutes. However, if your experience is like mine, you might end up with a buggy screen and will have to dig a bit deeper to understand what is happening and how to fix it.

Because my tutorial is based off of the above links, there are a lot of parallels in the set-up stages. My personal suggestion:

Watch the video.
If you run into troubles, return here and check out the sections on timing, warnings, and possible bugs & solutions.

The TFT, and the 8080 interface

The TFT

The exact TFT shield I own is the Elegoo 2.8" TFT Touchscreen, designed for use with Arduino Uno and ATMega boards. The Elegoo distribution page is not terribly well-documented, but this page gives some more detail, in particular that the driver IC is the ILI9341 (datasheet here).

Shield as pictured from LCDWiki

The pins

The shield has 8 data pins [D7:D0], a chip select pin CS, a register select pin RS, a read pin RD, a write pin WR, and a reset pin RST. For our purposes, these are the only pins we're concerned with (not including the voltage and GND pins). Note: CS, RS, WR, RD, and RST are all active-low, i.e. their OFF state is at positive voltage and to trigger the signals means to pull them to GND.

Writing commands and writing data

The way the LCD works is that it reads 8b commands and 8b data from the host board. The "commands" are actually registers, or addresses, and the proceeding data written from the host is interpreted depending on the register. I won't be going into much detail here about how to use the individual ILI9341 registers. The important thing to note is that there are no address lines on the shield, only data lines.

Addresses and data are both set on the data pins [D7:D0]. How does the LCD know what is a command vs what is data? That is what the register select RS pin is for. When RS is ON, whatever is written to [D7:D0] gets interpreted as a command/address. Inversely, when RS is OFF or pulled low, [D7:D0] gets interpreted as data.

Let's look at an example: treating the LCD as rows and columns of pixels (which it is), to change a pixel or region of pixels will require you to set the appropriate range of columns and rows in COLUMN_ADDRESS_SET (@ 0x2A) and PAGE_ADDRESS_SET (0x2B) registers. The COLUMN_ADDRESS_SET register takes a 16b value SC[15:0] (start column) and 16b value EC[15:0] (end column), each split up into two 8b writes. So to set the column address space, this is what your program would need to do:

keeping RS high, write 0x2A (0b00101010) to [D7:D0]. This is the command.
set RS low (register select ON, corresponds to data) and write SC[15:8] to [D7:D0].
set RS low, and write SC[7:0] to [D7:D0].
set RS low, and write EC[15:8] to [D7:D0].
set RS low, and write EC[7:0] to [D7:D0].

Valid data (CS, WR, RD, etc)

The above example we looked at does not address how and when the values across [D7:D0] are considered as valid. As mentioned, WR and CS are active-low -- data is actually sampled at the rising edge of WR (when WR goes from LOW state to HIGH state). CS needs to be set and remain LOW, RD needs to remain HIGH, and RS needs to be set and remain in its appropriate state, all for the duration of the WR half-cycle.

(TODO: images)

There are timing constraints for writing data to the IC -- there is a limit to how quickly we can access registers or write a next set of data. This will be addressed later, when configuring the FMC. For now, here is the complete list of Min/Max timings for 8080, as detailed in the ILI9341 datasheet

(TODO: image)

The FMC (flexible memory controller) interface

The STM32 FMC, in its most general sense, is an interface/controller for fast and easy writing to and reading from external memories. It can be configured to control synchronous or asynchronous NOR Flash, SRAM, NAND and SDRAM memories, and has multiple banks for theoretically controlling multiple external memories (though on a Nucleo board you'll be limited by your I/O pins).

Note: the FMC is a newer, upgraded version of the FSMC (flexible static memory controller) controller. For our purposes, FMC and FSMC are interchangeable.

The easy part: how to make FMC send commands and data

Once the FMC is properly configured (which will be covered the next few sections), the difference between writing commands to the LCD vs writing data is as simple as writing to two different addresses in memory.

LCD uses the NOR/PSRAM configuration (and memory banks). For sub-bank 1, this begins at address 0x60000000 in the controller's memory. As mentioned, from the LCD's end, the difference between a command and data is dictated by the RS signal.

Something you will have to configure is which I/O pin the RS signal is output on. You have to choose a pin Ax, where x is an integer between 0 and 24. The address in memory where you write data to will correspond to (0x60000000 + (0b1 << x)).

I use A18, so my setup is:

0x60000000 - for commands

0x60040000 - for data

Note: the above holds for 8b interfaces. For 16b,([D15:D0]), the computation for the data address becomes (0x60000000 + (0b1 << (x+1))). So for the same A18, the data address would become 0x60080000.

WARNING: despite being conceptually straightforward, this can be a potential source of problem if your display ends up not working as expected. You'll note that in my actual code I use addresses 0xC0000000 and 0xC0040000 for commands and data respectfully. I will explain this in the final section.

Tutorial

Open the images in separate tabs to view them at full resolution.

Configuring the FMC in STM32CubeMX

Start by selecting a New Project in STM32CubeMX, and finding and selecting your STM32 processor and board. Again, mine is the Nucleo F767ZI.

Pinout & Configuration

You will be redirected to a page like this:

Initializing FMC interface in CubeMX

In the Categories panel (left), open the drop-down menu under "Connectivity" and select "FMC" (NOTE: yours may be listed under a different category -- there is a search bar above the panel).

After you select FMC, an "FMC Mode and Configuration" panel will open (center). In the "Mode" section, set the following parameters:

Chip Select --> NE1

Memory type --> LCD Interface

LCD Register Select --> A18 (or whichever pin you're choosing to use)

Data --> 8 bits

Disabling the FIFO

In the "Configuration" section, disable the Write FIFO.

There are some timing parameters you can (and will need to) configure, but these can be easily changed after the code is generated. Ignore them for now.

Selecting a GPIO pin for Reset

In the right panel is the pinout-view. Selecting and configuring the FMC will have automatically generated almost every signal you will need to its own pin. However, you will also need to manually select a pin for your LCD's RESET signal. I use PD2.

NOTE: double-check which pins are accessible on your board, especially with Nucleo boards. CubeMX may automatically put a signal on a pin that exists on the processor but is not connected to an I/O port on your actual physical board. You can manually reset signals as needed to other pins. Refer to your board's datasheet and your processor's datasheet to check which pins are available for which signals.

Click on which pin you want to configure, and a pop-up menu will appear with the list of configurations available to that pin. For the RESET signal, select "GPIO Output".

Entering a user label

You can then right-click the pin and select "Enter User Label" and write a friendly, recognizable name for the pin. This will generate new macro definitions for the pin and port when you generate your code.

User label entered

Clock Configuration

Next, TODO: clock configuration picture is wrong!

Clock configuration

Project Manager

Project manager

CubeMX code: style, HAL libraries, and warnings

If you are new to STM32, and especially if you are new to CubeMX, the generated code you will see can be disorienting.

CubeMX is designed to be as convenient as possible - to a fault. It is built off of the STM32 HAL (hardware abstraction layer) libraries, which are designed to be easy-to-understand and generalized enough to be compatible with multiple STM32 processors. By using HAL, most of CubeMX's work (translating the user's configurations into a C program) is done for it - if you look at the main.c file that was generated in your project, you'll see just how few lines of code there actually are.

The HAL libraries are great for this purpose, but beware. Any STM32 hobbyist knows, and will tell you, the HAL libraries are notorious for being buggy and error-prone. The libraries (or at least the relevant files) are automatically included in your project directory when it's created. If anything goes unexpected when you run your FMC program, you'll want to sift through the functions and struct declarations and make sure that they are defined and written appropriately. Some examples are included in the "Debugging" portion of the tutorial.

You've likely also noticed in your "main.c" file some comments like so:

/* USER CODE BEGIN */ ... /* USER CODE END */

Between these comments are where CubeMX expects the user to add their own code. If at a later time you want to change a configuration in CubeMX, you would need to click "Generate Code" again to refresh the project. The code inside of these USER CODE BEGIN/END spaces would be left un-touched, but everything else would get changed as defined by CubeMX's programming. This can be rather constrictive, especially if you decide you want to make changes to non-user functions - your changes would be over-written the next time you clicked "Generate Code". If you made any personal changes to the HAL libraries, those would also get erased upon refresh.

My personal approach is to generate the code in STM32CubeMX the first time around, so that I have a main.c template ready to go. After that, any time I need to make a configuration change, I do so by editing the code directly. With most of the configuration code already written for me the first time around, finding the lines I need to change is pretty straightforward, and this way I don't risk losing personal code by constantly re-generating the project.

In short, STM32CubeMX and HAL are great for setting up projects to start, but don't lean on either too heavily, and be ready to look for errors.

Wiring the Nucleo to the LCD display

Timing Configuration

This is maybe the most tedious portion of configuration, as it requires gathering data from your devices and doing some calculations.

In your main.c file, there is an MX_FMC_Init() function:

FMC_Init func

This is where we manually configure the timing parameters for the FMC. The fields FMC_NORSRAM_TimingTypeDef struct will be passed to a function FMC_NORSRAM_Timing_Init() (defined in stm32f7xx_ll_fmc), and is used to bit-set the FMC_BTR1 register:

TODO: image

Timing FMC_Init func

Each of these timings are expressed in terms of number of HCLK cycles.

For our application, BusTurnAroundDuration can be zero, and CLKDivision and DataLatency are irrelevant. TODO description of each definition.

The two most important fields here are AddressSetupTime and DataSetupTime.

If you're lazy, and you're using all the same equipment and configurations I'm using, you can just set AddressSetupTime = 6 and DataSetupTime = 5 and you should be good to move to the next section. If you aren't using the same LCD I'm using but still want to get everything running before trying to optimize the system, set AddressSetupTime and DataSetupTime to their maximum values to be safe.

The following is a diagram from the F76xx datasheet:

TODO: change this picture to have AddressHoldTime commented as to set to zero.

Timing diagram STM32 FMC

The address setup time controls the number of HCLK cycles from the annunciation of the RS signal, to the falling edge of the NWE signal, where data is then set. Data must be valid at the rising edge of NWE. The data setup time controls the number of HCLK cycles that NWE is low, plus 1 HCLK cycle (presumably to ensure data remains valid until after NWE has returned high and not before).

How are these values appropriately chosen? It depends on the frequency of HCLK and the timing constraints of your external memory interface.

Here is a diagram and table of the timing characteristics for the ILI9341 and 8080 interfacing:

Timing diagram ILI9341

STM32's Application Note 2790 (TFT LCD Interfacing with FSMC) has a section on calculating timings on page 14. The AN2790 uses NOR Flash configuration for timings instead of NOR SRAM -- I have slightly modified the equations below to account for the discrepancies.

((ADDSET) + (DATAST+1))tHCLK = max(tCYC, tCYC(READ))

DATAST*tHCLK = tWRLW

Where DATAST must satisfy:

DATAST = (((tACC + tAS) + (tsu(Data_NE) + tv(A_NE)))/tHCLK) - ADDSET - 4

Below is a table of symbol definitions, plus their constraints given from the ILI9341 datasheet. For read timings, I use the ID versions instead of the FM (FM stands for "frame memory", I am not sure what ID actually stands for but it should correspond to the register read timings. We are not concerned with memory access times, especially for a write-only application).

Symbol	Description	Min Value	Max Value
tHCLK	HCLK period (for 168MHz, this is ~5.95ns)	NA	NA
tCYC	TFT write cycle time	66ns	None
tCYC(READ)	TFT read cycle time	160ns	None
tWRLW	TFT write signal low pulse width	15ns	None
tACC	TFT data access time	None	40ns
tAS	TFT address setup time	0ns	None
tsu(Data_NE)	NE1 low to data valid time	tHCLK - 1 (source)	None
tv(A_NE)	NE1 low to address valid time	None	0.5ns

Given that the equations are expressed as left side and right side equivalent, whereas the ILI9341 lists Min and Max values, this can be a little confusing at first glance. Using intuition on the first two equations helps to understand how they should be applied.

The LCD has a minimum valid WRITE time and a minimum valid READ time -- since we are only writing to the LCD, tCYC(READ) actually becomes irrelevant here. A complete "write" from the FMC interface involves setting the address and setting the data, the time frames of which are defined by ADDSET and DATAST respectively. So, logically, the total time frame of ADDSET + DATAST (+ 1 HCLK cycle) needs to span at least 66ns to be valid. And of course, this time frame depends on the HCLK frequency: if the frequency of HCLK is increased, the period of time an HCLK cycle spans is decreased, and therefore ADDSET + DATAST must be increased.

Similarly, evaluating equation 2: as we'd observed in the FMC timing diagram, DATAST corresponds to the period of time during which NWE is set low. So the time spanned by DATAST should meet the constraints of the LCD's low write pulse width.

The first two equations can therefore be more easily interpreted as:

(ADDSET + (DATAST + 1))tHCLK >= 66ns

DATAST*tHCLK >= 15ns

Note: fractions / decimals in these equations should be rounded up, rather than using integer division. (Alternatively, if using integer division, greater-than-or-equal-to relationships should be revised to greater-than).

The third equation, I find far less intuitive, especially since there were no upper bounds for LCD timings in the first two equations. But continuing the trend of ensuring that DATAST and ADDSET span the given time constraints of the LCD, I evaluated Equation 3 in a worst-case scenario. In other words, I assumed DATAST must at least equal the right side of the equation even at the LCD's upper bounds. I'm not even sure how significant this relationship is for a write-only interface (plus, as you'll see soon, the relationship is already satisfied in Equations 1 and 2).

The re-written equation below is just plugging in maximum values of parameters wherever given:

DATAST >= (40 + tHCLK - 1 + 0.5)/tHCLK - ADDSET - 4

Using our 168MHz frequency, the three equations give us the following information:

(ADDSET + DATAST + 1) >= 12

DATAST >= 3

DATAST >= 4 - ADDSET

Equation 3 essentially "collapses" into Equation 1. Then we have:

(ADDSET + DATAST + 1) >= 12

DATAST >= 3

Because I like symmetry, the values I use are ADDSET = 6 and DATAST = 5.

The ILI9341 Library

Running a program in Atollic TrueSTUDIO

If you're using TrueSTUDIO, you'll note that there are icons for building the program and debugging it, but there is no "Run Program" button like in a lot of other IDEs. The debugger can be configured such that the "Debug" icon acts as a "Run Program".

Open the debugger configuration window by selecting the blue bug icon.

TrueStudio IDE screen

Go to the "Startup Scripts" tab.

startup scripts tab

By default, the debugger script is initialized to FLASH the board and load the .elf to the processor, then hardware-set enable the debugger and break at main() after the program has started running (the debugger will then break again at each subsequent breakpoint the user has set).

To run your program in real-time without any breakpoints, all you have to do is replace the breakpoint code with "quit". This script will FLASH the board, load the program, then quit out of the debugger.

revised startup script

If you want to have these settings saved as a new configuration, rename the configuration (in the "Name" box), then select "Apply".

Then by clicking "Debug" (green bug icon) from then on, after a buffer on the IDE's end, the program will compile and run independently.

Debugging

Use a logic analyzer

If you don't have a logic analyzer you are not going to get very far. It is difficult to pin-point the source of your problem if you can't see what's happening along each wire.

Personally, I use the LA2016 logic analyzer, which is a pretty decent analyzer for less than $100. Unfortunately, it only reaches up to 200MHz sample rate (so it can't probe a system at 168MHz without risk of aliasing), and some of the channels have even lower sampling thresholds. Therefore, while debugging, I had to drop my SYSCLK / HCLK to 24MHz to probe WR, CS, RS and D0 all simultaneously.

Output SYSCLK on MCO2

Though not a necessity, it will help to verify that your clock is running as expected, and that your FMC signals are timed correctly with respect to it.

For the F767ZI processor there are two MCOx pins that clocks can be output on, but for he Nucleo board itself the only one accessible from the GPIO pins is MCO2, hence why I use it instead of MCO1.

HCLK is not one of my options for clock sources I can output on MCO2, but with an AHB prescaler of 1, the HCLK and SYSCLK should be running at the same speeds.

Add the following line to the end of your MX_GPIO_Init() function (if you are using it):

HAL_RCC_MCOConfig(RCC_MCO2, RCC_MCO2SOURCE_SYSCLK, RCC_MCODIV_1);

This will configure the appropriate GPIO pin with alternate function MCO2 to output the SYSCLK. TODO: mention which pin