Cortex M7 sample boot up flow - MarekBykowski/readme GitHub Wiki

                         Cortex-M7 (ARMv7-M, 500 MHz)
                    ┌──────────────┼──────────────┐
                    │              │              │
               I-TCM (32 KB)   D-TCM (32 KB)   AXIM (64-bit)
                64-bit port     2x32-bit port          │
                                                       │
                                           I-Cache 32 KB (64-bit)
                                           D-Cache 32 KB (32-bit)
                                                       │
                                                AXI Master
                                                       │
                                                 NoC @300 MHz
                                                       │
        ┌───────────────────────┬──────────────────────────┬────────────────────┐
        │                       │                          │                    │
   BOOT & LIB ROM         MAIN RAM (~5 MB)            qSPI Controller        Other AXI
      928 KB             (32/64-bit wide)                  │                  Slaves
                                                           │
                                                   Memory-Mapped
                                                     Interface
                                                           │
                                                   HyperRAM
                                                  (External)

Memory map

Address Range Label
0x00000000–0x1FFFFFFF Code
0x20000000–0x3FFFFFFF SRAM
0x40000000–0x5FFFFFFF Peripheral
0x60000000–0x9FFFFFFF External memory
0xE0000000–0xE00FFFFF Private Peripheral Bus
0xE0100000–0xFFFFFFFF Vendor-specific memory

Use a DSB, followed by an ISB instruction or exception return to ensure that the new MPU configuration is used by subsequent instructions.

Interupt handling

  • Thread mode: Unprivileged
  • Handler mode (Privileged)
  • Handler runs
  • Exception return
  • Back to Thread mode (Unprivileged)

Debugging

The debugging hardware of the Cortex-M processor is based on the CoreSight architecture. 

Unlike traditional ARM processors, the CPU core itself does not have a Joint Test Action Group (JTAG) interface. Instead, a debug interface module is decoupled from the core, and Debug Access Port (DAP) is provided at the core level. Through this bus interface, external debuggers can access control registers to debug hardware as well as system memory, even when the processor is running.

Chip manufacturers can also include an Embedded Trace Macrocell (ETM) to allow instruction trace.

The data watchpoint function is provided by a Data Watchpoint and Trace (DWT).

Execution Paths

Operation type/path

Operation Type Path Used
Instruction fetch I-Code
Vector fetch I-Code
Data load D-Code/System
Data store System
Exception stacking System
PPB access System
ITCM fetch ITCM
DTCM access DTCM

ARM recommends that you locate the vector table in either the CODE, SRAM, External RAM, or External Device areas of the system memory map.

Using the Peripheral, Private peripheral bus, or Vendor-specific memory areas can lead to unpredictable behavior in some systems.
This is because the processor uses different interfaces for load/store instructions and vector fetch in these memory areas.

If the vector table is located in a region of memory that is cacheable, you must treat any load or store to the vector as self-modifying code and use cache maintenance instructions to synchronize the update to the data and instruction caches

If code in ITCM:

Core → ITCM (direct, no cache, deterministic)

If code in MAIN RAM:

Core → I-Cache → AXI → NoC → MAIN RAM

If code in HyperRAM:

Core → I-Cache → AXI → NoC → qSPI → HyperBus → HyperRAM

This path is:

  • Longer
  • Higher latency
  • Dependent on qSPI configuration
  • Dependent on burst mode
  • Dependent on dummy cycles

Executing from HyperRAM means:

  • Instruction fetch goes through I-Cache
  • Cache miss triggers AXI burst
  • AXI request goes through NoC
  • Hits qSPI
  • qSPI translates to HyperBus transaction
  • HyperRAM returns data

If any of these is misconfigured:

  • wrong latency
  • no burst
  • MPU region wrong
  • cache disabled
  • qSPI not in memory-mapped mode

→ HardFault / IBUSERR / PRECISERR

If your Cortex-M7 (ARMv7-M) crashes when executing code from HyperRAM, this is almost always a cache + MPU + memory attribute issue — not the CPU itself.

MPU not configured for executable memory

By default:

  • External memory may be marked XN (Execute Never)
  • Or as Device memory
  • Or as Strongly Ordered

If MPU region is wrong → HardFault on first instruction fetch

Check:

  • CFSR
  • HFSR
  • MMFSR

I-Cache enabled but memory marked non-cacheable

If:

  • I-Cache ON
  • HyperRAM region is Normal memory but incorrectly configured

You can get:

  • Prefetch abort
  • Bus fault
  • Random crash

HyperRAM latency too slow for instruction fetch

HyperRAM is:

  • High latency
  • External
  • Variable latency (refresh inside device)
  • Instruction fetch is very timing sensitive.

If:

  • Controller not configured for burst
  • Latency too low
  • Wrong dummy cycles

CPU fetch → invalid instruction → crash.

D-Cache coherence issue

If code was copied to HyperRAM D-Cache was not cleaned then execution jumps there and CPU fetches stale data.

You should:

SCB_CleanDCache_by_Addr(...)
SCB_InvalidateICache();
__DSB();
__ISB();

before jumping.

HyperRAM controller not configured for memory-mapped mode

Many OctoSPI/HyperBus controllers require:

  • Memory-mapped mode enabled
  • Proper wrap/burst configuration
  • Linear addressing mode

If not → instruction fetch = garbage.