Extra 2K DRAM Maybe - mhightower83/Arduino-ESP8266-misc GitHub Wiki
CAUTION: For some sketches, global constructors (do_global_ctors()
) use a lot of sys stack, and some even crash into 0x3FFFE000
(stack overflow).
The space after 0x3FFFE000
and before 0x40000000
(8K block) is used for the System (SYS) and Continuation stack (CONT). Parts of this address space, closer to 0x3FFFE000
, are used to support some built-in ROM function calls. The stack is set to the higher address and grows downward (toward lower address values) toward memory used by those functions. The SDK, for reasons we are not privy to, has replaced some of these ROM functions with IRAM ones freeing up some of this memory space.
By not calling some of these ROM functions, this memory can be used for other purposes. Many releases ago, this was done with the Extra 4K Heap option which required not using the flawed WPS feature. This allowed the SYS and CONT stack to fit in this address space. Budgeting ~4K for each. More recently the only, that we known of, remaining function that was using a block (0x3FFFEA80
up to 0x3FFFEB30
) in this address space was aes_unwrap
. As of Arduino ESP8266 Core v3.0.0 this function was replaced, freeing that address block.
In most cases when ROM aes_unwrap was used, the SYS stack did not grow into 0x3FFFEB30
address space. It was when "HWDT Stack Dump" pushed the stack higher that I observed problems. The problem event appeared to be WiFi activity-related. Without HWDR Stack Dump the SYS stack got close but did not appear to overlap at the same time aes_unwrap was used. That is as far as we know.
Options:
-
0x0B30
is 2864 bytes. Since the total block size for stacks is 8K of which 4K for CONT stack, plus some misc. was used up. That leaves the SDK SYS stack working with just ~1.2K. We should be able to increase the CONT stack to 6K. This leaves ~800 bytes as a buffer zone1. So really we are saying the SYS can have ~2K; however, we think it is only using ~1.2K of it. Note, my match does not rigidly account for some misc. space usage. The SYS stack space needed may vary with the sketch. In particular, the stack used by callbacks is made from the SYS context. - Instead of growing the CONT stack we add the 2K to the DRAM Heap. This requires umm_malloc changes to pre-register the
0x3FFFC000
through the0x3FFFE000
block as a pre-allocated block so it is left alone. It turns out that adding this free 2K block fragment to the DRAM Heap can reduce overall fragmentation because of the core using umm_malloc with theUMM_BEST_FIT
strategy which causes small allocation to drop into our new 2K block area.2
My past observations are that the SDK will do callbacks in a context where the SYS stack is at a minimum loading. I think the dangerous time, high use point, for the SYS stack, will be from an interrupt or exception handling occurring during a callback with high stack usage.1
When every you ask the question "If you can? ..." you should also ask the question "If you should?".
While I already use ESPAsyncTCP and matching AsynWebServer which uses a lot of SYS callbacks and I never saw problems with a 1.2K SYS stack, I don't feel comfortable limiting the SDK stack to 1.2K. Well yes, it really is closer to 2K. If anyone tries this you really should implement a salted guard band.1 Or maybe a memory block exception to detect writing into the memory guard band.3
1 Buffer zone should be salted and monitored.
2TODO: Another topic to develop (need to write example) is the use of pre-fragmenting the heap to reduce overall fragmentation and allow for a larger contiguous block.
3TODO: Finalize Memory block breakpoint Mini-Library.