Interface protection - janomach/the-hardisc GitHub Wiki
Memory protection against bit-flips
The Hardisc integrates support for a memory protected by ECC with SECDED capability, which means a single error can be corrected and two detected. The protection allows uninterrupted execution of instructions during the presence of correctable errors and provides a mechanism for reporting those errors to software. The core provides maskable interrupts for correctable errors at instruction and data interfaces with memory addresses saved in a custom CSR register. It also provides non-maskable interrupts for uncorrectable errors.
For more information about memory protection and how it affects power, performance, and area of the core, refer here.
Instruction interface
The checksum is together with the fetched data saved inside the IFB. To preserve high operational frequency, the data are corrected within the IFB, not between the IFB and Decoder. A pipeline bubble is created if the data are faulty and the IFB contains only one entry.
Data interface
To preserve high operational frequency, the loaded data are corrected in the WB stage without forwarding to lower stages. It means that the loading of fault data results in a pipeline flush. Protection of the data interface is more complicated since it must also store data in the memory. Every sub-word memory store utilizes a Read-Modify-Write sequence to compute the final memory checksum correctly.
Error reporting and interrupts
The core provides a possibility to continue executing when the correctable is present for both interfaces, but it does not provide any automatic correction of data in the memory. The software is responsible for implementing such functionality. A new CSR register, maddrerr
, stores the address containing the detected correctable error. The core provides two new maskable interrupt sources (one for each interface). If an uncorrectable error is detected during instruction or data load, a non-maskable interrupt is raised, which causes an immediate jump to the provided trap handler.
Bus protection against transient faults
A hybrid protection approach was selected to cope with transient bus faults. Parity bits protect 8-bit sets of the core request signals, while 1-bit response signals (HREADY and HRESP) from subordinates are triplicated so the master can vote about the majority result. If there is a parity mismatch in the request, a subordinate may signalize such an issue with the HRESP signal. The software can enable an automatic instruction restart via the custom CSR register hrdctrl
if a bus error (HRESP) response is received during execution. If this feature is enabled and the bus error is observed, the pipeline is flushed, and a jump to the address of the faulty instruction is performed to restart its execution. The transfer that failed due to SET is expected to succeed if repeated. If the restarted instruction fails again for the same reason, an exception is raised following a jump to the trap handler. Such behaviour is likely caused by incorrect data bus access or other problems signalled by an
AHB subordinate. This safety feature allows uninterrupted execution of the software during the presence of transient soft errors while preserving functional error response messages.
Bus signals protected by individual parity bits can be seen in the table below. Not all signals are required in each implementation. So, the table provides a true number of bits protected by each parity bit on both interfaces (instruction and data).
Parity bit | Protected bus signals | Max. bits | Instr. bus | Data bus |
---|---|---|---|---|
0 | HADDR -> bits 0, 4, 8,12,16,20,24,28 | 8 | 8 | 8 |
1 | HADDR -> bits 1, 5, 9,13,17,21,25,29 | 8 | 8 | 8 |
2 | HADDR -> bits 2, 6,10,14,18,22,26,30 | 8 | 8 | 8 |
3 | HADDR -> bits 3, 7,11,15,19,23,27,31 | 8 | 8 | 8 |
4 | HSIZE, HWRITE, HPROT, HMASTLOCK, HBURST | 12 | 0 | 3 |
5 | HTRANS | 2 | 1 | 1 |
For more information about the protection against transient faults in the bus, refer here.
protected pipeline
Integration into theInstruction interface
The address phase signals are determined by pipeline 0, whereas the parity bits are connected to the bus from pipeline 1. Read data with checksum is routed from the bus into both pipelines. The replicated response signals (HREADY and HRESP) are connected into separate pipelines. The third replica is not utilized at the instruction interface.
Data interface
The address phase signals are determined by pipeline 0, and the corresponding parity bits are connected from pipeline 1. Each pipeline provides the data and checksum to the TMR voter, which selects the majority result for the bus. Read data with checksum is pushed into each of the three pipelines in the MA stage. The replicated response signals (HREADY and HRESP) are distributed separately into each pipeline.