.pr_agent_auto_best_practices - iNavFlight/inav GitHub Wiki

Pattern 1: Make headers self-contained: include the headers that define every macro/type you use and add proper header guards (or #pragma once) so compilation does not depend on include order.

Example code before:

// foo.h (no guard)
#include <stdint.h>
uint32_t makeId(void) { return CONCAT4(A, B, C, D); } // CONCAT4 not defined here

Example code after:

// foo.h
#pragma once
#include <stdint.h>
#include "common/utils.h"  // defines CONCAT4
uint32_t makeId(void) { return CONCAT4(A, B, C, D); }
Relevant past accepted suggestions:
Suggestion 1: [reliability] Headers rely on transitive CONCAT4
Headers rely on transitive CONCAT4 The new bus_spi_stm32{h7,f7}xx.h headers use CONCAT4 but do not include the header that defines it, relying on current include order/transitive includes from other headers. This is a latent build fragility: a future refactor or reuse of these headers elsewhere can cause compile failures (e.g., CONCAT4 undefined).

Issue description

The new SPI AF lookup table headers use CONCAT4(...) but do not include the header that defines it (common/utils.h). They currently compile only due to transitive includes (via drivers/io.hio_def.hcommon/utils.h) and therefore are fragile to include-order changes or reuse.

Issue Context

This is a latent build fragility / maintainability issue: it may not break today, but it can break later during refactors or if another file includes these headers without including common/utils.h first.

Fix Focus Areas

  • src/main/drivers/bus_spi_stm32h7xx.h[32-38]
  • src/main/drivers/bus_spi_stm32f7xx.h[36-42]

Suggestion 2: [correctness] dronecan.h missing deps
dronecan.h missing deps `dronecan.h` uses `PG_DECLARE` but doesn’t include `config/parameter_group.h` (and lacks a header guard), making builds depend on include order. `fc_tasks.c` includes `dronecan.h` before any header that defines `PG_DECLARE`, which can cause compile failures.

Issue description

src/main/drivers/dronecan/dronecan.h is not self-contained: it uses PG_DECLARE but does not include the header that defines it (config/parameter_group.h). This creates brittle include-order dependencies and can break compilation in files that include dronecan.h early (e.g. fc_tasks.c).

Issue Context

PG_DECLARE is defined in src/main/config/parameter_group.h. Some compilation units include dronecan.h before any header that brings in parameter_group.h.

Fix Focus Areas

  • src/main/drivers/dronecan/dronecan.h[1-19]
  • src/main/fc/fc_tasks.c[18-43]
  • src/main/config/parameter_group.h[104-110]

Pattern 2: Validate external or variable-length inputs before reading/using them (env vars, protocol payloads, computed frame lengths, array indices), and fail deterministically when invalid rather than continuing with undefined behavior.

Example code before:

int pr = parseInt(getenv("PR_NUMBER"));
uint8_t x = buf[2];        // assumes at least 3 bytes
port = ports[idx];         // assumes idx valid

Example code after:

const char *s = getenv("PR_NUMBER");
char *end = NULL;
long pr = s ? strtol(s, &end, 10) : -1;
if (pr < 0 || end == s || *end != '\0') { return ERROR_INVALID_INPUT; }

if (len < 3) { return ERROR_SHORT_PAYLOAD; }

if (idx < 0 || idx >= portCount) { return ERROR_BAD_INDEX; }
Relevant past accepted suggestions:
Suggestion 1: [reliability] `parseInt` result not validated
`parseInt` result not validated The PR comment step uses `parseInt(process.env.PR_NUMBER)` without checking for `NaN` or enforcing base-10 parsing, which can lead to non-deterministic behavior if the env var is missing/malformed. This violates the requirement to validate external inputs and handle invalid values deterministically.

Issue description

The workflow parses PR_NUMBER from an environment variable using parseInt(...) and then uses it without checking for NaN (and without specifying radix 10). If the env var is missing or malformed, this can produce non-deterministic behavior (e.g., NaN in URLs / API params) instead of a clear, deterministic failure.

Issue Context

This job runs with elevated permissions (pull-requests: write) and should validate external inputs (including env vars derived from artifacts/outputs) before use.

Fix Focus Areas

  • .github/workflows/pr-test-builds.yml[107-110] բավ

Suggestion 2:

Add payload size validation check

Add a payload size check in the MSP_OSD_CUSTOM_POSITION handler to ensure the incoming data is at least 3 bytes before reading from the buffer.

src/main/fc/fc_msp.c [2718-2731]

 case MSP_OSD_CUSTOM_POSITION: {
+    if (dataSize < 3) {
+        return MSP_RESULT_ERROR;
+    }
     uint8_t item;
     sbufReadU8Safe(&item, src);
     if (item < OSD_ITEM_COUNT){ // item == addr
         osdEraseCustomItem(item);
         osdLayoutsConfigMutable()->item_pos[0][item] = sbufReadU16(src) | (1 << 13);
         osdDrawCustomItem(item);
     }
     else{
         return MSP_RESULT_ERROR;
     }
 
     break;
 }

Suggestion 3:
  • [learned best practice] Validate `"$#"` before reading `$1` and print a short usage message on error so failures are deterministic and user-friendly.
    Suggestion 4:
  • [possible issue] Add a check to ensure the index returned by `findSerialPortIndexByIdentifier` is valid (>= 0) before using it to access the `portConfigs` array to prevent potential crashes.
    Suggestion 5:
  • [learned best practice] Before writing variable-length frames, validate the computed payload/frame length against `CRSF_FRAME_SIZE_MAX`/`CRSF_PAYLOAD_SIZE_MAX` (and remaining `sbuf` capacity) and clamp counts or abort if it won’t fit.
    Suggestion 6:
  • [general] Add a check to the firmware renaming script to ensure `.hex` files exist before attempting to loop through and rename them, preventing potential errors.

  • Pattern 3: Check and propagate return values and timeouts for hardware/IO operations (DMA/SDIO, bus reads/writes, CAN transmit/init, telemetry frame writers) and avoid consuming queues or proceeding with configuration when an operation reports failure.

    Example code before:

    canComputeTimings(&t);   // returns bool, ignored
    canInit();               // returns status, ignored
    txQueuePop();            // pops even if transmit failed
    while (!(SDIO->STA & DONE)) { /* no timeout */ }
    

    Example code after:

    if (!canComputeTimings(&t)) { return ERROR_BAD_TIMINGS; }
    
    int st = canInit();
    if (st != 0) { return st; }
    
    if (canTransmit(&frame) == 1) {
      txQueuePop();
    } else {
      // keep queued for retry
    }
    
    uint32_t start = micros();
    while (!(SDIO->STA & DONE)) {
      if (micros() - start > SD_TIMEOUT_US) { return SD_TIMEOUT; }
    }
    
    Relevant past accepted suggestions:
    Suggestion 1: [correctness] `canardSTM32ComputeTimings` unchecked
    `canardSTM32ComputeTimings` unchecked `canardSTM32ComputeTimings()` returns `bool` but its result is ignored and `out_timings` is used unconditionally to configure the peripheral. If timing computation fails, CAN may be initialized with invalid/uninitialized timing values without any error propagation.

    Issue description

    CAN timing computation failure is ignored, potentially configuring hardware with invalid values.

    Issue Context

    The timing helper explicitly returns false for invalid/unsatisfied configurations; initialization should not proceed on failure.

    Fix Focus Areas

    • src/main/drivers/dronecan/libcanard/canard_stm32h7xx_driver.c[162-168]

    Suggestion 2: [correctness] `canardSTM32CAN1_Init()` return ignored
    `canardSTM32CAN1_Init()` return ignored `canardSTM32CAN1_Init()` returns a status code but `dronecanInit()` ignores it and continues initialization. This can leave DroneCAN partially initialized and failing silently at runtime.

    Issue description

    DroneCAN initialization ignores CAN peripheral init failures.

    Issue Context

    Proceeding after failed CAN init can cause confusing runtime behavior and make debugging difficult.

    Fix Focus Areas

    • src/main/drivers/dronecan/dronecan.c[404-440]

    Suggestion 3: [correctness] TX queue popped on failure
    TX queue popped on failure `processCanardTxQueue()` always pops the libcanard TX queue even when `canardSTM32Transmit()` returns 0 (not sent, e.g. TX FIFO full). This will silently drop DroneCAN frames under load.

    Issue description

    processCanardTxQueue() drops frames by popping the TX queue even when the hardware transmit reports “not sent yet” (return 0).

    Issue Context

    On STM32H7 the transmit function returns 0 when HAL_FDCAN_AddMessageToTxFifoQ fails (e.g. TX FIFO full), which should be retried.

    Fix Focus Areas

    • src/main/drivers/dronecan/dronecan.c[358-375]
    • src/main/drivers/dronecan/libcanard/canard_stm32h7xx_driver.c[83-128]

    Suggestion 4: [correctness] F7 transmit always succeeds
    F7 transmit always succeeds On STM32F7, `canardSTM32Transmit()` returns 1 even when `HAL_CAN_Transmit()` fails, masking errors and causing upper layers to believe the frame was sent.

    Issue description

    STM32F7 canardSTM32Transmit() reports success even on HAL transmit failure, hiding errors and causing silent packet loss.

    Issue Context

    Callers use the return value to decide whether to keep or drop frames.

    Fix Focus Areas

    • src/main/drivers/dronecan/libcanard/canard_stm32f7xx_driver.c[127-170]
    • src/main/drivers/dronecan/dronecan.c[358-375]

    Suggestion 5:
  • [possible issue] In `SD_StartBlockTransfert`, check if the DMA disable loop timed out. If it did, set an error and return to avoid reconfiguring an active DMA stream, which could cause unpredictable behavior.
    Suggestion 6:
  • [possible issue] In `SD_HighSpeed`, if the `swTimeout` is reached while waiting for SDIO status, return an error like `SD_TIMEOUT` instead of just breaking the loop to prevent processing incomplete data. [possible issue, importance: 7]
    New proposed code:
    Suggestion 7:
  • [possible issue] In `teraRangerUpdate`, move the `busWrite()` call that triggers a new measurement to before any early `return` statements. This ensures the sensor is always re-triggered after a read attempt, preventing it from getting stuck.
    Suggestion 8:
  • [general] Add a timeout or retry counter to the polling loops in `STORAGE_Read` to prevent the system from hanging if the SD card becomes unresponsive.
    Suggestion 9:
  • [possible issue] In `processCrsf`, check the return value of `crsfRpm` and only call `crsfFinalize` if data was successfully written to the buffer.
    Suggestion 10:
  • [possible issue] In `processCrsf`, make the call to `crsfFinalize` conditional on `crsfTemperature` actually writing data to avoid sending empty temperature frames.
    Suggestion 11:
  • [learned best practice] Await the API call and wrap in try/catch to fail the step on error and aid troubleshooting.

  • Pattern 4: Keep generated and environment-specific build artifacts out of version control, and avoid mutating tracked source-of-truth files in post-build steps unless gated behind an explicit “update” action.

    Example code before:

    # CMakeLists.txt
    add_custom_command(TARGET app POST_BUILD
      COMMAND ${CMAKE_COMMAND} -E echo "..." >> ${CMAKE_SOURCE_DIR}/cmake/pg_struct_sizes.db
    )
    # repo contains: CMakeFiles/**, DependInfo.cmake, generated sources
    

    Example code after:

    # CMakeLists.txt (write to build dir; update source file only on explicit flag)
    if(UPDATE_PG_DB)
      add_custom_command(TARGET app POST_BUILD
        COMMAND ${CMAKE_COMMAND} -E copy
          ${CMAKE_BINARY_DIR}/pg_struct_sizes.db.new
          ${CMAKE_SOURCE_DIR}/cmake/pg_struct_sizes.db
      )
    else()
      add_custom_command(TARGET app POST_BUILD
        COMMAND ${CMAKE_COMMAND} -E copy
          ${CMAKE_BINARY_DIR}/pg_struct_sizes.db.new
          ${CMAKE_BINARY_DIR}/pg_struct_sizes.db
      )
    endif()
    # .gitignore: CMakeFiles/, **/DependInfo.cmake, **/dsdlc_generated/
    
    Relevant past accepted suggestions:
    Suggestion 1: [correctness] `dsdlc_generated` code committed
    `dsdlc_generated` code committed This PR adds `dsdlc_generated` DroneCAN DSDL outputs directly to the repo, which are generated artifacts and can make builds non-reproducible and the repo noisy. These files should be generated into the build directory (or updated via an explicit opt-in step) and excluded from normal source tracking.

    Issue description

    The PR commits DSDL-generated DroneCAN sources/headers under dsdlc_generated, which are generated artifacts.

    Issue Context

    Generated artifacts should be produced as part of the build (or via an explicit opt-in update command) and not be committed as normal source to keep the repository clean and reproducible.

    Fix Focus Areas

    • src/main/drivers/dronecan/dsdlc_generated/src/uavcan.equipment.air_data.Sideslip.c[1-8]
    • cmake/main.cmake[2-12]

    Suggestion 2:

    Remove generated build file from repository

    Remove the generated CMake build file from the repository. It contains user-specific absolute paths that will cause build failures for other developers and should be added to .gitignore.

    src/src/main/target/AXISFLYINGF7PRO/CMakeFiles/AXISFLYINGF7PRO_for_bl.elf.dir/DependInfo.cmake [1-474]

    -# Consider dependencies only in project.
    -set(CMAKE_DEPENDS_IN_PROJECT_ONLY OFF)
    +# This file should be removed from the repository.
     
    -# The set of languages for which implicit dependencies are needed:
    -set(CMAKE_DEPENDS_LANGUAGES
    -  "ASM"
    -  )
    -# The set of files for implicit dependencies of each language:
    -set(CMAKE_DEPENDS_CHECK_ASM
    -  "/Users/ahmed/Desktop/Projects/INAV-RPiOSD/inav/lib/main/CMSIS/DSP/Source/TransformFunctions/arm_bitreversal2.S" "/Users/ahmed/Desktop/Projects/INAV-RPiOSD/inav/src/src/main/target/AXISFLYINGF7PRO/CMakeFiles/AXISFLYINGF7PRO_for_bl.elf.dir/__/__/__/__/lib/main/CMSIS/DSP/Source/TransformFunctions/arm_bitreversal2.S.obj"
    -  "/Users/ahmed/Desktop/Projects/INAV-RPiOSD/inav/src/main/startup/startup_stm32f722xx.s" "/Users/ahmed/Desktop/Projects/INAV-RPiOSD/inav/src/src/main/target/AXISFLYINGF7PRO/CMakeFiles/AXISFLYINGF7PRO_for_bl.elf.dir/__/__/startup/startup_stm32f722xx.s.obj"
    -  )
    -set(CMAKE_ASM_COMPILER_ID "GNU")
    -
    -# Preprocessor definitions for this target.
    -set(CMAKE_TARGET_DEFINITIONS_ASM
    -...
    -
    -# The include file search paths:
    -set(CMAKE_ASM_TARGET_INCLUDE_PATH
    -  "main/target/AXISFLYINGF7PRO"
    -  "/Users/ahmed/Desktop/Projects/INAV-RPiOSD/inav/lib/main/STM32F7/Drivers/STM32F7xx_HAL_Driver/Inc"
    -  "/Users/ahmed/Desktop/Projects/INAV-RPiOSD/inav/lib/main/STM32F7/Drivers/CMSIS/Device/ST/STM32F7xx/Include"
    -...
    -

    Suggestion 3:

    Remove generated file from version control

    Remove the generated Makefile.cmake file from version control. This file is environment-specific and should be ignored by adding the CMakeFiles directory to .gitignore.

    src/CMakeFiles/Makefile.cmake [1-9]

    -# CMAKE generated file: DO NOT EDIT!
    -# Generated by "Unix Makefiles" Generator, CMake Version 4.1
    +# This file should be removed from the pull request and repository.
     
    -# The generator used is:
    -set(CMAKE_DEPENDS_GENERATOR "Unix Makefiles")
    -
    -# The top level Makefile was generated from the following files:
    -set(CMAKE_MAKEFILE_DEPENDS
    -  "CMakeCache.txt"
    -...
    -

    Suggestion 4:
  • [learned best practice] Avoid mutating `cmake/pg_struct_sizes.db` in a POST_BUILD step; instead write an updated copy to a build output file (or gate updates behind an explicit `--update-db` flag) and print instructions to apply it.

  • Pattern 5: Avoid large stack allocations and ambiguous “valid zero” defaults in embedded/task code paths by using static/preallocated buffers and initializing outputs to explicit invalid/no-data sentinels.

    Example code before:

    void handleMsg(void) {
      uint8_t tmp[377];          // large stack buffer
      int16_t rangeCm = 0;       // 0 can look like a valid measurement
    }
    

    Example code after:

    static uint8_t tmp[377];     // or use a shared pool
    void handleMsg(void) {
      int16_t rangeCm = RANGEFINDER_NO_NEW_DATA;
      // fill tmp safely...
    }
    
    Relevant past accepted suggestions:
    Suggestion 1: [correctness] 377-byte stack buffer used
    377-byte stack buffer used `handle_GetNodeInfo()` and `send_NodeStatus()` allocate `uint8_t buffer[UAVCAN_PROTOCOL_GETNODEINFO_RESPONSE_MAX_SIZE]` on the stack, where the max is 377 bytes. In embedded/IO paths this increases stack pressure and risk of overflow, especially if called in constrained task contexts.

    Issue description

    Large stack buffers are allocated in DroneCAN handler/sender paths, increasing stack pressure.

    Issue Context

    Embedded systems often have small stacks per task/ISR; using static/shared buffers or preallocated pools is safer.

    Fix Focus Areas

    • src/main/drivers/dronecan/dronecan.c[158-205]
    • src/main/drivers/dronecan/dsdlc_generated/include/uavcan.protocol.GetNodeInfo_res.h[10-16]

    Suggestion 2:

    Use static allocation for array

    The array servoMixerSwitchHelper is allocated on the stack with a size that could be large (MAX_SERVO_RULES/2). This could cause stack overflow in embedded systems with limited stack space. Consider using static allocation or dynamic allocation instead.

    src/main/flight/servos.c [211-227]

     //move the rate filter to new servo rules
     int maxMoveFilters = MAX_SERVO_RULES/2;
     int movefilterCount = 0;
    -servoMixerSwitch_t servoMixerSwitchHelper[maxMoveFilters]; // helper to keep track of servoSpeedLimitFilter of servo rules
    +static servoMixerSwitch_t servoMixerSwitchHelper[MAX_SERVO_RULES/2]; // helper to keep track of servoSpeedLimitFilter of servo rules
     memset(servoMixerSwitchHelper, 0, sizeof(servoMixerSwitchHelper));
     for (int i = 0; i < servoRuleCount; i++) {
         if(currentServoMixer[i].inputSource == INPUT_MIXER_SWITCH_HELPER || movefilterCount >= maxMoveFilters) {
             break;
         }
         if(currentServoMixer[i].speed != 0 && servoSpeedLimitFilter[i].state !=0) {
             servoMixerSwitchHelper[movefilterCount].targetChannel = currentServoMixer[i].targetChannel;
             servoMixerSwitchHelper[movefilterCount].speed = currentServoMixer[i].speed;
             servoMixerSwitchHelper[movefilterCount].rate = currentServoMixer[i].rate;
             servoMixerSwitchHelper[movefilterCount].speedLimitFilterState = servoSpeedLimitFilter[i].state;
             movefilterCount++;
         }
     }

    Suggestion 3:
  • [learned best practice] Initialize the measurement field to a fail-safe sentinel (e.g., `RANGEFINDER_NO_NEW_DATA`) so startup/first-read states don't appear as a valid `0cm` reading.
    Suggestion 4:

    Initialize static variable at declaration

    Initialize the static variable mavSystemId to a default value of 1 at declaration. This prevents it from being zero-initialized to an invalid system ID, making the code more robust.

    src/main/telemetry/mavlink.c [186-187]

     // Set mavSystemId from telemetryConfig()->mavlink.sysid
    -static uint8_t mavSystemId;
    +static uint8_t mavSystemId = 1;

  • [Auto-generated best practices - 2026-05-01]

  • ⚠️ **GitHub.com Fallback** ⚠️