Node Status Monitoring (Safety) - Pitt-RAS/iarc7_common GitHub Wiki

The entire software system has a Node Monitor which forms a bond with each of the nodes that should be running. This allows the Node Monitor to detect when a node crashes for any reason. The nodes are arranged in a heirarchy, with each node above all of its dependencies. When a node crashes, the node monitor triggers a safety response in the next node down in the heirarchy, which is generally some kind of attempted landing. The type of landing depends on which node is performing it, and on how much of the system is still available. Many nodes in the heirarchy have no safety response because they cannot take any action (sensor nodes, for example), so the response continues propogating down the heirarchy until it hits a node capable of handling the situation.

Nodes are also capable of telling the Node Monitor to trigger a safety response without crashing, which is used when a node detects a failure or invalid state and wants to trigger a response immediately, but shut down cleanly.