Troubleshooting: Debugging tree deadlocks - galacticusorg/galacticus GitHub Wiki

A tree deadlock occurs when Galacticus is unable to find any node in a merger tree that can be advanced forward in time. This can occur if, for example, a node has an event attached to it that never occurs - thereby preventing it from evolving beyond the time of that event.

Deadlocks can be difficult to diagnose, particularly in very large trees. To aid in this, if Galacticus detects a deadlock it will write an extensive report to the output log, and will also write tree data to a GraphViz format file (including information on evolution interdependencies between nodes) for visualization with dot for example, named galacticusDeadlockTree_1.gv (if more than one deadlocked tree is found then multiple trees are written, with the numerical suffix increasing for 1 for each).

This GraphViz file can also be used as a means to detect cycles in the node interdependencies, which can help to isolate the cause of the deadlock. Cycle detection can be performed using the dot_find_cycles.py script (see the blog post here) from Jason Antman. Before using this script it is generally good to remove from the GraphViz files any nodes that depend only on themselves for their evolution (these can be single node trees, or nodes in the future of the requested output) as they are typically unrelated to the deadlock. To do this, and then run dot_find_cycles.py use:

awk 'BEGIN {p=-100} {if (index($0,"parent") > 0 || index($0,"future") > 0) p=NR; if (NR > p+1) print $0}' galacticusDeadlockTree_1.gv > galacticusDeadlockTree_1_cleaned.gv
./dot_find_cycles.py galacticusDeadlockTree_1_cleaned.gv

If cycles are detected they will be output as follows:

0000000000143110 -> 0000000000026179 -> 0000000000143110
0000000000143111 -> 0000000000027822 -> 0000000000143111

which shows two cycles, each containing two nodes. Note that the numbers displayed are the uniqueIDs of the nodes, not the node indices.