EH outlining - ruijiefang/llvm-hcs GitHub Wiki

EH handling in LLVM follows roughly the following structure:

invoke-***
 |
lpad-***
 |
catch.dispatch
 |        |
catch     |
|         |
|  catch.fallthrough
|    |
resume

The difficulties of outlining are as follows:

  1. We cannot extract the block containing the invoke, otherwise we'll potentially extract the hot branch as well;
  2. We cannot extract the entire landing pad block, since the first instruction after the unwind edge into the lpad block must be the landingpad instruction.
  3. It seems possible to simply split the lpad block into two from the first instruction, and then outling starting from there; this is analogous to issue #4, which we outline below;
  4. The block at catch.dispatch contains potentially a series of calls to the eh.typeid.for intrinsic to use function-specific type information to match if the catch call can go through. As such, CodeExtractor cannot extract these calls (See detailed discussion and example in https://bugs.llvm.org/show_bug.cgi?id=39545). Making typeid.for outlining-friendly seems in general a difficult task, as the proposed patch in 39545 uses an entirely new pass to do so.
  5. What remains is the idea of extracting the typeid.for intrinsic calls to further up in the control flow graph, and since we have rather normal control flow, we can do so safely and store the resultant values in some variable. However, consider the following example of nested throws:

Since there are multiple catch.dispatch blocks, we cannot simply extract calls to eh.typeid.for from them to an arbitrary block that precedes them. On hindsight, it seems like we can just extract the call instructions to the successor block (between itself and landingpad block) to catch.dispatch. However, doing so is potentially unsafe, since the fallthrough branch of a try block in an outer nesting directly goes into the catch.dispatch block in the inner nesting, which means we cannot use the eh.typeid.for values unless we insert a phi node.

A possible solution is to extract these call instructions to the highest post-landingpad block that dominates them. This is the "safe" strategy we currently implement: https://github.com/ruijiefang/llvm-hcs/tree/eh-prepare. However, there are still complex cases of nested throws we cannot fully outline, since such "highest block" containing calls to eh.typeid.for might still be contained in the maximal detected SESE region induced by a try/catch.