The Reaction Possibility Engine - PathogenDavid/Periodic GitHub Wiki

The reaction possibility engine is the most complex subpart of the reaction processor. It consists of a data structure that describes the possible compounds that the system understands. The reaction processor iterates over the possible compounds for each element in the reaction and tests if that possible compound is present in this reaction, considering the current element as the root.

The makeup of a possible compound

All possible compounds are a tree of linked ReactionNode instances. (Do note, however, that ReactionNode isn't usually instantiated directly, only subclasses of it are.) In fact, the possible reaction database is implemented as a ReactionNode with several children.

Each possible compound has a specific root node. This node must be a pass-through node in order to process reactions in a logical manner. (More on pass-through nodes later.) The root node usually doesn't need to be a specific node, but there is usually one that makes logical sense to a human.

All of the nodes has a GetOutput method, which takes a single element as its input and has a single element as its output. Outputting a NULL signals that the process failed, and that node is considered to not having been satisfied. Whether this results in a failed reaction, however, depends on the parent of the node.

After each node gets its output, if it didn't fail it will process all of its children. Processing the children will get their output, and process their children, recursively. The default behavior is that if any one child fails to gather output: the possible compound is not satisfied and fails. However, some nodes modify this behavior, such as the EitherOrNode.

There are three general types of ReactionNode implementations: Processor nodes, pass-through nodes, and logical nodes.

Processor nodes

Processor nodes are the nodes that move things forward. They look for potential bonds on the input that satisfy some condition.

For instance, if we had the potential reaction [A][B][C] where B was our root, and we wanted C to be the next element: An reaction node looking for a bond to a C would ignore the A and be satisfied by the C.

Processor nodes should always try a new output if they are called an additional time. For the most part, this is implemented transparently to the ReactionNode implementer. The ReactionNode::Process method adds filter bits to the Elements in the reacion, and any Elements with filter bits are excluded from most Element query methods.

Two examples of processor nodes are ElementSymbolNode and ElementGroupNode. The former looks for bonds to elements with specific symbols, and the latter to elements in specific periodic groups.

Pass-through nodes

Pass-through nodes will always output their input or NULL.

They are used to filter inputs. They are always used for the start of a reaction to determine if the current element being processed is a valid root for the possible reaction.

Two examples of pass-through nodes are ElementSymbolFilterNode and ElementGroupFilterNode. The former filters elements with specific symbols, and the latter to elements in specific periodic groups.

Logical nodes

Logical nodes are a type of pass-through nodes, but they don't usually check any conditions on the input. Instead, they affect how the reaction is processed.

A NoOpNode does nothing directly. It doesn't affect how children are processed, and it always outputs its input. It is, however, useful for grouping reaction nodes together under nodes that affect children processing logic (such as the EitherOrNode.) They can be used to represent a concept similar to the scope braces ( {...} ) in C.

An EitherOrNode is used to allow multiple paths to be taken by the possible compound. It only requires a single child to succeed in order to be considered a successful branch of the possible compound processing tree. Additionally, it does do short-circuit logic; so only the first succeeding branch is actually processed. Children are processed from first-added to last-added.

Example potential compounds

Simple two-element compound (LiI)

Here is an example of a very simple possible compound definition for lithium iodine:

namespace LithiumIodine
{
    ElementSymbolFilterNode root("Li");
    ElementSymbolNode node_1("I");

    void Initialize()
    {
        root.AddChild(&node_1);
        node_1.SetBondInfo(BondType_Ionic);
        CompoundDatabaseRoot.AddChild(&root);
    }
}

The object diagram of this possible compound looks like this:

So if we have the elements [I][Li]:

I is chosen as a root
The ElementSymbolFilterNode tries to filter "Li", this fails
Li is chosen as a root
The ElementSymbolFilterNode tries to filter "Li", this succeeds. [Li] is the output.
The ElementSymbolNode tries to find a bond with "I" on the input ([Li]), it finds the [I] on the left and outputs it.
ElementSymbolNode has no children, so it applies the bond between its input ([Li]) and output ([I]) and returns success.
ElementSymbolFilterNode has no more children. All of its children succeeded, and the node has no bond information so it simply returns it succeeded.

Example possible compound with logic

For this example, we want to find an element in the alkali earth group that has two elements from the halogen group bonded to it. However, there is an exception for this reaction in that if the alkali earth element is [Be], we use different bond info.

The logic might look something like this:

If (Input is AlkaliEarth)
{
    // Make sure we have two halogens bonded
    if (not Input has bonds with 2 Halogen)
    {
        return Failure;
    }
    
    if (Input is Be)
    {
        CreateBond(Covalent)
    }
    else
    {
        CreateBond(Ionic)
    }
}

However, keep in mind that nodes only know about their parent's output and nothing else. So we will re-arrange this so the input from the first conditional is used directly:

If (Input is AlkaliEarth)
{
    if (Input is Be)
    {
        if (Input has bonds with 2 Halogen)
        {
            CreateBond(Covalent)
        }
    }
    else
    {
        if (Input has bonds with 2 Halogen)
        {
            CreateBond(Covalent)
        }
    }
}

This may look annoying since we check the same condition in both branches, but it is effectively the same.

The possible compound definition looks like this:

namespace AlkaliEarth_2Halogen
{
    ElementGroupFilterNode halogenRoot(ALKALIEARTH);
    EitherOrNode or1; // Used as if-else for Be
        ElementSymbolFilterNode beFilter("Be");
            ElementGroupNode be_1(HALOGEN);
            ElementGroupNode be_2(HALOGEN);
        /* else */
        NoOpNode elseNode;
            ElementGroupNode else_1(HALOGEN);
            ElementGroupNode else_2(HALOGEN);

    void Initialize()
    {
        halogenRoot.AddChild(&or1);
        or1.AddChild(&beFilter);
        {
            //If the alkali earth metal is Beryllium, the bond will be covalent
            beFilter.AddChild(&be_1);
            beFilter.AddChild(&be_2);
            be_1.SetBondInfo(BondType_Covalent, 1);
            be_2.SetBondInfo(BondType_Covalent, 1);
        }
        or1.AddChild(&elseNode);
        {
            //If they are the other alkali earth metals, they will make an ionic bond
            elseNode.AddChild(&else_1);
            elseNode.AddChild(&else_2);
            else_1.SetBondInfo(BondType_Ionic);
            else_2.SetBondInfo(BondType_Ionic);
        }

        CompoundDatabaseRoot.AddChild(&halogenRoot);
    }
}

This one is obviously a bit more complex than the last one, but it will hopefully make more sense with diagram of what is happening.

So in this example, beFilter is the condition of our "if statement" and or1 allows us to have more than one possible outcome for success.

Remember that all nodes can cause success and failure, so everything is a bool. All nodes (except EitherOrNode) act like an logical and operator between the results of their children. Therefore, the else_1 and else_2 would not act correctly if they were directly under or1. (If they were, only one halogen would be necessary to satisfy the possible compound when the root isn't Be. Additionally, only one of the two bonds would be created. This is very wrong.)

Another thing worth pointing out here is that bonds aren't applied by the logical nodes, but by the final be_# and else_# nodes. Remember, the logical nodes keep passing the input along, so the input to those final nodes is still the alkali earth from the beginning.

Example of compounds with long chains of elements

There are special considerations to be made for compounds where there is a chain of elements away from the root with a common ancestor. Take perchloric acid for example:

Let's say we do the human-logical thing and choose the chlorine as the root node (Note that it is possible in this case to use the hydrogen as the root node and avoid the issue discussed here entirely, but that isn't the point.)

Here is the possible compound definition for perchloric acid:

namespace PerchloricAcid
{
    ElementSymbolFilterNode root("Cl");
    ElementSymbolNode node_1("O");
    ElementSymbolNode node_1_1("H");
    ElementSymbolNode node_2("O");
    ElementSymbolNode node_3("O");
    ElementSymbolNode node_4("O");

    void Initialize()
    {
        root.AddChild(&node_1);
        root.AddChild(&node_2);
        root.AddChild(&node_3);
        root.AddChild(&node_4);

        node_1.AddChild(&node_1_1);

        node_1.SetBondInfo(BondType_Covalent, 1);
        node_2.SetBondInfo(BondType_Covalent, 2);
        node_3.SetBondInfo(BondType_Covalent, 2);
        node_4.SetBondInfo(BondType_Covalent, 2);

        node_1_1.SetBondInfo(BondType_Covalent, 1);

        CompoundDatabaseRoot.AddChild(&root);
    }
}

The object diagram for this definition is as follows:

The fact that node_1 is the first child is very important. It is the longest chain starting with an oxygen that branches off the common root chlorine. If the longest chain isn't the first child, you can experience some erratic behavior with reaction processing. Sometimes a reaction will work, and sometimes it won't - seemingly at random. (In reality, the order is influenced by the cube IDs used for each element, the sides the elements are touching each other on, etc.)

If we moved node_2 to be the first child (as in the following diagram), it might take the oxygen that node_1 needs to use in order to find the hydrogen. This results in a failed reaction because the hydrogen is never found. (The reaction possibility engine is not like regular expressions where it backpedals and tries different permutations of mutually satisfied conditions.)

So if we have a reaction like this:

Where:

The green [Cl] is used as the root
The blue [O] is used for node_2
Each red [O] fails for node_1 because node_1_1 fails on each one.
The [H] is gray because it is never even considered by any node.

The point is: since the blue [O] is reserved by node_2, it is never considered for node_1, and therefore node_1_1 never sees the [H].