IOEngineOverview - billbreit/BitWiseApps GitHub Wiki

Under Construction

This is a page for a new piece of functionality to drive the actions of the IOMapper ( maybe in mid-alpha phase ).

IOEngine is something like a Event-Condition-Action ( ECA ) engine, which is a small implementation of a rule-based system ( RBS ). An ECA engine answers the question 'what do I need to do next' ?

Actions and not knowledge assertions are the primary goal of ECA. It's more like a process engine than a 'rule engine', which is a way overused term and, after four decades of 'rule-engines', virtually undefinable.

The design of the IOEngine class limits computational burden by shifting all state-changing actions to IOMapper. This allows for optimizations using fast bitwise operations for pattern matching rather than navigating through a complex network structure of interlocking nodes. IOEngine doesn't do anything other apply logic rules: it makes assertions about relationships, such as X == Y. Things that get 'done' like action X.update( Y ) are delegated to IOMapper.

In fact, bitwise 'binary indexing' is often used to optimize the general-purpose reasoners - the difference in this approach and node-based implementations is that bitwise operations are used as the fundamental driver of pattern matching and logical deduction, albeit within limited domains.

Limitations

Many full-feature rule engine implementations can boast decent performance with several thousand rules, usually modeling complex business processes in one massive rule base and requiring powerful hardware and business rule management software to run well. The IOEngine can support about 100 rules in a rule set, that is per rule set - there can be an unlimited number of rule sets in a rule base.

The basic algorithmic characteristic of a full-featured rule engine is something like log N ( where N = number of rules ), which is good. The IOEngine is something like N^2, which is bad, but the bitwise matching part of the performance equation is about N/100, so the total is N^2/100 which shifts the entire exponential curve to the right, sufficient to allow for a higher level of performance than the basic N^2 would suggest.

Brute force can work as well as sophisticated algorithms ... up to about 100 rules in most environments. In practice, a rule base of several hundred rules can quickly become intractable without a sophisticated rule management software. Proper rule base design can help simplify the task of navigating through a giant morass of thousands of cross-references: by combining repetitive sequences of conditions between rules into 'definitional' rules or by using a rules table/tableau to represent tightly structured logic, something like insurance tables or schedules.

Drools is a great open-source, open-licence 'big ticket' solution, has a long track record, huge user base etc. It's very big: needs 4GB of memory to run a demo, 8 GB to run well.

Given the limitations of IOEngine, a small set of 20-30 rules ( plus a few device imports, etc. ) can run reasonably well on a Raspberry Pi Pico with 256K of memory ... no expensive nodes to maintain, only interlocking structures of bit integers. IOEngine is a node-free zone !

First the Worst: Rule Base Normalization

At this point one might say, "wow, small, fast, what else could I ask for ?" ... maybe ease of use ?

IOEngine rule encoding requires what I call 'rule base normalization' although various search engines don't seem to have heard of it ... seems pretty clear to me ... the closest reference I've found is Wikipedia's article on Rule of Replacement.

The normalized rule form is simply a list of conditions expressing relations between things that must be true in order to assert that some other relationship between things must also be true, expressed in a form that makes it easier to evaluate and run quickly.

To do so, one must manipulate the logic of the rule form to produce a list of AND conditions ( no ORs or NOTs), of the general form A AND B AND C, etc. The steps for normalization of a rule set consists of:

NOT Elimination: Invert to negative relationship

NOT A == B                ==> A != B  THEN assert X  
NOT ( A == B OR C == D )  ==> A != B AND C != D  THEN assert X   # DeMorgan  
NOT ( A == B AND C == D ) ==> A != B OR C != D THEN assert X      # DeMorgan

OR Elimination: Split Rule or Lookup

A == B OR C == D         ==> split rule, two ways to solve:
                             IF A == B THEN assert X
                             IF C == D THEN assert X

A == B OR A == C         ==> lookup in value list   # common A variable
                             IF A in [ B, C ] THEN assert X

Parenthesis Elimination ( get for free by NOT and OR elimination )

The ELSE construct is not supported by rule-based systems in general. Create an explicit ELSE rule with normalized negated conditions.

One winds up with a simplified list containing AND conditions. Each item in the list is a test of a conjunctive AND.

For example:

'Human Readable' Rule 1:

IF A == B AND
   NOT ( B == C OR D == E )
THEN
   X == Y

Normalized:

IF A == B AND
   B != C AND
   D != E
THEN
   X == Y

'Human Readable' Rule 2:

IF A == B AND
   NOT ( B == C AND D == E )
THEN
   X == Y

Normalized: Slightly tricky. The Demorgan form generates B != C OR D != E, so must split rule into two normalized rules.

IF A == B AND
   B != C 
THEN
   X == Y

IF A == B AND
   D != E
THEN
   X == Y

Tricky Situations

Suppose original form of rule had been:

IF A == B AND
   NOT ( B == C AND B == D )  # with common B value
THEN
   X == Y

It looks OK on the face of it, but what's variable B doing in there ? If A == B, then why not just test NOT ( A == C AND A == D ). And why must B equal to ( or not equal to ) two numbers at the same time. Even without knowing what A, B, C and D mean, the way the rule is constructed makes it difficult to see what it's trying to say. On the other hand, sometimes an anomalous rule structure of this sort functions as a formal definition of a legal or business term. But even then, when the form of a rule looks this strange, the entire definition may need rethinking.

In fact, there may be a significant amount of rule analysis work to be done before even attempting to do 'rule normalization'. Maybe look at rule analysis as a database-like normalization phase before generating a performance-enhanced, rule-normalized form of the rule base that can actually be deployed on small platforms.

Returning to the the normalized form B != C OR B != D ( common B value ), it might have been expressed within a single lookup with a rule, that is B not_in [ C, D ], but that would be incorrect, although it's difficult to see. Sometimes a small truth table can solve the logical mystery. [ Note the mixing of values with booleans - values on the left are sets of generated match conditions. ]

B	C	D	( B == C AND B == D )	NOT ( B == C AND B == D )
1	1	1	1	0
1	1	0	0	1

... etc ... all ones in evaluated NOT expression ...

B	C	D	( B == C AND B == D )	NOT ( B == C AND B == D )
1	0	0	0	1
0	0	0	1	0

Only the combinations (0, 0, 0) and (1, 1, 1) conditions produce a final evaluation of False, the rest are all True.

Compare to the B not_in [ C, D ] relationship, that is not ( B in [ C, D ] ) .

B	C	D	B not_in [ C, D ]
1	1	1	0
1	1	0	0

... whoops ... already in trouble ...

	1		0		0		1
	0		0		0		0

Only the combinations ( 1, 0, 0 ) and ( 0, 1, 1 ) produce boolean value of True, the rest are False. Not what we want.

From this one might conclude that B in [ C, D ] is not the same as NOT ( B notin [ C, D ] ), but in fact they are the same. The logical mode of the relation 'notin' is incorrect: it needs to produce an AND and all we are getting is an OR. Reverting to the split two-rule form is an option, but there needs to be an explicit B notall_eq [ C, D, E ] relationship to implement B != C OR B != D OR B!= E lookup expressions, which after all is only asserting the negation of "all values are the same".

So a rule of form using the notall_eq relationship:

IF A == B AND
   NOT ( B == C AND B == D )  # with common B value
THEN
   X == Y

would be expressed as:

IF A == B AND
   B notall_eq [C, D ])  
THEN
   X == Y

rather than splitting the rule implementation into two normalized rules.

Other specialized relationships may need to be created to make rule normalization a little bit easier.

The NOT ( B == C AND B == D ) exclusion pattern is similar to the 'nomatch' logic of an XOR expression ( which can be read as 'one of' ), which often presents unique problems for rule analysis and design.

In general form ( not match patterns, all boolean evals ):

XOR(B, C) ==> ( B OR C ) AND NOT ( B AND C )

B	C	XOR(B, C)
0	0	0
0	1	1
1	0	1
1	1	0

Only ( 1, 0 ) and ( 0, 1 ) result in a evaluation of True. The 'one of' reading allows a simple extension XOR( B, C, D ) to knowing that only values ( 1, 0, 0 ), ( 0, 1, 0 ) and ( 0, 0, 1 ) are True, the rest are False. In fact, XOR is not really a relationship in itself; it's a rule governing other relationships, either one xor the other. It often appears in the 'conflict detection' phase of the rule engine cycle.

Conflict Detection and Resolution

In the rules above, we asserted that X == Y is True. But what if X == Y is False in the outside world ? What happens then ?

In rule based systems, this is called "conflict detection and resolution" and it is a major feature of RBS that distinguishes it from other implementations of logic engines. In fact, generating conflicts can be the trigger that causes an action to be performed: setting the value of X to Y is popular, but there may be others, such as generating a nasty message to accounting or whatever.

Many pithy issues of rule base design have been addressed and solved over the course of decades. Much of this rule engine wisdom of the ages is preserved in the history of the 'Rete Algorithm', the mother of all rule engines.

What is Rete ?

Many rule-based system are general-purpose reasoners, such as making deductions from a set of animal classification rules and the fact that X has 4 legs, a long tail, a coat of fur and likes to eat mice. This process is called 'forward chaining' and is central to all rule-based systems.

The Rete Algorithm is a design framework for efficient forward chaining through a network of logical expressions and dependencies on a set of data values ( 'facts' ). It simulates deductive reason such as "Socrates is a man, all men are mortal therefore ...".

The main goal of Rete is efficiency and all that implies. Never do anything until it is required, i.e. 'bind on demand'. In most implementations, successful tests on nodes are encoded by 'tokens' with propagate through the network of nodes and eventually result in 'activations' of actions or assertions, such as 'ship_customer_order' or X == 'mammal' is True. This represents the state of the system carried forward from one iteration to the next.

Rete Overview

"Big Rete" implementations are very dynamic: alpha and beta nodes can be created and deleted or asserted and retracted at will, changing the configuration and state of the rule base. Suppose a rule appears in the form:

IF legs = 4 AND
   tails = 1 AND
   sound = 'barks'
   coat = 'furry'
THEN
  animal_type = 'dog'

and the example 'Fido' has only 3 legs. A Big Rete implementation with 'truth maintenance' functionality could dynamically create a new rule for Fido:

IF legs = 3 AND
   dog_name in known_3_legged_dogs     # where known_dogs = ['Fido']
THEN
  animal_type = 'dog'

The big implementations will often be able to 'backtrack' into a prompt, asking the user if the unknown animal with 3 legs is named 'Fido'. It's all very dynamic and computationally expensive.

IOEngine essentially retains the alpha/beta structure on the left, representing the network as a structure of bit integers. The IOMapper 'read' functions effectively pre-compute all queries, formulas etc. on the right hand side of the diagram on the first pass and then only pre-compute net changes on subsequent passes. Using 'bind-first' rather than 'bind-on-demand is inherently less efficient, but for a limited ECA-type process engine with 100 rules and maybe 30-40 values to bind, it doesn't seem to matter that much. Most source values are probably going to change anyway, certainly they will in a transaction-type application. [ Needs testing. ]

Not shown in the diagram are embedded queries and computations in beta nodes, such as Socrates.species = ?, and other functions needed to evaluate and bind values in logical expressions. Rete engines often embed database-like SELECT and JOIN queries directly into the network. The IOEngine may need to do a limited 'bind-on-demand' for computationally and IO expensive queries.

Note that the query might return Socrates.species = 'human', which would fail the "Socrates is a man" test. What if Socrates were a woman, how would that work ? For example, a 'definitional' rule such as IF X.species == 'human' => X.man = True would work for both a man or a woman, although X.gender might be 'man' or 'woman'. In practice, this kind of semantic jostling is beyond common: it's usual. Even an ECA engine as primitive as IOEngine may need to account for the phenomena with 'definitional' rules.

Many of the challenges addressed by Rete are the same problems facing the IOEngine or any other logic engine - how to keep the computational burden of syncing data values and their evaluations to a minimum. Rete is particularly efficient where few data values change between one cycle and the next, more like monitoring scenarios than transaction scenarios, such as order entry where where every transaction is a different customer and the node network has to be updated or recreated from scratch.

The IOEngine class will ( or should at this point in the project ) be able to implement a static rule-based transaction engine more efficiently than an equivalent node-based monitoring engine, for small rules sets, about 100 rules. Even for monitoring applications, it should ( pretty sure will ) consume much less memory.

A good 'Pythonic' reference model is PyRete ( last update 7 years ago ), much more concrete and informative than pages of Rete Design Principles, although the IBM RetePlus Overview and various other 'big ticket' systems are also good places to start. All these companies are big and very market-driven so, to avoid layers ( and minutes ) of marketing re-routing, just google with "ibm rete algorithm" or whatever.

Another good source is SparlingLogic's Rete documents, with no marketing hype. The article How the Rete Algorithm Works has an excellent example.

For the really hardcore, there's the Drool Rules documentation.

Some implementations of Prolog do something like 'rule normalization', Amzi Prolog for instance.

Using Semantic Frameworks

Structure/Function/Behavior (SFB ) Models
Ontology ( W6 )