Code Planting Tutorial Part 4 - GetPoplog/Seed GitHub Wiki

Step 4: Short-circuit logic

In this tutorial it helps to know that:

  • Pop-11 uses words rather than strings to represent symbols. The difference is that words with the same spelling are identical (c.f. interned strings) whereas there can be many copies of a string with a given spelling.
  • In Pop-11, the inside of lists [ .... ] introduces a special context. Lisp programmers will know this as quasi-quoting.
  • Inside a list, identifiers are treated as words and do not have their normal meaning. e.g. [ cat dog ] is a list with two words "cat" and "dog".
  • To get the value of a variable into a list, use the ^ operator. e.g. [ cat ^dog ] is a list with the word cat followed by the value of the variables dog.

Introduction

The and and or operators are directly implemented by the code-planting procedures sysAND and sysOR. This is a nice convenience that saves us from inefficient pushing and popping from variables. The formal descriptions of these procedures is quite hard to follow though and I won't list them here. However their usage is very simple and the best way to understand them. This is how to implement E1 and E2:

lvars done_label = sysNEW_LABEL();
plantCodeFor( E1 );                           ;;;   E1
sysAND( done_label );                         ;;;   AND on false goto done_label
plantCodeFor( E2 );                           ;;;   E2
sysLABEL( done_label );                       ;;; done_label:

And it's just as easy to implement E1 or E2:

lvars done_label = sysNEW_LABEL();
plantCodeFor( E1 );                           ;;;   E1
sysOR( done_label );                          ;;;   OR on true goto done_label
plantCodeFor( E2 );                           ;;;   E2
sysLABEL( done_label );                       ;;; done_label:

Example: Compiling a query-tree

The Scenario

At this point we're ready to take on a more challenging example and one that's more realistic in terms of when we use code-planting. The scenario that we will develop is a query-tree. This particular scenario crops up quite often in the kind of web-applications that I have worked on over the past ten years, where an administrator can configure a complex data filter on a table of records involving and/or/not and various comparisons. We won't do the whole problem because this is only a tutorial but we will just concentrate on turning this query tree into a filter that can be applied to a single record.

We will start from the position where we have a tree-like representation of the query in memory. To keep it easy to follow, I am going to use nested lists for this. Here's the representation:

  • Properties will have been mapped into a procedure that takes a record and returns a value
  • An equality of a property P to a value V is [ = ^P ^V ], other operators are <, >, <=, >= and /=
  • T and U is the list [ and ^T ^U ]
  • T or U is the list [ or ^T ^U ]

(The assumption that properties have been mapped to procedures is a big simplification that we make just to keep this example short enough to follow - sorry about that!)

The Strategy

We will generate a procedure with a single argument called "the_record". This will be popped off the stack as the first activity. Then we recursively compile the query-tree, compiling each part into a boolean expression that can access "the_record".

  • We will call properties directly using sysCALLQ that allows us to invoke procedure values directly.
  • We will translate operators into simple calls of the corresponding Pop-11 variables.
  • We will use sysAND and sysOR to implement and and or, recursively compiling the left and right sides in the natural way. Because they are compiled in such a similar way we will parameterise a single short-circuit helper procedure.

Code

Our entry point will be plantQueryTree, which will do that pop into "the_record".

define plantQueryTree( tree ); lvars tree;
    sysLVARS( "the_record", 0 );             ;;; declare our single variable.
    sysPOP( "the_record" );                  ;;; and assign it from the stack.
    plantQTree( tree );                      ;;; then compile the tree, knowing the variable is bound
enddefine;

So our first task is to determine which case we have and dispatch to helper procedures:

define plantQTree( tree ); lvars tree;
    lvars ( op, args ) = tree.dest;
    if op == "and" then
        plantShortCircuit( sysAND, args.dest.hd )
    elseif op == "or" then
        plantShortCircuit( sysOR, args.dest.hd )
    elseif member( op, [ = < > <= >= /= ] ) then
        plantComparison( op, args.dest.hd )
    else
        mishap( 'Unknown operator', [ ^op ] )
    endif
enddefine;

The plantShortCircuit procedure is a simple rewrite of the templates shown above:

define plantShortCircuit( shortCircuit, T, U ); lvars procedure shortCircuit; lvars T, U;
    lvars done_label = sysNEW_LABEL();
    plantQTree( T );
    shortCircuit( done_label );
    plantQTree( U );
    sysLABEL( done_label );
enddefine;

Compiling a comparison such as [ = ^doIt 99 ] needs to be translated into something like this pseudo-assembler:

PUSH the_record           ;;; the_record is a variable, so we access it via sysPUSH
CALLQ the value of doIt   ;;; we have mapped doIt to a procedure value, so we sysCALLQ it
PUSHQ 99                  
CALL =                    ;;; the operator is a word, so we use sysCALL not sysCALLQ

So this becomes:

define plantComparison( op, property, value );
    sysPUSH( "the_record" );
    sysCALLQ( property );
    sysPUSHQ( value );
    sysCALL( op );
enddefine;

And that's it.

Put it all together

We'll bring all the code together for this example:

define buildNewProcedure( code_planter ); lvars procedure code_planter;
    procedure();
        sysPROCEDURE( false, 0 );
        code_planter();
        sysPUSHQ( sysENDPROCEDURE() );
        sysEXECUTE()
    endprocedure.sysCOMPILE    
enddefine;

;;; Forward declaration.
vars procedure plantQTree;

define plantShortCircuit( shortCircuit, T, U ); lvars procedure shortCircuit; lvars T, U;
    lvars done_label = sysNEW_LABEL();
    plantQTree( T );
    shortCircuit( done_label );
    plantQTree( U );
    sysLABEL( done_label );
enddefine;

define plantComparison( op, property, value );
    sysPUSH( "the_record" );
    sysCALLQ( property );
    sysPUSHQ( value );
    sysCALL( op );
enddefine;

define plantQTree( tree ); lvars tree;
    lvars ( op, args ) = tree.dest;
    if op == "and" then
        plantShortCircuit( sysAND, args.dest.hd )
    elseif op == "or" then
        plantShortCircuit( sysOR, args.dest.hd )
    elseif member( op, [ = < > <= >= /= ] ) then
        plantComparison( op, args.dest.hd )
    else
        mishap( 'Unknown operator', [ ^op ] )
    endif
enddefine;

define plantQueryTree( tree ); lvars tree;
    buildNewProcedure(
        procedure();
            sysLVARS( "the_record", 0 );             ;;; declare our single variable.
            sysPOP( "the_record" );                  ;;; and assign it from the stack.
            plantQTree( tree );                      ;;; then compile the tree, knowing the variable is bound
        endprocedure
    )
enddefine;

And here's some simple test data:

vars example_tree = (
    [ and 
        [ and
            [ >= ^identfn 3 ]     ;;; x >= 3
            [ <= ^identfn 8 ]     ;;; x <= 8
        ]
        [ = ^(nonop &&(%1%)) 1 ]  ;;; must be odd, n.b. && is bitwise AND
    ]
);

And here's an interactive session trying out the compiled query on the list of single-digit numbers, successfully finding the odd digits between 3 and 8:

: lvars queryp = plantQueryTree( example_tree );
: lvars i; for i in [ 0 1 2 3 4 5 6 7 8 9 ] do if queryp(i) then i => endif endfor;
** 3
** 5
** 7
:

Reflection - what did this accomplish?

The first question is whether code-planting was essential to implement this? And the answer is plainly no. In fact a moment's reflection should convince you that code-planting cannot extend the power of the abstract machine - because we can always simulate it! And here's a naive alternative implementation:

define evalQuery( the_record, tree ); lvars the_record, tree;
    lvars ( op, arg1, arg2 ) = tree.dest.dest.hd;
    if op == "and" then
        evalQuery( the_record, arg1 ) and evalQuery( the_record, arg2 )
    elseif op == "or" then
        evalQuery( the_record, arg1 ) or evalQuery( the_record, arg2 )
    elseif member( op, [ = < > <= >= /= ] ) then
        valof( op )( arg1( the_record ), arg2 )
    else
        mishap( 'Unknown operator', [ ^op ] )
    endif
enddefine;

So the next question is whether our code-planting version is faster and/or more compact? Well, assuming we apply this procedure to a reasonably large data (otherwise we don't care about performance anyway) then is clearly much faster than the naive alternative. That is because it avoids the overhead of destructuring the tree and the runtime dispatch on the operator symbols. And the native implementation is much worse than you might think because of the expression [ = < > <= >= /= ] constructs the 6-element list each time!

But would it be possible to write an equally efficient version without code-planting? That's a good question! Can you can come up with a solution that is as good as the code-planting version just using normal functional-programming techniques? When you have picked your answer, turn here to find out more.

The third and final question is whether our code-planting version is a readable and maintainable way to achieve good performance? To a degree this is subjective, of course. But the abstract machine instructions are easy to understand and compose well together and by emitting to a hidden code-stream, rather like printing to a file, it avoids the complication of passing partially built objects around. So, for a good variety of problems, this Builder-like pattern is effective.

All of which brings us to the following conclusion, which is that code-planting is a heavy-duty powertool in our toolbox. It is reasonably straightforward to drive and, with a little care, delivers the best performance that Poplog is capable of. And though we can often get good enough performance through composition with higher-order procedures, code-planting is often a simpler answer and always at least as efficient.

Next Step

⚠️ **GitHub.com Fallback** ⚠️