Using the Stack Idiomatically in Pop 11 - GetPoplog/Seed GitHub Wiki
By and large, a Pop-11 programmer isn't aware or or thinking about the open value-stack. This is because they are intuitively following a very natural discipline that I call 'good stack citizenship'. In this note I explain a little about what this is and why it works.
A procedure is a good stack citizen when:
- On entry, it picks all its arguments off the stack and binds them to local variables
- It only invokes other procedures with appropriate arguments (right number, types etc)
- Items are only pushed on the stack because EITHER
- They are arguments to function calls or assignments
- They are intended to be return values
Some procedures and syntax forms take a variable number of parameters. For example the list-brackets construct a list of all the values that get pushed on the stack between the brackets [
and ]
. Also consstring(char1, char2, ..., charN, N) -> string
constructs a string from a counted set of characters. And using or writing these kind of procedures are the main occasions when needs to think about using the stack.
For example, this is the most natural way to code a list comprehension:
define remove_negative_numbers( L );
[%
lvars i;
for i in L do
if i >= 0 then
i
endif
endfor
%]
enddefine;
In this style, it is obviously true that a well-behaved function call does not peek at values of the stack that do not belong to them. So if a function has signature (say) T, U -> W
then it also has signature Item*, T, U -> Item*, W
. In other words, it is valid to see those excess input values as just extra return values.
This argument introduces a somewhat grey area. This is when we deliberately pass excess inputs arguments are to a function with the intention of them being additional return values. For example, the operator A // B returns two values: divid // divis -> (rem, quot)
. So one might write this expression with the intention of returning A rem B; A quo B + 1
.
( A // B ) + 1
Some Pop-11 programmers would elect to write this in strict good citizenship style:
lvars ( R, Q ) = A // B;
( R, Q + 1 )
An alternative way to understand the notion of good stack-citizenship is via the (imaginary) notion of nested stack-locking. When we lock the stack, no one can peek at the locked items until it is unlocked. A good stack-citizen obeys the following locking rules:
- Calling a procedure means [1] locking the stack [2] pushing the arguments [3] calling the procedure [4] unlocking the stack.
- And the first thing a procedure does is [1] binds its arguments [2] clears the stack back to the lock [3] runs its body.
Let's look at an idiom that violates the rules. The procedure applist( L, p )
applies the procedure p to each element of L, much like L.forEach( p )
in Java. Here's an idiom for summing a list:
applist( 0, L, nonop + ) ;;; nonop + = use `+` as an ordinary (non-operator) variable
This idiom fails the rule that the procedure should remove from the stack all the arguments it is passed - because it is passed an extra 0 argument that it will use. During the execution of the body, the top-of-stack is treated as a running variable. It happens to work because applist( L, P )
is implemented to be the equivalent of calling P(L1); P(L2); ... ; P(Ln)
. And not something oddball but also correct such as, say, Ln; ....; L2; L1; P(); P(); ...; P()
.
For good stack citizens, the basic question is how we can tell what is a return value from a procedure and what is not? The answer is that expressions are always either an argument to a form/function-call (or syntax-form) OR they are a return-expression.
define foo( i );
if test( i ) then ;;; test(i) is an argument to the if-form
f( i ) ;;; f(i) is a return-value
else
f( g(i) ) ;;; g(i) is an argument to f, and f(g(i)) is the return-expression
endif
enddefine;
This introduces the concept of being called in a return-position. It is important to know whether or not a procedure is called in return-position. Part of good stack citizenship is being clear on whether a procedure is called in return-position or not i.e. whether any 'excess' values will be treated as valid return values from the main procedure. Armed with this notion, we can now see that applist
guarantees to call its procedure in return-position.
This is what powers a common idiom for stringifying items. Imperative, duck-typed languages like Pop-11 and Python have to solve a basic issue: how to keep the way objects are stringified consistent with how they are printed. In Python the decision was to make stringification the primitive and derive printing from that. In Pop-11 the decision is to make printing the primitive. This is enabled through the key idiom of defining printing in terms of calling a dlocalisable variable cucharout.
So in order to stringify an item X we locally redefine cucharout. What do we define it as? Well the simplest approach is to define it as identfn, the do-nothing procedure. Provided that cucharout
is always called in return-position, it will just drops the characters to print onto the stack, where they can be gathered into a string.
dlocal cucharout = identfn;
lvars N = stacklength();
pr( X ); ;;; drops character codes on the stack
consstring( stacklength() - N );
Idiomatically we would always rewrite these using the "counting" brackets #| STMNTS |#
. These leave a count of the number of items that STMNTS leaves on the stack (on top of all the items).
dlocal cucharout = identfn;
consstring(#| pr(X) |#)
So now we know this idiom for stringification, let's look at the two forms of printf. The printf procedure is designed to work a bit like the C printf function. The easy to understand form is this:
printf(<string>, [<item1> <item2> ... <itemN>]);
It takes a control-string and marches along it looking for the magic '%' character. When it find '%' it uses the next character to determine the format and then prints the next item in the list and advances the list. Simple enough.
The second form looks like this:
printf(<itemN>, ..., <item2>, <item1>, <string>);
This version marches along the control-string and picks arguments off the stack as it goes. Of course you have to put the items on the stack in the reverse order. However this second form is not a good stack-citizen. It does not pick up its arguments on entry but leaves them littering the stack. As a consequence, when our redefined cucharout drops values on the stack they will horribly interfere with the arguments that are left on the stack. In fact the second version of printf is incompatible with this idiom for stringification.
We can look at this in three ways:
- Redefining cucharout (signature: int -> ()) as identfn (signature Item -> Item) is a bad idea.
- Complain that sprintf/2 is a bad stack citizen and need rewriting.
- Concede that there is no contractual guarantee that
cucharout
is called in a return-position and therefore the change of signature is not supported.