Closures in zeptoforth - tabemann/zeptoforth GitHub Wiki

zeptoforth, as of release 0.52.0, includes support for closures. Due to the nature of the Forth language, these are not like closures in languages such as Haskell or Scheme; there is no automatic name-binding, one has to manually manage the memory taken up by the closure. Note that while there is support in zeptoforth release 0.52.1 for closures with arbitrary numbers of arguments, for technical reasons manually constructing a data structure and having a single-cell closure point to it is more memory-efficient.

Closures themselves are data structures whose addresses serve as execution tokens, and which can be arbitrarily overwritten or reused. They can even be overwritten or reused within the code to which they point. They can be located anywhere in RAM, and can be in the dictionary, in a memory pool, in a heap, in an array, in a structure, in an object, or in any other kind of data structure. The only requirement upon them, aside from that they have closure-size bytes, for single-cell closures, 2closure-size bytes, for double-cell closures, or nclosure-size bytes (with a provided number of cells), for multi-cell closures, available, is that they be halfword aligned, in the case of single-cell and double-cell closures, or that they be cell aligned, in the case of multi-cell closures.

A simple example of the use of a single-cell closure is as follows:

closure import  ok
closure buffer: my-adder  ok
: set-add ( n -- ) my-adder ['] + bind ;  ok
: add ( n -- n' ) my-adder execute ;  ok
0 set-add  ok
1 add . 1  ok
2 add . 2  ok
1 set-add  ok
1 add . 2  ok
2 add . 3  ok
2 set-add  ok
1 add . 3  ok
2 add . 4  ok

Here we allot space for a single-cell closure at my-adder. set-add takes a value and binds + with it, giving a closure at my-adder. add executes the execution token which is the address at my-adder, which in this case takes a value and adds it to the value that had been bound with + by set-add. Each call to set-add overwrites the previous closure at my-adder, rebinding it to a new value. We can see this by how the result of add for each passed-in value changes when we change the value bound with set-add.

A simple example of the use of double-cell closures is as follows:

closure import  ok
2closure-size buffer: my-typer  ok
: set-type ( c-addr u -- ) my-typer ['] type 2bind ;  ok
: do-type my-typer execute ;  ok
: set-foo s" FOO" set-type ;  ok
: set-bar s" BAR" set-type ;  ok
: set-baz s" BAZ" set-type ;  ok
set-foo  ok
do-type FOO ok
set-bar  ok
do-type BAR ok
set-baz  ok
do-type BAZ ok
set-bar  ok
do-type BAR ok
set-foo  ok
do-type FOO ok

Here we allot space for a double-cell closure at my-typer. set-type takes a string (as an address and length) and binds type with it as the closure my-typer. do-type executes the execution token at the address of my-typer. set-foo, set-bar, and set-baz call set-type to (re)bind type with different strings (note that in this case the strings cannot be specified in interpretation mode, because those strings are only temporary); after each of these are called, the current binding of type in my-typer is overwritten and subsequent calls to do-type types the appropriate string.

An simple example of the use of multi-cell closures is as follows:

closure import  ok
3 nclosure-size buffer: my-closure  ok
: set-closure ( x y z -- ) 3 my-closure [: ." (" rot . swap . (.) ." )" ;] nbind ;  ok
: do-closure ( -- ) my-closure execute ;  ok
0 1 2 set-closure  ok
do-closure (0 1 2) ok
3 4 5 set-closure  ok
do-closure (3 4 5) ok
6 7 8 set-closure  ok
do-closure (6 7 8) ok

Here we allot a closure my-closure of the size returned by nclosure-size with three cells specified. Then we define set-closure, which takes three values off the stack and binds them with a quotation which prints out the three bound values which are pushed onto the stack when it is called as my-closure. We also define do-closure, which simply executes the execution token at my-closure. Then we call set-closure with three different sets of values and then call do-closure to print out the values bound by set-closure.

An example involving storing closures and associated data in a memory pool is as follows:

closure import
pool import
begin-structure coord-closure-size
  closure-size +field coord-closure
  field: coord-x
  field: coord-y
  field: coord-z
end-structure
16 constant coord-closure-count
pool-size buffer: my-pool
coord-closure-size coord-closure-count * buffer: my-pool-data
coord-closure-size my-pool init-pool
my-pool-data coord-closure-size coord-closure-count * my-pool add-pool
: bind-coord { x y z xt -- closure }
  my-pool allocate-pool { closure }
  x closure coord-x !
  y closure coord-y !
  z closure coord-z !
  closure dup xt bind
  closure
;
: bind-coord+ ( x y z -- closure )
  [: { coord } rot coord coord-x @ + rot coord coord-y @ + rot coord coord-z @ + ;] bind-coord
;
: bind-coord- ( x y z -- closure )
  [: { coord } rot coord coord-x @ - rot coord coord-y @ - rot coord coord-z @ - ;] bind-coord
;
: coord. ( x y z -- ) ." (" rot . swap . (.) ." )" ;
1 2 3 bind-coord+ constant 1_2_3+
4 5 6 bind-coord+ constant 4_5_6+
1 2 3 bind-coord- constant 1_2_3-
4 5 6 bind-coord- constant 4_5_6-

Afterwards, execute:

0 0 0 1_2_3+ execute coord. (1 2 3) ok
0 0 0 4_5_6+ execute coord. (4 5 6) ok
0 0 0 1_2_3- execute coord. (-1 -2 -3) ok
0 0 0 4_5_6- execute coord. (-4 -5 -6) ok
10 10 10 1_2_3+ execute coord. (11 12 13) ok
10 10 10 4_5_6+ execute coord. (14 15 16) ok
10 10 10 1_2_3- execute coord. (9 8 7) ok
10 10 10 4_5_6- execute coord. (6 5 4) ok

Here we define a structure containing a closure and xyz coordinate values and a memory pool containing these structures. We then define a word bind-coord which binds an execution token with a structure allocated from the memory pool containing both the resulting closure and a xyz coordinate triple which is passed in and which returns the structure. That the closure is first in the structure is significant because it enables the address of the structure to be used directly as an execution token. Afterwards, we define two words bind-coord+ and bind-coord- which bind quotations implementing adding and subtracting xyz coordinate triples, respectively, to passed-in xyz coordinate triples and return the resulting structures. We also define coord. to print xyz coordinate triples. We then use bind-coord+ and bind-coord- to define two adders, which add and subtract (1 2 3) and (4 5 6). We then add and subtract (0 0 0) and (10 10 10) with each of these to show how they apply to passed in values.

This pattern can be extended to many other sorts of memory management beyond just memory pools, such as storing values in heaps, but memory pools are convenient when one has many closures of the same size, being more efficient in memory usage and faster. Heaps are more suitable when the data associated with the closure is variable in size. Closures may be separated from the data they act on in other use cases, e.g. when a closure points at a preexisting data structure somewhere else in memory. As implemented, closures provide the flexibility to enable many use cases, even if they are not in their raw form as simple to use as in traditionally lexically-scoped languages.