CamelForthSaveLoad - nealcrook/multicomp6809 GitHub Wiki

This discusses various CamelForth internals and works up to a description of a set of words that can be used to load/compile code and to load/save compiled code. This description assumes that you have worked through a FORTH tutorial and have a vague idea what's going on!

CamelForth memory map

When you write new definitions in CamelForth they are stored in a data-structure called the Dictionary. All the "baked in" definitions are in ROM and new definitions that you write are stored in RAM, starting at address 0. The dictionary in RAM contains header information, code and read/write data. The dictionary in ROM necessarily does not contain read/write data; instead it contains references to locations in RAM.

In order to run correctly on even the smallest Multicomp6809 system, CamelForth is built to use 2KBytes of RAM, in the region 0x0000 - 0x07FF. The dictionary grows up from 0. At the top of memory is a small region for data storage (variables declared in the ROM dictionary), a region reserved for the structure stack and a region reserved for the return stack. Below that is the data stack pointer; the data stack (aka "the parameter stack" or simply "the stack") grows downwards from that point. If the Dictionary or the Stack grow too big, they will meet and corrupt one another (there is no protection or detection of this condition).

Most Multicomp6809 systems (once debugged) have loads of memory. There's a very simple trick to avoid the memory restriction just described. After reset, simply go:

HEX 1000 ALLOT

This reserves 0x1000 bytes in the dictionary, from the current starting point (which is 0x0000 after reset) and therefore it "steps over" the stack and continues allocation of the dictionary at address 0x1000. You could just as well pick 0x0800, but I prefer this nice round number.

Block Files

The traditional form of storage on Forth systems is a "block file". This is a low-level mechanism that does not rely on a file system or host operating system. It is simply an area on a disk, and the programmer/user is required to remember where it is and how big it is and what is in it. A block file has no structure but is conventionally considered to consist of a number of "blocks" each 1Kbyte in size. When a block file contains FORTH source code, it is conventionally presented as 16 lines of 64 characters (16*64=1024). There are no line endings and each block is contiguous with the next.

Another convention is to use blocks in pairs with one block of the pair holding source code and the other block of the pair holding comments or program commentary. One common pattern is to consider the first half of the blocks in a block file to be the code, and the second half to be the comments, called "shadow blocks".

Using Block Files with CamelForth

CamelForth on Multicomp6809 is set up to use 4 block files, numbered 0-3. Each of them is 256 blocks in size and therefore the set occupies 1MByte on the SDcard. It is trivial to assign more block files or to make them bigger (in fact, there is nothing here to check the size of a block file -- no check to stop you from running off the end of a block file). The SD offsets for these block files are as follows:

  • 0x2.0800 - block file 0
  • 0x2.0A00 - block file 1
  • 0x2.0C00 - block file 2
  • 0x2.0E00 - block file 3

One of these block files is selected by storing its start address into the double variable BLKADRS. After reset, block file 0 is selected. You can switch to (for example) block file 3 like this:

HEX 2 BLKADRS ! E00 BLKADRS CELL+ !

On a newly-created SD image, block files 0, 1, 2 will have undefined content and block file 3 will include the CamelForth source and binary image. If block file 3 has been prepared correctly, you can load and start the binary CamelForth image in that block file by typing:

SDFORTH

This can, for example, allow you to try out a new version of CamelForth without the need to reprogram the FPGA. There is one restriction to running CamelForth from RAM in this way: you cannot, from this image, swap to another image (for example, boot to FLEX or CUBIX). If you try to do so, the system will hang. There are ways that this restriction could be overcome, but there seems little benefit to be had from adopting them.

The Multicomp6809 version of CamelForth provides 2 sets of words for working with block files. Both sets of words are layered on a set of words used to read and write the SDcard. The two sets are:

  • The ANS Standard set of BLOCK and BLOCK EXT words (with the exception of REFILL). These are used for listing blocks, working with buffers and loading/compiling source code from block files
  • A novel and non-standard set of words that I designed and implemented for loading and saving compiled code is block files. This mechanism allows you to save a chunk of the dictionary (representing compiled code and data) to SDcard and to load it again in a future session. The code is stored within a block file and I call the thing that you load and save a "blob".

ANS Standard words

These words should all behave as described in the ANS Standard. They are layered on a set of lower-level words that are not intended to be used directly and which will not be described here.

Words

BLK ( -- a-addr)    Variable
SCR ( -- u-addr)    Variable
\                   Behaviour of this word is extended so that, when working with a block file, the 64-char aligned address is treated as the end of the line
EVALUATE            Behaviour of this word is extended to store 0 in BLK.
BLOCK ( blk -- a-addr) Return the address of a 1k buffer containing the contents of block blk. Make this buffer the Current buffer.
BUFFER ( blk -- a-addr) Return the address of a 1k buffer assigned to block blk (contents of the buffer are undefined). Make this buffer the current buffer.
LOAD ( blk --)      Load the specified blk (ie interpret/compile the code in that block). Note that ANS does NOT allow you to load block 0.
THRU ( blk1 blk2 --) Load a range of blocks blk1..blk2 inclusive. Note that ANS does NOT allow you to load block 0.
SAVE-BUFFERS ( --)  Save any buffers marked as Modified, then mark them as unmodified.
FLUSH ( --)         Save any buffers marked as Modified, then mark them as Unassigned.
UPDATE ( --)        Mark the Current buffer as Modified.
EMPTY-BUFFERS ( --) Mark all buffers as Unassigned (thereby discarding the contents of any Modified buffers)
LIST ( blk --)      Display the content of block blk as 16 lines of 64 characters

Implementation details

The BLOCK and BLOCK EXT words make use of 4 1Kbyte memory buffers which are explicitly used by the BLOCK and BUFFER words and implicitly used by words like LOAD, THRU, LIST. These buffers have many of the semantics of a write-back cache:

  • Each buffer is in one of 3 states: Unassigned, Assigned-Clean, Modified (= Assigned-Dirty)
  • Each buffer that is Assigned is associated with a specified block in the current block file
  • When access to a block is desired, BLOCK or BUFFER is used to assign a buffer to that block. First, a check is made to see if the block is already associated with a buffer. If that check fails, a check is made for an unassigned buffer that can be assigned to the block. If that check fails, an assigned buffer is reassigned to the block (writing its contents back to the block file first, if it is marked as Modified). A round-robin (least-recently assigned) rule is used for selecting which buffer to reassign.
  • The fact of their being 4 buffers is an implementation detail that should be transparent to any use of the BLOCK and BLOCK EXT words. Providing >1 buffer merely provides a performance improvement.
  • The buffers are assigned to the top of memory: $C000-$CFFF.

Tutorial

A typical block file containing source will contain a "load screen" as screen 1. A "load screen" is code that loads the remainder of the screens. For example, for a section of code contained in screens 3-14 the load screen might look like this:

\ Load screen for whizz-bang Editor (WBE)
\ to load this code type "1 LOAD". Then, to start the editor
\ type WBE
DECIMAL 3 14 THRU

So you might do this:

0 LIST
1 LIST \\ should see that this is a "load screen"
1 LOAD

The blob save/load mechanism

Design goals for the blob save/load mechanism

  • Ability to load and then link in a block of pre-compiled code (no requirement for this to be relocated, and it's OK to require the system to be in a clean state when this is done. It would be nice, however if this lump of code could be linked to successive ROM versions).

  • Ability to save block of compiled code, with linkage rules embedded within it (so that loading it doesn't require gory knowledge about how it was saved).

  • Ability to edit source code in block form using a WYSIWYG editor (from Kelly&Spies) on the Multicomp6809 ANSI VDU

  • Ability to compile code from a blocks file - including LOAD and THRU and nesting thereof.

  • Support block-based VM system based on a blocks file stored on SDcard.

  • Best if block file contained 3 sections: source code, shadow screens, compiled code (any combination of these).

  • Want all of this to be boot-strappable from ROM. Therefore, need the minimum word-set needed to support all of this to be stored in the ROM. The source file currently has 3 spare blocks so it would be GREAT if it could fit in that size. And it needs to fit into the spare ROM space, too.

  • Want the ability to discard compiled code (in the same way as "marker" but explicit).

Tutorial introduction

HEX ALLOT 1000
BLOB FRED \ creates a dictionary entry with a DOES> effect
: MYDEF1 .... ;
: MYDEF2 .... ;
(etc)

FRED \ "seals" the binary by setting the end-addr

40 WRBLOB FRED     \ write FRED to block file at block offset 40

( power cycle )

40 RDBLOB          \ restore definitions

(etc)

RMBLOB FRED        \ delete definitions

(etc)

40 RDBLOB          \ restore definitions
: MYDEF99 ... ;
FRED               \ "re-seal" by writing new end-addr
50 WRBLOB FRED     \ store FRED with new additions

Words

BLKADRS            double variable storing sda2, sda10 of blocks file. ie:
                   lba2 BLKADRS ! lba10 BLKADRS CELL+ !
BADBLOB ( -- )     error message then QUIT or ABORT?
BLOB ( "FOO" -- )  create marker data structure for binary blob
"FOO"              seal/reseal the data structure created by BLOB FOO
SDWRZ  ( sda10 blocks -- ) zero out that many 512-byte blocks on the SDcard -
                   like other SD words, assumes SDLBA2 has been run already.
WRBLOB ( n "FOO" -- ) write binary of FOO (defined by BLOB) to the current blocks
                   file starting at 1k block n. Error if FOO fails magic
                   number check. Error if FOO has a byte count of 0.
RDBLOB ( n -- )    Read binary from the current blocks file, at 1k block n, into
                   memory starting at HERE. Error if the binary fails
                   magic number check. Error if HERE does not match load address.
                   After the load, HERE and LATEST are updated so that the definitions
                   in the binary appear as though they were just compiled.
RMBLOB ( "FOO -- ) Update HERE and LATEST so that FOO and anything defined after it
                   (including the whole of the binary) disappear from the wordlist
FIXBLOB ( "FOO" --)Patch FOO so that it can be used to reseal the binary even if
                   the ROM has been updated since FOO was originally created (see
                   notes below)
MARKER ( "BAR" --) Not needed for this but handy to add now.

Restrictions

  • There is no relocation going on. The blob must always be loaded to the same address it was saved from. Typically, by doing a 1000 ALLOT first.

  • The DOES> effect baked in to FRED references an address in the definition of BLOB. If that location has changed (eg, because BLOB is in ROM and the ROM has been modified and rebuilt), executing FRED will cause undefined behaviour. That is the purpose of FIXBLOB. FIXBLOB finds the current DOES> address of BLOB and patches FRED. FIXBLOB is not itself portable but provided it is defined alongside BLOB it will work correctly.

  • The parameter field of FRED holds: ** 2 bytes store start-addr -- the first address used by FRED. Set once. Used in the save and also as a check during the load. ** 2 bytes store end-addr -- the first free address when FRED was sealed. Set by the run-time effect of FRED. ** 2 bytes link field for the newest definition -- copied from @ LATEST when FRED was sealed. Set by the run-time effect of FRED. ** 4 bytes hold the "magic value" of "CF09" (CamelForth6809) as a sanity check.

FORTH Code for blob save/load

This code is "baked in" to the latest CamelForth ROM image but the code is included here for instruction.

HEX 1000 ALLOT

VARIABLE BLKADRS CELL ALLOT
0002 BLKADRS ! 0800 BLKADRS 2 + ! \ Is endian correct?

: BADBLOB ." ERROR BAD BLOB " TYPE CR ABORT ;
\ Abort if CFA does not contain magic number
: BADMAG DUP 6 + S" CF09" S= IF S" MAGIC NUMBER" BADBLOB THEN ;
\ Abort if CFA has load-addr == end-addr
: BADSIZ DUP @ OVER 2 + @ = IF S" ZERO BYTES" BADBLOB THEN ;
\ Abort if CFA has load-addr that differs from HERE
: BADLD  DUP @ HERE <> IF S" LOAD ADDR" BADBLOB THEN ;

: BLOB LATEST @ HERE DUP CREATE , , , \ start-addr, end-addr, link-addr
  4346 , 3039 , DOES>     \ "CF09" magic number
    CELL+ DUP HERE SWAP !   \ run-time: fill in end-addr
    CELL+ LATEST @ SWAP ! ; \ and link-addr

: RMBLOB ' >BODY BADMAG DUP @ DP ! @ @ LATEST ! ;

HEX

\ A word created by BLOB contains code field: BD xx yy
\ Where xxyy is an address in BLOB. That address is
\ a fixed offset from ' BLOB. Find the offset (currently
\ 21) and code it below.
: FIXBLOB ['] BLOB 21 +  \ current addr for BLOB DOES>
  ' >BODY BADMAG 2 - ! ; \ overwrite addr of BLOB DOES> in defined blob

: WRBLOB 1 LSHIFT BLKADRS 2 + @ +
  ' >BODY BADMAG BADSIZ
  DUP CELL+ @ OVER @ - \ size in bytes
  9 RSHIFT 1 + \ size in 512b sd sectors
  SWAP @ SWAP
  BLKADRS @ SDLBA2 SDWRn ;

: RDBLOB ( n -- ) \ read blob from block file starting at block n
  BLKADRS @ SDLBA2 1 LSHIFT BLKADRS CELL+ @ + HERE \ 1st-sector load-addr
  2DUP SDRD \ load first sector.
  HERE CELL+ NFA>CFA 3 + \ PFA of blob
  BADMAG BADLD  \ sanity checks
  \ 1st-sector load-addr pfa

  \ calculate how many more sectors, then load them
  DUP CELL+ @ OVER @ - \ size in bytes
  9 RSHIFT \ remaining 512b sd sectors
  SWAP >R  \ 1st-sector load-addr count
  BEGIN
    DUP 0 <> WHILE
    >R \ stash current count
    200 + SWAP 1+ SWAP \ next addr sector
    2DUP SDRD \ load next sector
    R> 1-
  REPEAT
  DROP 2DROP R>

  \ stitch the blob into place
  LATEST    @ HERE !     \ Update ..oldest link in loaded blob.
  CELL+ DUP @ DP !       \ ..HERE from blob's end-addr
  CELL+     @ LATEST ! ; \ ..LATEST from value stored in blob