2.1 The Structure of the Interpreter - naver/lispe GitHub Wiki

Internal components of LispE

Fundamental structure in Lisp: the list.

LispE is based on a fundamental structure: the list, represented with parentheses, the elements being separated from each other by spaces.

(FUNCTION list|data list|data...)
- A function can be an operator: '+ - * / %'
- A predefined function of the language quote, cons, list, eval... 
- Or a user-defined function.

The arguments are of five types: 

* list
* dictionary
* atom
* character string
* number

'quote' is a particular function that returns its argument without evaluating it. It is most often written in the form: 'expression.

Lisp
(setq test '(1 2 3))

Object Representation

The interpreter has been implemented in C++ and therefore uses class derivation as the basic mechanism to represent all LispE objects.

//Any object in LispE, be it an instruction or data,
//is a derivation of the class at the root: 

class Element {
...
}

In this way, a list can be represented by a simple LIST of Elements:

class List : public Element {
public:
    LIST<Element*> liste;
    
    List(): Element(l_list) {}

Thus, the basic objects of the language are associated with their equivalent object (see elements.cxx):

  • Atoms: class Atom.
  • Operators: class Operator
  • The key words of the language: class Instruction
  • The strings: class String (unicode strings)
  • Numbers: class Number and class Integer.
  • Lists: class List
  • Dictionaries: class Dictionary and class Dictionary_n

LIST

This structure is a very specific vector implementation, which has been designed to propose a very efficient way to execute cdr functions while retaining a direct access to elements. In traditional Lisp implementation, a list is actually a linked list, where a cdr simply returns a pointer to the next element in the list. However, in such a list, direct access to an element via its index, requires traversing the list pointer by pointer.

A LIST element offers a simple solution to both problems. First, a LIST element contains a pointer to a vector and a home integer that indicates the position of the first element in this vector.

Since this vector is implemented as an array, access via an index is straightforward. A cdr is a little more complicated as it consists of creating a new LIST instance, where the current LIST home value is incremented by 1. This new LIST instance still shares the same array as this current LIST object.

//Our initial LIST object:

   LIST: [home = 0, item = array]

//cdr provides: 

   LIST: [home = 1, item = array]

//access to the element of index pos: 

Element* at(long pos) {
     return item[home+pos];
}

Pools

LispE manages most of the data in the form of data pools.

There are several pools:

  • pool of atoms: indexed to the numerical identifier of the atom.
  • pool of operators: indexed on the numeric identifier of the operator.
  • pool of numbers: returns Integer objects that can be re-used
  • pool of integers: returns Number objects that can be re-used
  • pool of strings: returns String objects that can be re-used.

Thus, instead of creating a new object for a string, a number or an atom, one draws from this pool, allowing the same object to be shared throughout a program. LispE provides four basic methods for this purpose:

  • Element* provideAtom(short identifier): the pool indexes Atom objects using its identifier.
  • Element* provideOperator(short identifier): the pool indexes Operator objects using its identifier.
  • Element* provideString(wstring): Return a pointer to a Stringpool object.
  • Element* provideNumber(double value): Return a pointer to a Numberpool object.
  • Element* provideInteger(double value): Return a pointer to a Integerpool object.

Thus, in the code, each time a new atom or a new string is created, it is checked whether or not they are already registered in their respective pool so that the corresponding pointer can be reused. This allows in particular to limit the creation and destruction of objects, a cost often important in terms of CPU time.

For strings, numbers and integers when an element is no longer used, it is pushed back in its respective pool. When one these elements is requested, we return the top one on the pool. Hence, these objects can be re-used.

Execution stack

When a function of type defun or deflib is executed, a Stackelement object is created in the execution stack: LispE::stack, in which the variables and arguments are stored. When the execution of the function is finished, the stored variables are unstacked and discarded.

Note that all objects have a status which is incremented when they are saved and decremented when they are removed from the stack.

Note bis: lambdas are executed within the current stack. If a local variable in the lambda shares the same name as another variable already stored in the stack, then we replace it temporarily with the new variable. When the lambda is executed, we clean the new variables that where created on the stack or put the old variables back in place. In this way, we ensure that a lambda will have access to all active variables.

Life cycle of objects

All objects derived from Element have a status field which can take the following values:

s_destructible = 0;
s_protect = 0xC000;
s_constant = 0x4000;

This definition is saved in the file: elements.h

Important An object is never destroyed directly, but via the methods release or decrement, whose purpose is to check the value of the status before deciding whether or not to destroy the object. Thus, constant objects are protected against untimely destruction thanks to this mechanism.

An object can take the following status:

  • s_destructible: the initial default value which, as its name indicates, allows the destruction of an object in release.
  • s_protect: used to temporarily protect a value against destruction, especially when the stack is unstacked.
  • s_constant: This is a value stored in one of the pools shown above or in the garbage collector.

Each time an item is added to a list or dictionary, or saved to the stack via setq, this status is incremented by 1. When the object is removed from a container or from the stack, its status is decremented. When this status is back to '0', the object is destroyed. This status therefore acts as a reference counter.

With one subtlety, the values of s_protect and s_constant are fixed at compile time and prevent the unexpected destruction of some objects. Thus, an object registered in the garbage collector is considered as 'constant' and cannot be destroyed during code execution. It is destroyed once the program is finished.

Evaluation

Each time an evaluation is done, do not forget to call release, in order to release the object.

//Each instruction is associated to a specific method in the List class

Element* List::evall_if(LispE* lisp) {
     //evaluation of the test
     try {
            first = liste[1]->eval(lisp);
            bool test = first->Boolean();
            //in all cases, we try to free 
            // the result of the evaluation
            first->release();
            first = null_;
            // Depending on the value of the 'test' we call the THEN
            // or the ELSE part
            if (test)
                return liste[2]->eval(lisp);
            return liste[3]->eval(lisp);
    }
    catch(Error* err) {
      //When an error occurs, we throw an Error object
      //We catch it here to get rid of dangling values
      first->release();
      throw err;
    }

Let's look at the example above to better understand how the life cycle of a value is organized.

It is the assessment of an if which is composed of four sections:

(if TEST THEN ELSE) 

what is found in the code above.

  • liste[0] is an Instruction object whose type is: l_if
  • liste[1] is the test. Note that after the call to eval, we check which Boolean value it corresponds to and we _relaunch it.
  • liste[2] is the THEN
  • liste[3] is the ELSE

The stack

When a value is stored in the stack with a certain label, there are three cases:

  1. It is a constant value, but it is not a container, so it is stored as such in the stack.
  2. It is a constant value which is also a container, we duplicate it and place it in the stack.
  3. It is any value, we store it in the stack and increment its status.

Let us note that because of point 2, we avoid an untimely modification of a constant object...

⚠️ **GitHub.com Fallback** ⚠️