2.3 Lists - naver/lispe GitHub Wiki

Implementation

In Lisp, lists are usually implemented as a linked list:

As we can see on this example, applying a cdr to this structure consists of positioning a pointer to the right cell. However, moving to the cell containing: 5 requires three steps. A list is a very powerful data structure, which allows for complicated updates, but prevents from any random access. Every access to a node imposes some structure traversal first. This is the reason why in most Lisps: push adds an element to the head of the list and not to the tail. Adding an element to the tail means traversing the whole structure to reach the last element, while adding an element to the head simply means: adding a new cell that points to the current head and transforming this new cell into the new head.

For people used to languages such as C, Java or Python, this way of handling lists is counter-intuitive as most instructions usually work the other way. appendin Python adds an element to the end of the list, not to the head.

LispE

LIspE proposes a different implementation that breaks with this tradition in Lisp to handle lists... well as lists. Our solution considers a list as a vector, in which elements can be directly accessed through their indexes and values are pushed at the end of a list. This solution certainly poses some problems and some traditional Lisp algorithms won't work as expected. However, it gives the developer a much natural way to access lists, while maintaining the way functions such as cdror carworks.

Home made list

The list implementation in LispE is homemade (see listes.h) and may need some explanations.

It is split in two classes: ITEM and LIST.

  • A LIST is composed of only two fields: home and item, with item being an ITEM element.
  • An ITEM is composed of a buffer, its size: sz and the position of the last element that was appended: last. status is a reference counter that indicates how many elements share this ITEM.

When a list is created, a LIST object is also created together with its ITEM object. The initial status value of this ITEM is 0. insert or push instructions will be processed as a push_back or an insert ITEM method. When the buffer is too small, a reserve is automatically called to enlarge the size of the buffer to accommodate new elements.

  1. Note that the status of an Element object is then handled in ITEM's methods.

  2. Note also that item is deleted and cleared only when status is zero.

cdr

The initial LIST object is created with its home field being set to 0: Every access to an element through an index will automatically add home to this index.

  //home is added to pos
  inline Element*& operator [](long pos) {
        return item->buffer[pos+home];
    }

   //Same to compute the size of the structure
    long size() {
        return item->last - home;
    }

When a cdr is applied to this list, a new LIST object is created. However this new LIST object will share the same item as the list to which cdr was applied and will set home to reflect the number of cdr that was applied:

    // The new LIST object shares the item
    // Its home value is the previous current home with the new position
    // The ITEM status is then incremented
    LIST(LIST& l, long pos) {
        home  = pos + l.home;
        item = l.item;
        //We modify the common reference counter
        item->status++;
    }

Hence: (cdddr a) will be handled through a LIST object with home = 3.

Sharing ITEM

Pushing or inserting a new element in the ITEM structure will of course be reflected in the variables that point to that structure through different calls to cdr. We also ensure that the structure is cleaned and deleted, only when status is set to zero.

When a LIST object based on a shared ITEM is deleted, we check the status of this element to delete it:

   //if item->status is zero, then we can safely delete item
   /// otherwise we simply decrement it...
    ~LIST() {
        if (!item->status)
            delete item;
        else
            item->status--;
    }

Final cleaning

The elements in a list are only cleaned when item is deleted, hence when the status is zero.


    void decrement() {
        if (status)
            return;
        for (long i = 0; i < last; i++) {
            buffer[i]->decrement();
        }
    }

Note that the decrement method can only decrement the elements when status is set to zero...

Double Reference Counter

The whole structure can be understood as a double reference counter. The LispE List object still handles its own status value, which is parallel to the status value handled by the ITEM class.

The very interesting point in this implementation is that traversing the structure is almost never necessary. For instance, the management of the Element status in the buffer is done once when the elements are stored in the structure. When you add a List into another List, you only need to increment its own status.

The only case, when this structure is less efficient is when you need to insert an element at the head. In this case, we need to move all elements one step forward, which in a buffer is quite simple and efficient:

        //we move all elements one step forward
        for (long i = last; i > pos; i--) {
            buffer[i] = buffer[i-1];
        }
        buffer[pos] = val;
        last++;

This is the reverse of regular Lisps where adding an element at the head is faster than at the end. However the above lines are usually compiled into very efficient machine code, which makes the whole operation quite fast.