6.9 Retrofitting Haskell into Lisp - naver/lispe GitHub Wiki

From Haskell to Lisp

Most of us know that Functional Languages in large part derive from the work of Alonzo Church: the lambda-calculus. This was the main inspiration to McCarthy when he proposed a new programming language, at the end of the fifties, which has become an ancestor to all functional languages: Lisp.

One of the most emblematic of these languages is Haskell, which was designed by mathematicians as an embodiment of Church's demonstrations and theorems.

However, if Lisp and Haskell share a common origin, the latter is not exactly a direct offspring of the first one. Actually, Lisp is more of a great-uncle to Haskell than a grand-father.

Still Lisp is not exactly dead, and frankly it could learn quite a few tricks from Haskell...

Actually, it might even give it a new youth, if we could retrofit some of its features into Lisp...

Lazy Evaluation

Lazy Evaluation is a mechanism that allows the evaluation of an expression only when the need arises. To understand the interest of this mechanism, let us imagine a list whose definition is given via an infinite arithmetic expression: [1,2,3...]. It is obviously impossible to extract this list first before processing it afterwards. The only way to manipulate this list is to constrain it via a condition with a function such as takeWhile whose role is to stop the iteration as soon as its condition has been verified.

takeWhile (<6) [1...]
[1,2,3,4,5]

Composition

But the interest of Haskell does not stop there. Indeed, this language knows how to compose functions with each other. The . is often used to emphasize this composition (and also to remove a few parentheses in passing !!!):

sum . takeWhile (<10000) . filter odd . map (^2) [1 .]
166650

As we can see in this example, not only did we compose together a series of functions, but more importantly, we applied this composition to an infinite list that takeWhile bounds with the condition: < 10000.

Lazy evaluation and composition are two aspects of Haskell that, in our opinion, had to be integrated into LispE.

map/filter

First of all, we will look at the two simplest functions of our arsenal: map and filter.

These instructions, which are directly available in LispE are actually not directly executed but compiled into LispE code. This allows for a much more flexible mechanism to handle these instructions as we will show now.

(map fonc list)

What is map?

This function takes two arguments. It applies the first argument (a function or an operator) to the second argument, a list of elements.

Let's take an example:

(map (lambda (x) (* x 2)) '(1 2 3 4))

;(2 4 6 8)

Note that LispE also offers another way to write a lambda in imitation of Haskell:

(map (\ (x) (* x 2)) '(1 2 3 4))

You can even write if you want to be even more readable:

(map (λ (x) (* x 2)) '(1 2 3 4))

The simplest interpretation of a map is to see this function as a loop on the list as an argument:

for i in list calculate λ(i)

Now we have a function in LispE that does exactly that: loop

(loop i list
      ( λ (x) (* x 2) ) i)
)

One step is missing to turn this operation into a real map. We need to push the result in a list. We will use push that appends a value to the end of a list (note that in most Lisp, this instruction pushes the value at the beginning of the list).

(setq r ())
(loop i list
      ; we apply our function on an element of the list
      (setq v ( [λ (x) (* x 2)] i))
      (push r v)
)

At the end of the calculation, r will contain our final list.

Note that LispE allows the use of square brackets: [] instead of parentheses to make the code more readable.

filter

filter returns a list where only those elements that satisfy a particular condition are kept.

(filter '(< 10) '(1 3 20 10 21 34 4 5))

; (1 3 4 5)

Like map, filter performs a loop on the list and checks, for each element, if the condition is verified. So we could rewrite our above expression as follows:

(setq r ())
(loop i list
      (check (< i 10)
             (push r i)
      )
)

We chose to build this test with check and not with if. The reason is simple. When the check condition is verified, the following instructions are all executed one after the other like in a block. By choosing check we simplify the code injection process.

Compose

So we have transformed each of these functions into a loop. Now let's imagine that we want to compose these two functions.

Let's first observe two things:

For a map, the application of a function results in the creation of a loop and a v variable.
For a filter, the application of the condition results in the creation of a loop and a check on a condition.

In both cases, we have a loop that has exactly the same shape and that we can factorize.

Let's take the following example:

(map '+ (filter '(< 10) '(1 10 2 3 9 12 21 4))))

We will first extract the common loop:

(loop i '(1 10 2 3 9 12 21 4) ...)

Then we will insert our condition:

(setq r ())
(loop i '(1 10 2 3 9 12 21 4) 
     (check (< i 10)
        (push r i)
     )
)

Now we need to introduce the map. Its introduction depends both on the condition and on the value pushed in r.

(setq r ())
(loop i '(1 10 2 3 9 12 21 4) 
     (check (< i 10)
        (setq v (+ i i))
        (push r v)
     )
)

We can see on the generated code that we have taken into account both the condition and operator '+' required by the map.

But what happens if we reverse the order of the operations:

(filter '(< 10) (map '+ '(1 10 2 3 9 12 21 4))))

The initial code for the map will be as follows:

(setq r ())
(loop i '(1 10 2 3 9 12 21 4) 
      (setq v (+ i i))
      (push r v)
)

The condition introduced by filter no longer relates to i but to the result of the map...

Which we will be able to re-transcribe in the following way:

(setq r ())
(loop i '(1 10 2 3 9 12 21 4) 
      (setq v (+ i i))
      (check (< v 10)
         (push r v)
      )
)

take, drop, fold and scan

Most of the other functions takewhile, take, dropwhile, drop, fold, scan are simple variations on this composition, with sometimes the injection of additional variables to control for example the number of iterations. What is special here is that we compile these functions in the form of LispE code that we can easily visualize or even transform. This choice may seem less efficient than a direct interpretation of these functions, but it allows both lazy evaluation and composition without any deep modification of the LispE interpreter. Moreover, it makes these notions clearer and more explicit for neophytes, who often have problems grasping these concepts in other functional languages.

If you want to see in details how LispE composes these different functions together, the easiest way is to create a defun function and then display its content:


(defun see() (map '* (scanl1 '+ '(10 20 30))))

(prettify see)

(defun see ()
   (#compose
      (0 0) 
      (setq #accu0 (* (+ #accu0 #i0) (+ #accu0 #i0))) ; intermediate slot for operation composition 
      (push #recipient0 #accu0) ; intermediate slot for storage composition
      (setq #recipient0 ()) ; the actual code starts here
      (setq #first0 true)
      (loop #i0 '(10 20 30)
         (ncheck #first0
            (setq #accu0 (* (+ #accu0 #i0) (+ #accu0 #i0)))
            (setq #first0 nil)
            (setq #accu0 #i0)
         )
         (push #recipient0 #accu0)
      )
   )
)

The above example is an actual composition as it happens in LispE. We can remark that variables start with a # to avoid confusion with user variables. ncheck is a specific function, which executes the first line, when the condition fails, or the list of instructions after the first line. Hence:

when we enter the loop body the first time, #first0 is true, and we execute the initialisation of #acccu0 with the iterator #i0 and we set #first0 to nil.
In subsequent loops, since #first0 is nil, we will execute the actual calculus in: (setq #accu0 (* (+ #accu0 #i0) (+ #accu0 #i0))).

Stateless programming

Stateless programming is certainly one of the most fundamental concepts of functional programming.

But what is it?

If we were to give a simple answer, let's say that it is an approach to programming where the execution of a program cannot be influenced by a previous execution. In other words, the result of the execution of (f a b) will always give the same result for the same set of arguments (a,b).

Let's take the following example:

// This function is impure as it depends on a global variable

int state = 0;

int f(int e) {
  state += e;
  return state;
}

cost << f(10) < " " << f(10) < " " << f(10) << endl;

We can see in the above example that successive calls to f with the same argument will give different results each time. With each call, state will change and f will return a different value. In Functional Programming parlance, this function is said to be impure. A pure function is a function that does not depend on a context to run.

For instance:

// This function is pure

int f(int e, int state) {
  state += e;
  return state;
}

cost << f(10,0) < " " << f(10,0) < " " << f(10,0) << endl;

// Each of these calls will return the same value

What's the problem?

In the case of functional programming, this kind of programming is generally not recommended. In particular, the notion of class found in Java and C++ is problematic in the sense that one can modify a field declared in an object and therefore make methods sensitive to previous executions.

Most computer scientists often have some difficulty understanding why such a behavior is problematic, since changing states within an instance is most often at the heart of their implementation strategies.

In fact, the problems are numerous:

Debugging: changing a single variable can impact code 1000 lines away.
Multi-threading: if a function or method depends on an external variable, concurrent access to this variable poses metaphysical problems that only semaphores can solve. The complexity of such architectures is unfortunately no longer to be demonstrated...

Data Types

The solution is again given by Haskell: Data Type.

Data Type superficially resemble class descriptions, which often throws computer scientists accustomed to Java or C++ into an advanced form of mental confusion (dare I confess that this was my case in the beginning?). A first encounter with these objects often leads to questions such as: "but how do we change the values of these guys?".

Obviously, there is no answer...

Because data structures in these programming languages are immutable.

In Java we choose a variable through which we will execute a method.
In Haskell, the choice of a function is based on polymorphism. We look for the function whose parameters correspond to the type of the arguments.

In other words, in a functional language, an instance of a data type has no other role than to guide the choice of a function.

Let's take a simple example.

For example, here is the description of a class to handle circles and rectangles areas in C++:


class Circle {
public:
  
 float radius;

  Circle(s) {
    radius = r;
  }

  float area() {
     return (M_PI*r*r);
  }
};


class Rectangle {
public:
  float width, length;

  Rectangle(float l, float L) {
      width = l;
      length = L;
  }

  float area() {
     return width*length;
  }
};

void main() {
   Circle c(20);
   Rectangle r(20,30);

  cost << c.area() << endl;
  cost << r.area() << endl;

  // Nothing prevents us from modifying c:
  c.radius = 100;
  cost << c.area() << endl;

}

We can perfectly modify the values of the instances and obtain different values for area each time.

In a functional language, the logic is reversed. Functions apply depending on their arguments.


(data [Circle _] [Rectangle _ _])

(defpat area( [Circle r] ) (* _pi r r))
(defpat area( [Rectangle w h] ) (* w h))

(println 
   (area [Circle 10])
   (area [Rectangle 10 20]).
)

(see Data Types for an explanation of the code above.)

Haskellish

Of course, Haskell puts much more emphasis on verification than LispE, which as an interpreted language, will not make all the right verifications beforehand.

However, LispE provides you with a quite simple framework, in which you can play and understand these concepts in a way that is slightly less complicated than GHC.