Compilation transformation - IronScheme/IronScheme GitHub Wiki

This will walk you through how code get transformed in Ironscheme by means of a simple example.

We will use the well known Ackerman function to demonstrate this and compare it to a best-case and naive versions in C#.

We start off with the plain Scheme version:

(define (ack m n)
  (cond
    [(= m 0) (+ n 1)]
    [(= n 0) (ack (- m 1) 1)]
    [else (ack (- m 1) (ack m (- n 1)))]))

But we improve this by redefining define as (using the (ironscheme typed) library):

(define-syntax-rule (define (name arg ...) body ...)
  (define: (name (arg : fixnum) ... -> fixnum)
    body ...))

The above macro annotates all parameter types and the return type to be fixnum (int in C#).

We also redefine +, - and = to come from the (ironscheme typed fixnums) library.

When the above code is pushed through the macro expansion stage, we end up with:

(typed-case-lambda
 ((g$m$31$1aFkwB g$n$32$1aFkwB)
  ((fixnum fixnum) fixnum)
  (if (g$fx=?$2488$1aDpbh g$m$31$1aFkwB '0)
      (g$fx+$2479$1aDpbh g$n$32$1aFkwB '1)
      (if (g$fx=?$2488$1aDpbh g$n$32$1aFkwB '0)
          (g$ack$27$1aFkwB (g$fx-$2481$1aDpbh g$m$31$1aFkwB '1) '1)
          ((case-lambda
             (()
              (begin
                '#f
                (g$ack$27$1aFkwB
                  (g$fx-$2481$1aDpbh g$m$31$1aFkwB '1)
                  (g$ack$27$1aFkwB
                    g$m$31$1aFkwB
                    (g$fx-$2481$1aDpbh g$n$32$1aFkwB '1)))))))))))

As we have an hygienic macro system, identifiers are made unique (you can still make out what they were).

The above language is the input language for the IronScheme compiler. It only depends on a very few core forms. In the case, typed-case-lambda, case-lambda, begin and if as well as procedure application.

Once this goes via the compiler and through a few optimization passes, the produced MSIL ends up looking like:

public static int ack(int m, int n)
{
  while (!fixnums.fx=?(m, 0))
  {
    if (fixnums.fx=?(n, 0))
    {
      int num = fixnums.fx-(m, 1);
      int num2 = 1;
      m = num;
      n = num2;
    }
    else
    {
      int num3 = fixnums.fx-(m, 1);
      int num4 = ack(m, fixnums.fx-(n, 1));
      m = num3;
      n = num4;
    }
  }
  return fixnums.fx+(n, 1);
}

You may notice what seems like redundant assignments, but those are there to make sure evaluation occurs in the order they are defined (which makes things easier later, but is not required by any standard), and ends up being not expensive at all.

For those following, you would note there was no loop in the original definition, but as we have a tail call to the same procedure, it effectively becomes a loop. This optimization, to prevent an expensive recursive tail call is called tail call elimination (TCE). This is a common technique in many compilers, even for non-tail calls and non-recursive single callsite functions. Tail calls in .NET are normally about 5 times slower than a normal method call.

At this stage you might think "why are fx+ and friends not replaced with 'native' operators?" This is completely unnecessary as the .NET JIT compiler does a very good job of inlining those functions, effectively making them just as fast as the 'native' operators. This is normally around 5 times faster than having an actual method call in .NET.

To confirm the above, let's rewrite the output into optimized C# with 'native' operators and compare the execution speed:

public static int ack(int m, int n)
{
  while (m != 0)
  {
    if (n == 0)
    {
      m--;
      n = 1;
    }
    else
    {
      n = ack(m, n - 1);
      m--; // note I changed the execution order here
    }
  }
  return n + 1;
}

Benching this, shows the C# code above to run on my PC in about 8ms per iteration, while the Scheme runs at about 13ms per iteration for ack(3, 9). Not bad considering :) (This could well be to the change in execution order)

And now, if you are still concentrating, you will say "but wait! that is not the way I would write Ackerman in C#". So, let bench that:

public static int ack(int m, int n)
{
  if (m == 0) return n + 1;
  if (n == 0) return ack(m - 1, 1);
  return ack(m - 1, ack(m, n - 1));
}

This time it takes whopping 41ms per iteration!

Final thoughts:

So can we say IronScheme can compile code better than say C#? The answer is no.

IronScheme's compiler optimizations are quite similar to loop optimization to F#'s compiler, so given the original definition written in F#, I would expect it to run somewhere between 8 and 13ms per iteration. This does not make F# any faster than C# (although there are a few snake oil salesmen wanting you to believe it). C#'s compiler just does not apply these optimizations which generally make debugging a lot harder. Nothing stops you, as an experienced coder, to make this transformation yourself.

In general, IronScheme, being an incremental compiler, tries to be as fast as possible without spending much time on optimization. The focus is rather on doing cheap optimizations with big gains such as TCE and inlining and producing code the JIT can optimize by itself.

References: