Tutorial ~ Example (Part 3) - miniboxing/ildl-plugin GitHub Wiki

In the the last part, we managed to define a transformation object and apply it to the main method. We will now see why the `ildl-plugin` produced four warnings.

To understand the warnings, we will examine how the Scala compiler transforms the code. Luckily, this is not very difficult, as we can output the intermediate representation after every compiler phase as source code. While some parts of the tree are verbose and difficult to follow, the specific one we are interested in is quite clear.

Before we see the compiler output, we can see the compiler phases:

$ ildl-scalac -Xshow-phases
      phase name  id  description
      ----------  --  -----------
          parser   1  parse source into ASTs, perform simple desugaring
 ildl-postparser   2
           namer   3  resolve names, attach symbols to named trees
  packageobjects   4  load package objects
           typer   5  the meat and potatoes: type the trees
     ildl-inject   6
          patmat   7  translate match expressions
  superaccessors   8  add super accessors in traits and nested classes
      extmethods   9  add extension methods for inline classes
         pickler  10  serialize symbol tables
       refchecks  11  reference/override checking, translate nested objects
         uncurry  12  uncurry, translate function values to anonymous classes
     ildl-bridge  13
     ildl-coerce  14
     ildl-commit  15
       tailcalls  16  replace tail calls by jumps
      specialize  17  @specialized-driven class and method specialization
   explicitouter  18  this refs to outer pointers
         erasure  19  erase types, add interfaces for traits
     posterasure  20  clean up erased inline classes
ildl-tweakera...  21
        lazyvals  22  allocate bitmaps, translate lazy vals into lazified defs
      lambdalift  23  move nested functions to top level
    constructors  24  move field definitions into constructors
         flatten  25  eliminate inner classes
           mixin  26  mixin composition
         cleanup  27  platform-specific cleanups, generate reflective calls
      delambdafy  28  remove lambdas
           icode  29  generate portable intermediate code
             jvm  30  generate JVM bytecode
        terminal  31  the last phase during a compilation run

The phases starting with ildl are the ones introduced by our plugin. Specifically, the interesting phase is ildl-commit (we will see a waltkthrough of what each phase does later on). For now, we can see the output at the ildl-commit phase:

$ ildl-scalac example.scala -Xprint:ildl-commit

... [warnings]

[[syntax trees at end of               ildl-commit]] // example.scala
package <empty> {
  object Test extends Object {

  ... [lots of code]

    def main(args: Array[String]): Unit = {
      val x1: Long = Test.this.IntPairAsComplex.toRepr(new (Int, Int)(3, 5));
      val x2: Long = Test.this.IntPairAsComplex.toRepr(new (Int, Int)(2, 8));
      scala.this.Predef.println(Test.this.IntPairAsComplex(Test.this.IntPairAsComplex.toHigh(x1)).+(Test.this.IntPairAsComplex.toHigh(x2)));
      scala.this.Predef.println(Test.this.IntPairAsComplex(Test.this.IntPairAsComplex.toHigh(x1)).*(Test.this.IntPairAsComplex.toHigh(x2)))
    }
  }
}

This is what the main method looks like after the ildl transformation. We can simplify it by hiding the qualified names:

    def main(args: Array[String]): Unit = {
      val x1: Long = IntPairAsComplex.toRepr(new (Int, Int)(3, 5));
      val x2: Long = IntPairAsComplex.toRepr(new (Int, Int)(2, 8));
      println(IntPairAsComplex(IntPairAsComplex.toHigh(x1)).+(IntPairAsComplex.toHigh(x2)));
      println(IntPairAsComplex(IntPairAsComplex.toHigh(x1)).*(IntPairAsComplex.toHigh(x2)))
    }

In the code above, there are two suboptimal parts:

for the new operator, the new tuple being created is immediately converted to a long integer value, creating useless heap garbage that needs to be collected later;
for the + and * operations, there are even more operations. Let's take one of the statements and see exactly what it is doing:

      println(IntPairAsComplex(IntPairAsComplex.toHigh(x1)).+(IntPairAsComplex.toHigh(x2)));
                               \_________________________/    \_________________________/
                                x1 converted to (Int,Int)      x2 converted to (Int,Int)
              \___________________________________________/
               x1 converted to (Int, Int) and later to
               IntPairAsComplex (which has the + operation
               we defined for Complex numbers before)
              \__________________________________________________________________________/
               x1 + x2 using the + operation defined in the implicis IntPairAsComplex class

So, values are first converted to tuples, then x1 is converted from a tuple to the implicit class IntPairAsComplex, which has the + operation. Finally, the + operation occurs between the two tuples, returning a tuple, corresponding to the complex number addition.

Overall, the transformed code allocates four redundant tuples:

two for the new operator and
two for the + and * operations
the implicit value class IntPairAsComplex is removed later by the Scala compiler, in the posterasure phase.

Ideally, we should avoid all allocations, and the warnings are telling us exactly how to do this:

$ ildl-scalac example.scala
example.scala:24: warning: The new operator can be optimized if you define a public,
non-overloaded and matching constructor method for it in object IntPairAsComplex, 
with the name ctor_Tuple2:
      val x1: Complex = (3, 5)
                        ^
example.scala:25: warning: The new operator can be optimized if you define a public,
non-overloaded and matching constructor method for it in object IntPairAsComplex,
with the name ctor_Tuple2:
      val x2: Complex = (2, 8)
                        ^
example.scala:26: warning: The method + can be optimized if you define a public,
non-overloaded and matching exension method for it in object IntPairAsComplex,
with the name implicit_IntPairAsComplex_+:
      println(x1 + x2)
                 ^
example.scala:27: warning: The method * can be optimized if you define a public,
non-overloaded and matching exension method for it in object IntPairAsComplex,
with the name implicit_IntPairAsComplex_*:
      println(x1 * x2)
                 ^
four warnings found

Let us now try to optimize the constructor. As suggested, we create a ctor_Tuple2 method in the IntPairAsComplex transformation. We won't go into explaining how to create this method, as this is done later in the tutorial, but the simple idea is we have to match the tuple constructor arguments, namely, two integers.

To implement ctor_Tuple2 we refactor the transformation object a bit (the constructor is at the bottom):

  import ildl._
  object IntPairAsComplex extends TransformationDescription {
    // bitwise conversions:
    private def real(l: Long @high): Int = (l >>> 32).toInt
    private def imag(l: Long @high): Int = (l & 0xFFFFFFFF).toInt
    private def pack(re: Int, im: Int): Long @high =
      (re.toLong << 32l) | (im.toLong & 0xFFFFFFFFl)  	
    // conversions:
    def toRepr(p: (Int, Int)): Long @high = pack(p._1, p._2)
    def toHigh(l: Long @high): (Int, Int) = (real(l), imag(l))
    // constructor:
    def ctor_Tuple2(_1: Int, _2: Int): Long @high = pack(_1, _2)
  }

As you can imagine, instead of returning a tuple, the constructor directly generates the equivalent long integer, skipping an element allocation. If we compile the example now, we get just two warnings:

$ ildl-scalac example.scala 
example.scala:32: warning: The method + can be optimized if you define a public,
non-overloaded and matching exension method for it in object IntPairAsComplex,
with the name implicit_IntPairAsComplex_+:
      println(x1 + x2)
                 ^
example.scala:33: warning: The method * can be optimized if you define a public,
non-overloaded and matching exension method for it in object IntPairAsComplex,
with the name implicit_IntPairAsComplex_*:
      println(x1 * x2)
                 ^
two warnings found

And, as you can probably imagine, the constructors have been transformed:

    def main(args: Array[String]): Unit = {
      val x1: Long = IntPairAsComplex.ctor_Tuple2(3, 5);
      val x2: Long = IntPairAsComplex.ctor_Tuple2(2, 8);
      println(IntPairAsComplex(IntPairAsComplex.toHigh(x1)).+(IntPairAsComplex.toHigh(x2)));
      println(IntPairAsComplex(IntPairAsComplex.toHigh(x1)).*(IntPairAsComplex.toHigh(x2)))
    }

This got rid of two of the tuple allocations. We can now implement the operations, by adding them in the IntPairAsComplex transformation description object as well:

    // operations:
    def implicit_IntPairAsComplex_+(c1: Long @high, c2: Long @high) =
      pack(real(c1) + real(c2), imag(c1) + imag(c2))
    def implicit_IntPairAsComplex_*(c1: Long @high, c2: Long @high) =
      pack(real(c1) * real(c2) - imag(c1) * imag(c2),
           real(c1) * imag(c2) + imag(c1) * real(c2))

Now if we compile the program, there are no more warnings (and no more redundant allocations):

    def main(args: Array[String]): Unit = {
      val x1: Long = IntPairAsComplex.ctor_Tuple2(3, 5);
      val x2: Long = IntPairAsComplex.ctor_Tuple2(2, 8);
      println(IntPairAsComplex.toHigh(IntPairAsComplex.implicit_IntPairAsComplex_+(x1, x2)));
      println(IntPairAsComplex.toHigh(IntPairAsComplex.implicit_IntPairAsComplex_*(x1, x2)));
    }

Let's look at the operation now:

      println(IntPairAsComplex.toHigh(IntPairAsComplex.implicit_IntPairAsComplex_+(x1, x2)));
                                      \__________________________________________/
                                       our newly-defined + operation, equivalent
                                       to the IntPairAsComplex.+
                                      \__________________________________________________/
                                       x1 + x2, in long integer format
              \___________________________________________________________________________/
               x1 + x2, in integer tuple format (so they can be printed)

Thus, if we follow the code, we notice it keeps the long integer encoding for as long as possible:

the x1 and x2 values are created directly encoded as long integers by the ctor_Tuple2 method;
the + and * operations occur directly on the long integer encoding;
only at the last step, before printing, the long integers are decoded into tuples, as otherwise the transformation would have been incorrect (this page explains the problem in detail).

While these optimizations might not seem very productive, we will later see how exactly this transformation was able to speed up the a complex number example by almost 13x.

Now, we have developed our transformation into a rather large one:

// complex number support:
object Complex {

  // "define" type complex based on integer pairs
  type Complex = (Int, Int)

  // add the addition and multiplication operation to complex numbers
  implicit class IntPairAsComplex(val p1: Complex) extends AnyVal {
    def +(p2: Complex): Complex = (p1._1 + p2._1 , p1._2 + p2._2)
    def *(p2: Complex): Complex = (p1._1 * p2._1 - p1._2 * p2._2,
                                   p1._1 * p2._2 + p1._2 * p2._1)
    // we could define other operations here as well...
  }

  import ildl._
  // transform Complex into long integer
  object IntPairAsComplex extends TransformationDescription {
    // bitwise conversions:
    private def real(l: Long @high): Int = (l >>> 32).toInt
    private def imag(l: Long @high): Int = (l & 0xFFFFFFFF).toInt
    private def pack(re: Int, im: Int): Long @high =
      (re.toLong << 32l) | (im.toLong & 0xFFFFFFFFl)  	
    // conversions:
    def toRepr(p: (Int, Int)): Long @high = pack(p._1, p._2)
    def toHigh(l: Long @high): (Int, Int) = (real(l), imag(l))
    // constructor:
    def ctor_Tuple2(_1: Int, _2: Int): Long @high = pack(_1, _2)
    // operations:
    def implicit_IntPairAsComplex_+(c1: Long @high, c2: Long @high) =
      pack(real(c1) + real(c2), imag(c1) + imag(c2))
    def implicit_IntPairAsComplex_*(c1: Long @high, c2: Long @high) =
      pack(real(c1) * real(c2) - imag(c1) * imag(c2),
           real(c1) * imag(c2) + imag(c1) * real(c2))
    def extension_toString(c: Long @high) = 
      "(" + real(c) + "," + imag(c) + ")"
  }
}

object Test {
  import ildl._
  import Complex._

  adrt(IntPairAsComplex) {
    // test the output
    def main(args: Array[String]): Unit = {
      val x1: Complex = (3, 5)
      val x2: Complex = (2, 8)
      println(x1 + x2)
      println(x1 * x2)
    }
  }
}

In the next section we will be able to play with the transformation description we have painstakingly developed.

From Here:

continue with the example, part four
return to the home page

Tutorial ~ Example (Part 3) - miniboxing/ildl-plugin GitHub Wiki

From Here:

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️