Tutorial ~ Example (Part 3) - miniboxing/ildl-plugin GitHub Wiki

To understand the warnings, we will examine how the Scala compiler transforms the code. Luckily, this is not very difficult, as we can output the intermediate representation after every compiler phase as source code. While some parts of the tree are verbose and difficult to follow, the specific one we are interested in is quite clear.
Before we see the compiler output, we can see the compiler phases:
$ ildl-scalac -Xshow-phases
phase name id description
---------- -- -----------
parser 1 parse source into ASTs, perform simple desugaring
ildl-postparser 2
namer 3 resolve names, attach symbols to named trees
packageobjects 4 load package objects
typer 5 the meat and potatoes: type the trees
ildl-inject 6
patmat 7 translate match expressions
superaccessors 8 add super accessors in traits and nested classes
extmethods 9 add extension methods for inline classes
pickler 10 serialize symbol tables
refchecks 11 reference/override checking, translate nested objects
uncurry 12 uncurry, translate function values to anonymous classes
ildl-bridge 13
ildl-coerce 14
ildl-commit 15
tailcalls 16 replace tail calls by jumps
specialize 17 @specialized-driven class and method specialization
explicitouter 18 this refs to outer pointers
erasure 19 erase types, add interfaces for traits
posterasure 20 clean up erased inline classes
ildl-tweakera... 21
lazyvals 22 allocate bitmaps, translate lazy vals into lazified defs
lambdalift 23 move nested functions to top level
constructors 24 move field definitions into constructors
flatten 25 eliminate inner classes
mixin 26 mixin composition
cleanup 27 platform-specific cleanups, generate reflective calls
delambdafy 28 remove lambdas
icode 29 generate portable intermediate code
jvm 30 generate JVM bytecode
terminal 31 the last phase during a compilation run
The phases starting with ildl
are the ones introduced by our plugin. Specifically, the interesting phase is ildl-commit
(we will see a waltkthrough of what each phase does later on). For now, we can see the output at the ildl-commit
phase:
$ ildl-scalac example.scala -Xprint:ildl-commit
... [warnings]
[[syntax trees at end of ildl-commit]] // example.scala
package <empty> {
object Test extends Object {
... [lots of code]
def main(args: Array[String]): Unit = {
val x1: Long = Test.this.IntPairAsComplex.toRepr(new (Int, Int)(3, 5));
val x2: Long = Test.this.IntPairAsComplex.toRepr(new (Int, Int)(2, 8));
scala.this.Predef.println(Test.this.IntPairAsComplex(Test.this.IntPairAsComplex.toHigh(x1)).+(Test.this.IntPairAsComplex.toHigh(x2)));
scala.this.Predef.println(Test.this.IntPairAsComplex(Test.this.IntPairAsComplex.toHigh(x1)).*(Test.this.IntPairAsComplex.toHigh(x2)))
}
}
}
This is what the main
method looks like after the ildl
transformation. We can simplify it by hiding the qualified names:
def main(args: Array[String]): Unit = {
val x1: Long = IntPairAsComplex.toRepr(new (Int, Int)(3, 5));
val x2: Long = IntPairAsComplex.toRepr(new (Int, Int)(2, 8));
println(IntPairAsComplex(IntPairAsComplex.toHigh(x1)).+(IntPairAsComplex.toHigh(x2)));
println(IntPairAsComplex(IntPairAsComplex.toHigh(x1)).*(IntPairAsComplex.toHigh(x2)))
}
In the code above, there are two suboptimal parts:
- for the
new
operator, the new tuple being created is immediately converted to a long integer value, creating useless heap garbage that needs to be collected later; - for the
+
and*
operations, there are even more operations. Let's take one of the statements and see exactly what it is doing:
println(IntPairAsComplex(IntPairAsComplex.toHigh(x1)).+(IntPairAsComplex.toHigh(x2)));
\_________________________/ \_________________________/
x1 converted to (Int,Int) x2 converted to (Int,Int)
\___________________________________________/
x1 converted to (Int, Int) and later to
IntPairAsComplex (which has the + operation
we defined for Complex numbers before)
\__________________________________________________________________________/
x1 + x2 using the + operation defined in the implicis IntPairAsComplex class
So, values are first converted to tuples, then x1
is converted from a tuple to the implicit class IntPairAsComplex
, which has the +
operation. Finally, the +
operation occurs between the two tuples, returning a tuple, corresponding to the complex number addition.
Overall, the transformed code allocates four redundant tuples:
- two for the
new
operator and - two for the
+
and*
operations - the implicit value class
IntPairAsComplex
is removed later by the Scala compiler, in theposterasure
phase.
Ideally, we should avoid all allocations, and the warnings are telling us exactly how to do this:
$ ildl-scalac example.scala
example.scala:24: warning: The new operator can be optimized if you define a public,
non-overloaded and matching constructor method for it in object IntPairAsComplex,
with the name ctor_Tuple2:
val x1: Complex = (3, 5)
^
example.scala:25: warning: The new operator can be optimized if you define a public,
non-overloaded and matching constructor method for it in object IntPairAsComplex,
with the name ctor_Tuple2:
val x2: Complex = (2, 8)
^
example.scala:26: warning: The method + can be optimized if you define a public,
non-overloaded and matching exension method for it in object IntPairAsComplex,
with the name implicit_IntPairAsComplex_+:
println(x1 + x2)
^
example.scala:27: warning: The method * can be optimized if you define a public,
non-overloaded and matching exension method for it in object IntPairAsComplex,
with the name implicit_IntPairAsComplex_*:
println(x1 * x2)
^
four warnings found
Let us now try to optimize the constructor. As suggested, we create a ctor_Tuple2
method in the IntPairAsComplex
transformation. We won't go into explaining how to create this method, as this is done later in the tutorial, but the simple idea is we have to match the tuple constructor arguments, namely, two integers.
To implement ctor_Tuple2
we refactor the transformation object a bit (the constructor is at the bottom):
import ildl._
object IntPairAsComplex extends TransformationDescription {
// bitwise conversions:
private def real(l: Long @high): Int = (l >>> 32).toInt
private def imag(l: Long @high): Int = (l & 0xFFFFFFFF).toInt
private def pack(re: Int, im: Int): Long @high =
(re.toLong << 32l) | (im.toLong & 0xFFFFFFFFl)
// conversions:
def toRepr(p: (Int, Int)): Long @high = pack(p._1, p._2)
def toHigh(l: Long @high): (Int, Int) = (real(l), imag(l))
// constructor:
def ctor_Tuple2(_1: Int, _2: Int): Long @high = pack(_1, _2)
}
As you can imagine, instead of returning a tuple, the constructor directly generates the equivalent long integer, skipping an element allocation. If we compile the example now, we get just two warnings:
$ ildl-scalac example.scala
example.scala:32: warning: The method + can be optimized if you define a public,
non-overloaded and matching exension method for it in object IntPairAsComplex,
with the name implicit_IntPairAsComplex_+:
println(x1 + x2)
^
example.scala:33: warning: The method * can be optimized if you define a public,
non-overloaded and matching exension method for it in object IntPairAsComplex,
with the name implicit_IntPairAsComplex_*:
println(x1 * x2)
^
two warnings found
And, as you can probably imagine, the constructors have been transformed:
def main(args: Array[String]): Unit = {
val x1: Long = IntPairAsComplex.ctor_Tuple2(3, 5);
val x2: Long = IntPairAsComplex.ctor_Tuple2(2, 8);
println(IntPairAsComplex(IntPairAsComplex.toHigh(x1)).+(IntPairAsComplex.toHigh(x2)));
println(IntPairAsComplex(IntPairAsComplex.toHigh(x1)).*(IntPairAsComplex.toHigh(x2)))
}
This got rid of two of the tuple allocations. We can now implement the operations, by adding them in the IntPairAsComplex
transformation description object as well:
// operations:
def implicit_IntPairAsComplex_+(c1: Long @high, c2: Long @high) =
pack(real(c1) + real(c2), imag(c1) + imag(c2))
def implicit_IntPairAsComplex_*(c1: Long @high, c2: Long @high) =
pack(real(c1) * real(c2) - imag(c1) * imag(c2),
real(c1) * imag(c2) + imag(c1) * real(c2))
Now if we compile the program, there are no more warnings (and no more redundant allocations):
def main(args: Array[String]): Unit = {
val x1: Long = IntPairAsComplex.ctor_Tuple2(3, 5);
val x2: Long = IntPairAsComplex.ctor_Tuple2(2, 8);
println(IntPairAsComplex.toHigh(IntPairAsComplex.implicit_IntPairAsComplex_+(x1, x2)));
println(IntPairAsComplex.toHigh(IntPairAsComplex.implicit_IntPairAsComplex_*(x1, x2)));
}
Let's look at the operation now:
println(IntPairAsComplex.toHigh(IntPairAsComplex.implicit_IntPairAsComplex_+(x1, x2)));
\__________________________________________/
our newly-defined + operation, equivalent
to the IntPairAsComplex.+
\__________________________________________________/
x1 + x2, in long integer format
\___________________________________________________________________________/
x1 + x2, in integer tuple format (so they can be printed)
Thus, if we follow the code, we notice it keeps the long integer encoding for as long as possible:
- the
x1
andx2
values are created directly encoded as long integers by thector_Tuple2
method; - the
+
and*
operations occur directly on the long integer encoding; - only at the last step, before printing, the long integers are decoded into tuples, as otherwise the transformation would have been incorrect (this page explains the problem in detail).
While these optimizations might not seem very productive, we will later see how exactly this transformation was able to speed up the a complex number example by almost 13x.
Now, we have developed our transformation into a rather large one:
// complex number support:
object Complex {
// "define" type complex based on integer pairs
type Complex = (Int, Int)
// add the addition and multiplication operation to complex numbers
implicit class IntPairAsComplex(val p1: Complex) extends AnyVal {
def +(p2: Complex): Complex = (p1._1 + p2._1 , p1._2 + p2._2)
def *(p2: Complex): Complex = (p1._1 * p2._1 - p1._2 * p2._2,
p1._1 * p2._2 + p1._2 * p2._1)
// we could define other operations here as well...
}
import ildl._
// transform Complex into long integer
object IntPairAsComplex extends TransformationDescription {
// bitwise conversions:
private def real(l: Long @high): Int = (l >>> 32).toInt
private def imag(l: Long @high): Int = (l & 0xFFFFFFFF).toInt
private def pack(re: Int, im: Int): Long @high =
(re.toLong << 32l) | (im.toLong & 0xFFFFFFFFl)
// conversions:
def toRepr(p: (Int, Int)): Long @high = pack(p._1, p._2)
def toHigh(l: Long @high): (Int, Int) = (real(l), imag(l))
// constructor:
def ctor_Tuple2(_1: Int, _2: Int): Long @high = pack(_1, _2)
// operations:
def implicit_IntPairAsComplex_+(c1: Long @high, c2: Long @high) =
pack(real(c1) + real(c2), imag(c1) + imag(c2))
def implicit_IntPairAsComplex_*(c1: Long @high, c2: Long @high) =
pack(real(c1) * real(c2) - imag(c1) * imag(c2),
real(c1) * imag(c2) + imag(c1) * real(c2))
def extension_toString(c: Long @high) =
"(" + real(c) + "," + imag(c) + ")"
}
}
object Test {
import ildl._
import Complex._
adrt(IntPairAsComplex) {
// test the output
def main(args: Array[String]): Unit = {
val x1: Complex = (3, 5)
val x2: Complex = (2, 8)
println(x1 + x2)
println(x1 * x2)
}
}
}
In the next section we will be able to play with the transformation description we have painstakingly developed.
- continue with the example, part four
- return to the home page