Lecture07 - nus-cs2030/2324-s2 GitHub Wiki
In this lecture, we attempt to demonstrate the design pattern for building computation contexts by creating a Maybe context that mimics Java's Optional in handling missing values, and a Lazy context which facilitates lazy evaluation.
A computation context is a container type that wraps one or more values.
We have seen examples of computation contexts (or simple contexts) in Optional, and the ImList as a collection pipeline.
Programming with contexts requires a client to manipulate the value within the context by passing functionality to the context via higher order methods.
This is known as cross-barrier manipulation.
Now, the client no longer directly manipulates the value in an imperative way using state-changing instructions.
Rather, the client passes instructions as functionalities into the context, i.e. the client programs declaratively.
A context must first provide some way to wrap a given value (or values) within the context.
This is done via static factory methods.
For the Maybe (or Optional) class, we have the of and empty methods:
class Maybe<T> { // new Maybe<Integer>(..) binds T (of class scope) to Integer
private final T value;
private Maybe(T value) {
this.value = value;
}
static <T> Maybe<T> of(T value) { // Maybe.<Integer>of(1) binds T (of method scope) to Integer
return new Maybe<T>(value);
}
static <T> Maybe<T> empty() {
return new Maybe<T>(null);
}
}Notice that there is an explicit <T> declaration for each factory method.
This declaration is necessary to allow for bindings such as
Maybe.<Integer>of(1).
In this case, since no Maybe object is created before calling the factory method, the Integer type is bound to the generic declaration T in the method (not the class).
In contrast, calling new Maybe<Integer>(1), will bind Integer to the T
declared with class scope.
Next we write some helper methods to assist with defining the rest of the Maybe
class.
private boolean isEmpty() {
return this.value == null;
}
private boolean isPresent() {
return !this.isEmpty();
}
private T get() {
return this.value;
}Unlike Optional, we declare the above methods with the private modifier.
This will make sure that a client of Maybe does not look at or fetch the value
from the container in order to perform imperative operations on it.
Indeed, this is what many students try to do in Optional;
we desire to maintain the context for as long as possible and
at no time should we prematurely expose the value until the very end of the computation.
Let us go ahead and override the toString and equals methods:
@Override
public String toString() {
if (this.isEmpty()) {
return "Maybe.empty";
}
return "Maybe[" + this.get() + "]";
}
@Override
public boolean equals(Object obj) {
if (this == obj) { // trivial check
return true;
}
if (obj instanceof Maybe<?> other) {
return this.get().equals(other.get());
}
return false;
}For the equals method, if it is not trivially true (i.e. same object), then we can just check if obj is an instanceof a Maybe that contains any type.
We can do this because we will rely on the equals method of
the values contained in the Maybe objects to perform the
eventual comparison.
The
equalsmethod above is incomplete. Sincevaluecan benull, we need to check if we can invoke thegetmethod at all.
There are a number of higher order methods we can write.
Here we focus on the more interesting ones: map and flatMap.
map takes in a mapper function that transforms the type T value contained in Maybe<T> to another value of type R while retaining the Maybe context, but of type Maybe<R>.
<R> Maybe<R> map(Function<? super T, ? extends R> mapper) {
if (this.isEmpty()) {
return Maybe.<R>empty();
}
R r = mapper.apply(this.get())
return Maybe.<R>of(r);
}We first check if the Maybe<T> is empty, in which case we
return an empty Maybe<R>.
Otherwise, we perform the mapping.
Notice that type T data flows into the mapper and type R data is expected from it and assigned to the variable r.
mapper can read the type T data as type T or as any super-type of T, hence ? super T.
Furthermore, since type R data is expected from mapper, it can produce data that is of type R or any sub-type of R, hence ? extends R.
This more general sub-typing of mapper allows for more use cases:
jshell> Function<Object, Integer> f = x -> x.hashCode()
f ==> $Lambda$..
jshell> Maybe<Number> mn = Maybe.<String>of("abc").map(f)
mn ==> Maybe[96354]
In the above, T is bound to String and R is bound to Number.
However, the function passed to map has input type Object and output type Integer.
What if the above function is defined as follows?
jshell> Function<Object, Maybe<Integer>> g = x -> Maybe.<Integer>of(x.hashCode())
g ==> $Lambda$..
Performing a map with this function g will result in Maybe[Maybe[...]].
jshell> Maybe.<String>of("abc").map(g)
$.. ==> Maybe[Maybe[96354]]
In order to flatten the context, we need to use flatMap.
<R> Maybe<R> flatMap(Function<? super T, ? extends Maybe<? extends R>> mapper) {
if (this.isEmpty()) {
return Maybe.<R>empty();
}
Maybe<? extends R> mr = mapper.apply(this.get());
R r = mr.get();
return Maybe.<R>of(r);
}Now we have
jshell> Maybe<Number> mn = Maybe.<String>of("abc").flatMap(g)
mn ==> Maybe[96354]
In the definition of flatMap, you may be tempted to chain the three statements together, as follows:
<R> Maybe<R> flatMap(Function<? super T, ? extends Maybe<? extends R>> mapper) {
if (this.isEmpty()) {
return Maybe.<R>empty();
}
return Maybe.<R>of(mapper.apply(this.get()).get());
}However, this will result in a compilation error.
The key observation to make here is that get is a private method defined in Maybe, and mapper.apply(this.get()) returns a value of type ? extends Maybe. We cannot guarantee being able to call get from a subclass of Maybe. Hence, the result of mapper.apply(this.get()) first has to be assigned to a variable of type Maybe, which we can guarantee being able to call get on.
Some of you may realize that since the mapping function produces a Maybe, why
not just return this instead, rather than taking out the value and re-wrapping it.
Indeed this can be done, but only if we make use of a stricter output type:
<R> Maybe<R> flatMap(Function<? super T, ? extends Maybe<R>> mapper) {
if (this.isEmpty()) {
return Maybe.<R>empty();
}
Maybe<R> mr = mapper.apply(this.get());
return mr;
}If we want to further restrict from having sub-classes of Maybe as the function output type, we can further simply the signature to:
<R> Maybe<R> flatMap(Function<? super T, Maybe<R>> mapper)Eventually after all processing is done, we would like to obtain the eventual result stored in a Maybe.
At this point of time, we are not sure if Maybe contains a value or otherwise.
If a value is present, Maybe will return us the expected value.
However in the absence of a value, we need it to return us some default value.
We provide two variants of such a method.
T orElse(T other) {
if (this.isEmpty()) {
return other;
}
return this.get();
}
T orElseGet(Supplier<? extends T> supplier) {
if (this.isEmpty()) {
return supplier.get();
}
return this.get();
}At first glance, the two methods above look similarly in behaviour, but with orElseGet wrapping the default value inside a Supplier.
Consider the following situation:
jshell> int foo() {
...> System.out.println("foo evaluated");
...> return -1;
...> }
| created method foo()
jshell> Maybe.<Integer>empty().orElse(foo())
foo evaluated
$.. ==> -1
jshell> Maybe.<Integer>of(1).orElse(foo())
foo evaluated
$.. ==> 1
Notice that both situations result in foo() being
evaluated, although the last test case does not really need
the value.
Let us now use the orElseGet method instead by first wrapping
foo in a Supplier,
jshell> Maybe.<Integer>empty().orElseGet(() -> foo())
foo evaluated
$.. ==> -1
jshell> Maybe.<Integer>of(1).orElseGet(() -> foo())
$.. ==> 1
then foo() is not evaluated when Maybe contains a value.
By passing foo() directly into the orElse method, the
method has to be eagerly (or strictly) evaluated before the
return value is passed to the orElse method. In contrast, by
wrapping foo() in a Supplier, only the Supplier object
created is passed to orElseGet and foo() is only evaluated when
Maybe is empty and the get method of the Supplier is
called. This delayed evaluation is known as lazy evaluation.
Now consider the following
jshell> Supplier<Integer> supp = () -> foo()
supp ==> $Lambda$..
jshell> Maybe.<Integer>of(1).orElseGet(() -> foo())
$.. ==> 1
jshell> Maybe.<Integer>empty().orElseGet(() -> foo())
foo evaluated
$.. ==> -1
jshell> Maybe.<Integer>empty().orElseGet(() -> foo())
foo evaluated
$.. ==> -1
Notice that the last two test cases evaluates foo() and gives the same outcome.
In the absence of side-effects (i.e. foo() will always be evaluated to
the same value), there is really no need to re-evaluate foo() again after the first
evaluation since the outcome should remain the same.
This leads us to create our next context — the Lazy context.
We can view the Lazy context as an improved form of a Supplier that performs
caching of the result of the first evaluation such that subsequent evaluations
need only require this cached value to be returned.
Upon creating the Lazy object, the cache should be empty; it is only assigned
with a value after the first invocation of the get method.
This infers that the cache should be declared as
Optional<T>.
class Lazy<T> implements Supplier<T> {
private final Supplier<? extends T> supplier;
private Optional<T> cache;
private Lazy(Supplier<? extends T> supplier, Optional<T> cache) {
this.supplier = supplier;
this.cache = cache;
}
static <T> Lazy<T>of(Supplier<T> supplier) {
return new Lazy<T>(supplier, Optional.<T>empty());
}
static <T> Lazy<T>of(T value) {
return new Lazy<T>(() -> value, Optional.<T>of(value));
}
}We provide two overloaded static factory methods, one that takes a Supplier and the other that takes a value. The former sets cache to be empty, while the second wraps the value in the cache.
For the get method, we first check if the cache is empty.
If the cache contains a value, then return that value.
Otherwise, we invoke supplier.get(), cache this value and then return the value. We do this using the orElseGet method of Optional.
public T get() {
return this.cache.orElseGet(() -> {
T v = this.supplier.get();
this.cache = Optional.<T>of(v);
return v;
});
}
jshell> Lazy<Integer> lazy = Lazy.<Integer>of(() -> foo())
lazy ==> Lazy@..
jshell> lazy.get()
foo evaluated
$.. ==> -1
jshell> lazy.get()
$.. ==> -1Note that in the last test case, the foo method no longer needs to be evaluated.
You will also realize that in order for cache to be assigned
after the first evaluation, it cannot be declared final.
Even though our Lazy class is no longer immutable, the
client still perceives Lazy as immutable since every get
returns the same value. We say that Lazy is observably
immutable.
Let us include the map method as follows:
<R> Lazy<R>map(Function<? super T, ? extends R> mapper) {
R r = mapper.apply(this.get());
return Lazy.<R>of(r);
}Even though the above compiles, the evaluation of mapper.apply(this.get()) is eager as we have not called the get() method yet.
jshell> Lazy<Integer> lazy = Lazy.<Integer>of(() -> foo())
lazy ==> Lazy@..
jshell> lazy.map(x -> x + 1)
foo evaluated
$.. ==> Lazy@..
To delay the evaluation, redefine map as
<R> Lazy<R>map(Function<? super T, ? extends R> mapper) {
Supplier<R> supplier = () -> mapper.apply(this.get());
return Lazy.<R>of(supplier);
}
jshell> Lazy<Integer> lazy = Lazy.<Integer>of(() -> foo())
lazy ==> Lazy@..
jshell> lazy.map(x -> x + 1)
$.. ==> Lazy@..
jshell> lazy.map(x -> x + 1).map(x -> x * 2)
$.. ==> Lazy@..
jshell> lazy.map(x -> x + 1).map(x -> x * 2).get()
foo evaluated
$.. ==> 0
jshell> lazy.map(x -> x + 1).map(x -> x * 2).get()
$.. ==> 0From the above, it is evident that the foo method does not
evaluate until the first get method is invoked, no matter
how many map precedes it.
Also, subsequent invocation of the get method will result in
the cache value being returned.
Let us study the lambda expression that was used in the map method,
<R> Lazy<R>map(Function<? super T, ? extends R> mapper) {
Supplier<R> supplier = () -> mapper.apply(this.get());
return Lazy.<R>of(supplier);
}
The lambda expression makes use of mapper and this.get() where mapper is the parameter (a local variable) of the map method, and this refers to the Lazy object where the map method was called.
The lambda expression is a local class (i.e. class defined in an enclosing method of an enclosing class). A local class captures (or makes a copy of) variables in the enclosing method, and also the reference to the enclosing class (more precisely, the object that is an instance of the enclosing class). We say that the lambda closes over the enclosing method and the enclosing class. This is the lambda closure.
In addition,
- variables of the enclosing method that are captured cannot be modified in the method; they need to be final or effectively final.
- reference to the enclosing class is captured using a
qualified this, e.g.
Lazy.this
<R> Lazy<R>map(Function<? super T, ? extends R> mapper) {
Supplier<R> supplier = () -> mapper.apply(Lazy.this.get());
return Lazy.<R>of(supplier);
}The captures are necessary because one may create a lambda to be returned from the method. For example,
jshell> class A {
...> private final int a;
...> A (int a) {
...> this.a = a;
...> }
...> Function<Integer,Integer> foo(int b) {
...> int c = 10;
...> return d -> a + b + c + d;
...> }
...> }
| created class A
jshell> new A(1).foo(2).apply(3)
$.. ==> 16Specially, new A(1).foo(2) will return a
Function<Integer,Integer>. This lambda expression is a
local class created in the foo method of class A.
Hence the lambda captures variables b and c from the foo
method and also A.this which refers to the new A(1)
object.
More importantly, as the foo method returns with the Function, the local variables
b and c no longer persist in (stack) memory.
Thankfully, due to the
capture, the Function still has a copy of these values.
We can then invoke the method apply(3) on the Function
which gives 16.
That is, a + b + c + d or more precisely this.a + b + c + d
(even more precisely A.this.a + b + c + d) evaluates to 1 + 2 + 10 + 3 = 16.
Try to reassign the values of b and c anywhere inside the
foo method and you will notice that it violates the
effectively final property.
So far we have looked at the lambda expression as a local
class.
An anonymous inner class is also a local class, and thus we can define the map method as follows:
<R> Lazy<R>map(Function<? super T, ? extends R> mapper) {
Supplier<R> supplier = new Supplier<R>() {
public R get() {
return mapper.apply(Lazy.this.get());
}
};
return Lazy.<R>of(supplier);
}In this case, replacing Lazy.this.get() with just this.get() (or get()) has different connotations.
While Lazy.this.get() invokes the get() method of the Lazy class, the latter will invoke the get() method of the Supplier which results in an infinite loop.
Try it out!
In a nutshell, defining a lambda expression is deemed safer as you will not need to worry about invoking its single abstract method. If you use anonymous inner classes, the explicit definition of the single abstract method exposes the method for possible invocation or misuse.