Lecture07 - nus-cs2030/2324-s2 GitHub Wiki

Programming with Contexts

In this lecture, we attempt to demonstrate the design pattern for building computation contexts by creating a Maybe context that mimics Java's Optional in handling missing values, and a Lazy context which facilitates lazy evaluation.

Computation context

A computation context is a container type that wraps one or more values. We have seen examples of computation contexts (or simple contexts) in Optional, and the ImList as a collection pipeline. Programming with contexts requires a client to manipulate the value within the context by passing functionality to the context via higher order methods. This is known as cross-barrier manipulation. Now, the client no longer directly manipulates the value in an imperative way using state-changing instructions. Rather, the client passes instructions as functionalities into the context, i.e. the client programs declaratively.

Wrapping a value in a context

A context must first provide some way to wrap a given value (or values) within the context. This is done via static factory methods. For the Maybe (or Optional) class, we have the of and empty methods:

class Maybe<T> { // new Maybe<Integer>(..) binds T (of class scope) to Integer
    private final T value;

    private Maybe(T value) {
        this.value = value;
    }

    static <T> Maybe<T> of(T value) { // Maybe.<Integer>of(1) binds T (of method scope) to Integer
        return new Maybe<T>(value);
    }

    static <T> Maybe<T> empty() {
        return new Maybe<T>(null);
    }
}

Notice that there is an explicit <T> declaration for each factory method. This declaration is necessary to allow for bindings such as Maybe.<Integer>of(1). In this case, since no Maybe object is created before calling the factory method, the Integer type is bound to the generic declaration T in the method (not the class). In contrast, calling new Maybe<Integer>(1), will bind Integer to the T declared with class scope.

Next we write some helper methods to assist with defining the rest of the Maybe class.

private boolean isEmpty() {
    return this.value == null;
}

private boolean isPresent() {
    return !this.isEmpty();
}

private T get() {
    return this.value;
}

Unlike Optional, we declare the above methods with the private modifier. This will make sure that a client of Maybe does not look at or fetch the value from the container in order to perform imperative operations on it. Indeed, this is what many students try to do in Optional; we desire to maintain the context for as long as possible and at no time should we prematurely expose the value until the very end of the computation.

Let us go ahead and override the toString and equals methods:

@Override
public String toString() {
    if (this.isEmpty()) {
        return "Maybe.empty";
    }
    return "Maybe[" + this.get() + "]";
}

@Override
public boolean equals(Object obj) {
    if (this == obj) { // trivial check
        return true;
    }
    if (obj instanceof Maybe<?> other) {
        return this.get().equals(other.get());
    }
    return false;
}

For the equals method, if it is not trivially true (i.e. same object), then we can just check if obj is an instanceof a Maybe that contains any type. We can do this because we will rely on the equals method of the values contained in the Maybe objects to perform the eventual comparison.

The equals method above is incomplete. Since value can be null, we need to check if we can invoke the get method at all.

Higher-order methods in Maybe

There are a number of higher order methods we can write. Here we focus on the more interesting ones: map and flatMap.

map takes in a mapper function that transforms the type T value contained in Maybe<T> to another value of type R while retaining the Maybe context, but of type Maybe<R>.

<R> Maybe<R> map(Function<? super T, ? extends R> mapper) {
    if (this.isEmpty()) {
        return Maybe.<R>empty();
    }
    R r = mapper.apply(this.get())
    return Maybe.<R>of(r);
}

We first check if the Maybe<T> is empty, in which case we return an empty Maybe<R>. Otherwise, we perform the mapping. Notice that type T data flows into the mapper and type R data is expected from it and assigned to the variable r. mapper can read the type T data as type T or as any super-type of T, hence ? super T. Furthermore, since type R data is expected from mapper, it can produce data that is of type R or any sub-type of R, hence ? extends R. This more general sub-typing of mapper allows for more use cases:

jshell> Function<Object, Integer> f = x -> x.hashCode()
f ==> $Lambda$..

jshell> Maybe<Number> mn = Maybe.<String>of("abc").map(f)
mn ==> Maybe[96354]

In the above, T is bound to String and R is bound to Number. However, the function passed to map has input type Object and output type Integer.

What if the above function is defined as follows?

jshell> Function<Object, Maybe<Integer>> g = x -> Maybe.<Integer>of(x.hashCode())
g ==> $Lambda$..

Performing a map with this function g will result in Maybe[Maybe[...]].

jshell> Maybe.<String>of("abc").map(g)
$.. ==> Maybe[Maybe[96354]]

In order to flatten the context, we need to use flatMap.

<R> Maybe<R> flatMap(Function<? super T, ? extends Maybe<? extends R>> mapper) {
    if (this.isEmpty()) {
        return Maybe.<R>empty();
    }
    Maybe<? extends R> mr = mapper.apply(this.get());
    R r = mr.get();
    return Maybe.<R>of(r);
}

Now we have

jshell> Maybe<Number> mn = Maybe.<String>of("abc").flatMap(g)
mn ==> Maybe[96354]

In the definition of flatMap, you may be tempted to chain the three statements together, as follows:

<R> Maybe<R> flatMap(Function<? super T, ? extends Maybe<? extends R>> mapper) {
    if (this.isEmpty()) {
        return Maybe.<R>empty();
    }
    return Maybe.<R>of(mapper.apply(this.get()).get());
}

However, this will result in a compilation error.

The key observation to make here is that get is a private method defined in Maybe, and mapper.apply(this.get()) returns a value of type ? extends Maybe. We cannot guarantee being able to call get from a subclass of Maybe. Hence, the result of mapper.apply(this.get()) first has to be assigned to a variable of type Maybe, which we can guarantee being able to call get on.

Some of you may realize that since the mapping function produces a Maybe, why not just return this instead, rather than taking out the value and re-wrapping it. Indeed this can be done, but only if we make use of a stricter output type:

<R> Maybe<R> flatMap(Function<? super T, ? extends Maybe<R>> mapper) {
    if (this.isEmpty()) {
        return Maybe.<R>empty();
    }
    Maybe<R> mr = mapper.apply(this.get());
    return mr;
}

If we want to further restrict from having sub-classes of Maybe as the function output type, we can further simply the signature to:

<R> Maybe<R> flatMap(Function<? super T, Maybe<R>> mapper)

Terminating the context pipeline

Eventually after all processing is done, we would like to obtain the eventual result stored in a Maybe. At this point of time, we are not sure if Maybe contains a value or otherwise. If a value is present, Maybe will return us the expected value. However in the absence of a value, we need it to return us some default value. We provide two variants of such a method.

T orElse(T other) {
    if (this.isEmpty()) {
        return other;
    }
    return this.get();
}

T orElseGet(Supplier<? extends T> supplier) {
    if (this.isEmpty()) {
        return supplier.get();
    }
    return this.get();
}

At first glance, the two methods above look similarly in behaviour, but with orElseGet wrapping the default value inside a Supplier. Consider the following situation:

jshell> int foo() {
   ...>     System.out.println("foo evaluated");
   ...>     return -1;
   ...> }
|  created method foo()

jshell> Maybe.<Integer>empty().orElse(foo())
foo evaluated
$.. ==> -1

jshell> Maybe.<Integer>of(1).orElse(foo())
foo evaluated
$.. ==> 1

Notice that both situations result in foo() being evaluated, although the last test case does not really need the value. Let us now use the orElseGet method instead by first wrapping foo in a Supplier,

jshell> Maybe.<Integer>empty().orElseGet(() -> foo())
foo evaluated
$.. ==> -1

jshell> Maybe.<Integer>of(1).orElseGet(() -> foo())
$.. ==> 1

then foo() is not evaluated when Maybe contains a value.

By passing foo() directly into the orElse method, the method has to be eagerly (or strictly) evaluated before the return value is passed to the orElse method. In contrast, by wrapping foo() in a Supplier, only the Supplier object created is passed to orElseGet and foo() is only evaluated when Maybe is empty and the get method of the Supplier is called. This delayed evaluation is known as lazy evaluation.

Now consider the following

jshell> Supplier<Integer> supp = () -> foo()
supp ==> $Lambda$..

jshell> Maybe.<Integer>of(1).orElseGet(() -> foo())
$.. ==> 1

jshell> Maybe.<Integer>empty().orElseGet(() -> foo())
foo evaluated
$.. ==> -1

jshell> Maybe.<Integer>empty().orElseGet(() -> foo())
foo evaluated
$.. ==> -1

Notice that the last two test cases evaluates foo() and gives the same outcome. In the absence of side-effects (i.e. foo() will always be evaluated to the same value), there is really no need to re-evaluate foo() again after the first evaluation since the outcome should remain the same. This leads us to create our next context — the Lazy context.

Lazy context

We can view the Lazy context as an improved form of a Supplier that performs caching of the result of the first evaluation such that subsequent evaluations need only require this cached value to be returned. Upon creating the Lazy object, the cache should be empty; it is only assigned with a value after the first invocation of the get method. This infers that the cache should be declared as Optional<T>.

class Lazy<T> implements Supplier<T> {
    private final Supplier<? extends T> supplier;
    private Optional<T> cache;

    private Lazy(Supplier<? extends T> supplier, Optional<T> cache) {
        this.supplier = supplier;
        this.cache = cache;
    }

    static <T> Lazy<T>of(Supplier<T> supplier) {
        return new Lazy<T>(supplier, Optional.<T>empty());
    } 

    static <T> Lazy<T>of(T value) {
        return new Lazy<T>(() -> value, Optional.<T>of(value));
    }
}

We provide two overloaded static factory methods, one that takes a Supplier and the other that takes a value. The former sets cache to be empty, while the second wraps the value in the cache.

For the get method, we first check if the cache is empty. If the cache contains a value, then return that value. Otherwise, we invoke supplier.get(), cache this value and then return the value. We do this using the orElseGet method of Optional.

public T get() {
    return this.cache.orElseGet(() -> {
        T v = this.supplier.get();
        this.cache = Optional.<T>of(v);
        return v;
    });
}

jshell> Lazy<Integer> lazy = Lazy.<Integer>of(() -> foo())
lazy ==> Lazy@..

jshell> lazy.get()
foo evaluated
$.. ==> -1

jshell> lazy.get()
$.. ==> -1

Note that in the last test case, the foo method no longer needs to be evaluated.

You will also realize that in order for cache to be assigned after the first evaluation, it cannot be declared final. Even though our Lazy class is no longer immutable, the client still perceives Lazy as immutable since every get returns the same value. We say that Lazy is observably immutable.

Let us include the map method as follows:

<R> Lazy<R>map(Function<? super T, ? extends R> mapper) {
    R r = mapper.apply(this.get());
    return Lazy.<R>of(r);
}

Even though the above compiles, the evaluation of mapper.apply(this.get()) is eager as we have not called the get() method yet.

jshell> Lazy<Integer> lazy = Lazy.<Integer>of(() -> foo())
lazy ==> Lazy@..

jshell> lazy.map(x -> x + 1)
foo evaluated
$.. ==> Lazy@..

To delay the evaluation, redefine map as

<R> Lazy<R>map(Function<? super T, ? extends R> mapper) {
    Supplier<R> supplier = () -> mapper.apply(this.get());
    return Lazy.<R>of(supplier);
}

jshell> Lazy<Integer> lazy = Lazy.<Integer>of(() -> foo())
lazy ==> Lazy@..

jshell> lazy.map(x -> x + 1)
$.. ==> Lazy@..

jshell> lazy.map(x -> x + 1).map(x -> x * 2)
$.. ==> Lazy@..

jshell> lazy.map(x -> x + 1).map(x -> x * 2).get()
foo evaluated
$.. ==> 0

jshell> lazy.map(x -> x + 1).map(x -> x * 2).get()
$.. ==> 0

From the above, it is evident that the foo method does not evaluate until the first get method is invoked, no matter how many map precedes it. Also, subsequent invocation of the get method will result in the cache value being returned.

Local class and variable capture

Let us study the lambda expression that was used in the map method,

<R> Lazy<R>map(Function<? super T, ? extends R> mapper) {
    Supplier<R> supplier = () -> mapper.apply(this.get());
    return Lazy.<R>of(supplier);
}

The lambda expression makes use of mapper and this.get() where mapper is the parameter (a local variable) of the map method, and this refers to the Lazy object where the map method was called.

The lambda expression is a local class (i.e. class defined in an enclosing method of an enclosing class). A local class captures (or makes a copy of) variables in the enclosing method, and also the reference to the enclosing class (more precisely, the object that is an instance of the enclosing class). We say that the lambda closes over the enclosing method and the enclosing class. This is the lambda closure.

In addition,

  • variables of the enclosing method that are captured cannot be modified in the method; they need to be final or effectively final.
  • reference to the enclosing class is captured using a qualified this, e.g. Lazy.this
<R> Lazy<R>map(Function<? super T, ? extends R> mapper) {
    Supplier<R> supplier = () -> mapper.apply(Lazy.this.get());
    return Lazy.<R>of(supplier);
}

The captures are necessary because one may create a lambda to be returned from the method. For example,

jshell> class A {
   ...>     private final int a;
   ...>     A (int a) {
   ...>         this.a = a;
   ...>     }
   ...>     Function<Integer,Integer> foo(int b) {
   ...>         int c = 10;
   ...>         return d -> a + b + c + d;
   ...>     }
   ...> }
|  created class A

jshell> new A(1).foo(2).apply(3)
$.. ==> 16

Specially, new A(1).foo(2) will return a Function<Integer,Integer>. This lambda expression is a local class created in the foo method of class A. Hence the lambda captures variables b and c from the foo method and also A.this which refers to the new A(1) object.

More importantly, as the foo method returns with the Function, the local variables b and c no longer persist in (stack) memory. Thankfully, due to the capture, the Function still has a copy of these values. We can then invoke the method apply(3) on the Function which gives 16. That is, a + b + c + d or more precisely this.a + b + c + d (even more precisely A.this.a + b + c + d) evaluates to 1 + 2 + 10 + 3 = 16. Try to reassign the values of b and c anywhere inside the foo method and you will notice that it violates the effectively final property.

So far we have looked at the lambda expression as a local class. An anonymous inner class is also a local class, and thus we can define the map method as follows:

<R> Lazy<R>map(Function<? super T, ? extends R> mapper) {
    Supplier<R> supplier = new Supplier<R>() {
        public R get() {
            return mapper.apply(Lazy.this.get());
        }
    };
    return Lazy.<R>of(supplier);
}

In this case, replacing Lazy.this.get() with just this.get() (or get()) has different connotations.
While Lazy.this.get() invokes the get() method of the Lazy class, the latter will invoke the get() method of the Supplier which results in an infinite loop. Try it out!

In a nutshell, defining a lambda expression is deemed safer as you will not need to worry about invoking its single abstract method. If you use anonymous inner classes, the explicit definition of the single abstract method exposes the method for possible invocation or misuse.

⚠️ **GitHub.com Fallback** ⚠️