Lecture10 - nus-cs2030/2324-s2 GitHub Wiki

Asynchronous Programming

Finally, we look at asynchronous programming and how internal threading and execution is managed by the CompletableFuture context.

The running example that we are using comprises the following:

class A { 
    int x; 
    A(int x) { 
        this.x = x;
    }
}

class B { }

class C { }

class D { }

class E { }

B f(A a) { 
    System.out.println("f: start");
    doWork(a.x);
    System.out.println("f: done");
    return new B(); 
}

C g(B b, int n) {
    System.out.println("g: start");
    doWork(n);
    System.out.println("g: done");
    return new C(); 
}

D h(B b, int n) { 
    System.out.println("h: start");
    doWork(n);
    System.out.println("h: done");
    return new D(); 
}

E n(C c, D d) { 
    System.out.println("n: proceeds");
    return new E(); 
}

void foo(int m, int n) {
    B b = f(new A(5));
    C c = g(b, m);
    D d = h(b, n);
    E e = n(c, d);
}

where doWork(n) is used to represent some delay during computation (here assumed to be proportional to n). More importantly,

  • methods g and h require f to be completed;
  • if methods g and h are effect-free and do not depend on one another, both of them can be executed together;
  • method n needs to wait for both methods g and h to complete execution.

With synchronous computation, one will expect that invoking foo(5, 10) will execute g, wait for it to complete, before h executes.

jshell> foo(5, 10)
f: start
f: done
g: start
g: done
h: start
h: done
n: proceeds

Java threads

We can make use of threads to speed up computation asynchronously.

  • let f be executed in the main thread;
  • when f completes, spawn a new thread to execute g;
  • at the same time, let the main thread execute h;
  • have the main thread wait for g to complete;
  • continue executing e on the main thread.
void foo(int m, int n) {
    B b = f(new A(5));
    Thread t = new Thread(() -> g(b, m));
    t.start();
    D d = h(b, n);
    try {
        t.join();
    } catch (InterruptedException e) {}
    System.out.println("DONE");
}

Notice that thread t is created with g(b,m) wrapped in a Runnable (not a Supplier) and passed to the constructor. A Runnable is a functional interface with the abstract method run that takes no arguments and returns void; so we cannot expect a value to be returned from t. Execution of the thread initiates with t.start(), while t.join() performs a blocking wait for thread t to complete. In the absence of join(), the main thread will not wait for the completion of thread t, but goes on to execute the println statement. This would be an issue particularly when m > n. Also note that the join method throws an exception, and hence needs to be explicitly handled (by catching or throwing).

In the following output, notice how g and h completes before DONE

jshell> foo(5, 10)
f: start
f: done
h: start
g: start
g: done
h: done
DONE

Up to this point, we are explicitly performing thread management. Moreover, due to its stateful nature, bugs arise if we order the instructions incorrectly.

void foo(int m, int n) {
    B b = f(new A(5));
    D d = h(b, n);
    Thread t = new Thread(() -> g(b, m));
    t.start();
    try {
        t.join();
    } catch (InterruptedException e) {}
    System.out.println("DONE");
}

jshell> foo(5, 10)
f: start
f: done
h: start
h: done // h has to be completed before a new thread is created to process g
g: start
g: done
done

Java's CompletableFuture

Java provides us with a computation context for handling asynchronous computations. To start the CompletableFuture pipeline, we make use of two static factory methods supplyAsync and runAsync which respectively take in a Supplier and a Runnable. As with the Thread constructor, runAsync is seldom useful as we almost always require a value to be returned.

We can wrap f(new A(5)) in a CompletableFuture as follows:

jshell> CompletableFuture.<B>supplyAsync(() -> f(new A(5)))
f: start
$.. ==> java.util.concurrent.CompletableFuture@...[Not completed]

jshell> f: done

Notice that the thread starts immediately, More importantly, the statement completes execution before f completes. This is due to the absence of join().

jshell> CompletableFuture.<B>supplyAsync(() -> f(new A(5))).join()
f: start
f: done
$.. ==> B@...

Callback function

How do we add further processing to the CompletableFuture pipeline? Functions to be executed after completion of an existing process are named callbacks (or call-afters). CompletableFuture operates based on the Hollywood Principle — "don't call us, we'll call you" where an existing process will initiate callback execution after it completes, rather than a callback having to constantly probe the previous process to check completion.

How are callbacks passed to a CompletableFuture? One of these methods is thenApply. Suppose g is the callback:

jshell> CompletableFuture.<B>supplyAsync(() -> f(new A(5))).thenApply(b -> g(b, 5)).join()
f: start
f: done
g: start
g: done
$.. ==> C@...

Notice that thenApply takes in any Function where the input type is the type of the value wrapped in the preceding CompletableFuture. Recall from contexts like Optional, Stream, ImList, Try, Lazy, etc. with the map method that takes in the same kind of Function? Indeed, thenApply is map! This also implies that CompletableFuture is a functor, and the identity and associativity laws of the functor applies.

How do we extend the pipeline to include h? Notice that h is executed after f, and as such:

void foo(int m, int n) {
    CompletableFuture<B> cfb = CompletableFuture.<B>supplyAsync(() -> f(new A(5)));
    CompletableFuture<C> cfc = cfb.thenApply(b -> g(b, m));
    CompletableFuture<D> cfd = cfb.thenApply(b -> h(b, n));
    cfc.join();
    cfd.join();
    System.out.println("DONE");
}

Running foo(5, 10) gives

jshell> foo(5, 10)
f: start
f: done
h: start
h: done
g: start
g: done
DONE

The above output is not exactly right as h and g are executed one after another. The reason is that thenApply makes use of the same thread to perform execution. In the above computation, there is only one thread. What we really desire is:

  • cfb executed on a thread, say A
  • cfc executed on thread A after cfb completes
  • cfd executed on a separate thread B after cfb completes

We can achieve a separate thread execution with thenApplyAsync.

void foo(int m, int n) {
    CompletableFuture<B> cfb = CompletableFuture.<B>supplyAsync(() -> f(new A(5)));
    CompletableFuture<C> cfc = cfb.thenApply(b -> g(b, m));
    CompletableFuture<D> cfd = cfb.thenApplyAsync(b -> h(b, n)); // note the use of thenApplyAsync
    cfc.join();
    cfd.join();
    System.out.println("DONE");
}

Now the execution behaves as expected:

jshell> foo(5, 10)
f: start
f: done
g: start
h: start
g: done
h: done
DONE

Finally the results of cfc and cfd have to be passed to method n. We can make use of thenCombine, and also return the value of cfe.join().

E foo(int m, int n) {
    CompletableFuture<B> cfb = CompletableFuture.<B>supplyAsync(() -> f(new A(5)));
    CompletableFuture<C> cfc = cfb.thenApply(b -> g(b, m));
    CompletableFuture<D> cfd = cfb.thenApplyAsync(b -> h(b, n));
    CompletableFuture<E> cfe = cfc.thenCombine(cfd, (c, d) -> n(c, d));
    return cfe.join();
}

Calling foo(5, 10) returns a value of type E:

jshell> foo(5, 10)
ForkJoinPool.commonPool-worker-8
f: start
f: done
g: start
h: start
g: done
h: done
n: proceeds
$.. ==> E@...

The thenCombine method is called from one CompletableFuture, and the other CompletableFuture is passed to the method, together with a BiFunction to dictate how the two values are to be combined.

How do we map over multiple Optional/Stream/ImList/Try/Lazy contexts? We use flatMap. The equivalent of flatMap in CompletableFuture is thenCompose. The following will produce a similar behaviour.

E foo(int m, int n) {
    CompletableFuture<B> cfb = CompletableFuture.<B>supplyAsync(() -> f(new A(5)));
    CompletableFuture<C> cfc = cfb.thenApply(b -> g(b, m));
    CompletableFuture<D> cfd = cfb.thenApplyAsync(b -> h(b, n));
    CompletableFuture<E> cfe = cfc.thenCompose(c -> cfd.thenApply(d -> n(c, d)));
    return cfe.join();
}

Since CompletableFuture has both thenApply and thenCompose, it qualifies as a monad! Furthermore, the identity and associativity laws of the monad applies!

Converting synchronous methods to asynchronous

So far we have wrapped the provided methods within a CompletableFuture context without modifying the methods.

Now consider the following two methods:

int foo(int x) {
    System.out.println("foo starts");
    if (x < 0) {
        return 0;
    } else {
        return doWork(x);
    }
}

int bar(int x) {
    System.out.println("bar starts");
    return doWork(x);
}

Here is the outcome of bar(foo(3)).

jshell> bar(foo(3))
foo starts
bar starts
$.. ==> 3

To make use of these functions as part of some asynchronous computation, we can modify the functions to return CompletableFuture instead. Let's modify foo first.

CompletableFuture<Integer> foo(int x) {
    System.out.println("foo starts");
    if (x > 0) {
        return CompletableFuture.supplyAsync(() -> doWork(x));
    } else {
        return CompletableFuture.completedFuture(0);
    }
}

In the then block of the if statement, we wrap doWork within a Supplier and pass to supplyAsync as usual. However, in the else block, although we could have also passed () -> 0 into supplyAsync, this will unnecessarily spawn a thread for a computation that is already completed! As such we use an alternative factory method completedFuture that simply wraps an evaluated value in a CompletableFuture.

To perform bar(foo(3)) with only foo converted, we use thenApply. Remember to include the join() method.

jshell> foo(3).thenApply(x -> bar(x)).join()
foo starts
bar starts
$.. ==> 3

What if we further convert the bar method to

CompletableFuture<Integer> bar(int x) {
    System.out.println("bar starts");
    return CompletableFuture.supplyAsync(() -> doWork(x));
}

To perform the same asynchronous computation, we will need thenCompose:

jshell> foo(3).thenCompose(x -> bar(x)).join()
foo starts
bar starts
$.. ==> 3

What if we use thenApply instead of thenCompose?

jshell> foo(3).thenApply(x -> bar(x)).join()
foo starts
bar starts
$.. ==> java.util.concurrent.CompletableFuture@...[Not completed]`

Notice that we do not get 3 from the join() method, but another CompletableFuture? Compare this with

jshell> Optional.of(1).map(x -> Optional.of(x))
$.. ==> Optional[Optional[1]]

jshell> Optional.of(1).map(x -> Optional.of(x)).get()
$.. ==> Optional[1]

jshell> Optional.of(1).map(x -> Optional.of(x)).get().get()
$.. ==> 1

By now, you should know that Optional.of(1).map(Optional.of(1)) will return an Optional that wraps another Optional, and the get() method only unwraps the outer Optional, leaving the inner Optional exposed. Hence, you will need two get() methods; albeit that this is not the right way. You ought to have used flatMap instead:

jshell> Optional.of(1).flatMap(x -> Optional.of(x)).get()
$.. ==> 1

Likewise, although you can get away with two join() method calls,

jshell> foo(3).thenApply(x -> bar(x)).join().join()
foo starts
bar starts
$.. ==> 3

You should start off on the right foot and use thenCompose instead.

There are many methods associated with Java's CompletableFuture and you will get acquainted with some of them in due time. Finally we hope that students would see that with a good appreciation of effect-free, declarative programming with contexts, you are able to readily pickup and adapt to using a new context, since the usage and design pattern is almost always the same.

⚠️ **GitHub.com Fallback** ⚠️