Lecture10 - nus-cs2030/2324-s2 GitHub Wiki
Finally, we look at asynchronous programming and how internal threading and execution is managed by the CompletableFuture
context.
The running example that we are using comprises the following:
class A {
int x;
A(int x) {
this.x = x;
}
}
class B { }
class C { }
class D { }
class E { }
B f(A a) {
System.out.println("f: start");
doWork(a.x);
System.out.println("f: done");
return new B();
}
C g(B b, int n) {
System.out.println("g: start");
doWork(n);
System.out.println("g: done");
return new C();
}
D h(B b, int n) {
System.out.println("h: start");
doWork(n);
System.out.println("h: done");
return new D();
}
E n(C c, D d) {
System.out.println("n: proceeds");
return new E();
}
void foo(int m, int n) {
B b = f(new A(5));
C c = g(b, m);
D d = h(b, n);
E e = n(c, d);
}
where doWork(n)
is used to represent some delay during computation (here assumed to be proportional to n
).
More importantly,
- methods
g
andh
requiref
to be completed; - if methods
g
andh
are effect-free and do not depend on one another, both of them can be executed together; - method
n
needs to wait for both methodsg
andh
to complete execution.
With synchronous computation, one will expect that invoking foo(5, 10)
will execute g
, wait
for it to complete, before h
executes.
jshell> foo(5, 10)
f: start
f: done
g: start
g: done
h: start
h: done
n: proceeds
We can make use of threads to speed up computation asynchronously.
- let
f
be executed in the main thread; - when
f
completes, spawn a new thread to executeg
; - at the same time, let the main thread execute
h
; - have the main thread wait for
g
to complete; - continue executing
e
on the main thread.
void foo(int m, int n) {
B b = f(new A(5));
Thread t = new Thread(() -> g(b, m));
t.start();
D d = h(b, n);
try {
t.join();
} catch (InterruptedException e) {}
System.out.println("DONE");
}
Notice that thread t
is created with g(b,m)
wrapped in a Runnable
(not a Supplier
) and passed to the
constructor. A Runnable
is a functional interface with the abstract method run
that takes no arguments and
returns void
; so we cannot expect a value to be returned from t
.
Execution of the thread initiates with t.start()
, while t.join()
performs a blocking wait for thread t
to complete.
In the absence of join()
, the main thread will not wait for the completion of thread t
, but goes on to execute the println
statement.
This would be an issue particularly when m > n
.
Also note that the join
method throws an exception, and hence needs to be explicitly handled (by catching or
throwing).
In the following output, notice how g
and h
completes before DONE
jshell> foo(5, 10)
f: start
f: done
h: start
g: start
g: done
h: done
DONE
Up to this point, we are explicitly performing thread management. Moreover, due to its stateful nature, bugs arise if we order the instructions incorrectly.
void foo(int m, int n) {
B b = f(new A(5));
D d = h(b, n);
Thread t = new Thread(() -> g(b, m));
t.start();
try {
t.join();
} catch (InterruptedException e) {}
System.out.println("DONE");
}
jshell> foo(5, 10)
f: start
f: done
h: start
h: done // h has to be completed before a new thread is created to process g
g: start
g: done
done
Java provides us with a computation context for handling asynchronous computations.
To start the CompletableFuture
pipeline, we make use of two static factory methods supplyAsync
and runAsync
which
respectively take in a Supplier
and a Runnable
. As with the Thread
constructor, runAsync
is seldom
useful as we almost always require a value to be returned.
We can wrap f(new A(5))
in a CompletableFuture
as follows:
jshell> CompletableFuture.<B>supplyAsync(() -> f(new A(5)))
f: start
$.. ==> java.util.concurrent.CompletableFuture@...[Not completed]
jshell> f: done
Notice that the thread starts immediately, More importantly, the statement completes execution before f
completes. This is due to the absence of join()
.
jshell> CompletableFuture.<B>supplyAsync(() -> f(new A(5))).join()
f: start
f: done
$.. ==> B@...
How do we add further processing to the CompletableFuture
pipeline?
Functions to be executed after completion of an existing process are named callbacks (or
call-afters).
CompletableFuture
operates based on the Hollywood Principle — "don't call us, we'll call you" where an
existing process will initiate callback execution after it completes, rather than a callback having to constantly
probe the previous process to check completion.
How are callbacks passed to a CompletableFuture
?
One of these methods is thenApply
. Suppose g
is the callback:
jshell> CompletableFuture.<B>supplyAsync(() -> f(new A(5))).thenApply(b -> g(b, 5)).join()
f: start
f: done
g: start
g: done
$.. ==> C@...
Notice that thenApply
takes in any Function
where the input type is the type of the value wrapped in the preceding CompletableFuture
. Recall from contexts like Optional
, Stream
, ImList
, Try
, Lazy
, etc. with the map
method that takes in the same kind of Function
?
Indeed, thenApply
is map
! This also implies that CompletableFuture
is a functor, and the identity and associativity laws of the functor applies.
How do we extend the pipeline to include h
? Notice that h
is executed after f
, and as such:
void foo(int m, int n) {
CompletableFuture<B> cfb = CompletableFuture.<B>supplyAsync(() -> f(new A(5)));
CompletableFuture<C> cfc = cfb.thenApply(b -> g(b, m));
CompletableFuture<D> cfd = cfb.thenApply(b -> h(b, n));
cfc.join();
cfd.join();
System.out.println("DONE");
}
Running foo(5, 10)
gives
jshell> foo(5, 10)
f: start
f: done
h: start
h: done
g: start
g: done
DONE
The above output is not exactly right as h
and g
are executed one after another. The reason is that thenApply
makes use of the
same thread to perform execution. In the above computation, there is only one thread.
What we really desire is:
-
cfb
executed on a thread, sayA
-
cfc
executed on threadA
aftercfb
completes -
cfd
executed on a separate threadB
aftercfb
completes
We can achieve a separate thread execution with thenApplyAsync
.
void foo(int m, int n) {
CompletableFuture<B> cfb = CompletableFuture.<B>supplyAsync(() -> f(new A(5)));
CompletableFuture<C> cfc = cfb.thenApply(b -> g(b, m));
CompletableFuture<D> cfd = cfb.thenApplyAsync(b -> h(b, n)); // note the use of thenApplyAsync
cfc.join();
cfd.join();
System.out.println("DONE");
}
Now the execution behaves as expected:
jshell> foo(5, 10)
f: start
f: done
g: start
h: start
g: done
h: done
DONE
Finally the results of cfc
and cfd
have to be passed to method n
.
We can make use of thenCombine
, and also return the value of cfe.join()
.
E foo(int m, int n) {
CompletableFuture<B> cfb = CompletableFuture.<B>supplyAsync(() -> f(new A(5)));
CompletableFuture<C> cfc = cfb.thenApply(b -> g(b, m));
CompletableFuture<D> cfd = cfb.thenApplyAsync(b -> h(b, n));
CompletableFuture<E> cfe = cfc.thenCombine(cfd, (c, d) -> n(c, d));
return cfe.join();
}
Calling foo(5, 10)
returns a value of type E
:
jshell> foo(5, 10)
ForkJoinPool.commonPool-worker-8
f: start
f: done
g: start
h: start
g: done
h: done
n: proceeds
$.. ==> E@...
The thenCombine
method is called from one CompletableFuture
, and the other CompletableFuture
is passed to the method, together with a BiFunction
to dictate how the two values are to be combined.
How do we map over multiple Optional
/Stream
/ImList
/Try
/Lazy
contexts? We use flatMap
.
The equivalent of flatMap
in CompletableFuture
is thenCompose
.
The following will produce a similar behaviour.
E foo(int m, int n) {
CompletableFuture<B> cfb = CompletableFuture.<B>supplyAsync(() -> f(new A(5)));
CompletableFuture<C> cfc = cfb.thenApply(b -> g(b, m));
CompletableFuture<D> cfd = cfb.thenApplyAsync(b -> h(b, n));
CompletableFuture<E> cfe = cfc.thenCompose(c -> cfd.thenApply(d -> n(c, d)));
return cfe.join();
}
Since CompletableFuture
has both thenApply
and thenCompose
, it qualifies as a monad!
Furthermore, the identity and associativity laws of the monad applies!
So far we have wrapped the provided methods within a CompletableFuture
context without modifying the methods.
Now consider the following two methods:
int foo(int x) {
System.out.println("foo starts");
if (x < 0) {
return 0;
} else {
return doWork(x);
}
}
int bar(int x) {
System.out.println("bar starts");
return doWork(x);
}
Here is the outcome of bar(foo(3))
.
jshell> bar(foo(3))
foo starts
bar starts
$.. ==> 3
To make use of these functions as part of some asynchronous computation, we can modify the functions to return CompletableFuture
instead.
Let's modify foo
first.
CompletableFuture<Integer> foo(int x) {
System.out.println("foo starts");
if (x > 0) {
return CompletableFuture.supplyAsync(() -> doWork(x));
} else {
return CompletableFuture.completedFuture(0);
}
}
In the then block of the if
statement, we wrap doWork
within a Supplier
and pass to supplyAsync
as
usual.
However, in the else block, although we could have also passed () -> 0
into supplyAsync
, this will
unnecessarily spawn a thread for a computation that is already completed!
As such we use an alternative factory method completedFuture
that simply wraps an evaluated value in a CompletableFuture
.
To perform bar(foo(3))
with only foo
converted, we use thenApply
.
Remember to include the join()
method.
jshell> foo(3).thenApply(x -> bar(x)).join()
foo starts
bar starts
$.. ==> 3
What if we further convert the bar
method to
CompletableFuture<Integer> bar(int x) {
System.out.println("bar starts");
return CompletableFuture.supplyAsync(() -> doWork(x));
}
To perform the same asynchronous computation, we will need thenCompose
:
jshell> foo(3).thenCompose(x -> bar(x)).join()
foo starts
bar starts
$.. ==> 3
What if we use thenApply
instead of thenCompose
?
jshell> foo(3).thenApply(x -> bar(x)).join()
foo starts
bar starts
$.. ==> java.util.concurrent.CompletableFuture@...[Not completed]`
Notice that we do not get 3
from the join()
method, but another CompletableFuture
?
Compare this with
jshell> Optional.of(1).map(x -> Optional.of(x))
$.. ==> Optional[Optional[1]]
jshell> Optional.of(1).map(x -> Optional.of(x)).get()
$.. ==> Optional[1]
jshell> Optional.of(1).map(x -> Optional.of(x)).get().get()
$.. ==> 1
By now, you should know that Optional.of(1).map(Optional.of(1))
will return an Optional
that wraps another Optional
, and the get()
method only unwraps the outer Optional
, leaving the inner Optional
exposed.
Hence, you will need two get()
methods; albeit that this is not the right way.
You ought to have used flatMap
instead:
jshell> Optional.of(1).flatMap(x -> Optional.of(x)).get()
$.. ==> 1
Likewise, although you can get away with two join()
method calls,
jshell> foo(3).thenApply(x -> bar(x)).join().join()
foo starts
bar starts
$.. ==> 3
You should start off on the right foot and use thenCompose
instead.
There are many methods associated with Java's CompletableFuture
and you will get acquainted with some of them in due time.
Finally we hope that students would see that with a good appreciation of effect-free, declarative programming with contexts, you are able to readily pickup and adapt to using a new context, since the usage and design pattern is almost always the same.