Lecture09 - nus-cs2030/2324-s2 GitHub Wiki
Functional Programming is rooted in the principles and concepts from Math, with writing equations to express the equality of two expressions being the most fundamental. As an example, the equations
give the solution
It is unfortunate that programming languages uses =
to express assignment instead of equality, e.g.
x = 1;
y = x;
y = 2;
where variable y
was first assigned with the value 1
and then with value 2
.
Functional programming has no notion of execution history due to the absence of side-effects. As we have seen throughout this course, this makes the behaviour of our program more predictable and less buggy, which allows us to focus on the logic of the solution.
However, that does not mean that we cannot enforce execution ordering.
The way we enforce order is through function evaluation, e.g.
Let us start with the concept of a function which is a mapping from a domain to a range of values within a codomain.
Consider the function
Here is an attempt to define a function to represent
int f(int x, int y) {
return x / y;
}
On closer inspection, the int y
parameter (part of the domain) is not exactly right. One could pass the value y
parameter of the function which would result in an exception.
We could have done better by declaring the parameter y
as
int f(int x, nonZeroInt y) { // assuming the primitive type exists
return x / y;
}
by allowing the compiler to type-check on arguments passed to f
. Since int
is not a sub-type of nonZeroInt
, values of type int
, particularly the value 0
, cannot be passed to the parameter y
.
In this case, leaving the return type (codomain) as int
is fine.
Rather than restricting the type of the domain, an alternative solution is to declare the return type as Optional<Integer>
.
Optional<Integer> f(int x, int y) {
return Optional.<Integer>of(y)
.filter(n -> n != 0)
.map(y -> x / y);
}
Now the range (and also codomain) of values are of type Optional<Integer>
with Optional.empty
being one of these values.
We have now defined f
as a pure function:
- every possible argument
x
maps to exactly one return value of the return type (i.e. codomain); - multiple arguments can map to the same return value, e.g.
f(4,3)
andf(5,4)
both mapped toOptional[1]
; - not all values of the return type (codomain) may be mapped.
A pure function returns a deterministic value for every argument and has no side-effects (e.g. resulting in an exception that is outside the return type).
In particular, the absence of side-effects is a necessary condition for referential transparency.
That is to say, an expression (e.g. f(4, 3)
) can always be replaced by an equivalent expression, e.g. a result value 1
, or another (referentially transparent) expression f(5, 4)
.
In general, to check for referential transparency, an execution of a statement of code ...e...e...
is equivalent to invoking a function definition { v = e; return (...v...v...); }
.
Suppose we are given the function
void r(List<Integer> queue, int i) {
queue.add(i);
}
and the statement foo(bar(r(q)),baz(r(q)));
We cannot replace this with { v = r(q); return (foo(bar(v),baz(v))); }
because the state of the list q
is changed after the first call to r(q)
,
and calling r(q)
again will give a different result.
Moreover, if List
is an implementation of type ImmutableCollections
, calling add
will result in an exception.
How about the following?
int s(int i) {
return this.x + i;
}
Clearly, reading a value via this.x
is effect-free, particularly if s
is an instance method of an immutable class say, A
, in which case new A(..).s(1)
will be deterministic.
Thus far, our examples have been simple which require no assignments. What about more practical programs with control flow? These have massive side effects. As you shall see, rather than using side-effects to model control flow, we model them as context values that encapsulate the side-effects while maintaining pure function definitions.
Java 8 was the first attempt to make functions first class. Now a function (or function object) can be:
- passed as argument to higher-order functions
- returned as a result from higher-order functions
- assigned to variables, or stored inside other data structures
Moreover, rather than passing a function as an instance of a named class, we can pass a function directly to a higher-order function using a lambda expression or anonymous inner class definition.
Other than
Function
, we have already seen other function objects such asConsumer
,Supplier
,Predicate
, etc. which are also first class. But here we are particularly interested in theFunction
due to its connotation with Math functions.
Two functions
Similarly, we can compose two Function
objects using compose
and andThen
-
f.compose(g)
is equivalent tox -> f.apply(g.apply(x))
, which follows from$f \circ g$ . -
f.andThen(g)
is equivalent tox -> g.apply(f.apply(x))
, i.e. applyx
tof
, and then apply the result tog
. This is similar to unix pipes, e.g. to concatenate all java files, then look for lines having thereturn
keyword, and then count the number of such lines.
cat *.java | grep return | wc -l
It is interesting to note the method signatures of compose
and andThen
in the Java API. These are default
methods that are implemented out of necessity within the Function<T,R>
functional interface, thus making the interface impure.
The first method has the signature
<V> Function<T,V> andThen(Function<? super R,? extends V> after)
.
This method is called with respect to the this
object of type Function<T,R>
. The function takes in a value of
type T
and outputs a value of type R
. This latter value is
passed to the after
function with an input type of R
and a
possibly different output type V
. Hence the function generated
from the andThen
method is of type Function<T,V>
.
T -> [Function<T,R>] -> R -> [Function<R,V>] -> V
which is equivalent to
T -> [Function<T,V>] -> V
The second method has the signature <V> Function<V,R> compose(Function<? super V,? extends T> before)
. This method is called with respect to the before
function so that the output type can be passed to the this
object of type Function<T,R>
. This implies that the output of function before
must be T
, and the input type can be any other type, say V
. Hence the function generated via the compose
method is of type Function<V,R>
.
V -> [Function<V,T>] -> T -> [Function<T,R>] -> R
which is equivalent to
V -> [Function<V,R>] -> R
A tupled function takes in arguments as a tuple.
For example, the add
function takes in a tuple of two integer
arguments and produces an integer result.
add :: (int, int) -> int
We can define the add
function as an implementation of
BinaryOperator<Integer>
with the definition of the Integer apply(Integer x, Integer y)
method.
jshell> BinaryOperator<Integer> add = (x, y) -> x + y
add ==> $Lambda$..
jshell> add.apply(2, 3)
$.. ==> 5
In contrast to a tupled function, add
can also be implemented as a
curried function.
add :: int -> int -> int
Here, we simply make use of the usual Function
functional
interface.
jshell> Function<Integer,Function<Integer,Integer>> add = x -> (y -> x + y)
add ==> $Lambda$..
jshell> add.apply(2).apply(3)
$.. ==> 5
Note that the add
function has type
Function<Integer,Function<Integer,Integer>>
. It is a function that
takes in an integer input, and gives an output that is of type
Function<Integer,Integer>
. This suggests that add.apply(2)
only
produces a partial application of the function, which requires
another invocation of the apply
method to complete the evaluation.
More importantly, currying suggests that a tupled function of
any number of arguments can be implemented as a curried function
using only the Function
functional interface.
Any practical software application will need to interact with
users via some input/output, interact with databases to read/write
data, establish an internet connections to get/post data, etc.
All these make it seemingly hard to write programs using only pure functions.
However, we have also seen how we can make use of Optional
to wrap the
side-effect of invalid or missing values and propagate the value (with their accompanying effects) via map
and flatMap
.
Similarly,
- the side-effect of exception handling can be wrapped in a
Try
- the side-effect of cached evaluation can be wrapped in a
Lazy
- the side-effect of looping can be wrapped in a
Stream
- ...
All the above computation contexts have map
and flatMap
which can be generalized to functors and monads —
category theory concepts from abstract algebra.
Let us start our discussion with the functor
concept.
Suppose we have defined a Maybe
class that encapsulates a null
value, but yet to have map
or flatMap
methods defined.
To qualify Maybe
as a functor with the map
method, one can have
Maybe
extend from Functor
(for simplicity we drop the bounded wildcards).
abstract class Functor<T> {
abstract <R> Functor<R> map(Function<T,R>);
}
class Maybe<T> extends Functor<T> {
....
<R> Maybe<R> map(Function<T,R> mapper) { ... }
}
As an aside, one can also define
Functor
as follows if higher-kinded types are allowed (e.g. in Scala)abstract class Functor<C<_>> { abstract <T,R> C<R> map(Function<T, R> mapper, C<T> someContext); }
You can think of it as implementing the
map
in aMaybeFunctor
class by passing both theFunction
and theMaybe
; doing this will not require the existingMaybe
class to be modified. Since Java does not support high-kinded types, we shall not pursue this further.
Let us use the Log
context as an example since logging is a more visible
side-effect
Suppose we initialize a Log<Integer>
context with an initial log
"init 2"
.
jshell> Log.<Integer>of(2, "init 2")
$.. ==> Log[2]
init 2
To manipulate the value within the context without logging, we use
map
.
jshell> Log.<Integer>of(2, "init 2").map(x -> x + 5)
$.. ==> Log[7]
init 2
Notice that the map
method allows us to change the content of the
Log<Integer>
(from 2
to 7
), without changing the context "init 2"
.
There are also two laws governing the map
operation of a functor object obj
.
- Identity law:
obj.map(x -> x) <-> obj
- Composition law:
obj.map(f).map(g) <-> obj.map(g.compose(f)) <-> obj.map(x -> g.apply(f.apply(x)))
assuming that both f
and g
are pure functions.
When we implement map
for Log
(or map
for any
context), we need to make sure that the laws are followed.
Since map
does not log, to add further logging we need flatMap
.
Suppose Log
is now a monad, which is an extension of a functor with the
flatMap
method.
abstract class Monad<T> extends Functor<T> {
abstract <R> Monad<R> flatMap(Function<T,Monad<R>>);
abstract Monad<T> unit(T t);
}
class Maybe<T> extends Monad<T> {
....
<R> Maybe<R> map(Function<T,R> mapper) { ... }
<R> Maybe<R> flatMap(Function<T,Maybe<R>> mapper) { ... } // assuming it overrides...
Maybe<T> unit(T t) { ... }
}
Or alternatively, as a higher-kinded type,
abstract class Monad<C_>> extends Functor<C<_>> { abstract <T,R> C<R> flatMap(Function<T, C<R>> mapper, C<T> someContext); abstract C<T> unit(T t); }
We shall illustrate logging starting with an imperative solution.
Note that the assignment to the log
variable is a side-effect.
jshell> String log = ""
log ==> ""
jshell> int addFive(int x) {
...> log = log + "addFive;";
...> return x + 5;
...> }
| modified method addFive(int)
jshell> int multTen(int x) {
...> log = log + "multTen;";
...> return x * 10;
...> }
| modified method multTen(int)
jshell> x = 2; r1 = addFive(x); r2 = multTen(r1)
x ==> 2
r1 ==> 7
r2 ==> 70
jshell> log
log ==> "addFive;multTen;"
Now suppose we have a Log
context that handles the side-effect of logging.
We can re-express the solution where the logging is handled implicitly in the context.
jshell> Log<Integer> addFive(int x) {
...> return Log.<Integer>of(x + 5, "addFive");
...> }
| replaced method addFive(int)
jshell> Log<Integer> multTen(int x) {
...> return Log.<Integer>of(x * 10, "multTen");
...> }
| replaced method multTen(int)
jshell> Log.of(2).flatMap(x -> addFive(x)).flatMap(r1 -> multTen(r1))
$.. ==> Log[70]: addFive, multTen
Notice from the above that each logging operation requires a flatMap
in order to concatenate the logs together.
You may also conceptualize an intermediate (albeit informal) solution in the form of a monad comprehension which is syntactic sugar for long chains of flatMap
. Note that Java does not support comprehensions.
do {
x <- unit(2);
r1 <- addFive(x);
r2 <- multTen(r1);
unit(r2);
}
Every line of a monad comprehension is a monadic (or context) value, and we use <-
to inspect the
value of type T
wrapped inside Monad<T>
. The last monadic value is the result that represents the
final outcome of the monad comprehension. All side-effects are handled implicitly within the monad.
Now you need a translation scheme to translate the monad comprehension to chaining flatMap, map, filter methods (just like for list comprehension). In fact a list comprehension
[ ex | x <- L1; test(x); y <- L2 ]
can be re-expressed as a monad comprehension
do {
x <- L1;
if test(x);
y <- L2;
unit(ex); // ex is an expression over x and y
}
since L1 is also a monad with flatMap
, map
and filter
.
You should get the same result for both, i.e.
L1.filter(x -> test(x)).flatMap(x -> L2.map(y -> ex))
Here is the translation scheme:
do { x <- L1; if test(x); rest } <-> do { x <- L1.filter(x -> test(x)); rest }
do { x <- L1; rest } <-> L1.flatMap(x <- do { rest })
do { L1 } <-> L1
For the special case of do {x <- L1; unit(ex)}
where ex
is an expression over x, it can be simply translated to a
map
operation L1.map(x -> ex)
.
Here is an example of how you can apply the translation. Suppose you are given the monad comprehension
do { x <- L1; if (test(x)); y <- L2; unit(x + y); }
The translation proceeds as follows:
do { x <- L1; if (test(x)); y <- L2; unit(x + y); }
<-> do { x <- L1.filter(x -> test(x)); y <- L2; unit(x + y); } // translate test to filter
<-> L1.filter(x -> test(x)).flatMap(x -> do { y <- L2; unit(x + y); }) // translate do to flatMap
<-> L1.filter(x -> test(x)).flatMap(x -> L2.flatMap(y -> unit(x + y))) // translate do to flatMap
or alternatively
<-> L1.filter(x -> test(x)).flatMap(x -> L2.map(y -> x + y)) // using special case translation to map
Just like functor laws, there are monad laws governing the flatMap
operation.
Let us take inspiration from the laws with respect to the monoid, a first-order cousin of monad.
In abstract algebra, a monoid is a triplet
- a set
$S$ ; - an associative binary operator
$\bigoplus$ that maps$(S,S) \rightarrow S$ ; and - an identity element
$\varepsilon \in S$ .
Here are some examples of monoids:
$(\mathbb{Z},+,0)$ $(\mathbb{Z},\times,1)$ -
$(S,\cup,\emptyset)$ where$S$ represents a set -
(String, +, "")
where+
is the concatenation operator
Here are the two laws of the monoid governing the associative binary
operator
-
identity law:
$\varepsilon \bigoplus x \equiv x \bigoplus \varepsilon \equiv x$
e.g. for$(\mathbb{Z},+,0)$ we have$0 + x \equiv x + 0 \equiv x$ -
associative law:
$(x \bigoplus y) \bigoplus z \equiv x \bigoplus (y \bigoplus z)$
e.g. for$(\mathbb{Z},\times,1)$ we have$x \times (y \times z) \equiv (x \times y) \times z$
Correspondingly for monads, we have a context C<T>
with two properties:
- an associative
flatMap :: C<T> -> (T -> C<R>) -> C<R>
- the operator
unit :: T -> C<T>
and two corresponding laws governing flatMap
- Identity law:
unit(a).flatMap(x -> f(x)) <-> f(a).flatMap(x -> unit(x)) <-> f(a)
- Associativity law:
(obj.flatMap(g)).flatMap(h) <-> obj.flatMap(x -> g.apply(x).flatMap(h))
The associative law of Monads warrants a little explanation. It is similar but not quite the same as composition law of the functor, which was:
(obj.map(g)).map(h) <-> obj.map(h.compose(g)) <-> obj.map(x -> h.apply(g.apply(x)))
If we try to use this composition law of Functor as our associative law for Monad, we may try to write it as:
(obj.flatMap(g)).flatMap(h) <-> obj.flatMap(h.compose(g)) <-> obj.flatMap(x -> h.apply(g.apply(x)))
However, g.apply(x)
gives a monadic value and hence cannot be passed to h
, i.e. h.compose(g)
is invalid.
Therefore we need to re-express
h.apply(g.apply(x))
as g.apply(x).flatMap(h)
.
Lastly a word of advice.
Do be mindful that compilers cannnot check if functor or monad laws
are adhered to.
The onus is on you to test the map
and flatMap
methods.