Step 1: Refactor existing code to pure functions - SoftDevGang/RefactorLegacyCodeThroughPureFunctions GitHub Wiki

The first step in the method is refactoring existing code to pure functions.

We start from an arbitrary piece of code, and we make small changes aimed at obtaining immutability and static functions. These changes end up in a separation of pure code from impure code, trying to maximize the pure code as much as possible.

The fundamental technique

The fundamental steps are the following:

NB: needs review for various programming languages. Works in C++ as detailed, Java or Groovy with small adjustments NB: needs more structure; in reality many of these steps get intertwined; perhaps split it into steps for immutability and steps for context-free?

  • pick a piece of the code
  • extract it to a method using automated refactoring
  • make the method static. The compiler or IDE should show you the dependencies.
  • if the code doesn't work anymore, undo make static
  • extract each dependency as a parameter
  • make the method static again; now it should work
  • make each parameter const (or find the parameters that change)
  • refactor to avoid state change (described in the following)
  • extract the pure function
  • double check that the function is pure

The catalog of techniques

Extract I/O

Whenever we encounter an I/O operation, we extract the smallest scope possible into a static function. We cannot avoid mutation due to the I/O operations being mutable by their nature, but we can still ensure that the function is free of context as much as possible.

Another consequence of the nature of I/O operations is that the functions might not return a value.

The canonical example is printing to the console (C++-like pseudocode):

print("The number is: ", number);

which should be extracted into an I/O function:

static void printNumber(const int number){
    print("The number is: ", number);
}

or it could become two functions, one pure and one I/O:


static string buildPrintNumberMessage(const int number){
    return "The number is: " + number
}

static void printString(const string message){
    print(message);
}

Both functions are context free and can be moved around as needed. An additional step could be to transform the functions into lambdas, allowing us to take advantage of functional composition:

template <class F, class G>
auto compose(F f, G g){
  return [=](auto value){return f(g(value));};
}

auto buildPrintNumberMessage = [](const int number) -> string {
    return "The number is: " + number
}

auto printString = [](const string message) -> void {
    print(message);
}

auto printNumberString = compose(buildPrintNumberMessage, printString);

// printNumberString(20) results in printing "The number is 20"

Refactor value assignment

A common construct in impure functions is the assignment of values. As we said, we accept that state change can happen locally within a function. However, we need to refactor the state change that happens on values outside the function's scope: class data members (either static or non-static), other class data members, or global variables.

The general pattern in this situation is to replace any construct of the form:

value = newValue

with

value = pure_function(value)

The refactoring steps are the following (NB: needs review):

  • Extract the smallest scope around the assignment into another function
  • Extract the new value computation into a function
  • Make the new function pure
  • Inline in reverse order

Refactor if/else blocks

TBD

Refactor if blocks

TBD