symbolic_regression_part2 - morinim/ultra GitHub Wiki

Symbolic regression - custom evaluator

...that is great BUT my problem needs a particular evaluator / requires a unique data access technique / has a peculiar way of doing things.

No problem: you can fully customise the evaluator.

Toy problem

Given a, b and c find a function f such that $a = b * f(c)$.

This example is deliberately simple. While not particularly interesting on its own, it highlights a pattern that appears in more complex problems and provides a convenient way to introduce a general technique.

Setting up code

const double a = ultra::random::between(-10.0, 10.0);
const double b = ultra::random::between(-10.0, 10.0);

a and b get two fixed, random values.

c is somewhat different: it's a terminal. Terminals and functions are the alphabet of the program to be evolved (f). The terminal set consists of the variables and the constants.

In this example, c is the only terminal required (in general, numerical constants are also included):

class c : public ultra::terminal
{
public:
  c() : ultra::terminal("c") {}

  [[nodiscard]] value_t instance() const noexcept final
  {
    static const double val(ultra::random::between(-10.0, 10.0));
    return val;
  }
};

The constructor sets the name of the terminal (used for displaying purpose).

The instance() function returns a fixed random value.


int main()
{
  using namespace ultra;

  problem prob;

  // SETTING UP SYMBOLS
  prob.insert<c>();          // terminal
  prob.insert<real::add>();  // functions
  prob.insert<real::sub>();
  prob.insert<real::mul>();

  // ...
}

Note that the base problem class is used instead of src::problem.

src::problem provides many ready-to-use features (datasets, validation strategies, standard evaluators), whereas problem is more general and suitable for custom scenarios.

Alongside the terminal c, we include add, sub, and mul as building blocks (the function set).


https://xkcd.com/534/

Now only the evaluator (aka fitness function) is missing:

using candidate_solution = ultra::gp::individual;

// Given an individual (i.e. a candidate solution of the problem), returns a
// score measuring how good it is.
[[nodiscard]] double my_evaluator(const candidate_solution &x)
{
  using namespace ultra;

  const auto ret(run(x));

  const double f(has_value(ret) ? std::get<D_DOUBLE>(ret) : 0.0);

  const double model_output(b * f);

  const double delta(std::fabs(a - model_output));

  return -delta;
}

candidate_solution is simply an alias for gp::individual, which represents a program in linear form (a Straight Line Program).

Let us examine the evaluation process step by step.

const auto ret(run(x));

Evaluates the candidate solution and stores its output.

ret is a std::variant (see value_t for further details).

Variants allow efficient manipulation of different data types: here we're working with real numbers but ULTRA also supports integers and strings.

const double f(has_value(ret) ? std::get<D_DOUBLE>(ret) : 0.0);

std::get<D_DOUBLE>(ret) extracts the real number from the variant.

The user must check whether the variant holds a valid value (has_value(ret)), as evolution may produce invalid programs (e.g. division by zero).

const double model_output(b * f);

const double delta(std::fabs(a - model_output));

delta is a measure of the error based on the absolute value. Different norms may give better results (problem dependent).

return -delta;

Returning -delta may look unusual: ULTRA uses standardised fitness (higher is better), rather than raw error. See fitness.h for details.


All that remains is to put the pieces together:

int main()
{
  // ...

  // AD HOC EVALUATOR
  search s(prob, my_evaluator);

  // SEARCHING
  const auto result(s.run());

  std::cout << "\nCANDIDATE SOLUTION\n"
            << out::c_language << result.best_individual()
            << "\n\nFITNESS\n" << *result.best_measurements().fitness << '\n';
}

The search object (s) is instructed to use our evaluator before being launched (s.run()).

(for your ease all the code is in the examples/symbolic_regression/symbolic_regression03.cc file)

PROCEED TO PART 3 →

⚠️ **GitHub.com Fallback** ⚠️