symbolic_regression_part2 - morinim/vita GitHub Wiki
Symbolic regression - Custom evaluator
...that is great BUT my problem needs a particular evaluator / requires a unique data access technique / has a peculiar way of doing things.
No problem at all, you can customize the evaluator!
Toy problem
Given a
, b
and c
find a function f
such that $a = b * f(c)$.
Probably this is not of immediate interest, yet is useful to illustrate a trait that may be shared by other, more complicated, problems and as a way to explain a more general problem solving technique.
Setting up code
const double a = vita::random::between(-10.0, 10.0);
const double b = vita::random::between(-10.0, 10.0);
a
and b
get two fixed, random values.
c
is somewhat different: it's a terminal. Terminal and function sets are the alphabet of the to-be-evolved-program (f
). The terminal set consists of the variables and the constants.
For our problem c
is the only terminal required (in general we also add some numbers):
class c : public vita::terminal
{
public:
c() : vita::terminal("c") {}
vita::value_t eval(vita::symbol_params &) const override
{
static const double val(vita::random::between(-10.0, 10.0));
return val;
}
};
The constructor (c() : vita::terminal("c") {}
) sets the name of the terminal (used for displaying purpose).
The eval
function returns a fixed random value.
int main()
{
vita::problem prob;
// SETTING UP SYMBOLS
prob.insert<c>();
prob.insert<vita::real::add>();
prob.insert<vita::real::sub>();
prob.insert<vita::real::mul>();
// ...
}
Note how the base problem
class is used instead of the derived src_problem
. src_problem
has a lot of ready-to-be-used functionalities (dataframes for training and validation, evaluator functions for scoring the goodness of a candidate solution...) but problem
is more general and adaptable to different tasks (not only symbolic regression).
Besides the terminal c
we use the functions add
, sub
, mul
as building blocks (function set).
Now what is missing is the evaluator (aka fitness function):
using candidate_solution = vita::i_mep;
// Given an individual (i.e. a candidate solution of the problem), returns an
// score measuring how good it is.
class my_evaluator : public vita::evaluator<candidate_solution>
{
public:
vita::fitness_t operator()(const candidate_solution &x) override
{
const auto ret(vita::run(x));
const double f(vita::has_value(ret) ? std::get<vita::D_DOUBLE>(ret)
: 0.0);
const double model_output(b * f);
const double delta(std::fabs(a - model_output));
return {-delta};
}
};
candidate_solution
is just an alias for i_mep
; i_mep
(Multi Expression Programming) is a kind of linear representation used for Genetic Programming.
A line by line description of the evaluation process follows:
const auto ret(vita::run(x));
Simply gets and stores the output of the candidate_solution.
ret
is a std::variant
(see vita::value_t
for further details).
Variants allow efficient manipulation of different data types: here we are working with real numbers but Vita also supports integers and strings.
const double f(vita::has_value(ret) ? std::get<vita::D_DOUBLE>(ret)
: 0.0);
std::get<vita::D_DOUBLE>(ret)
extracts the real number from the variant.
The user must check the variant is not empty (vita::has_value(ret)
): it's required since the evolution process generates many nefarious individuals that could blow up for specific input values.
const double model_output(b * f);
const double delta(std::fabs(a - model_output));
delta
is a measure of the error based on the absolute value. Different norms may give better results (problem dependent).
return {-delta};
The last instruction can be confusing, so let's see some details:
-delta
instead ofdelta
. Vita uses standardized fitness (greater is better) not raw fitness. See the comments infitness.h
;{-delta}
instead of-delta
. Fitness is a vector type. Here it's just a one dimensional vector; in general it could have more dimensions (side note, by default evolution uses a lexicographic comparison for fitness).
All that remains is to put the pieces together:
int main()
{
// ...
// AD HOC EVALUATOR
vita::search<candidate_solution> s(prob);
s.training_evaluator<my_evaluator>();
// SEARCHING
const auto result(s.run());
std::cout << "\nCANDIDATE SOLUTION\n"
<< vita::out::c_language << result.best.solution
<< "\n\nFITNESS\n" << result.best.score.fitness << '\n';
}
The search object (s
) is instructed to use our evaluator (s.training_evaluator<my_evaluator>()
) before being launched (s.run()
).
(for your ease all the code is in the examples/symbolic_regression03.cc file)