symbolic_regression_part4 - morinim/ultra GitHub Wiki
Evolving multiple programs at the same time is useful, but some problems also require multiple input variables.
How should we handle both aspects together?
If you only need multiple variables (without teams), src::search is sufficient. In that case, you can safely stop here and refer to the tutorial or the example source.
In general, prefer src::search whenever possible: it provides built-in support for model metrics and validation strategies.
This extends the previous example: we now need to evolve multiple programs that depend on multiple variables.
This combination (teams + variables) cannot be handled by a single predefined abstraction. Instead, the user must customise the generic ultra::search class to match the problem.
A painstaking extension of the previous example is technically viable, but we have a better option.
Instead of defining a custom terminal (c), we use the predefined ultra::variable terminal.
Variables act as placeholders whose values are provided at execution time.
In the main() function:
prob.sset.insert<c>();has been replaced with:
prob.insert<src::variable>(0, "x1");
prob.insert<src::variable>(1, "x2");
prob.insert<src::variable>(2, "x3");Each variable is defined by:
- an index, used to retrieve its value at execution time (e.g.
0); - a name, used for display.
At evaluation time, the index refers to a position inside a vector of input values.
A training example can be represented as:
example(const std::vector<double> &ex_a, const ultra::matrix<double> &ex_b,
const std::vector<double> &ex_x)
: a(ex_a), b(ex_b)
{
std::ranges::copy(ex_x, std::back_inserter(x));
}
std::vector<double> a;
ultra::matrix<double> b;
std::vector<ultra::value_t> x {};x stores the values of the variables for a given example (x[i] corresponds to the variable with index i).
Although the problem uses real numbers, Ultra employs ultra::value_t to support multiple data types. Therefore, we convert the input vector (ex_x) into a vector of value_t.
std::ranges::copy performs this conversion once, avoiding repeated conversions during evaluation.
The training set is a collection of examples:
using training_set = std::vector<example>;Any iterable container can be used (e.g. std::list instead of std::vector).
We can now leverage the existing avg_error_evaluator class (see src/evaluator.h) to simplify the implementation.
avg_error_evaluator is a template class that:
- computes the total error of a program over the training set;
- converts this error into a standardized fitness.
template<Individual P, class F, class D = multi_dataset<dataframe>>
requires ErrorFunction<F, D>
class sum_of_errors_evaluator : public evaluator<D>
{
public:
explicit sum_of_errors_evaluator(D &);
[[nodiscard]] auto operator()(const P &) const;
// ...
};The error functor (F) is constructed from a candidate solution and computes the error for a single example:
const F err_fctr(prg);
auto error(err_fctr(example));The implementation closely follows the previous example, with one key difference: inputs are now provided explicitly.
class error_functor
{
public:
error_functor(const candidate_solution &s) : s_(s) {}
double operator()(const example &ex) const
{
using namespace ultra;
std::vector<double> f(N);
std::ranges::transform(s_, f.begin(),
[&ex](const auto &prg)
{
const auto ret(run(prg, ex.x));
return has_value(ret) ? std::get<D_DOUBLE>(ret) : 0.0;
});
std::vector<double> model(N, 0.0);
for (unsigned i(0); i < N; ++i)
for (unsigned j(0); j < N; ++j)
model[i] += ex.b(i, j) * f[j];
double delta(std::inner_product(ex.a.begin(), ex.a.end(),
model.begin(), 0.0,
std::plus{},
[](auto v1, auto v2)
{
return std::fabs(v1 - v2);
}));
return delta;
}
private:
candidate_solution s_;
};(for your ease all the code is in the examples/symbolic_regression05.cc file)
Note that run(prg) becomes run(prg, ex.x), allowing variable values from the current example to be passed to the program.
In summary, each example provides a different set of input values, and the same team of programs is evaluated across all examples:
| Summary |
|---|
| Teams → multiple programs |
| Variables → dynamic inputs |
| Evaluator → combines both |