anatomy - morinim/ultra GitHub Wiki
This document is a work in progress.
Abstract: this paper presents an extended specification for the Ultra evolutionary framework. Architectural choices are detailed and their conceptual significance is discussed. Key implementation aspects are also covered.
Logical view
Ultra implements an object-oriented layered architecture, built on the base of five classes:
basic_searchandevolution;problem;layered_population;parameters.
classDiagram
note "High level logical view"
namespace ultra {
class basic_search {
+run()
}
class evolution
class layered_population
class parameters
class problem {
+parameters params
+symbol_set sset
}
}
evolution "1" <-- "1" basic_search
layered_population "1" <-- "1" evolution
problem "1" <-- "1" basic_search
problem "1" *-- parameters
The purpose of the layered architecture is to clearly separate foundational concepts, such as the internal structure of individuals, from higher-level abstractions more directly related to the evolutionary process.
This is the minimal skeleton of a program using the Ultra framework:
ultra::basic_search<a_given_evolutionary_strategy> s(a_problem_instance, a_fitness_function);
s.run();
The basic_search class allows users to select the evolutionary strategy to employ and provides unified interface for different types of search. In most cases, users can simply rely on ultra::search, a convenient instantiation of ultra::basic_search that implements a general-purpose evolutionary strategy:
ultra::search s(a_problem_instance, a_fitness_function);
s.run();
There are various specialisations of the ultra::basic_search for different tasks (de::search for Differential Evolution, ga::search for Genetic Algorithms, hga::search for Heterogeneous Genetic Algorithms and src::search for Symbolic Regression and Classification).
All strategies, regardless of the specific search class, rely on ultra::problem or one of its specialisations, to access problem parameters / constraints (via the params data member), and the building blocks for individuals (via the sset data member).
classDiagram
note "Problem logical view"
namespace ultra {
class problem {
+parameters params
+symbol_set sset
}
class symbol
class symbol_set {
+insert(symbol, weight)
}
class w_symbol {
+symbol sym
+unsigned weight
}
}
symbol_set "1" *-- "*" w_symbol
problem "1" *-- "1" symbol_set
w_symbol "1" *-- "1" symbol
Being an evolutionary framework, Ultra performs its work with the help of the ultra::evolution and ultra::population classes.
Problem
classDiagram
note "Problem in-depth view"
class problem {
+parameters params
+symbol_set sset
}
class symbol
class symbol_set {
+insert(symbol, weight)
}
class w_symbol {
+symbol sym
+unsigned weight
}
class `de::problem` {
+problem(dimensions, interval)
+problem(intervals)
+insert(interval, category) real *
}
class `ga::problem` {
problem(dimensions, interval)
+problem(intervals)
+insert(interval, category) integer *
}
class `hga::problem` {
+insert(...) terminal *
}
class `src::problem` {
+problem(dataframe)
+problem(filepath)
+problem(input_stream)
+setup_symbols()
+setup_terminals()
}
symbol_set "1" *-- "*" w_symbol
problem "1" *-- "1" symbol_set
w_symbol "1" *-- "1" symbol
problem <|-- `de::problem`
problem <|-- `hga::problem`
problem <|-- `ga::problem`
problem <|-- `src::problem`
Specialisations of the problem class simplify the setup of the evolutionary environment and the evaluation of individuals. For instance:
ultra::de::problem prob(5, {-5.12, 5.12});
ultra::de::search search(prob, function_to_be_optimized);
const auto result(search.run());
defines a five-dimensional search space where each variable ranges over the interval $[-5.12;5.12]$, and then launches an optimisation based on differential evolution.
Population
The default population is organized in multiple subgroups or layers. In the picture, each subgroup is represented as a torus to mark the fact that many evolutionary strategies may use a Trivial Geography scheme, where individuals are viewed as having a 1-dimensional spatial structure, essentially a circle, where the first and last positions are considered adjacent. The production of an individual for location i is permitted to involve only parents from i's local neighborhood.
This behaviour can be disabled by setting parameters::population::mate_zone to a sufficiently large value. Each layer is implemented in the linear_population class.
The global population is implemented in the layered_population class. The first and last layers are highlighted in distinct colours to reflect their special handling in some algorithms. Notably, the ALPS algorithm segregates individuals based on their ages:
- the first layer contains the youngest individuals;
- upper layers contain progressively older individuals;
- the last layer contains the oldest individual (without an age limit).
In contrast, the standard evolutionary strategy treats each layer in the same way.
Regardless of the chosen strategy, all layers are are evolved in parallel. Possible interactions among layers depend on the strategy. For example, using the standard evolutionary strategy and the differential evolutionary strategy, there is no direct interaction; with ALPS the i-th layer may sometimes access the i-1-th layer for selection / recombination and upper layers for replacement.
The linear_population class provides a mutex via the method ([nodiscard](/morinim/ultra/wiki/nodiscard) auto &mutex() const) which must be used to coordinate access in multi-threaded environments.
Fitness
Fitness is a scalar/vector value assigned to an individual which reflects how well the individual solves the task.
The literature identifies (at least) four distinct measures of fitness:
- raw fitness
- standardized fitness
- adjusted fitness
- normalized fitness
but we are mostly interested in the first two.
The raw fitness is the measurement of fitness that is stated in the natural terminology of the problem itself, so the better value may be either smaller or larger.
For example in an optimal control problem, one may be trying to minimize some cost measure, so a lesser value of raw fitness is better.
Since raw fitness is defined in problem-specific terms, its interpretation may vary significantly across domains. Therefore, Ultra adopts standardized fitness as the primary measure, ensuring consistency across tasks. The only requirement for standardized fitness is that bigger values represent better individuals (this may differ in other frameworks).
Many of the fitness functions in Ultra (see class src::evaluator and its specialisations) define the optimal fitness as the scalar value 0, or the vector (0, ... 0), and use negative values for sub-optimal solutions. However, this is not mandatory.
Often, excluding the simplest cases, fitness alone is not enough to understand goodness and flaws of the candidate solution.