3. A (not so) short Constraint Programming tutorial - richoux/GHOST GitHub Wiki
Constraint Programming is a programming paradigm for modeling and solving combinatorial problems. Constraint Programming is declarative: Users declare what their problem is, and a program designed to search for solutions, called solver, aims to quickly find an optimal or near-optimal solution. Constraint Programming is part of AI, in the sense that it aims to design solvers able to solve any kind of combinatorial problems.
The idea of Constraint Programming is to model combinatorial problems in such a way that it will furnish the solver a structure on problems it can exploit efficiently. This structure is called the constraint network. This network of constraints is build regarding which constraints are linked together. Two constraints in the model are linked if they have at least one variable in common.
There exists many different Constraint Programming formalisms. GHOST has been designed to handle the two main formalisms, namely Constraint Satisfaction Problems (CSP) and Constrained Optimization Problems (COP), as well as not-so-well-known but very efficient formalisms: Error Function-based Constraint Satisfaction Problems (EF-CSP) and Error Function-based Constrained Optimization Problems (EF-COP).
Different Constraint Programming formalisms handled by GHOST
Constraint Satisfaction Problems (CSP) and Constrained Optimization Problems (COP) are two close formalisms to model combinatorial satisfaction and optimization problems, respectively. For a decision problem, also called yes-no problems, we want to know if a solution exists, and if it is the case, what is this solution. For an optimization problem, we are looking for an optimal solution, the best one according to a given criterion. For instance, "Does a route of 5km top from A to B exist?" is a decision problem. "What is the shortest route from A to B?" is an optimization one.
Thus, the difference between a CSP and a COP is simple:
- A CSP models a decision problem where all possible solutions are equivalent, so usually, you just want to find one of them. A solution of a CSP is a vector of values, one value for each variable of the problem, such that all constraints are satisfied.
- A COP models an optimization problem, where some solutions are better than others, and you want the best one, or at least a 'good enough' solution. Here, a solution of a COP is a vector of values, one value for each variable of the problem, such that all constraints are safisfied, and for which the objective function outputs the best score among possible solutions.
Some vectors of values satisfy all constraints, some of them don't. In Constraint Programming, we call candidates any vectors of values. A candidate satisfying all constraints is a solution, optimal or not.
GHOST has been developed to work in programs and systems requiring a solution within a very short time. For this kind of systems, it is usually preferable to get a very good quality solution within 100 milliseconds rather than the perfect solution within 1 second.
Finally, Error Function-based Constraint Satisfaction Problems (EF-CSP) and Error Function-based Constrained Optimization Problems (EF-COP) model combinatorial satisfaction and optimization problems, respectively. Remember the constraint network structure? The goal of error functions are to provide the solver a second structure, this time on candidates, so the solver can search for solutions more efficiently.
Definition of a Constraint Satisfaction Problem (CSP)
Let's start by defining a CSP. To model your problem with a CSP, you need to define three things:
- V, the set of variables of your CSP.
- D, the domain of your CSP, i.e., the set of values your variables can take.
- C, the set of constraint predicates of your CSP.
Constraint predicates are functions outputting true or false. These functions take as input the value of variables in their scope, and outputs true if the combination of values satisfies the constraint, and false otherwise. Let's consider the constraint predicate <, representing the less than operation and taking two variables in its scope, say x and y. We have thus the constraint <(x,y), more conventionally written x<y. With x=3 and y=2, we have 3<2 outputting false. With x=1 and y=5, we have 1<5 outputting true.
Let's take now a simple example of a CSP model:
V = {x, y, z}. The variables of our CSP are x, y and z.
D = {0, 1, 2}. Our variable x, y and z can take a value from D, *i.e.*, be either equals to 0, 1 or 2.
We can have for instance x = 1, y = 1 and z = 0, giving the candidate (110).
C = {=, ≠, <}. We have three types of constraint predicates here: equal, different and less than.
Ok, now what? Well, to describe an instance of our problem, we have to build a formula from our CSP. This is a bit like playing to Lego: You combine blocks to build bigger blocks, then you combine your bigger blocks to create an object.
Here, your blocks are your variables. You can combine them with a constraint predicate to build a bigger block, i.e., a constraint. For instance, we can build the constraint x=z, and the constraint z≠y, etc.
Then, we can build a formula by combining constraints. Combining means here that we have a conjunction, i.e., an "and"-operator linking two constraints. A formula with the CSP described above could be for instance:
z=y AND y≠x AND x<z
A first example of a CSP formula
Consider the following CSP:
V = {a, b, c, d}.
D = {0, 1, 2}.
C = {=, <}.
and suppose our problem instance is modeled by the formula
a=b AND b<d AND c<d AND b=c
A solution of our problem is an evaluation of each variable to a value of the domain D such that our 4 constraints are satisfied.
For instance, the evaluation a=1, b=0, c=1 and d=2 - i.e., the candidate (1012) - is not a solution of the formula, because the first and the fourth constraint, a=b and b=c, are not satisfied.
However, the evaluation a=1, b=1, c=1 and d=2 - the candidate (1112) - satisfies all constraints of the formula, and is then a solution to our problem.
Definition of a Constrained Optimization Problem (COP)
A COP is defined as a CSP but with an objective function f in addition.
- V, the set of variables of your COP.
- D, the domain of your COP, i.e., the set of values your variables can take.
- C, the set of constraint predicates of your COP.
- f, the objective function to maximize or minimize.
Like CSPs, we define a COP formula to express a concrete problem instance.
A first example of a COP formula
Consider the same example as in the CSP section, with an objective function in addition.
V = {a, b, c, d}.
D = {0, 1, 2}.
C = {=, <}.
f = Minimize( a + b + c + d )
and suppose our problem instance is modeled by the same formula as previously:
a=b AND b<d AND c<d AND b=c
The evaluation a=1, b=0, c=1 and d=2, aka the candidate (1012), is still not a solution of the formula since the first and the fourth constraint, a=b and b=c, are not satisfied.
The evaluation a=1, b=1, c=1 and d=2 satisfies all constraints of the formula, so the candidate (1112) is then a solution to our problem. We have f(1,1,1,2) = 5. Since we want to minimize the value outputted by f, you see we can do better.
Indeed, the evaluation a=0, b=0, c=0 and d=1 is a solution since it satisfies all constraints of the formula, and it is the optimal solution, with f(0,0,0,1) = 1. No other solutions can lead to a better value of our objective function.
Definition of an Error Function-based Constraint Satisfaction Problem (EF-CSP)
An EF-CSP is very similar to a CSP. In the same fashion you get a COP by adding an objective function to a CSP, you get an EF-COP by adding an objective function to an EF-CSP.
For the sake of simplicity, we will only talk about decision problem with EF-CSP in this section, but keep in mind you just have to add an objective function to model an EF-COP.
An EF-CSP is defined as a CSP but replacing constraint predicates with error functions, one for each type of constraints to express.
- V, the set of variables of your EF-CSP.
- D, the domain of your EF-CSP, i.e., the set of values your variables can take.
- F, the set of error functions.
Where a constraint predicate is either true or false, an error function outputs a real value. Not only this value informs if a constraint is satisfied or not, but it also indicates how close or how far the input is to satisfy the constraint. If the error function outputs 0, it means the combination of values given as input satisfies the constraint. Otherwise, the higher the value, the further the combination is from satisfying the constraint.
Again, consider the constraint x=y. Assume its error function to be f(x,y) = |x-y|. For all values such that we actually have x=y (0=0, 1=1, ...), the value of f(x,y) is 0, meaning those combinations satisfy the constraint. If we have x=1 and y=2, then f(x,y) = 1, meaning it is not a solution, but it's near to be one. With x=1 and y=100, we have f(x,y) = 99, meaning this combination is further to be a solution.
In the same way than CSP and COP, an EF-CSP (and an EF-COP) formula will correspond to a concrete problem instance.
A first example of an EF-CSP formula
Consider the same example as in the CSP section, with the set of cost functions.
V = {a, b, c, d}.
D = {0, 1, 2}.
F = {f=, f<}.
such that f= and f<, respectively representing the constraints = and <, are defined as follows:
f=(x,y) = |x-y|
f<(x,y) = max(0, x-y)
We cannot link constraints represented by error functions with a conjunction anymore. There are several possibilities to consider combinations of error functions. The simplest and more intuitive one is certainly the addition. Thus, we have the following EF-CSP formula:
f=(a,b) + f<(b,d) + f<(c,d) + f=(b,c)
and solutions will remain exactly the same.
The Knapsack Problem
Ok, now how to model a problem through a CSP formula? This is not always trivial and require some experience. One of GHOST's main goals is to make easy the modeling part, making Constraint Programming usable for non-experts.
Let's take a very classic combinatorial problem: the knapsack problem. It is more than a toy problem, since it is difficult to solve and is used to express very concrete resource allocation problems, for instance in finance or in cryptography.
The problem is the following: You have a knapsack with a given capacity (say, 30 liters). You have many objects of different types you want to bring with you. For the sake of simplicity, let's say you only possess two kind of objects: you have 50 bottles of 1 liter of iced tea, costing 500 yens each (yes, I live in Japan, and right now I am in a coffee shop), and 10 large sandwiches of 1.25 liters, costing 650 yens each.
There exists two version of this problem:
- the decision version, where you want to answer the question "Is it possible to bring objects for a total of at least 15,000 yens?", for instance.
- the optimization version, which is the usual one for the knapsack problem: "What is the combination of objects I can put in the knapsack to maximize the total value?"
A CSP model for the knapsack problem
V = {bottle, sandwich}.
D = {{[0,50]}, {[0,10]}}.
C = {capacity, at_least}.
where
capacity(bottle, sandwich) = (bottle * 1 + sandwitch * 1.25) ≤ 30
at_least(bottle, sandwich) = (bottle * 500 + sandwitch * 650) ≥ 15,000
Observe that the domain D is here a set of domains. When all variables can take the same values, then D is a simple set, but in the general case, your variables can take different values (here, up to 50 bottles and 10 sandwiches). Thus, D is in reality a set of sets of values, like in this example.
Our CSP formula will be than
capacity(bottle, sandwich) AND at_least(bottle, sandwich)
A COP model for the knapsack problem
We only need to add an objective function that will replace the at_least constraint (eventually, you can keep this constraint to reject solutions below a given threshold).
V = {bottle, sandwich}.
D = {{[0,50]}, {[0,10]}}.
C = {capacity}.
f = Maximize(bottle * 500 + sandwich * 650)
Without the value constraint, our COP formula is simply
capacity(bottle, sandwich)
An EF-CSP model for the knapsack problem
Let's consider the decision version of the knapsack problem. Then an EF-CSP model can be the following:
V = {bottle, sandwich}.
D = {{[0,50]}, {[0,10]}}.
F = {fc, fl}.
with
fc(bottle, sandwich) = max(0, (bottle * 1 + sandwitch * 1.25) - 30)
fl(bottle, sandwich) = max(0, 15,000 - (bottle * 500 + sandwich * 650))