GSoC 2015 Application Luv Agarwal: Cylindrical Algebraic Decomposition - sympy/sympy GitHub Wiki

About Me

Personal Background

Hello, I am Luv Agarwal a sophomore at IIIT - Hyderabad, India pursuing a degree in B-Tech in Computer Science with MS by research in Computational and Natural Sciences. Some of the courses that I have taken are set theory, real algebraic geometry, probability theory. As this project demands the knowledge of abstract algebra, I have made myself quite familiar with it's basic concepts (like groups, fields, rings, extensions) and others that are used in this project.

Programming Experience

I work on ubuntu 14.04 and use VIM and Sublime interchangeably as my text editor but nowadays I am more VIM oriented after I accidentally got introduced to some of it's cool features. I have done coding in a low level language ARM and various high level languages like C, C++, PHP, Python and javascript. I am also familiar with web frameworks like web2py and Django. I have been using git (with bitbucket) for around 1.5 years, github for around a year and quite familiar with their workflow. I also use mercurial while contributing to Mozilla.

Here are the few projects which depicts my interest in programming:

  1. A clone of FruitNinja for web in here

  2. I, with my team, have worked on building a web Analytics system for a startup in Django with MySQL.

  3. Other small projects

I use C, C++ while doing Competitive Programming. For development purposes I use Python as it easily allows me to convert my ideas into code. Some of the features of python that I like are list comprehension, argument unpacking, multiple assignment as these help avoid to use of unnecessary variables and keep the code clean.

Username and Contact Information

Name : Luv Agarwal

University : International institute of Information Technology - Hyderabad

Email : [email protected]

Github Username : luviiit

Time Zone : IST (UTC+05:30)

Blog : http://luvagarwal.blogspot.in/

Contributions to SymPy

  1. changed all the occurrences of Eq and Ne appearing as == and != respectively
  2. added an example of imageset intersection in docs

The Project

Motivations and Overview

Some terms and variables used later:

  • CAD - Cylindrical Algebraic Decomposition
  • Projection | Projection operator - A component of CAD algorithm.
  • CAD tree - the tree data-structure used by CAD algorithm
  • cell - a node of CAD tree
  • level of a cell- height of a cell in CAD tree
  • F - input formula. For ex. ForAll((x), And(x**2 - y > 0, x*y < 1))
  • F_qf - quantifier free standard formula of F. For above example its value is And(x**2 - y > 0, x*y - 1 < 0)
  • A - list of polynomials occurring in F. For above example its value is [x**2 - y, x*y - 1]
  • B - coarsest square free basis of A
  • k - level
  • truth value of a cell - truth value of F_qf on this cell
  • candidate cell - cell whose truth value is not determined

Solving real multivariable polynomial systems is a crucial feature for a Computer Algebra System in a long term. SymPy has great solving capabilities but currently it doesn't have support for handling such systems

There are several algorithms that are used to deal with such systems.

CAD algorithm is the best algorithm known till date to deal with real polynomial systems in general. In time, there are different modifications done in CAD in order to improve it's performance to solve different problems or workout some special cases. Some of these enhancements are as follows:

  • Hong's improved Projection operator (This operator is complete and is used in place of the original projection operator introduced by Collin)
  • George E. Collins and Hoon Hong's Partial Cylindrical Algebraic Decomposition for Quantifier Elimination.
  • Adam W. Strzebonski Cylindrical Algebraic Decomposition using validated numerics - used to reduce the algebraic number computations

How others tackle?

As Wolfram states "As The primary method used by the Wolfram Language for solving real polynomial systems and real quantifier elimination is the CAD algorithm. There are, however, simpler methods applicable in special cases."

In Wolfram language CAD implementation uses the Collins-Hong algorithm (Partial CAD as mentioned above) with Brown-McCallum projection operator for a special condition - if set of input polynomials is a well-oriented set and Hong projection operator in general. While constructing CAD, we need to deal with plenty of algebraic number computations (computationally intensive) which can sometimes be omitted by using Strzebonski's genealogy-based method using validated numerics backed up by exact algebraic number.

PROJECT PLAN

The project comprise of implementation of following things:

  1. Cylindrical Algebraic Decomposition with Hong's improved projection operator. (reference[1] and [3])
  2. Hong's partial CAD derived from complete CAD^ to perform quantifier elimination. (reference [2])
  3. A platform to solve real polynomial inequalities using CAD.

ALGORITHMS

Given n polynomials in r variables, the objective of CAD algorithm is to divide complete r dimensional space in cells such that each of the n polynomials is invariant on each cell. In other words each input polynomial has the same sign over complete cell and therefore each cell can be denoted by a sample point (and a definition) which can be used to calculate the sign of each polynomial over every point in that cell.

Now I will explain the algorithms in some more detail in following way:

1. CAD in two parts:

1.1 a general CAD algorithm assuming all algebraic numbers are rational numbers so that I can represent all the numbers exactly in numerical form and don't have to deal with algebraic number manipulations which will help focus more on understanding the principles of the algorithm.

1.2 workout for removing the assumption in 1.1

2. partial CAD using complete CAD^ to perform quantifier elimination.

3. Inequalities solving

1.1 General CAD algorithm with assumption

Input: a set of polynomials in r variables.
Output: r-dimensional cells each represented by a sample point and a definition

1. compute the coarsest squarefree basis of input polynomials using basis

2. Apply projection operator iteratively r-1 times so that we end up with projection polynomials (proj(A), proj^2(A), ... proj^(r-1)(A)) where proj^(k+1)(A) = proj(proj^k(A)).

3. Start from proj^(r-1)(A).
    Let the roots of these polynomials be a[1], a[2], ..., a[n]
    
    Set cells = [(b[1], ), (a[1], ), (b[2], ), (a[2], ), ..., (a[n], ), (b[n+1], )]
    where b[i] is any rational number between a[i-1] and a[i] for i in {2..n}
    b[1] is any rational number less than a[1]
    b[n+1] is any rational number greater than a[n]
    Now each cell is a one-dimensional point.

4. define each cell using define_base.

5. Extension phase:

    level = 1
    for p in reverse(projection_polynomials):
        level += 1
        A = p
        cells_new = []

        # cells = [(b[1][1], .., b[1][level]), (b[2][1], .., b[2][level]), ..., (b[m][1], .., b[m][level])]
        for cell in cells:
            A = A(cell) # substitute the cell into A
                        # Now A is polynomial in variable x[level]
            A = basis(A)
            r = isolate_roots(A)
            [cells_new.append(cell+[root]) for root in r]
        cells = cells_new
        define all these new cells using define
    
    Now we have the cells and their definitions.

1.2 workout for removing assumption: (lines having leading '+' are added in above algorithm)

1. compute the coarsest squarefree basis of input polynomials using basis

2. Apply projection operator iteratively r-1 times so that we end up with projection polynomials (proj(A), proj^2(A), ... proj^(r-1)(A)) where proj^(k+1)(A) = proj(proj^k(A)).

3. Start from proj^(r-1)(A).
    Let the roots of these polynomials be a[1], a[2], ..., a[n]

+   if not root.is_rational: represent it with polynomial over `Q` and isolating interval.
    Set cells = [(b[1], ), (a[1], ), (b[2], ), (a[2], ), ..., (a[n], ), (b[n+1], )]
    where b[i] is any rational number between a[i-1] and a[i] for i in {2..n}
    b[1] is any rational number less than a[1]
    b[n+1] is any rational number greater than a[n]
    Now each cell is a one-dimensional point.

4. define each cell using define_base.

5. Extension phase:

    level = 1
    for p in reverse(projection_polynomials):
        level += 1
        A = p
        cells_new = []

        # cells = [(b[1][1], .., b[1][level]), (b[2][1], .., b[2][level]), ..., (b[m][1], .., b[m][level])]
        for cell in cells:
+           # cell = (b<sub>i,1</sub>, .., b<sub>i,level</sub>)
+           α, B[1], B[2], ..., B[level] = simple(cell)
+           # B's are the polynomials over Q
+           # α is an algebraic number represented by an polynomial over Q and an isolating interval such that B[j](α) = b[i][j]
            A = A(cell) # substitute the cell into A
                        # Now A is polynomial in variable x[level]
+           # Now A is a polynomial over Q(α)
            A = basis(A)
            r = isolate_roots(A, extension=α)
            [cells_new.append(cell+[root]) for root in r]
        cells = cells_new
        define all these new cells using define
    
    Now we have the cells and their definitions.
  1. Partial CAD to perform quantifier elimination:

    a) After the construction of children of the cell try to determine the truth value of F_qf on as many children as possible Function: evaltv

    b) Try to determine the truth value of cell if a) is successful. If it is determined then try to determine the truth value of as many ancestors as possible dropping the subtree of the cell whose truth value is determined. Function: prptv

    b) Instead of simply running bfs over the CAD tree (as we did in extension phase of above algorithm) we will choose cells such that it will result in dropping of more subtrees. For example we can choose cells based on their level, type of cell (sector or section), et al. Function: choose

FUNCTIONS: (from references [1], [2, [3], [4], [6])

1. proj

p1

p2

proj2_new

  1. Collin's original projection operator is the union of PROJ1 and PROJ2

  2. Hong's improved projection operator is the union of PROJ1 and PROJ2*

As both Collin's and Hong's projection operators are complete but the Hong's operator is more efficient so I will implement Hong's operator.

2. simple

Input: α and β both represented in form of a polynomial and an isolating interval
Output: γ, fnew, gnew where γ represented in form of polynomial and an isolating interval; fnew and gnew are polynomials over Q such that fnew(γ) = α and gnew(γ) = β.

>>> simple(f, g, f_interval, g_interval)
>>> (h, h_interval, fnew, gnew)

Algorithm: (from reference [4])

1. r = resultant(f.subs({x: x-t*y}), g.subs({x: y}), y)
2. calculate least t (say t1) such that gcd(r, r.diff(x)) is of degree = 0.
   Set h = r(x, t1)
3. by repeated bisection of f_interval and g_interval construct h_interval such that
h_interval = f_interval + t1*g_interval and h_interval is an isolating interval for h representing γ.
4. b = gcd(f(γ - t1*x), g(x), extension=γ) # γ is the root represented by h and h_interval
5. gnew = -(trailing coefficient of b)
   fnew = x - t1*gnew
3. normal

Input: a polynomial over Q(α) and an isolating interval representing a root β
Output: a polynomial over Q and an isolating interval representing β

>>> normalize_roots(f, f_interval)
(fnew, fnew_interval)

Algorithm: NORMAL in Reference [4]

4. isolate_roots

Input: list of squarefree polynomials over Q(α)
Output: list of Interval's containing the roots and a mapping which maps these intervals to the corresponding polynomials

>>> isolate_roots([x-sqrt(2), x+1], extension=sqrt(2))
[((-3/2, -1/2), (1, 3/2)), (1, 0)]
>>> # _[1] is the mapping of _[0] in the input polynomials

Algorithm:

using `dup_isolate_real_roots_sqf` (currently raises DomainError for AlgebraicField) isolate the roots of the given squarefree polynomials to the accuracy that no interval overlaps.

For making dup_isolate_real_roots_sqf work for algebraic field following features (and possibly others) need to be added in AlgebraicField:

  1. comparison between algebraic numbers: current status is as follows:
>>> a = AlgebraicField(QQ, sqrt(5))
>>> a.convert(sqrt(5)+1) > a.convert(2)
False

The output to the above example should have been True.

To implement comparison between algebraic numbers, thanks to Kalevi, here is a very nice description of the idea.

  1. a.log: current status
>>> a  = AlgebraicField(QQ, sqrt(2))
>>> a.log(2, 2)
NotImplementedError:
5. count_roots

Input: a polynomial over Q(α) and an interval
Output: number of roots of the polynomial in the given interval

Already implemented as count_roots

todo: make count_roots work for polynomials over algebraic numbers.

Current status:

>>> a = Poly(x**2-sqrt(2), extension=sqrt(2))
>>> a.count_roots(0, 10)
0
6. define_base

Input: root cell, Projr-1(A)

Output: Represent children of root cell using polynomial inequalities in free variables

Algorithm:

   if no of free variables == 0:
       return ()
   if len(cells.childrens) == 1: # single cell representing (-∞, ∞)
       return (-∞ < x1 < ∞)
   calculate the sign of each polynomial over each children of root cell using `signj`
   def = [None]*(len(rootcell.children))
   for cell in rootcell.children:
       index = rootcell.index(cell)
       if index%2 != 0: # section
           # let the poly representing this cell be B
           # (a/b, c/d) is it's sample point
           def[index] = (And(B = 0, b*x1 - a > 0, d*x1 - c < 0))
       else: # sector
           # let the poly representing the cell just before and after this cell be B1, B2
           # (a/b, c/d) is it's sample point
           def[index] = (And(B1 > 0, B2 < 0, b*x1 - a > 0, d*x1 - c <0))
   return def
7. define

Input: cell at level k, substituted projection polys at level k+1 i.e. Projr-(k+1)(A)(cell) Output: Represent children of input cell using polynomial inequalities in free variables

Algorithm: Reference [1]

8. choose

Input: D (CAD tree)
Output: return a cell c

Algorithm: As I mentioned earlier there can be various choices on the basis of which we choose cells. For different type of problems different strategies takes different amount of time. May be we can implement choose in multiple ways and associate a name with each way which will be passed in cylindrically_decompose as argument choose_type with a default value set to the best possible choose SymPy can detect for that particular problem.

9. signg

Input: basis set of projection polys (leading coeff of each poly = 1), cell
Output: a 2-d array of of size len(cell.children) by input polynomials in which index i by j gives the sign of jth polynomial over ith cell

cells = cell.children Algorithm: As the odd indexed cells contain all the roots of input basis set, so by iterating on the following rules we can get sign of each polynomial over each cell:

1. sign of every poly on the last cell will be +ve as each poly (as leading coeff of each poly = 1)
1. sign of a poly is zero only on the cell which is represented by this poly. (let's call this cell as c<sub>i</sub>)
2. sign of this poly on c<sub>i-1</sub> will be opposite it's sign on c<sub>i+1</sub>
10. signj

Input: projection polynomials of kth level, cell at level k-1
Output: sign of each projection polynomial over each cell represented in a form of a 2-d array

cells = cell.children basis set, degree of occurrences = basis(projection polynomials) Algorithm: it uses signg to compute the signs of basis set and then manipulate the degree of occurrences along with the calculated signs to compute the sign of each projection polynomial. For complete algo Refer algorithm signj of [2]

11. evaltv

Input: A , F_qf, r, k, Projr - k(A), cell at level k-1
Output: This algorithm tries to computes the truth value of F_qf over each cell (children).

This algorithm uses the following result (from Theorem 2.7 in reference [2]) to determine the truth value of F_qf:
A(k) ⊆ Projr - k(A) where k is the level of children cells, A(k) is the set of polynomials from A which are build only from variables in {r1, r2, ..., rk}
So what it does is it computes the sign of Projr - k(A) using signj and then using the above result tries to calculate the truth value of F_qf.

To make it more clear I will take an arbitrary example and simultaneously explain the algorithm as follows:

Let F_qf be Or(And(x*y + y > 0, x**2 - 3*x + 2 > 0), x + y**3 < 0, y + y**2*z < 0) and variable order as [x, y, z] then
A = [x*y + y, x**2 - 3*x + 2, x + y**3, y + y**2*z]
r = 3

Now as per the definition of A(k):
A(1) = [x**2 - 3*x + 2]
A(2) = [x*y + y, x**2 - 3*x + 2, x + y**3]
A(3) = [x*y + y, x**2 - 3*x + 2, x + y**3, y + y**2*z]

Let input k = 2

From the above result we know that A(2) ⊆ Projr - k(A).

Now we determine the sign of each polynomial in Projr - k(A) over all children using signj. So by above result we have sign of each polynomial in A(2) i.e. [x*y + y, x**2 - 3*x + 2, x + y**3] over each children

Finally:

for cell in children:
    # try to determine the truth value of F\_qf by substituting the sign of known Polynomials
    F_qf -> Or(And(known, known), known, y + y\*\*2\*z < 0)
12. prptv

Input: D (partial CAD tree), a cell c of level k, k, f (number of free variables), quantifier (if any) on variable xk+1
Output: determine the truth value of cell c and as many ancestors as possible and drop the subtree of the cell whose truth value has been determined.

Algorithm:

1. c' = c; k = level of c
2. if k < 0:
       return
3. if k >= f:
       if quantifier == Exists and at least a single child of c' has truth value == True:
           c'.truthvalue = True
       elif quantifier == ForAll and at least a single child  of c' has truth value == False:
           c'.truthvalue = False
       else:
           return
   else: # below free variable space
       if truth value of every child of c' is True:
           c'.truthvalue = True
       elif truth value of every child of c' is False:
           c'.truthvalue = False
       else:
           return
   # drop the subtree of c'
   c'.children = None
   k -= 1
   c' = c'.parent
   # recursively call the same function
   prptv(D, c', k, f, ...)

API

Cylindrical Algebraic Decomposition: cylindrically_decompose

>>> cylindrically_decompose(x**2 + y**2 < 1, (x, y), choose_type="default")
And(-1 < x, x < 1, -sqrt(-x**2 + 1) < y, y < sqrt(x**2 - 1))

>>> cylindrically_decompose(x**2 + y**2 > 1, (x, y), choose_type="default")
Or(And(Or(y < -sqrt(-x**2 + 1), y > sqrt(-x**2 + 1)), x <= 1, x >= -1), x < -1, x > 1)

Quantifier Elimination: resolve

>>> from sympy.logic.FOL import Exists
>>> resolve(Exists((x), x**2 + y**2 < 1), (x, y))
∃x × (-1, 1)

Inequalities solving: solve

>>> solve(x**2 + y**2 < 1, (x, y))
(0, 1) × (-sqrt(-x**2 + 1), sqrt(-x**2 + 1))

Timeline (tentative)

community bonding period and week 1

  • implement proj, isolate_roots, basis, count_roots. Submit a PR.

week 2

  • implement simple, normal

week 3

  • implement a cell class
  • add test cases for above functions
  • design and implement choose with an efficient structure for arrangement of the cells so that it will be able to return the best possible cell without comparing all cells everytime it is called.

week 4 - week 5

  • implement define_base, define

week 6 - week 7

  • use above implementations in function: cylindrically_decompose, add test cases, bug fixes. Submit a PR with working cylindrically_decompose.

week 8 - week 9

  • implement signg, signj, evaltv, prptv and few other small functions

week 10 - week 11

  • use these above implemeted functions to implement
    • resolve
    • solve Submit a PR.

week 12 - week 13

  • add tests, bug fixes, complete documentation with working of the algorithm

My college exams are ending at April 30 and I have no major plans in the gsoc period. I will start coding right after that (40-45 hours per week) except a few days in which I will travel from college to home and vice versa.

If I am able to complete the specified work before time then I will start work on using validated numerics (reference [5]).

I will blog weekly about my progress at http://luvagarwal.blogspot.in/

How do I fit in?

I have been reading and thinking about CAD and it's implementation for about 1.5 months. Though it initially took me time to pick up it's concepts as I was not familiar with abstract algebra and the papers were difficult to read but now I hold a clear picture of it's structural elements, different enhancements and challenges in my mind. For now I may have missed or misthought some of it's implementation in SymPy but I can cover those things before and while gsoc period and I feel confident about the successful completion of this project. Also I have gone through all the references in references section except ref [5] which builds a strong base for the completion of the project.

Other Applications:

  1. Obtain complex solutions of inequalities - Replace the complex variables with a pair of real variables and express each poly in terms of complex and real part.

References

[1]: Quantifier Elimination for Real Closed Fields by Cylindrical Algebraic Decomposition by George E. Collins
[2]: Partial Cylindrical Algebraic Decomposition by George E. Collins and Hoon Hong
[3]: An Improvement of the Projection Operator in Cylindrical Algebraic Decomposition by Hoon Hong
[4]: Loos, R. G. K., and Collins G. E. Resultant Algorithms for Exact Arithmetic on Algebraic Numbers
[5]: Cylindrical Algebraic Decomposition using validated numerics by Adam W. Strzebonski
[6]: Implement comparison of algebraic numbers [7]: Cylindrical Algebraic Decomposition I: The Basic Algorithm by Dennis S. Arnon, George E. Collins, Scot McCallum [8]: Companion to the Tutorial Cylindrical Algebraic Decomposition Presented at ISSAC 2004

⚠️ **GitHub.com Fallback** ⚠️