GSoC 2019 Solvers Progress - sympy/sympy GitHub Wiki

GSoC 2019 Progress - Solvers : Extending solveset

Student: Jogi Miglani
Mentors: Shekhar Rajak, Amit Kumar, Yathartha Joshi
Github: jmig5776
Blog Link : jmig5776.github.io
PR(s) : https://github.com/sympy/sympy/pulls/jmig5776

Progress Tracking Table

Week #	Goals	Blog Post	Status/PR Links (if any)
Community Bonding	lambert, bivariate	First meeting	--
1	proof of correctness and pull request	Week 1	# 16890
2	Continuing with lambert	Week 2	# 16890, # 16976
3	Lambert, creating `power_list`	Week 3	# 16890, # 16976
4	creating `power_list`,Discussing `_solve_modular`	Week 4	# 16890, # 16976
5	Documentation improvement in Lambert	Week 5	# 16890, # 16976
6	Removal of `power_list`	Week 6	# 16890, # 16976
7	Completed and merged Lambert Solver	Week 7	# 16890, # 16976
8	Work on _solve_modular	Week 8	# 16976
9	Work on both _solve_modular and ImageSet Union	Week 9	# 16976
10	Work on both _solve_modular and ImageSet Union	Week 10	# 16976, # 17079
11	Completed _solve_modular and work on ImageSet Union	Week 11	# 16976, # 17079
12	Final report of GSoC 2019	Week 12	final report link

Meeting Schedule

Meeting #01 (Community bonding period)

Date: Saturday 18th May, 2019
Time: 11:30 AM - 12:30 PM (IST)
Attendees: Jogi Miglani, Yathartha Joshi, Amit Kumar

Minutes of Meeting:

Plans for community bonding period

I will be planning on basic details of how to implement this project. My end semester examinations are over already so I will try to contribute as much as possible.
Frequency of meetings

Another thing that was discussed was the frequency of meetings. So Amit, Yathartha and I agreed upon having meetings on weekly basis, probably Saturday/Sunday. Also, Yathartha suggested on having a gitter channel for discussions on Solvers project and discuss the doubts there if any.
Plan for completing lambert solver

How lambert solver will be completed and what will be its flow was discussed. Amit and Yathartha suggested on having a thorough understanding of bivariate and its workflow. Advised to figure out why all of the solutions are not returned and find solution so that all of the solutions should be returned. Also to write tests for lambert so that to plan the implementation.

Goals

Understanding the flow of bivariate
Figuring out the problem for returning all solutions
Writing tests for lambert

Pull Requests to review (if any):

None yet

Meeting #02 (Week 1)

Date: Sunday 2nd June, 2019
Time: 11:30 AM - 12:30 PM (IST)
Attendees: Jogi Miglani, Yathartha Joshi, Amit Kumar

Minutes of Meeting:

Discussion on code of # 16890

Amit requested changes at the pull request to describe the problem in detail and describe the plan more briefly. And more discussion was there on various test cases.

Proof Of Correctness

As my solution to the problem was based on some heuristics so it is necessary to give proof of correctness.

Goals

Proof of Correctness and changes in pull request
Writing tests for modular equations

Pull Requests to review (if any):

# 16890

Meeting #03 (Week 2)

Date: Saturday 8th June, 2019
Time: 12:00 - 01:00 PM (IST)
Attendees: Jogi Miglani, Yathartha Joshi, Amit Kumar, Harsh Gupta

Minutes of Meeting:

Changing the plan of # 16890

Previously implemented approach was not very accurate. So Amit and Yathartha requested to change the plan and solve the problem of lambert by considering the even degrees which results in more than one equation.

Harsh also provided his opinion on whatever we were going to implement that everything should be well documented and strategised.

Goals

Implementing new plan
Proof of correctness of new plan

Pull Requests to review (if any):

Meeting #04 (Week 3)

Date: Sunday 16th June, 2019
Time: 12:00 - 01:00 PM (IST)
Attendees: Jogi Miglani, Yathartha Joshi, Amit Kumar, Shekhar Prasad Rajak

Minutes of Meeting:

Discussing the code of # 16890

In this meeting mainly it was discussed to improve the code quality and make routine of independent working code.

Code Quality Every programmer should master this skill to write a good quality code. Everything from naming, documentation to make routines includes in this skill.

And it comes that a new feature in polytools is required named power_list to return powers of symbols present in an expression. An approach I will be impelementing this week.

Goals

Implementing power_list
Code quality

Pull Requests to review (if any):

Meeting #05 (Week 4)

Date: Sunday 22th June, 2019
Time: 12:30 - 01:30 PM (IST)
Attendees: Jogi Miglani, Yathartha Joshi

Minutes of Meeting:

Discussing previous week's progress

In the previous week many things happened very suddenly. In the starting of the week it was decided that we had to create a function power_list which could return powers. So I thought to create this type of function as a new feature in this PR #17043. But as @jksoum and @smichr advised that this function can act as helper function in bivariate itself so it was decided to make it as helper function there. And we discussed the code for handling modular equations also.

Next week goals

Implementing and get plan for defining imageset union

As per the timeline we decided that Imageset Union should be defined first rather than code for handling nested trigonometric equation solver. So according to timeline this week goes for discussion and getting plan for implementing imageset union. A PR #17079 is also started based on fundamental approach to solve the problem.

Getting merge Lambert Solver

Most of the work had been done only a little improvement is to be done in power_list function. Hope it gets merged by this week.

Pull Requests to review (if any):

Meeting #06 (Week 5)

Date: Saturday 29th June, 2019
Time: 11:30 - 12:30 PM (IST)
Attendees: Jogi Miglani, Yathartha Joshi, Amit Kumar

Minutes of Meeting:

Discussing previous week's progress

Mentors suggested me to work upon the code improvements and documentation improvement. And make it more readable to user. Although somehow @smichr had some doubts on the logic that we were implementing. Although a lot of progress has been there. So I decided to create and discussion for thinking new logic for implementing Lambert all solutions and work on the current PR as goes on.

Next week goals

Improving existing PR for Lambert The main improvement left is of Documentation and code readability.
Improving _solve_modular PR also.
If time left then find plan for Imageset Union.

Pull Requests to review (if any):

Meeting #07 (Week 6)

Date: Sunday 7th July, 2019
Time: 1:45 - 2:45 PM (IST)
Attendees: Jogi Miglani, Yathartha Joshi, Amit Kumar

Minutes of Meeting:

Discussing previous week's progress

In this meeting both the mentors were convinced by the code for Lambert’s. And few modifications in documentation and code clean up were suggested by them. In this week the whole idea of power_list was droppped because @smichr suggested code for replacing the symbol more targetted as we wanted by which the whole code was improved. And it was decided to work upon on _solve_modular mainly now onwards.

Next week goals

Getting merge existing PR for Lambert
Work upon _solve_modular PR
If time left then find plan for Imageset Union.

Pull Requests to review (if any):

Meeting #08 (Week 7)

Date: Monday 15th July, 2019
Time: 6:00 - 7:00 PM (IST)
Attendees: Jogi Miglani, Yathartha Joshi

Minutes of Meeting:

Merged Lambert solver PR

Explaining the function `_solve_lambert` (main function to solve lambert equations)

Input - f, symbol, gens
OutPut - Solution of f = 0 if its lambert type expression else NotImplementedError

This function separates out cases as below based on the main function present in the main equation.

For the first ones:
1a1) B**B = R != 0 (when 0, there is only a solution if the base is 0,
                   but if it is, the exp is 0 and 0**0=1
                   comes back as B*log(B) = log(R)
1a2) B*(a + b*log(B))**p = R or with monomial expanded or with whole
                            thing expanded comes back unchanged
   log(B) + p*log(a + b*log(B)) = log(R)
   lhs is Mul:
       expand log of both sides to give:
       log(B) + log(log(B)) = log(log(R))
1b) d*log(a*B + b) + c*B = R
   lhs is Add:
       isolate c*B and expand log of both sides:
       log(c) + log(B) = log(R - d*log(a*B + b))

If the equation are of type 1a1, 1a2 and 1b then the mainlog of the equation is taken into concern as the deciding factor lies in the main logarithmic term of equation.

For the next two,
   collect on main exp
   2a) (b*B + c)*exp(d*B + g) = R
       lhs is mul:
           log to give
           log(b*B + c) + d*B = log(R) - g
   2b) -b*B + g*exp(d*B + h) = R
       lhs is add:
           add b*B
           log and rearrange
           log(R + b*B) - d*B = log(g) + h

If the equation are of type 2a and 2b then the mainexp of the equation is taken into concern as the deciding factor lies in the main exponential term of equation.

3) d*p**(a*B + b) + c*B = R
   collect on main pow
   log(R - c*B) - a*B*log(p) = log(d) + b*log(p)

If the equation are of type 3 then the mainpow of the equation is taken into concern as the deciding factor lies in the main power term of equation.

Eventually from all of the three cases the equation is meant to be converted to this form:-

f(x, a..f) = a*log(b*X + c) + d*X - f = 0 which has the
solution,  X = -c/b + (a/d)*W(d/(a*b)*exp(c*d/a/b)*exp(f/a)).

And the solution calculation process is done by _lambert function.

Everything seems flawless?? You might be thinking no modification is required. Lets see what loopholes are there in it.

What does PR #16890 do?

There are basically two flaws present with the this approach.

Not considering all branches of equation while taking log both sides.
Calculation of roots should consider all roots in case having rational power.

1. Not considering all branches of equation while taking log both sides.

Let us consider this equation to be solved by _solve_lambert function.

-1/x**2 + exp(x/2)/2 = 0

So what the old _solve_lambert do is to convert this equation to following.

2*log(x) + x/2 = 0

and calculates its roots from _lambert. But it missed this branch of equation while taking log on main equation.

2*log(-x) + x/2 = 0

Yeah you can reproduce the original equation from this equation.So basically the problem was that it missed the branches of equation while taking log. And when does the main equation have more than one branch?? The terms having even powers of variable x leads to two different branches of equation.

So how it is solved? What I has done is that before actually gets into solving I preprocess the main equation and if it has more than one branches of equation while converting taking log then I consider all the equations generated from them.(with the help of _solve_even_degree_expr)

How I preprocess the equation? So what I do is I replace all the even powers of x present with even powers of t(dummy variable).

Code for targeted replacement
lhs = lhs.replace(
            lambda i:  # find symbol**even
                i.is_Pow and i.base == symbol and i.exp.is_even,
            lambda i:  # replace t**even
                t**i.exp)
Example:-
Main equation -> -1/x**2 + exp(x/2)/2 = 0
After replacement -> -1/t**2 + exp(x/2)/2 = 0

Now I take logarithms on both sides and simplify it.

After simplifying -> 2*log(t) + x/2 = 0

Now I call function _solve_even_degree_expr to replace the t with +/-x to generate two equations.

Replacing t with +/-x
1. 2*log(x) + x/2 = 0
2. 2*log(-x) + x/2 = 0

And consider the solutions of both of the equations to return all lambert real solutions of -1/x**2 + exp(x/2)/2 = 0.

Hope you could understand the logic behind this work.

2. Calculation of roots should consider all roots in case having rational power.

This flaw is in the calculation of roots in function _lambert. Earlier the function_lambert has the working like :-

Find all the values of a, b, c, d, e in the required loagrithmic equation
Then it defines a solution of the form

-c/b + (a/d)*l where l = LambertW(d/(a*b)*exp(c*d/a/b)*exp(-f/a), k)

and then it included that solution. I agree everything seems flawless here. but try to see the step where we are defining l.

Let us suppose a hypothetical algorithm just like algorithm used in _lambert in which equation to be solved is

x**3 - 1 = 0

and in which we define solution of the form

x = exp(I*2*pi/n) where n is the power of x in equation

so the algorithm will give solution

x = exp(I*2*pi/3) # but expected was [1, exp(I*2*pi/3), exp(-I*2*pi/3)]

which can be found by finding all solutions of

x**n - exp(2*I*pi) = 0

by a different correct algorithm. Thats y it was wrong. The above algorithm would have given correct values for x - 1 = 0.

And the question in your mind may arise that why only exp() because the possiblity of having more than one roots is in exp(), because if the algorithm would have been like x = a, where a is some real constant then there is not any possiblity of further roots rather than solution like x = a**(1/n). And its been done in code like this:

code
num, den = ((c*d-b*f)/a/b).as_numer_denom()
p, den = den.as_coeff_Mul()
e = exp(num/den)
t = Dummy('t')
args = [d/(a*b)*t for t in roots(t**p - e, t).keys()]

Next week goals

Work upon _solve_modular PR
If time left then find plan for Imageset Union.

Pull Requests to review (if any):

# 16976

Meeting #09 (Week 8)

Date: Saturday 20th July, 2019
Time: 12:30 - 1:30 PM (IST)
Attendees: Jogi Miglani, Yathartha Joshi, Amit Kumar

Minutes of Meeting:

Defining goals for time left of GSoC

In this meeting the goals for the time left for GSoC were decided as follows:-

Lambert:- Completed and merged.
Solve modular:- What I experienced with lambert's PR that got stretched so much which I was not expecting. Algorithms made from heuristics takes so much time to be ready for merging so I really don't know how much time it will take. But It is sure that it will be completed before the final evaluation.
ImageSet Union:- This task will be taken after the solve modular. This is a very complex task and will need proper guidance and algorithm. I had searched for some research papers, but what I found out was not that what we want. Before GSoC final evaluation this task will be started to implement, but I am not really sure if it would get merged before final evaluation.

Next week goals

Work upon _solve_modular PR
If time left then find plan for Imageset Union.

Pull Requests to review (if any):

# 16976

Meeting #10 (Week 9)

Date: Saturday 27th July, 2019
Time: 1:30 - 2:30 PM (IST)
Attendees: Jogi Miglani, Yathartha Joshi, Amit Kumar

Minutes of Meeting:

Discussing previous week’s progress

Progress of _solve_modular:- In PR #16976 I implemented the basic design of _solve_modular. Some very good suggestion by Yathartha for creating _invert_modular were very helpful. Now basically the _solve_modular first do preprocessing and invert it and then find solution and in final domain intersection takes place.
Progress of ImageSet Union:- In PR #17079 I will be implementing an algorithm based on heurestics for performing ImageSet Union this week. Basic code is being already written for defining a function and only algorithm part is left which will be performed in this week.

Next week goals

Improve Code of both the PR's i.e _solve_modular and Imageset Union

Pull Requests to review (if any):

# 16976

Meeting #11 (Week 10)

Date: Sunday 4th August, 2019
Time: 1:00 - 2:00 PM (IST)
Attendees: Jogi Miglani, Yathartha Joshi

Minutes of Meeting:

Discussing previous week's progress

Progress of _solve_modular:- In PR #16976 After discussing with Yathartha, I decided to change the basic model of the _solve_modular such that I should be able to target equations more efficiently and also the rest of the types of equation should return ConditionSet. Cases like Mod(a**x, m) - rhs = 0 are special type and will be handled differently with the helper functions of ntheory module.
Progress of ImageSet Union:- In PR #17079 This PR is currently been left for review.

Next week goals

Work upon _solve_modular

In the following week I will be changing the domain of solving equations to Integers only.

Pull Requests to review (if any):

Meeting #12 (Week 11)

Date: Sunday 11th August, 2019
Time: 11:30 - 12:30 PM (IST)
Attendees: Jogi Miglani, Yathartha Joshi, Amit Kumar

Minutes of Meeting:

Here is all the brief description about new solver _solve_modular for solving modular equations.

What type of equations to be considered and what domain?

A - Mod(B, C) = 0

    A -> This can or cannot be a function specifically(Linear, nth degree single
         Pow, a**f_x and Add and Mul) of symbol.(But currently its not a
        function of x)
    B -> This is surely a function of symbol.
    C -> It is an integer.
And domain should be a subset of S.Integers.

Filtering out equations

A check is being applied named _is_modular which verifies that only above mentioned type equation should return True.

Working of `_solve_modular`

In the starting of it there is a check if domain is a subset of Integers.

domain.is_subset(S.Integers)

Only domain of integers and it subset are being considered while solving these equations. Now after this it separates out a modterm and the rest term on either sides by this code.

modterm = list(f.atoms(Mod))[0]
rhs = -(S.One)*(f.subs(modterm, S.Zero))
if f.as_coefficients_dict()[modterm].is_negative:
    # f.as_coefficient(modterm) was returning None don't know why
    # checks if coefficient of modterm is negative in main equation.
    rhs *= -(S.One)

Now the equation is being inverted with the helper routine _invert_modular like this.

n = Dummy('n', integer=True)
f_x, g_n = _invert_modular(modterm, rhs, n, symbol)

I am defining n in _solve_modular because _invert_modular contains recursive calls to itself so if define the n there then it was going to have many instances which of no use. Thats y I am defining it in _solve_modular.

Now after the equation is inverted now solution finding takes place.

if f_x is modterm and g_n is rhs:
        return unsolved_result

First of all if _invert_modular fails to invert then a ConditionSet is being returned.

    if f_x is symbol:
        if domain is not S.Integers:
            return domain.intersect(g_n)
        return g_n

And if _invert_modular is fully able to invert the equation then only domain intersection needs to takes place. _invert_modular inverts the equation considering S.Integers as its default domain.

    if isinstance(g_n, ImageSet):
        lamda_expr = g_n.lamda.expr
        lamda_vars = g_n.lamda.variables
        base_set = g_n.base_set
        sol_set = _solveset(f_x - lamda_expr, symbol, S.Integers)
        if isinstance(sol_set, FiniteSet):
            tmp_sol = EmptySet()
            for sol in sol_set:
                tmp_sol += ImageSet(Lambda(lamda_vars, sol), base_set)
            sol_set = tmp_sol
        return domain.intersect(sol_set)

In this case when g_n is an ImageSet of n and f_x is not symbol so the equation is being solved by calling _solveset (this will not lead to recursion because equation to be entered is free from Mod) and then the domain intersection takes place.

What does `_invert_modular` do?

This function helps to convert the equation A - Mod(B, C) = 0 to a form (f_x, g_n). First of all it checks the possible instances of invertible cases if not then it returns the equation as it is.

a, m = modterm.args
if not isinstance(a, (Dummy, Symbol, Add, Mul, Pow)):
        return modterm, rhs

Now here is the check for complex arguments and returns the equation as it is if somewhere it finds I.

if rhs.is_real is False or any(term.is_real is False \
            for term in list(_term_factors(a))):
        # Check for complex arguments
        return modterm, rhs

Now after this we check of emptyset as a solution by checking range of both sides of equation. As modterm can have values between [0, m - 1] and if rhs is out of this range then emptySet is being returned.

if (abs(rhs) - abs(m)).is_positive or (abs(rhs) - abs(m)) is S.Zero:
        # if rhs has value greater than value of m.
        return symbol, EmptySet()

Now the equation haveing these types are being returned as the following

if a is symbol:
        return symbol, ImageSet(Lambda(n, m*n + rhs), S.Integers)

    if a.is_Add:
        # g + h = a
        g, h = a.as_independent(symbol)
        if g is not S.Zero:
            return _invert_modular(Mod(h, m), (rhs - Mod(g, m)) % m, n, symbol)

    if a.is_Mul:
        # g*h = a
        g, h = a.as_independent(symbol)
        if g is not S.One:
            return _invert_modular(Mod(h, m), (rhs*invert(g, m)) % m, n, symbol)

The more peculiar case is of a.is_Pow which is handled as following.

if a.is_Pow:
        # base**expo = a
        base, expo = a.args
        if expo.has(symbol) and not base.has(symbol):
            # remainder -> solution independent of n of equation.
            # m, rhs are made coprime by dividing igcd(m, rhs)
            try:
                remainder = discrete_log(m / igcd(m, rhs), rhs, a.base)
            except ValueError: # log does not exist
                return modterm, rhs
            # period -> coefficient of n in the solution and also referred as
            # the least period of expo in which it is repeats itself.
            # (a**(totient(m)) - 1) divides m. Here is link of theoram:
            # (https://en.wikipedia.org/wiki/Euler's_theorem)
            period = totient(m)
            for p in divisors(period):
                # there might a lesser period exist than totient(m).
                if pow(a.base, p, m / igcd(m, a.base)) == 1:
                    period = p
                    break
            return expo, ImageSet(Lambda(n, period*n + remainder), S.Naturals0)
        elif base.has(symbol) and not expo.has(symbol):
            remainder_list = nthroot_mod(rhs, expo, m, all_roots=True)
            if remainder_list is None:
                return symbol, EmptySet()
            g_n = EmptySet()
            for rem in remainder_list:
                g_n += ImageSet(Lambda(n, m*n + rem), S.Integers)
            return base, g_n

Two cases are being created based of a.is_Pow

x**a
a**x

x**a - It is being handled by the helper function nthroot_mod which returns required solution. I am not going into very mch detail for more information you can read the documentation of nthroot_mod.

a**x - For this totient is being used in the picture whose meaning can be find on this Wikipedia page. And then its divisors are being checked to find the least period of solutions.

Pull Requests to review (if any):

Meeting #12 (Week 12)

Final report of GSoC 2019 of Solvers: Extending solveset is here.

GSoC 2019 Solvers Progress - sympy/sympy GitHub Wiki

GSoC 2019 Progress - Solvers : Extending solveset

Progress Tracking Table

Meeting Schedule

Meeting #01 (Community bonding period)

Minutes of Meeting:

Goals

Pull Requests to review (if any):

Meeting #02 (Week 1)

Minutes of Meeting:

Goals

Pull Requests to review (if any):

Meeting #03 (Week 2)

Minutes of Meeting:

Goals

Pull Requests to review (if any):

Meeting #04 (Week 3)

Minutes of Meeting:

Goals

Pull Requests to review (if any):

Meeting #05 (Week 4)

Minutes of Meeting:

Next week goals

Pull Requests to review (if any):

Meeting #06 (Week 5)

Minutes of Meeting:

Next week goals

Pull Requests to review (if any):

Meeting #07 (Week 6)

Minutes of Meeting:

Next week goals

Pull Requests to review (if any):

Meeting #08 (Week 7)

Minutes of Meeting:

Explaining the function _solve_lambert (main function to solve lambert equations)

What does PR #16890 do?

1. Not considering all branches of equation while taking log both sides.

2. Calculation of roots should consider all roots in case having rational power.

Next week goals

Pull Requests to review (if any):

Meeting #09 (Week 8)

Minutes of Meeting:

Next week goals

Pull Requests to review (if any):

Meeting #10 (Week 9)

Minutes of Meeting:

Next week goals

Pull Requests to review (if any):

Meeting #11 (Week 10)

Minutes of Meeting:

Pull Requests to review (if any):

Meeting #12 (Week 11)

Minutes of Meeting:

What type of equations to be considered and what domain?

Filtering out equations

Working of _solve_modular

What does _invert_modular do?

Pull Requests to review (if any):

Meeting #12 (Week 12)

Explaining the function `_solve_lambert` (main function to solve lambert equations)

Working of `_solve_modular`

What does `_invert_modular` do?