GSoC 2018 Application Arighna Chakrabarty : Improving Series Expansions - sympy/sympy GitHub Wiki

The same proposal is uploaded here in Google Docs.

Table Of Contents

About Me

Name and Contact Info

Name : Arighna Chakrabarty

University : Indian Institute Of Technology, Guwahati

Major : Electronics and Communication Engineering

G-Mail : [email protected]

Github Handle : ArighnaIITG

Blog : https://arighnaiitg.github.io

Time Zone : IST(UTC +5:30)

Age : 19

Personal Background

Hello, I am Arighna Chakrabarty, a second year undergraduate student studying at IIT Guwahati, India. I am currently majoring in Electronics and Communication Engineering, with a minor in Computer Science and Engineering. I had been introduced to the world of programming five years ago, when I started out with Java.

But I commenced coding in Python roughly two years ago, by the book "Delve Deep into Python", which turned out to be my favourite language. The lucid syntax of Python, and its usage in many related fields, was what drew me closer to Python. I have implemented many Machine Learning and Deep Learning projects in Python, the most notable being the project : Convolutional Neural Networks for Automobile Recognition. I have recently stood 2nd among 60 competing teams in India's largest AI hackathon in IIT Bombay, for implementing an AI based Email Spam Detection Filter using Tensorflow library in Python.

I am also fairly competent in Data Structures and Algorithms in C and C++. I have also worked as a Web Developer in the Students' Committee in my institute for about an year, using front-end technologies like HTML, CSS, JavaScript and Bootstrap, and some back-end technologies. I have been using Git and GitHub for about 10 months now. All my other projects are there in my GitHub handle. I am also currently trying my hands at Cython through the book “Cython: A Guide for Python Programmers” by Kurt W. Smith.

Platform Details

OS : Ubuntu 17.04

Hardware Details : i5 7700HQ, 8 GB

Code Editor : Sublime Text 3

IDE : Python - Pycharm

SymPy and its favourite feature

I was first introduced to SymPy when I was skimming through the list of previous years GSoC organizations in October 2017. When I saw that this project needed a combination of Mathematics + Python, I immediately understood, that this was what I would contribute to. I also wanted to develop my Python skills, and since I had taken several courses in Maths, and was good at it, SymPy was my first choice.

My favourite feature of SymPy would be surely be the integration module of Sympy and in that too, the functionality of Fourier and Laplace transform as I use them quite often for my academic purposes (It saves quite a lot of effort and time when I deal with complex expressions).

>>> from sympy import fourier_transform, exp
>>> from sympy.abc import x, k
>>> fourier_transform(exp(-x**2), x, k)
sqrt(pi)*exp(-pi**2*k**2)

Contributions

I started studying the code-base in November 2017, and made my first contribution in SymPy in mid-December 2017. Since then, I have created many pull requests, many of which got merged. It has been a great journey, and I intend to make it more fruitful in the coming GSoC days.

Here are the list of my merged/closed/un-merged Pull Requests, and open/closed Issues. These have been arranged in chronological order, in the order they were created.

  • (Merged) - ValueError thrown when trying to find out Inverse of zero and infinite quaternion #13700
  • (Merged) - Modified the docstring of collect_const() #13725
  • (Closed) - Tests for making AntiSymmetric Tensor canonical #13741
  • (Merged) - Computation of negative powers of a quaternion #13745
  • (Closed) - Lambdify modified to handle functions and their simultaneously #13777
  • (Unmerged) - Evaluating powers of TensorProduct #13784
  • (Merged) - Used fuzzy_bool instead of Ternary Sort #13796
  • (Merged) - Implementation of symbolic functions in solveset #13838
  • (Unmerged) - Sympy.fps modified to always return a formal power series #13854
  • (Merged) - Modified coeffs generator to remove runtime error in solve/nonlinsolve #13856
  • (Unmerged) - det matrix supports method argument #13857
  • (Merged) - Fixed cot.rewrite(sin) #13884
  • (Merged) - Expand Max and Min functions in fcode and octave #13903
  • (Closed) - Implementing integer_log #13911
  • (Unmerged) - Added test cases for syntax error in lambdify #13938
  • (Merged) - Updated the docstring in elementary.rst #13951
  • (Merged) - Added evaluate flags in miscellaneous.py #14130
  • (Merged) - Implemented Sum((-1)**n/sqrt(n), (n, 1, oo)).is_absolutely_convergent() #14158
  • (Unmerged) - Added is_Empty property in Set class #14206
  • (Merged) - Fixed a typo #14252
  • (Unmerged) - Added simplify in limit_seq #14252
  • (Unmerged) - Added Deprecation warnings in stats module (Approved) #14282
  • (Merged) - Added printing methods for zoo and negative infinity #14306
  • (Merged) - Limit comparison test introduced #14401
  • (Unmerged) - [WIP] Modified _eval_nseries of Mul #14419
  • (Merged) - Modified the TR3 function in simplify.fu #14420
  • (Merged) - Fixed triple integral #14460
  • (Merged) - Modified invert_real() to use only free symbols, not has #14462
  • (Closed) - Added new subs property in ConditionSet #14501
  • (Unmerged) - Binomial Documentation fixed #14537

The Project

Project Outline and Motivation

Series expansions, in the field of symbolic computation, is an area of major study in mathematical analysis. Series are used in most areas of symbolic mathematics, even for studying finite structures (such as in combinatorics), through generating functions. Presently, SymPy has a well established series code-base , including the Formal Power Series, the Sequence Class, and the Gruntz Limits. So, why not complete this entire field, by fixing all corner cases, implementing some special functions, and unifying all the Series Expansions under one module.

My implementation will run along the above lines. The project will contain various subprojects in itself. The general outline of my GSoC project will be the following --

  1. Asymptotic Expansions -- In 2014, work was initiated in the very crucial field of Asymptotic Expansions by Avichal Dayal, as his GSoC project. The MRV Gruntz algorithm will be implemented for the computation of asymptotic series. Just as eval_nseries, eval_aseries will be extended out to all kinds of special functions, like Li, Bessel Variants, Fibonacci, etc.

  2. Formal Power Series -- In 2015, Sartaj Singh had implemented this, as a result of which Sympy has a well-coded series/formal.py. Most of the basic operations in formal power series are implemented in formal.py. I intend to do 2 particular enactments here -:

a) Remove all the `XFAIL` tests in `formal.py` , making `solve_DE()` more robust.

b) Introduce more operations for formal power series, i.e, creating functions 
for `composition, inversion, convolution`, etc.
  1. Ring series -- Presently, SymPy has a fast implementation of ring_series called the rs_series, which has already been extended out to trigonometric functions, exp, log, etc. This is probably the most important series function, and needs to be improved and extended, for it to become the ultimate series function !! I intend to break this into 2 sub-projects--
a) Implementing the Laurent and Puiseux counterpart of Formal Series, by constructing a ring deriving generator and domain from options and input expressions. This is basically bringing FormalSeries under the radar of rs_series.

b) New functions will be added to "polys/ring_series.py", keeping in mind the points written in the introduction of the same file. Focus will also be on improvement of the existing functions in 'ring_series`.
  1. Improving limit algorithms -- Presently, SymPy has implemented limits of series along the lines of Gruntz Algorithm of most rapidly varying subexpression, in series/gruntz.py. There are various slow and XFAIL tests concerned with this, which requires the algorithm to be improved. Also, other series-related limit failures will be debugged, and fixed.
Also, under this, I would like to consider the XFAIL tests in the is_convergent() method,
in concrete/summations.py. Those shall be removed by making the method more efficient,
by adding new tests.
  1. Unifying series expansion -- This is kind of, like, the ultimate goal for the series GSoC project. According to me, SymPy need a good User-Interface for it's Series Representation and Expansion for users, as given in UD - Function expansions and series - User Interface. A entire new class structure will be defined, which will wrap all Formal Power Series, Taylor Series, Laurent Series, etc., thus unifying the series interface, into a single module.

I have always been fascinated with the power Mathematica brings to the process of mathematical study. I have a vision which focuses on the complete packaging and unification of the Series module as in Mathematica. What fascinates me the most in Series module is its simplification of a seemingly complex function.

>>> series(cos(3*acos(x)))
-3*x + 4*x**3 + O(x**6)

After taking a basic course in Series and its Representations as a part of Real and Complex Analysis, I came to understand the importance of this aspect of mathematics, and how this links many other topics together, and thus, being a contributor of SymPy for over more than 5 months, I am determined to finish all the unfinished works in this module and to seal it off as a complete well-functioning package, composed of a variety of functions and representations. How awesome would that be !!

Implementation Details

The various sub-projects under this project have already been discussed in the previous section. I will now be providing the detailed implementation of how the changes will be brought about.

1. Asymptotic Expansions

In mathematics, asymptotic expansions, or Poincare Expansions is a formal series of functions which has the property that truncating the series after a finite number of terms provides an approximation to a given function as the argument of the function tends towards a particular, often infinite, point. It's basically a non-convergent series.

One very good example is the Gamma Function --

The gamma function

    .. math::
        \Gamma(x) := \int^{\infty}_{0} t^{x-1} e^{-t} \mathrm{d}t.

Now, presently SymPy performs the following with this asymptotic function -:

>>> series((sin(1/x+exp(-x))-sin(1/x)), x, oo)
 ⎛1        
O⎜──; x  ∞⎟
  6       
 ⎝x        

But, upon implementation of an asymptotic algorithm, which can provide a value close to the desired answer for a finite number of terms, will deliver this -:

>>> series((sin(1/x+exp(-x))-sin(1/x)), x, oo)
(1  1/(2*x**2) +1/(24*x**4) + O(1/x**6)) * exp(-x)

This depicts the difference and the importance brought by the asymptotic series into the foray.

Now how will I implement it ? Most of the routines needed for the following algorithm are already present in gruntz.py, so it will be easy to implement.

a) MRV_Gruntz algorithm

In "gruntz.py", finding the most-rapidly varying term(mrv function) and rewriting expression(rewrite function) in terms of omega is already implemented. This algorithm will -:

1) Determine the set of most rapidly varying sub-expressions.
2) Choose one w from the above set and rewrite the other expression as
   as g(x) = h(x)*w**p, where mrv(h(x)) < w, and p is in R+
3) Compute Puiseux series about w = 0.
4) Coefficients are recursively expanded in terms of their most rapidly varying subexpression.
   Algorithm terminates when coefficients become constant.

In 2014, Avichal Dayal had started off this work ( PR #7529 ). But, it had many loopholes and the work was unfinished. I tend to finish off that PR, by tweaking on the changes a bit.

#Methods

An aseries function will be created in core/expr.py, which will return the asymptotic expansion for self.

aseries will be equivalent to self.series(x, oo, n)        # n - order upto which expr to be truncated

Bound parameter will be used in the function definition to give limit 
on rewriting coefficients in its normalised form.

A formal pseudocode -:

def aseries(self, x, n=6, bound=0):

     omega, exps = mrv(self, x)
     if x in omega :
          Substitute x with exp(x), then perform aseries, and again substitute with log(x).
     Return along with the order term "Order(1/x**n, (x, S.Infinity))", if any.

     f, logw = rewrite(exps, omega, x, Dummy_var)
     
     if self in omega :     # Find a canonical representative
          Find asymptotic series for the exponent, and remove all order terms.
          r = Substitute x with (1/x), find leading term , and then resubstitute.
          logw = log(1/r)
     
     # The above will actually form a base for the hierarchical representation, which will follow later.

The above pseudo-code will be easy to implement given that the core structure has already been worked upon.

  • Hierarchical representation -: Consider the following case --
>>> e = sin(x) * cos(exp(-x))
>>> e.aseries(x)
ValueError: gamma function pole

There are cases like above, when computing the asymptotic expansion using only the mrv and rewrite routines in MRV_gruntz algorithm will result in a complex expansion, and may also lead to recursion loops, leading to "Gamma Pole Errors". Hierarchical series tend to break after the first recursion. The coefficients are not expanded completely, which results in a more informative expansion.

We will declare a "hierar" parameter for the hierarchical representation.

s = Canonical series (obtained after the previous pseudocode)
if hierar:
    return s.subs(Dummy_var, exp(logw))     # Only the first recursion takes place
else :
    move on to the recursive implementation

b) Extension of eval_aseries in special functions

Quite a few special functions don't have their asymptotic as well as eval_nseries method implemented. Fresnel_integrals already has it implemented. I am thinking to extend _eval_aseries to other functions in functions/special/.

     * **erf** : `Gaussian error function
     * **Ei** : `Exponential integral`
     * **Riemann zeta function**
     * **Si, Ci** : sine and cosine integral respectively. 

In case expansion does not exist at infinity, functions need to be rewritten in terms of exponential functions. Test cases also need to be added in many functions. Many functions don't have their _eval_nseries written in a tractable way. Those can be rewritten. This part will be dealt with in a later topic( specifically about eval_nseries)

The entire **Asymptotic Expansions** module will not be hard to implement, as there is already work done in some open PRs, and there's enough literature on the net to support the rest of the implementations.

There have been many implementations of special asympt functions, most notable being the Work in Progress PR #7552, created by Avichal Dayal. There are many loopholes in the PR. I intend to port the existing code, modify it, for example, remove some redundant _eval_aseries with existing aseries functions, and insert some newer algorithms, to finish off this section. Discussions about this will be done with the mentors.

2. Formal Power Series

Presently, SymPy has a well-established series/formal.py, due to the work done in 2015 by Sartaj Singh. The FormalPowerSeries class is defined, which contains various mathematical operations under its belt. The fps function returns the formal series expansion of f around x = x0with respect to x in the form of a FormalPowerSeries object.

>>> ps=fps(sin(x),x)
>>> ps
     3     5        
    x     x      6
x - ── + ─── + O⎝x 
    6    120        

It also can be represented in an infinite series format, something which is very useful, and something which is implemented in Mathematica.

>>> ps.infinite
                                                                    
    ________                                                         
                                                                    
                            k   1                                  
                             -                                   
                            2   2  k                               
                        -1/4     ⋅x                                
            ⎪────────────────────────────────────  for Mod(k, 2) = 1
                                k   1 ⎛k   1                    
x +         ⎪RisingFactorial⎜3/2,  - ─⎟⋅⎜─ - ─⎟!                   
                                2   2 ⎝2   2                    
                                                                   
                             0                        otherwise    
                                                                   
                                                                    
                                                                    
    ‾‾‾‾‾‾‾‾                                                         
     k = 2                                       

There are various other basic mathematical operations which you can perform upon fps, such as add, subtract, multiply, differentiate, integrate, truncate, etc. But the module has large scope of improvement, and many existing functions can be rewritten in a nicer way.

I have divided this entire part into 3 subprojects. Here are their implementation details.

a) Correcting all the XFAIL tests in formal.py

The FormalPowerSeries consists of the following structure --

Class FormalPowerSeries(Series Base):

Methods :

1. simpleDE() -> Converts a function to a simple differential equation.
2. hyper_RE() -> Converts a differential equation to recurrence equation, using the hypergeometric algorithm.
3. rsolve_hypergeometric() -> Solves recurrence equation of hypergeometric type.
4. dsolve() -> Solves differential equation
4. compute_fps() -> Computes the formula for Formal Power Series of a function.

Example of an XFAIL test --

>>> fps(x**n*sin(x**2),x).truncate(8)
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "sympy/series/formal.py", line 1250, in fps
    result = compute_fps(f, x, x0, dir, hyper, order, rational, full)
  File "sympy/series/formal.py", line 900, in compute_fps
    return _compute_fps(f, x, x0, dir, hyper, order, rational, full)
  File "sympy/series/formal.py", line 820, in _compute_fps
    result = hyper_algorithm(f, x, k, order)
  File "sympy/series/formal.py", line 738, in hyper_algorithm
    sol = solve_de(f, x, DE, i, g, k)
  File "sympy/series/formal.py", line 695, in solve_de
    DE = _transform_explike_DE(DE, g, x, order, syms)
  File "sympy/series/formal.py", line 621, in _transform_explike_DE
    sol = dict(zip(syms, (i for s in linsolve(eq, list(syms)) for i in s)))
  File "sympy/solvers/solveset.py", line 1465, in linsolve
    '') % eq)
ValueError: 
(n**2 + 2*n)/x**2 contains non-linear terms in the variables to be evaluated

But Mathematica gives us an answer to it,

>>> f = x**n*sin(x**2)
fps(f, x).truncate(8) == x**2*x**n - x**6*x**n/6 + O(x**(n + 8), x)

If we trace the example with a debugger, we can easily see that solve_de() causes the ValueError, since it is not yet equipped to calculate the differential equation, which has non-linear terms in itself.

But if we observe the solution carefully, we can see that the solution can be written as -

>>> x**n*fps(sin(x**2),x,0).truncate(18)
         6    10    14          
 n  2   x    x     x        18⎞⎟
x ⋅⎜x  - ── + ─── - ──── + O⎝x  ⎠⎟
        6    120   5040         

This is similar to the FPS obtained after the first example, which inadvertently means, that we can seperate the (x**n) term, compute the fps for the remaining terms, and then multiply x**n again. This may be performed in the hyper_RE algorithm or in the transformation functions, or it might be wrapped in the solve_DE() function itself.

A pseudocode would be like this --

def solve_de(f, x, DE, order, g, k):
     if isinstance(DE, Mul):
        Check for a Pow object , might specifically be x**n
        new_DE = Formed after seperating the x**n
        DE = new_DE
     
     syms = DE.free_symbols.difference({g, x})
     # Algorithm continues as [here](https://github.com/sympy/sympy/blob/master/sympy/series/formal.py#L684)

This approach will solve other XFAIL tests listed here.

There are some other XFAIL tests, which need to be registered:

  1. Unable to compute series for functions which are under the class of binomial Expansion ,like
>>> fps((1+x)**n, x)
ValueError: 
_k*(_k + 1) - a1*n + 2*a1*(_k + 1) - n**2 + n contains non-linear terms in the variables to be evaluated

A Seperate exp_bino() function can be defined which will take in the function, expand it, and compute the fps term by term, and then delete the redundant terms, printing the order alongwith.

  1. Unable to compute series for functions which have the form x**f(x), like
>>> fps(x**log(x), x, 1)
 log(x)
x             (from present fps)

But x**log(x) can be expanded about 1, as demonstrated by Mathematica.

>>>fps(x**log(x),x,1)

1+ (x-1)**2 - (x-1)**3 + 17 * (x-1)**4 - 11  * (x-1)**5 + O((x-1)**6)
                         __              __
                         12              6 

Various new operations have to be implemented in the FormalPowerSeries class base, which can remove these XFAIL tests, which brings me to the next sub-project.

b) Extension Of FormalPowerSeries by implementing newer operations -:

As I had already said before, formal.py has its basic operations implemented ( PR #9782 ). But, important extensions can be made in this arena, in the form of newer operations. Some more newer operations are listed below, along with their pseudocodes --

1. Computing the Coefficient sequence of power of FPS -:

This is in particular necessary, to obtain the coefficient sequence of equations of type f(x)^n, where f(x) is a FPS. Mathematica can do it, and this also comes handy in matters relating convergence. The entire theory is given in Wikipedia. I will try to implement it for positive "n", and later extend it to negative and fractional -> the general case.

Algorithm -->> Let a formal power series be given as w(x) = summation(a_n*x**n, (n, 0, oo)) and a0 != 0, where n is a natural number.

Then, c_0 = a_0**n
      c_m = (1/(m*a_0)) * summation((k*n - m + k) * a_k * c_(m - k), (k, 1, m))

As you can see, this is a recurrence relation of hypergeometric type, with transformation function - f'(x): b(k + m) = ((k + m + 1)/(k + 1))*R(k + 1)*b(k).

"rsolve_hypergeometric(f, x, P, Q, k, m)" solves RE of hypergeometric type.
SO, this method will be used to solve the above relation, and obtain the coefficient sequence.

API :- @property
       coefficient(fps, pow=some_value, order=some_value) , where pow is the power to which the FPS is raised.

"Sample Test"

>>> ps = fps(sin(x), x)
>>> ps.coefficient(pow=10, order=3)

[1, - 5/3 , -59 / 72]    

2. Composition -:

Given two formal power series,

f(x)=summation(a_n * x**n, (n, 0, oo))
g(x)=summation(b_n * x**n, (n, 0, oo))

one can compute the composition like -

g(f(x))= summation(b_n * f(x)**n , (n, 0, oo))) =summation(c_n * X**n ,(n,0, oo))

where the coefficients c_n are determined by "expanding out" the powers of f(x). There exists a closed formula for composition of two power series and also to find the coefficients. It's called the Faà di Bruno's formula.

There is also another algorithm given in a 1978 publication of jacm-brent-kung. The theory is given there and the pseudocode will somewhat look like this -:

def compose_series(f, g, x):
    f(x) = f_m(x) + f_r(x)      #  where f_m(x) = summation(a_n * x**n, (n, 0, m))
                                         f_r(x) = summation(a_n * x**n, (n, m+1, oo))  , m < n

    # Value of m to determined from a Lemma.(http://www.eecs.harvard.edu/~htk/publication/1978-jacm-brent-kung.pdf)

    g(f_m(x)) = compose_series(f_m, g, x) mod x**(n+1)
    l = n/m
    for counter in range(0, l):
        Store in list_1 repeated derivative terms of g(f_m(x)) upto l, each term having mod x**(n+1)
        Store in list_2 repeated derivative terms of f_r(x) upto l, each term having mod x**(n+1)
    
    result = mul(list1, list2)   # Term by term multiplication , adding up, and eliminating redundant terms
    return (1/factorial(l))*result

c) Convolution -:

Convolution, or the Cauchy Product of two formal power series, is actually, the discrete convolution of two power series. This is used mainly in case of Infinite series.

API -: def convolute(f, g, x, order=6) which convolutes two functions, and truncates upto keyword arg "order".

f(x) = summation(a_n * x**n, (n, 0, oo))
g(x) = summation(b_n * x**n, (n, 0, oo))

def convolute(f, g, x, order=3):
    c_k = summation(a_k * b_(k-l), (l, 0, k))
    result = summation(c_k * x_k, (k, 0, oo))
    return result

An example, after it will be implemented -:

>>> g(x) = fps(exp(a**x), x)
                 2         3           4          5        
                x          x          x          x      ⎛ 6⎞
1 + a*x + a**2 *── + a**3 *── + a**4 *── + a**5 *─── + O⎝x ⎠
                2          6          24         120        

>>> convolute(exp(a**x), exp(b**x), x, order =3)
                              2          2    
 1   ⎛                       x          x     ⎛ 3⎞ )
 ── *| 1 + a*x - b*x + a**2 *── - b**2 *── + O⎝x ⎠ |
 a-b ⎝                       2           2         ⎠

d) Inversion -:

The series f(x) = Summation(a_n * x**n, (n, 0, oo)) has an inverse if and only if the d only if its constant coefficient a_0 is invertible in R, the ring of the series. The theory of the inversion algorithm i also given in Wikipedia.

Let g(x) = Summation(b_n * x**n, (n, 0, oo)) be the inverse of f(x). Then the API of the function, will be somewhat like this.

@property
def invert():
    b0 = 1/a0
    b(n) = -(Summation(a(i) * b(n-i), (i, 0, oo)) / a(0))    
    return Summation(b_n * x**n, (n, 0, oo)) 

Again, b(n) will be invoked by rsolve_hypergeometric, since it is a basic recurrence relation.

Example, after it will be implemented -:

>>> fps(sin(x), x)
     3     5        
    x     x      6
x - ── + ─── + O⎝x 
    6    120        

>>> fps(sin(x), x).invert()
     3      5        
    x    3⋅x      6
x + ── + ──── + O⎝x 
    6     40 

3. Ring Series Expansion

SymPy has rs_series which makes use of the efficient representation and operations of sparse polynomials for very fast multivariate series manipulations. This project would extend it to work for all functions. Eventually the goal is to make it the default series expansion in SymPy. There are mainly two parts of rs_series I would like to focus on. Optimization of rs_series is seriously awesome. But it requires extension and restructuring. The application of the sub-projects are given below -:

a) Implementing Formal Laurent and Puiseux Series with ring implementation

The best way to represent a series class is debatable and not so straightforward. While there may be many structures that can be executed, I do have the following structure for Formal Series as a starting point.

FormalSeries     
          FormalPowerSeries   
          FormalLaurentSeries    
          FormalPuiseuxSeries

The unification regarding the above structure of Formal Power Series will be done in a later section. Here I intend to discuss about the ring implementation part.

Now that Formal Power Series has already been implemented, its time to shift towards the more general side, the Laurent and the Puiseux Series. I have seen many implementations of them, but I feel that they should be applied by constructing a ring deriving generator and domain from options and input expressions, basically using the sring method in polys/rings.py. This also drives forward the goal of SymPy, which is to make rs_series the ultimate series class in SymPy. And not to forget the fact, that the computation of complex operations involving formal series will improve if you initialize variables as polynomial rings.

The code I am about to put below has been taken from Shivam Vats' GSoC 2015 application, which has not been implemented yet. I intend to complete the structure and port the code into another file polys/ring_formal_series.py, and, in future unify it in the main module.

class FormalSeries(object):
    def __init__(cls, data):
        pass

class FormalPowerSeries(FormalSeries):
    def __init__(self, data):
        if isinstance(data, PolyElement):
            self.series = data
            self.ring = data.ring
        else:
             self.ring, self.series = sring(data, domain=QQ)

    def __repr__(self):
        from sympy.printing import sstr
        return sstr(self.series, order=None)

    def __str__(self):
        from sympy.printing import sstr
        return sstr(self.series, order=None)

    def __add__(self, other):
        return FormalPowerSeries(self.series + other.series)

    def __mul__(self, other):
        if(self.ring.ngens == 1 and other.ring.ngens == 1):
            x = ((self.ring).gens)[0]
            prec = min((self.series).degree(), (other.series).degree()) + 1
            return FormalPowerSeries(rs_mul(self.series, other.series, x, prec))

    def __pow__(self, n):
        if(self.ring.ngens == 1):
            x = ((self.ring).gens)[0]
            prec = (self.series).degree() + 1
            return FormalPowerSeries(rs_pow(self.series, n, x, prec))

    def __div__(self, other):
        if(self.ring.ngens == 1 and other.ring.ngens == 1):
            x = ((self.ring).gens)[0]
            return self * other**(-1)

class FormalLaurentSeries(FormalSeries):
    def __init__(self, data):          #Converts a laurent series into a power series by multiplying the 
        x = (data.atoms(Symbol)).pop() #  whole series by the lowest negative exponent.
        lead = data.extract_leading_order(x)
        if lead < 0:
            self.ring, self.series = sring(data * (x**(-lead)), domain=QQ)
        else FormalPowerSeries.__init__(data)

We can then be able to obtain Formal Power Series, by constructing a polynomial ring, which leads to faster and optimized results.

R, x = Ring("x", QQ)
f = FormalPowerSeries(x**2 + 2*x + 1, prec=3)

The above implementation will also be extended to support Puiseux series. I am currently reading about it, and will come up with a pseudocode soon.

b) Improving rs_series and addition of special functions to ring_series.py

The goal, eventually, for SymPy is to make rs_series the default series expansion. For that, we need to extend it to all functions, and we also need to work towards its betterment.

Some areas where rs_series needs to be improved -:

1. Extension of rs_log for expressions without any constant terms -:

Currently, rs_log, can only perform this kind of functionality -:

>>> from sympy import *
>>> from sympy.polys.domains import QQ
>>> from sympy.polys.rings import ring
>>> from sympy.polys.ring_series import rs_log
>>> R, x = ring('x', QQ)
>>> rs_log(1 + x, x, 8)
1/7*x**7 - 1/6*x**6 + 1/5*x**5 - 1/4*x**4 + 1/3*x**3 - 1/2*x**2 + x
>>> rs_log(x + x**2, x, 8)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "sympy/polys/ring_series.py", line 1042, in rs_log
    raise NotImplementedError
NotImplementedError

This issue can be solved by following the below sub-routine -:

def rs_log(p, x, prec):
   # Some code here
   c = _get_constant_term(p, x)
   if c:
      # The code for computing the series with constant terms.

   else:
      # Find out the factors of the expression. In the above case,
      # log(x + x^2) = log(x) + log(1 + x) .
      # Define log(x) expansion separately, using another method, and then add them up.

2. rs_series_inversion can be extended to support cases with constant term in series. 3. Hadamard exponentiation can be extended for non-rational domains. 4. Univariate series expansion of the nth root can be extended to support cases with constant term in series.

These are some issues listed above, which need to be solved, and will bring about the improvement of the existing ring_series functions.

Then comes the topic of extending special functions to ring_series.py. As given in the introduction of ring_series.py, here are some points to keep in mind while implementing a new function -:

The implementation should work on all possible input domains/rings.
Special cases include the ``EX`` ring and a constant term in the series
to be expanded. There can be two types of constant terms in the series:

     -- A constant value or symbol.
     -- A term of a multivariate series not involving the generator, with
        respect to which the series is to expanded.

More is described here about how to add new functions, and extend it to work with rs_series.

I am thinking of extending special functions, under functions/special, to work with rs_series. Some of them are -:

1. Error functions - erf
2. Bessel functions - BesselJ(n, x), and BesselY(n, x)
3. All the remaining hyperbolic and inverse hyperbolic functions
4. Fibonacci polynomial.

These can be implemented easily, since there is enough of literature supporting these algorithms. Discussions woth mentors will be done prior to this.

4. Improvement in limit algorithms

After all the major implementation part, comes the portion of fixing issues, and thus improving the codebase. At present, SymPy has a good representation of limits in sequences in series/limits.py. The function limit(e, z, z0, dir="+") computes the limit of e(z) at the point z0. SymPy tries some heuristics for easy and frequent cases like "x", "1/x", "x**2" and similar, so that it's fast. For all other cases, the Gruntz algorithm is used.

a) Implementation of Shackell's algorithm -:

Gruntz use series for limits calculation, and Series algorithm use limits to calculate terms. Gruntz can solve only fixed number of functions. But the current Gruntz algorithm can content itself with rougher rewritings, and does make heavy recursive calls. This may become hard to understand, and becomes less optimizable.

There exists an alternate algorithm, called the Shackell's Algorithm. It basically iterates the use of nested forms, and says that -:

To simplify, given an exp-log expression, this algorithm rewrites it into its nested form,
an almost normal form on which the asymptotic behaviour reads off easily.
The difficult operation when doing this is addition because of possible cancellations.
When faced with a sum, the algorithm rst computes a list of the growth orders in the
expressions, then orders it, and by substituting the largest ones by zero, tries to determine which one provides
the right asymptotic scale, then it rewrites the expression in terms of this one and a recursive application
of this gives the nested form.

This algorithm will make all slow tests under test_gruntz.py faster.

The entire algorithm is given in Nested forms section in Computing Limits in a symbolic manipulation world – Dominik Gruntz. There has been an improvement in the algorithm in the form of finding out the estimate function, which adopts the nested form algorithm, using the Z-Functions.

Class <Shackell(Series base)> -:

API of the methods would be -:

  1. construct_expr_tree(f, x, x0) -:

Creates an expression tree out of the function given to us, and has a method which does internal leaf-traversing. Returns the rewritten exponential or logarithmic form "exp_log_form".

Example given here in Section 6.1

  1. compute_estimate_form(exp_log_form, x, x0) -:

Rewrites the estimate form in terms of z-functions. Some predetermined z-functions will be defined, like zexp(x), zinv(x), zlog(x), etc. Calculates the total estimate form of the function.

  1. compute_rordering(est, x, x0) -:

Compared with the previous standard z-functions, and found out the comparable terms. Applied the R-Ordering Lemma, and found out the Case the function belongs to. Returns "case", and the terms.

  1. def shackell(f, x, x0, dir='+') -:

Compute the limit of f(x) at the point x0 using the Shackell algorithm, by selecting the case of the r-ordering and then performing the concerned operation. Returns the "Limit" value.

Example -:

>>> func = exp(1/x +exp(-x)) + exp(1/x)
>>> shackell(func, x, oo)
1

# Takes much lesser time than Gruntz algorithm.

There are many kinds of implementations of this particular algorithm, but this worked out to be my choice. Other possible executions can be discussed with my mentor.

b) Improving existing limit algorithms -:

Here comes the part of fixing issues. Many of the functions fail in certain situations. Present algorithm can be extended further to work for more functions. Some examples and issues to be solved here -:

1. Limit doesn't work with hyper (Issue 10802) - This issue basically states that -:

>>> limit(4*x*hyper([S(1)/2],[S(3)/2, S(3)/2],-x**2), x, oo)
AttributeError: 'TupleArg' object has no attribute 'compute_leading_term'

while,

>>> limit(hyperexpand(4*x*hyper([S(1)/2],[S(3)/2, S(3)/2],-x**2)), x, oo)
π

According to Aaron's comment on the issue, _eval_expand_tractable should be implemented in hyper. We can make limits work on hyper. We can implement _eval_expand_tractable to return hyperexpand(self), something like -:

if e.has(hyper):
    e.expand(tractable=True)

2. Limit(x**n, x, -oo) is sometimes wrong Issue 8634 -: The inconsistency of the result, with changing values of n is the problem, as described in the issue above.

Kalevi's comment on the issue describes the issue -:

I`m afraid this is not going to be easy to fix. It seems that limit invokes gruntz
even if that would not be necessary in this case. That will routinely replace the 
variable x with an expression of type -exp(y) (as x is expected to be negative). 
Then x**n will take the form exp(y*n + I*pi*n), which is correct. But afterwards, 
in passing to the limit y -> oo, the I*pi*n term is lost and the result will be exp(oo) = oo.

It seems to me that the limit should be computed without passing over to gruntz, 
but that would imply a major rearrangement of limit.

This major rearrangement of limits is what I intend to implement. The functions whose limits can be calculated are limited by the functions whose series can be calculated as the algorithm (gruntz) uses series. The case of x**n preceding the expression can be processed somewhat like this -:

trunc = False

class Limit(Expr) -:
     ....     #Existing code
     
     def doit(self, **hints):
          ..
          def truncate(self, z):
               # if Expr contains a Pow instance of `z**n`, truncate the expression, perform operations, 
               # leaving behind the term `z**n`. Also make `trunc` = True.

          if trunc:
               r = heuristics(e, z, z0, dir)   # Try a heuristics approach
          else:
               try:
                   r = gruntz(e, z, z0, dir)
                   if r is S.NaN:
                       raise PoleError()
               except (PoleError, ValueError):
                   r = heuristics(e, z, z0, dir)
                   if r is None:
                       return self
          
          return r

Truncating the expression is needed when x**n is present in multiple terms of the expression, as in Issue 8635, or in the case of an expansion, when truncate can directly be called.

3. Limit of fraction with oscillating term in the numerator -:

>>> limit((n+cos(n))/n,n,oo)
>>>  n + cos(n)
lim ──────────
n─→∞    n     

As described above, Gruntz cannot handle oscillating stuff due to the properties of the function fields. If we could find a way to express the fact that cos(x) is bounded oscillatory as x -> oo, gruntz could probably be adapted to compute limits like this. This is based on so called transseries, described in On the computation of limsups. The algorithm can be discussed with the mentors, and implemented effectively.

4. Correcting XFAIL tests in is_convergent() and test_kauers -:

There are a lot of XFAIL tests in concrete/test_sums_products.py, which are failing. The is_convergent() method should be made more robust. I have already implemented two new tests in the is_convergent() method, the Ratio Test PR #14158, and the Limit Comparison Test PR #14401, which helped solve previous failing examples. Likewise more tests have to be added to make is_convergent more robust.

Like, this example -:

>>> Sum(sin(n)/n, (n, 1, oo)).is_convergent()
NotImplementedError: The algorithm to find the Sum convergence of sin(n)/n**3 is not yet implemented

# whereas should be True

This suffers from an oscillatory problem, which has to be introduced under Dirichlet's tests.

5. Unifying series expansion

I think that Series needs a good user_interface for it's series package. Users should get a condensed information about series. So I plan to implement a new type of Series. After improvement and implementation, the same FormalPowerSeries Algorithm[1] can be used under this type of series. A new class can be constructed to work with all types of series and it's operations.

Class Unified_Series ---> 
        series(FormalPowerSeries, TaylorSeries, LaurentSeries, Expansion(truncated series with order term O())).

In this Way we will get unified version of series.

Basic class structure: <--un_series-->

  • Points Regarding to implement it-

    • I will be using the older FormalPowerSeries file for this.I am going to create a new class ,called the un_series(), where I will handle the properties of different types of series. This will give the user more idea about the corresponding series.

    • Methods, which are given below just require some checks while computing series. And It will be easy to implement all of them. This new Series will be more compatible for the user.

API -: un_series(x, x0, type="formal")

where, `type` can be -:
    * formal, for `FormalPowerSeries`
    * taylor, for `TaylorSeries`
    * laurent, for `LaurentSeries`
    * claurent, for `ClassicalLaurentSeries`

For an example,

>>> s = sin(x).un_series(x, 0, type="formal")
>>> s
  x + x**3/6 + x**5/120 + ...
  or it can interpreted as "summation(a(n)*x**n, (n, 0, oo))"

>>> s.coefficients(n=5)
[1, 0, 1/6, 0, 1/120]

Sometimes it is more convenient to obtain asymptotic expansion for some expression or function, bringing the order term back into the result:

>>> s = sin(x).un_series(x, 0, type="formal")
>>> s.as_expansion(x,6)
x + x**3/6 + x**5/120 + O(x**6)

Operations on un_series

All the existing operations of the FormalPowerSeries will be included, the only difference being that while performing operation, un_series() should track about the property of function (what is the type of series).

un_series() should also have the property of Internal Representation. One very important difference in series, which is a classic example of difference between Internal Representation, and Screen Representation is -:

While both PowerSeries and TaylorSeries are the same on screen,
PowerSeries track coefficients near (x^k) terms, but TaylorSeries 
track coefficients near (x^k/{n!) ones.

Some new properties are to be added to the existing ones, whose implementations are listed below -:

>>> s = sin(x).un_series(x, 0, type="formal")
>>> s.coefficients(n=5)
[1, 0, 1/6, 0, 1/120]

>>> s = sin(x).un_series(x, 0, Kind="taylor")
>>> s
x + x**3/6 + x**5/120 + ...(same as it for formal)
>>> s.coefficients(n=5)
[1, 0, 1, 0, 1]

# The above difference is due to the internal representation theory, given above

>>> s = sin(x).un_series(x, 0, type="formal")
>>> s.expression
sin(x)
>>> s.variable
x

>>> s.is_periodical()
False

>>> s.is_finite()
False

>>> s.to_taylor().coefficients(n=8)
[0, 1, 0, -1, 0, 1, 0, -1]

>>> s.to_taylor().is_periodical()
True

>>> s.to_taylor().period()
4

>>> s.to_taylor().sequence
Sequence(periodical=True, [0, 1, 0, -1])

Now comes the application of the Laurent Series part. (Presence of both negative and positive powers).

>>> s = (sin(x)/x**4).series(x)
>>> s
x**(-3) - 1/(6*x) + x/120 - x**3/5040 + x**5/362880 + ...
>>> type(s)
<class 'sympy.series.FormalLaurentSeries'>

>>> s.principalPart()
x**(-3) - 1/(6*x)

>>> (sin(x)/x**4).series(x, type="taylor")
SeriesError: Cannot expand series.

Now comes the application of the Classical Laurent Series part. (Presence of only negative powers).

>>> s = (exp(1/x)).un_series(x, oo)
>>> s
1 + 1/x +  1/(2*x**2) + 1/(6*x**3) + 1/(24*x**4) + 1/(120*x**5) + ...(if power of x is ClassicalLaurent , we can detect while handling it)
>>> type(s)
<class 'sympy.un_series.ClassicalLaurentSeries'>
>>> s.x0 | s.point
oo
>>> s.principalPart().is_finite
False
>>> s.positivePart().is_finite
True

All these types of series can be distinguished internally, so the user will receive a good interface for the unified representation of series. All the properties discussed in the above examples will be implemented, corresponding to the series they belong. More can be implemented, by discussion with the mentors.

Proposed Timeline

Given below is the timeline, pointing the exact time-frame I will require to complete each subproject.

Community Bonding Period (23 April - 13 May)

I will use this period to study the rs_series module, understanding all its functions, and their ring implementations. I will be discussing all the subprojects with my mentors, so that all the ideas and their approaches are crystal-clear. I will try to be as familiar as possible with the SymPy series code-base. My exams will be going on from 1st May upto 5th May, but I will be able to finish off the pre-requisites needed to start the coding period.

Coding Period

Week 1 (14 May - 20 May)

  • In this week, I would be focusing on implementing the MRV_Gruntz algorithm. PR #7529 has an unmerged version of the algorithm, which I will be modifying along the lines of the pseudocode I have provided in the implementation details.
  • I will implement the hierarchical representation in the MRV_Gruntz algorithm, by adding suitable parameters and keyword arguments.
  • A Pull Request will be submitted at the end of week 1, ensuring the implementation of asymptotic expansions using this algorithm.

Week 2 (21 May - 27 May)

  • Test cases, and documentation will be added for MRV_Gruntz algorithm in a separate appropriate file.
  • Focus will then be shifted towards the extension of eval_aseries for special functions in functions/special. It will ensure the extension of asymptotic series for all functions. Test cases will be added.
  • PR #7552 has some unfinished work in this area. I will port the code from the above-mentioned PR, and make a PR on this sub-project by the end of this week.

Week 3 (28 May - 3 June)

  • Next sub-project under the plan, FormalPowerSeries will now be commenced. I will start by correcting all the XFAIL tests currently listed above under this section.
  • Discuss with mentors regarding greater optimization of existing fps methods.
  • Adding new tests intest_formal.py through a PR, which will be submitted by the end of this week.

Week 4 & 5 (4 June - 17 June)

  • Implement newer operations in formal.py, as stated in the plan.
  • Majority of focus in Week 4 will be given upon the application of computing the coefficient sequence of power of FPS``, and composition. Methods coefficientandcompose_series` will be suitably implemented, and their corresponding test cases also will be added, along with the documentation..
  • Week 5 will be used for implementing convolution and inversion operations in formal.py .Corresponding methods ,APIs and test cases will be implemented. This will vouch for the completion of extension of the existing formal.py.
  • A PR containing the revised changes will be submitted by Week 5.

** PHASE-1 Evaluation -:**

As per the above schedule, I will have complemented my proposed implementations in Asymptotic expansions and Formal Power Series,and also corrected the XFAIL tests, wherever proposed. I will also take care of the required documentation in each PR I submit, which leads to a better understanding of the code changes I am bringing in.

Week 6 & 7 (18 June - 1 July)

This, according to me, will be the most challenging part of the project, because ring_series is a difficult nut to crack. So, I will devote more time to this particular sub-project.

  • In Week 6, I will be implementing Formal Power Series along the lines of ring field. I will port the code, based on the pseudocode I have given in the plan, into a new file polys/ring_formal_series.py. This will drive the project towards SymPy's series vision : Make rs_series the ultimate series module.
  • Test cases will be added, and all operations in FormalPowerSeries class in formal.py will be ported, under sring implementation, into the above file.
  • In Week 7, I will be extending my above approach to Formal Laurent and Puiseux series.
  • All relevant test and documentation will be added, and a PR will be sent.

Week 8 (2 July - 8 July)

  • In this week, I will add special functions to ring_series.py, to make rs_series method more efficient. Best methods to implement, will be discussed by me with the mentors.
  • Test cases, and relevant documentation will be added.
  • A PR containing the implementation of special functions will be issued.

PHASE-2 Evaluation -:

As stated above, project work related the ring-series will be wrapped up by the end of Phase-2. I will try to ensure that the basic implementations in the PRs are present, and that all related test cases are included.

Week 9 & 10 (9 July - 22 July)

  • The major part of Week 9 & 10 will focus upon the implementation of Shackell's algorithm, in a separate file series/shackell.py. This will optimize the already present Gruntz Algorithm used for computing limits, and @slow tests will become faster to compute.
  • All methods described in the plan will be implemented, and test cases will also be added.
  • Week 10 will focus on completing the test cases, and loose corners of the above algorithm, and then move on to focus upon Fixing existing major limit issues. This will hugely improve the present limit algorithms.
  • is_convergent() will be made more robust, by fixing the XFAIL tests.
  • A PR will be submitted after Week 9, containing Shackell's algorithm, and another one, addressing limit issues will be submitted by Week 10.

Week 11 (23 July - 29 July)

  • The last subproject Unifying series expansions will be implemented by this week.
  • A basic class structure un_series() will be created, which will contain various methods, linking to the variousd series methods.
  • All operations will be implemented in this class, as described in the details of the subproject.
  • All the tests and documentation will be added.
  • A PR containing the above changes will be submitted.

Week 12 (30 July - 5 August)

  • The last week will be reserved for completing any unfinished PR, or documentation issues.
  • A final PR would be sent, issuing the final coding changes, bringing the coding period to an end.
  • The final report, describing my entire GSoC work, will be written and posted by me, on my blog. I will also create a Wiki page, describing what needs to be done in the future in the series module.

Week 13 (6 August - 13 August)

  • Final submission of the project, and Final evaluation.

Post GSoC

After working with SymPy over 8 months, I would be very eager to continue my affiliation with SymPy, and continue to contribute in Series and limits, and also to other issues. I would keep working for the betterment of this library, and possibly become a mentor, or participate in some conferences to engage myself in the picturesque world of SymPy and open-source.

Notes

I have got no other commitments for the summer, and thus would be able to devote 45-50 hours full time per week. My college will restart on 27th July, but due to the light coursework at the start of the semester, I would be able to devote the same amount of energy and time, and thus finish the project. Also, due to familiarity with SymPy for about 5 months, and also having no prior commitments in summer, I can devote extra time to my project, possibly commence in the community bonding period if all prior discussions with the mentors are completed, or if I am running behind schedule. If I miss a particular alloted slot in some week, I will make sure to inform my mentor in advance, and would also surely make up for the time, by devoting extra time the following weeks.

References

[1] - On the computation of limsups - by Joris van Der Hoeven

[2] - A New Algorithm Computing for Asymptotic Series - by Dominik Gruntz

[3] - On computing limits in a symbolic manipulation world - by Dominik Gruntz and Wolfram Koepf

[4] - Fast algorithms for manipulating formal power series - by RP Brent

[5] - Mathematica - Wolfram.com

[6] - Algorithms for computing asymptotic forms - by John R. Shackell

[7] - Mailing list discussion on my proposal

[8] - Wikipedia, on various topics of my sub-projects.

[9] - Previous years' proposals listed on this page

[10] - Shivam Vats' GSoC 2015 Application

[11] - Pull requests by Avichal Dayal

⚠️ **GitHub.com Fallback** ⚠️