GSoC 2015 Report Shivam Vats: Fast Series Expansion - sympy/sympy GitHub Wiki

Brief Information

Project : Fast series expansion using sparse representation.
Proposal : wiki page
Blog : blog
Email : [email protected]

I am Shivam Vats, currently a 3rd year undergraduate student of Mathematics and Computing at IIT Kharagpur.

Introduction

I was introduced to Sympy by Harsh Gupta who is himself a former GSoC student and senior to me in my department. I wanted to learn Python and he suggested Sympy as a great place to learn it by doing useful work. And indeed, it is one of the most beginner friendly projects out there. I started from scratch and was guided in by initial days by Sergey. Later on, I accidentally stumbled upon SymEngine and started contributing to it too.

Application Phase

I had experimented with quite a few modules of Sympy and most of the topics I was interested in (Plotting, polynomials, etc) were either too advanced for me or were not being actively developed. On the other hand, SymEngine was quite new and a lot of things needed to be done. Moreover, I like coding in C++ and Series expansion seemed to be an important functionality.

I had extensive discussions with Ondrej, during which he told me about the things he had tried and what seemed to work best. The very first decision we had to take was how to represent a series. Sympy does so as expressions, which is quite slow. Moreover, he also pointed out that a truncated series is effectively a polynomial. So, we finally decided to use polynomial operations, which meant we could reuse the polynomial module. Further, sparse representation was chosen as it gave decent execution timings over a large range of series sizes.

SymEngine lacked a Polynomial module that the time, which meant I had to write the parts required for my project. I was thus, expecting most of my work to be in SymEngine. Though I planned to implement fast expansion in Sympy too, as a proof of concept, I expected to finish it in the first month.

Coding

Fortunately for me, Sumith's proposal for Polynomials got accepted as well. This meant I could focus on series expansion. Sympy at the time, already had a limited expansion functionality implemented using Rings. It had been done by Mario, who had written optimised algorithms for the same. However, a larger porting of his work on series expansion had not been merged and lay in a long forgotten pull request. I started by porting all those functions in ring_series.py. This proved to be quite a daunting task as I was not aware of all the algorithms used and in many cases new functions had to be implemented to get equivalent behaviour.

One important point about Mario's code was that it was written for power series, meaning all the exponents in the series were positive integers. We wanted out algos to support the most general series, meaning a puiseux series( fractional and negative powers). This called for rewriting all the functions. The possibility of fractional powers meant we had to change the way we calculated the order of the series. Some of the algorithms are semi-numerical in nature, which make use of the fact that the series is a power series. This was done in #9495 with major help from Mario who also pointed our the flaws in my method.

The way we create a new sparse polynomial is with something like ring('x', QQ), which basically creates a polynomial object in the variable x with rational coefficients. We can of, course choose any number of variables and different types of coefficients. The next step was to have series involving symbolic terms- things like sin(x + a), when you expand wrt say, x. This was partly done in #9575. Polynomials, do in fact have an EX domain, which allows symbolic coefficients, but at the cost of significant slow down. Since, a major goal of my project was to improve execution speed, this was not acceptable.

What I then did was simple. I used the definition of a variable. For a ring, a variable is something that was specified at the time of creation and a coefficient is anything multiplied with or added to any variable. So, instead of creating a new coefficient domain, I added all the required symbolic coefficients as variables to the ring. For example, for sin(x + a), this means adding sin(a) and cos(a) as variable to the ring. Problem solved?

Not yet! It turned out that the restriction of a polynomial to positive powers was hard coded in polynomials which made operations involving different series very difficult. Eventually, I had to remove the restrictions, compromising mathematical accuracy in exchange for practical usability. So, now I had the elementary operations ready. But they weren't too useful for most users. We needed the equivalent of Sympy's existing series function that could expand an arbitrary expression. This was done in #9775 and was definitely the most challenging part of my project.

I had proposed in my proposal to write classes for different types of series and I did start writing them in #9614. But I eventually realised that they weren't really needed. If the rs_series could be smart enough to know which ring_series function to call and with the right arguments, I was done. The way the functions are evaluated, different series have different rings associated with them. Thus, while operating on two or more different series, they need to be converted appropriately. A lot of similar conversions and checks are done in rs_series to make sure that the answer is correct and uses the minimum possible operations.

I kept running into unforeseen difficulties during the course of the project and the timeline in my proposal became irrelevant after a point. That is partly also because I started developing ring_series as a potential replacement of the present series method, instead of implementing it as a proof of concept only. As for SymEngine I could not do any work related to series expansion as the polynomial module couldn't be completed. So, I mostly helped out on a few issues not related to my project.

Future Plans

Ring series seems very promising with excellent speed ups. I intend to continue developing it. The first target is the make rs_series work with all the ring_series functions. There are a number of bugs and I am sure many more are to be caught. I am also working on detailed documentation so that interested people can help. The important thing is to get people to start using it.

There is a lot of work in porting the code to SymEngine.

Conclusion

I had a great time working with Sympy during the summer. The whole GSoC experience is full of life lessons, of which, coding is a minor part. Community building is definitely the most important aspect of open source development; because at the end of the day all of us want to make new friends :)