GSoC 2022 : Improving Series Expansions and Limit Computations Report by Anutosh Bhat - sympy/sympy GitHub Wiki
This report summarizes my work during GSoC 2022, on the project Improving Series Expansions and Limit Computations with SymPy. The links to the PRs are in chronological order.
About Me
I am Anutosh Bhat, a fourth-year undergraduate student pursuing a major in Data Science from the Indian Institute of Technology (IIT) Madras. I am open source enthusiast always looking forward to contribute to project involving any scientific/mathematical computation.
Project Synopsis
The primary goal of the project could be divided into 4 - 5 parts.
- Refactoring/Implementing leading term & series method for functions involving branch cuts/ branch points such as
pow
,log
,inverse trig
andinverse hyperbolic
. - Adding leading term and series support to elementary and special functions mainly revolving around
bessel
,gamma
anderror
type functions. - Fixing limit errors through gruntz algorithm and improving performance of the algorithm overall by refactoring blocks in some functions like
mrv_leadterm
. - Addressing miscellaneous bugs and implementation ideas related to the series module.
- Refactoring written code for order.py and extending functionality of the order term. (Was proposed but the goal couldn't be achieved due to time constraints and hence this section has been shifted to Future Work,)
Pull Requests
Merged
- #23492: Refactored leading term method for
log
Function: This Pr was essentially the foundation for the Phase 1 of my project. As the limit calculations for functions involving logarithms weren't performing well at points lying on branch cuts, the_eval_as_leading_term
helper method for the log function needed to be refactored/re-written. This would eventually be helping us with addressing series and limits of other functions like hyperbolic and inverse hyperbolics that can be expressed using the log function. My mentors and I engineered a 5 step algorithm that was able to tackle all cases from basics to the complex ones. The basis of the algorithm is that, in leading term methods the parameterx
is expected to tend to0
along the half-rayx = cdir*t
wheret
isreal
andpositive
.Hence we can define a positive dummy variablet
and substitutecdir*t
forx
in the argument oflog: z = arg0.subs(x, cdir*t)
. The algorithm is further explained in the Pr description. Apart form this, some notable things/issues which were fixed are -
- Fixed buggy leading term method for
Ei
,Chi
andbessely
functions. - Implemented branch cut handling for leading term methods of
inverse trigonometric functions
. - #21227 : Series expansions for nested logarithms involving the
logx
parameter were fixed
def test_issue_21227():
f = log(-log(x))
assert f.nseries(x, logx=y) == log(-y)
assert f.nseries(x, logx=-x) == log(x)
f = log(log(x))
assert f.nseries(x, logx=y) == log(y)
assert f.nseries(x, logx=-x) == log(-x)
assert f.nseries(x, logx=x) == log(x)
f = log(log(log(1/x)))
assert f.nseries(x, logx=y) == log(log(-y))
assert f.nseries(x, logx=-y) == log(log(y))
assert f.nseries(x, logx=x) == log(log(-x))
assert f.nseries(x, logx=-x) == log(log(x))
Currently no new issues for limit calculations of functions involving logarithms have been reported after the new algorithm was introduced . If any issues will be discovered , those would be looked into. Overall 5 issues were closed through this pull request.
- #23715: Implemented/Fixed a few series methods for
inverse trig/hyperbolic
functions: This PR fixes a long-standing issue (#3663) from 2008 where SymPy isn't able to calculate correct series expansions for inverse hyperbolic functions. Hence it wasn't unknown that SymPy doesn't perform well while calculating limits and series expansions forinverse trig/hyperbolic functions
at points lying on branch cuts and otherwise. To tackle these changes, thetaylor_term
methods for all 12 inverse trig/hyperbolic functions was looked into where the buggy ones were fixed and in some cases a new method was introduced. After this, the_eval_as_leading_term
methods for all 6 inverse hyperbolic functions were re-written and specific code blocks in_eval_as_leading_term
methods for all 6 inverse trig functions were refactored. Once these methods were correctly framed, heavy testing was added catering to points from all over the axis (branch points, points lying on branch cuts and general points). We also made sure to test leading terms being calculated from different directions of approach (cdir
beingreal
orimaginary
). Apart from this, some other notable changes introduced in this Pr are -
- #23752 : Fixed branch cut handling for
pow._eval_as_leading_term
andpow._eval_nseries
method where new direction (ndir
) being calculated with respect tobase
of thepow
expression isreal
.
def test_issue_23752():
expr1 = sqrt(-I*x**2 + x - 3)
expr2 = sqrt(-I*x**2 + I*x - 3)
assert limit(expr1, x, 0, '+') == -sqrt(3)*I
assert limit(expr1, x, 0, '-') == -sqrt(3)*I
assert limit(expr2, x, 0, '+') == sqrt(3)*I
assert limit(expr2, x, 0, '-') == -sqrt(3)*I
def test_nseries():
x = Symbol('x')
# test issue 23752
assert sqrt(-I*x**2 + x - 3)._eval_nseries(x, 4, None, 1) == -sqrt(3)*I + sqrt(3)*I*x/6 - \
sqrt(3)*I*x**2*(-S(1)/72 + I/6) - sqrt(3)*I*x**3*(-S(1)/432 + I/36) + O(x**4)
assert sqrt(-I*x**2 + x - 3)._eval_nseries(x, 4, None, -1) == -sqrt(3)*I + sqrt(3)*I*x/6 - \
sqrt(3)*I*x**2*(-S(1)/72 + I/6) - sqrt(3)*I*x**3*(-S(1)/432 + I/36) + O(x**4)
assert cbrt(-I*x**2 + x - 3)._eval_nseries(x, 4, None, 1) == -(-1)**(S(2)/3)*3**(S(1)/3) + \
(-1)**(S(2)/3)*3**(S(1)/3)*x/9 - (-1)**(S(2)/3)*3**(S(1)/3)*x**2*(-S(1)/81 + I/9) - \
(-1)**(S(2)/3)*3**(S(1)/3)*x**3*(-S(5)/2187 + 2*I/81) + O(x**4)
assert cbrt(-I*x**2 + x - 3)._eval_nseries(x, 4, None, -1) == -(-1)**(S(2)/3)*3**(S(1)/3) + \
(-1)**(S(2)/3)*3**(S(1)/3)*x/9 - (-1)**(S(2)/3)*3**(S(1)/3)*x**2*(-S(1)/81 + I/9) - \
(-1)**(S(2)/3)*3**(S(1)/3)*x**3*(-S(5)/2187 + 2*I/81) + O(x**4)
Currently no new issues for faulty leading terms or taylor terms for inverse trig/hyperbolic functions
have been reported after the new methods were introduced . If any issues will be discovered , those would be looked into. Overall 5 issues were closed through this pull request.
- #23798: Fixing/Implementing
nseries
methods forinverse hyperbolic/trig
andlog
functions: The last Pr as we know was focused on fixing/implementing thetaylor_term
and_eval_as_leading_term
methods, hence by obvious choice the following Pr should work towards correcting_eval_nseries
for these functions. Hence_eval_nseries
methods for all 6inverse hyperbolic
functions was implemented and specific code blocks (those responsible for handling points lying onbranch cuts
) from the method were refactored in case of all 6inverse trigonometric
functions. As these functions can be expressed in terms of logarithms, the Pr also had to make sure thatlog._eval_nseries()
was working perfectly for all cases, as we would be using theself.rewrite(log)._eval_nseries()
call for some specific cases in these functions. Hencelog._eval_nseries()
method was refactored too by introducing apositive dummy variable - t
and replacingx
withcdir*t
. A block based on theHeaviside
function was added and we would be subtracting a factor of2*I*S.Pi
once a set of conditions are met. Good amount of testing was added for the same.
def test_log_nseries():
p = Symbol('p')
assert log(1/x)._eval_nseries(x, 4, logx=-p, cdir=1) == p
assert log(1/x)._eval_nseries(x, 4, logx=-p, cdir=-1) == p + 2*I*pi
assert log(2*x + (3 - I)*x**2)._eval_nseries(x, 3, None, 1) == log(2) + log(x) + \
x*(S(3)/2 - I/2) + x**2*(-1 + 3*I/4) + O(x**3)
assert log(2*x + (3 - I)*x**2)._eval_nseries(x, 3, None, -1) == -2*I*pi + log(2) + \
log(x) - x*(-S(3)/2 + I/2) + x**2*(-1 + 3*I/4) + O(x**3)
assert log(-2*x + (3 - I)*x**2)._eval_nseries(x, 3, None, 1) == -I*pi + log(2) + log(x) + \
x*(-S(3)/2 + I/2) + x**2*(-1 + 3*I/4) + O(x**3)
assert log(-2*x + (3 - I)*x**2)._eval_nseries(x, 3, None, -1) == -I*pi + log(2) + log(x) - \
x*(S(3)/2 - I/2) + x**2*(-1 + 3*I/4) + O(x**3)
assert log(sqrt(-I*x**2 - 3)*sqrt(-I*x**2 - 1) - 2)._eval_nseries(x, 3, None, 1) == -I*pi + \
log(sqrt(3) + 2) + I*x**2*(-2 + 4*sqrt(3)/3) + O(x**3)
assert log(-1/(1 - x))._eval_nseries(x, 3, None, 1) == I*pi + x + x**2/2 + O(x**3)
assert log(-1/(1 - x))._eval_nseries(x, 3, None, -1) == I*pi + x + x**2/2 + O(x**3)
Apart from this, some other notable changes introduced in this Pr are -
- Refactored series method in expr.py to use the cdir parameter effectively. As a result of this change SymPy can now cater to a more vaster cases of interest.
def test_log_series():
# Note Series at infinities other than oo/-oo were introduced as a part of
# pull request 23798. Refer https://github.com/sympy/sympy/pull/23798 for
# more information.
expr1 = log(1 + x)
expr2 = log(x + sqrt(x**2 + 1))
assert expr1.series(x, x0=I*oo, n=4) == 1/(3*x**3) - 1/(2*x**2) + 1/x + \
I*pi/2 - log(I/x) + O(x**(-4), (x, oo*I))
assert expr1.series(x, x0=-I*oo, n=4) == 1/(3*x**3) - 1/(2*x**2) + 1/x - \
I*pi/2 - log(-I/x) + O(x**(-4), (x, -oo*I))
assert expr2.series(x, x0=I*oo, n=4) == 1/(4*x**2) + I*pi/2 + log(2) - \
log(I/x) + O(x**(-4), (x, oo*I))
assert expr2.series(x, x0=-I*oo, n=4) == -1/(4*x**2) - I*pi/2 - log(2) + \
log(-I/x) + O(x**(-4), (x, -oo*I))
I had encountered an issue based on asymptotic series expansions(#23843), when I was doing some basic testing after the Pr was merged. This issue will be looked into soon and other than this everything with respect to series expansions and limit calculations for inverse trig/hyperbolic functions
looks good. Overall 1 issue was closed through this pull request.
- #23844: Implemented/Fixed a few series methods for
bessel
functions: This PR was meant to fix/implement_eval_as_leading_term
and_eval_nseries
methods forbessel
andmodified bessel
functions but ended up fixing a lot more than what was expected. Good amount of testing was added for limit of all 4 bessel functions at bothreal
andimaginary
infinities. Apart from this ,some notable changes introduced were
- Refactored
mrv_leadterm
function ofgruntz.py
to make use of theleadterm
method rather than thecalculate_series
method . This refactoring essentially deserved another pull request but a couple of issues based onbessel
functions like #19154 would require thegruntz
refactoring if we would like to fix that, hence code for that was added . This eventually also led to deprecation ofcalculate_series
function and thecompute_leading_term
function from the codebase. - Fixed limits/leading term methods for
Chi
,Shi
,polygamma
and theharmonic
function. - Series support was added for
pow._eval_nseries()
where theexponent
isvariable
and the number of terms in the series expansion can be expressed in terms of the exponent.
>>> n, k = symbols('n, k', positive=True)
>>> ((1 + 1/n)**k)._eval_nseries(n, -k + 3, None)
n**(-k) + k*n**(2 - k)*(k - 1)/2 + k*n**(1 - k) + O(n**(3 - k))
>>> ((1 + 1/n)**k)._eval_nseries(n, 4, None)
n**(-k) + k*n**(3 - k)*(k - 2)*(k - 1)/6 + k*n**(2 - k)*(k - 1)/2 + k*n**(1 - k) + O(n**4)
All limit and series issues based on bessel
functions have now been addressed from the issues page . If any issues turn up in future, those would be looked into. Overall 6 issues were closed through this pull request.
Work in Progress
- Ongoing
- #24020 : Impelementing leading term and nseries methods for
frac
function - #24027 : Implementing some leading term and series methods for
gamma
functions (uppergamma
,lowergamma
andloggamma
) anderror
functions (expint
,Si
,Shi
andChi
).
- To be done
- Fix
Piecewise._eval_as_leading_term()
. SymPy's limit handling forPiecewise
functions is not up to the mark as of now (at points where the piece changes or at points of discontinuity). Some related issues which need to be addressed are the following : #23836, #23299, #6171, #23299 - Implementing leading term and series methods for the following functions -
zeta
,lerchphi
,polygamma
,beta
,elliptical
,inverse error functions
likeerfcinv
,erfinv
. There are issues scattered on SymPy's issues page which demands for implementation of these methods, a recent one being #23635 . Alsoaseries
support should also be added forgamma
,factorial
andbessel
functions.
Issues Opened
- #23843: Asymptotic series for atan/acot functions shifted to their branch cuts gives wrong answer
- #23752: Fix limits and series for a few particular cases of sqrt function at branch point 0
Future Work
- A lot of special functions now have an
_eval_as_leading_term
and_eval_nseries
methods correctly implemented, but a few remain. Support for the remaining few should be added and also asymptotic expansions through_eval_aseries
could be added if they exist for the respective function. - Improving documentation throughout the series/limits module - SymPy's series module makes use of lot of helper methods like
taylor_term
,_eval_nseries
,_eval_lseries
,_eval_aseries
and_eval_as_leading_term
but there isn't any clear guideline or documentation regarding what each of these do or the significance of these methods. We might also need to highlight the important constraints to keep in mind while implementing any of these methods for a particular function from the functions module. These documentation changes are being tracked in #23625 - Refactor Order code and order.py to handle more cases. This was proposed in my proposal but couldn't be acheived due to time constraints. Some notable issues which should be addressed are the following
- #22836 - Possible improvements for Order of expressions involving factorials.
- #22247 - Refactoring Order for Multivariate expressions and improving individual if-else blocks to handle specific cases.
- #19120, #10290, #7872 - Issues related to
_eval_subs
method inorder.py
. - #22590 - Issues related to convergence tests (particularly
Dirichlet
andRaabe
test) and order term.
- Extending
ring_series
module.
Conclusion
I am quite impressed with the current state of the series and limits module right now . A good number of issues which were pending since a long time have now been resolved. Earlier, the gruntz
algorithm was responsible for most of the limit computations for functions of type pow
, log
, inverse trig
and inverse hyperbolic
at points lying on branch cuts
. The algorithm being somewhat recursive was slow in many cases and as the calculate_series
method was fragile and buggy, the algorithm would often return wrong results . Now that the _eval_as_leading_term
and _eval_nseries
methods have been refactored or completely re-written in most cases we not only have more control on the correctness of the results being returned but also on the speed behind these computations. The calculate_series
method also stands deprecated as of SymPy version 1.12
. As the series module reaches a good extent of development, extending the faster ring_series
module should be the next target for development.
I had been hearing amazing things about GSoC since the past 2 - 3 years and my GSoC tenure exactly lived upto its expectation. I had an exceptional experience past summer working for SymPy. Apart from improving my coding skills, I got a gist of how documentation and testing driven software development works which would surely be of great help when I move to the industry in the upcoming years. I am grateful to SymPy for this experience. I am highly grateful to my mentors, Kalevi Suominen and Sidharth Mundhra for always being there whenever and wherever I was stuck and have guided me in every possible way. I would also like to thank the org admins Aaron Meurer and Oscar Benjamin for being readily available throughout the Summer and pitching in for code reviews whenever they had time.
I plan to keep contributing to SymPy, trying to finish the work proposed in the Future Work after addressing the issues pointed out in the Work in Progress section. I would also like to pitch in as a co-mentor for modules such as the series, concrete and stats module for GSoC 2023 if required.