En route to Polynomial - isuruf/symengine GitHub Wiki
Design decisions and progress while writing the Polynomial
(in chronological order)
Developed during GSoC 2015 under Sumith and Shivam's project.
Inputs received from Ondřej Čertík and Francesco Biscani.
- Planned to implement our own
hashtableandIntegerin SymEngine so as to have a fast Polynomial module, Piranha has both of it's own and the speed that which we have to match.
- Sumith started work on
UnivariatePolynomialclass, got merged eventually.
Remarks
- Uses
std::mapinstead ofstd::unordered_map(or evenpiranha::hash_set) which is why it is slow.
- Supports only univariate polynomial, can be deprecated once multivariate is implemented.
-
mul_polyuses a very naive algorithm so doesevalandeval_bit
- Uses
- Decided to work on the
poly_mulalready present in rings.cpp (here), try to speed up expand2b. Once we hit the speed in lower level, then wrap into aPolynomialclass.
Todos to reach there were:
- Use
piranha::integer
- Pack exponents of the polynomial instead of using a vector for them.
- Use
- To begin,
expand2baveraged94.6msat start, I'll refer to the benchmark results in my machine so that you get an idea of speedups and slowdowns.
- Before packing we used
vectorof exponents tompz_classunordered_map, one more option instead of vector wasvalarray, but didn't lead anywhere, instead brought slowdown.expand2baveraged around112ms.There were very few instances were the syntactic sugar came handy. We are assuming that bottleneck is in memory allocation time, valarray will probably not bring much over vector.
- We used
piranha::integerinstead ofmpz_classwhich brought speedup but not at all close to Piranha. Hence we moved from using containerstd::unordered_maptopiranha::hash_settoo
Following were the problems faced and how they were resolved
- As
piranha::hash_setis syntactically similar tostd::unordered_set, each element is a pair. Here, we were unable to usestd::pairbecause of immutability. Hence,m_pairwas written.
- We noticed that sparsity of
hash_setwas less, this was a real slowdown. Due to the fact ofpower of twopacking of exponents were used, hence the hashing was not good and poor sparsity. Then we decided to use Piranha's packingpiranha::encode. This uses packing byprimes. Two advantages; one,negativeexponents can also be packed, two, sparsity improved, hence speed. This is when we decided to hard-depend on Piranha for the Polynomial module.
- Once both
piranha::integerandpiranha::hash_setwere used the speed came down to23ms, that is 4X from the initial94ms. But we were still far from Piranha's13.5ms.
- After investigating we realised that
DNDEBUGwas not passed as a compiler flag leading toDebugbuild of Piranha, once this was fixed, we reached an average of14.3ms. Yay
- As
- One problem still remained. It was noticed after all of this was done
expand2bnow showed averages of114ms, a slowdown. So wasexpand2. Usinggit bisectthis was cornered to include ofpiranha.hppheader. The problem being this inclusion includedthread_poolheader of Piranha, which declared a global thread variable, which turned off some compiler optimizations. But this is not necessary for us as we use only single threaded functions. This was resolved by amending Piranha, the classthread_poolwas templatized i.e. replacedclass thread_pool: private detail::thread_pool_base<>withtemplate <typename = void> class thread_pool_: private detail::thread_pool_base<>.
All the speed reports of benchmarks can be found here.
Good to start the wrapping of Polynomial class.
Polynomials are like a 3D space --- one axis is dense/sparse, another axis is the list of generators, and another axis are the coefficient type (ZZ, QQ or a polynomial, which by itself has a 3D space of options). Probably one can find an application where each of the choices will be the fastest. However, SymPy and Piranha experience seems to suggest that using a sparse and a non-recursive (i.e. providing the full list of generators and just use ZZ, QQ or EX) polynomial is very fast and should provide a competitive solution for most applications. This motivates our design below.
There is a class Ring<ZZ> (integers), Ring<QQ> (rational) and Ring<EX> (any kind of SymEngine expression) for storing the coefficient type as well as generators. A Polynomial class then stores something like RCP<const Ring<ZZ>> inside it to provide the type of coefficients as well as the list of generators. poly_mul then checks the pointer to the Ring class to ensure it is the exact same instance, otherwise it raises an exception (i.e. the generators and the coefficient type must be the same, otherwise it raises an exception).
Example 1: a polynomial 2*x*y^2 + y^4 can be represented as a Polynomial instance with a hashtable {(1, 2): 2, (0, 4): 1} and a pointer to a Ring<ZZ>(x, y) (i.e. with generators x and y).
Example 2: a polynomial 2/3*x*y^2 + y^4 can be represented as a Polynomial instance with a hashtable {(1, 2): 2/3, (0, 4): 1} and a pointer to a Ring<QQ>(x, y).
Example 3: a polynomial sin(a)*2/3*x*y^2 + y^4 can be represented as a Polynomial instance with a hashtable {(1, 2, 1): 2/3, (0, 4, 0): 1} and a pointer to a Ring<QQ>(x, y, sin(a)).
Example 4: a polynomial a*x can be represented as a Polynomial instance with a hashtable {(1, 1): 1} and a pointer to a Ring<ZZ>(x, a). Another way to represent it is as {(1): a} with a Ring<EX>(x), but that will be slower, because the general expressions EX as coefficients are slower than ZZ.
Example 5: a polynomial 2*x*y^2 + y^(-4) can be represented as a Polynomial instance with a hashtable {(1, 2): 2, (0, -4): 1} and a pointer to a Ring<ZZ>(x, y) (i.e. with generators x and y).
Example 6: a polynomial 2*x*y^(3/2) + y^(-1/4) can be represented as a Polynomial instance with a hashtable {(1, 3/2): 2, (0, -1/4): 1} and a pointer to a Ring<ZZ, QQ>(x, y). The first template (ZZ) is the coefficient type, the second template is the exponents type (QQ).
The exponents are signed integers (i.e. positive or negative) by default. They can be specified with the second template to Ring to be rational.
Instead of using ZZ, QQ, EX, we can instead use the C++ type itself, for example piranha::integer instead of ZZ, piranha::rational instead of QQ and SymEngine::Expression instead of EX.
The ring series is implemented just like in SymPy, e.g.:
rs_mul(const Polynomial &a, const Polynomial &b, Polynomial &c, Generator x)
inside it calls poly_mul which checks that a and b have the same Ring, otherwise it raises an exception.
-
Taylor series are polynomials with positive exponents and the ring
Ring<ZZ>,Ring<QQ>,Ring<EX>(i.e. it depends on the coefficient types, but the exponents are always ZZ, the default). -
Laurent series are polynomials with positive or negative exponents and the ring
Ring<ZZ>,Ring<QQ>,Ring<EX>(i.e. it depends on the coefficient types, but the exponents are always ZZ, the default). -
Puiseux series are polynomials with positive or negative exponents and the ring
Ring<ZZ, QQ>,Ring<QQ, QQ>,Ring<EX, QQ>(i.e. it depends on the coefficient types, but the exponents are always QQ).