Prob 0012 ‐ Highly Divisible Triangular Number - maccergit/Euler GitHub Wiki
Yet another problem that builds upon factorization of numbers.
01
We immediately realize that triangular numbers are formed from Gauss's formula, so we don't need to loop to compute triangular numbers :
T_n = \frac{n(n + 1)}{2}
We've already investigated approaches to doing prime factorization and have some fast ways to do that. However, we want a count of all the divisors, not the prime factors. Since we know all the prime factors, the divisors are all the ways the prime factors can be combined - if we include $p^0 = 1$ as a power of each prime factor, then when the prime factors are expressed as powers of primes ($p^n$), then the number of possible combinations for each prime factor is $n + 1$; and the total number of combinations is the product of all the possible combinations for each prime factor :
T = p_1^{n_1} \times p_2^{n_2} \times \cdots \times p_z^{n_z}
\displaystyle \text{number of divisors of }T = (n_1 + 1) \times (n_2 + 1) \times \cdots \times (n_z + 1) = \prod_{i = 1}^{z}{(n_i + 1)}
Putting this all together with our home grown factorization library produces the answer in a few seconds :
02
We also know that our home grown prime factoring code is not the fastest, and we can substitute in faster libraries. Note that exploration of the libraries reveals that some other calls may allow us to get the powers of the prime factors directly without having to count them ourselves. SymPy can even provide us directly with the divisor count - the resulting solution is an order of magnitude faster and much simpler to read :
03
The fastest prime factorization we have so far is the pyprimesieve library. It's almost as easy as SymPy, as it returns the factors and their associated power, so we don't need to count the prime factors to get the powers. Using it gives us another order of magnitude performance increase :
04
The solution discussion mentions that the numerator of Gauss's formula contains coprime elements - $n$ and $(n + 1)$ will be coprime. Thus, the factors of each do not overlap, and we can find the number of divisors for each element and multiply them together to get the total divisors (we apply the division by two to the even element). Factoring smaller elements is much faster than factoring a large element, so we get a performance boost. Interestingly, the performance boost is not as much as we might expect :
05
Using coprimes with the SymPy version results in a better performance improvement :
06
Adding the coprimes approach to the pyprimesieve implementation actually slows it down a bit :
07
Closer examination of the code reveals that the code the computes the number of divisors is a candidate for memoization. When the function is memoized, the direct coprime implementation performance gets a boost. Note that the solution without coprimes cannot benefit from this, as there is no call with repeated values :
08
Adding memoization to the SymPy implementation requires adding an extra function wrapper around the SymPy call, as we can't simply memoize the SymPy call itself. Memoization again provides a boost :
09
Adding memoization to the pyprimesieve implementation removes the bottleneck, and we now see a performance improvement :
Final Notes
Running repeated tests that include memoization can be tricky - the earlier test runs can prime the cache so that later test runs are much faster. We get better results when we clear the cache before each test run, but the overhead of clearing the cache will get included in the timing of the test. This overhead is small, and I feel the benefits of running comparable tests outweigh the slight inaccuracy caused by including the overhead.
The lack of a big performance improvement when moving to the coprime implementation shows that we have multiple bottlenecks, and fixing the algorithm still needed another performance change before we saw the full benefit. Note that the non-coprime implementations did not have a good place for memoization, as we did not repeat argument values as we iterated over the triangular numbers. It was only after we broke apart the triangular numbers into smaller values that we saw functions being called with repeated values.