Prob 0002 ‐ Even Fibonacci Numbers - maccergit/Euler GitHub Wiki

Another easy problem, but one where a bit of research reveals some interesting solutions, including the use of a provided library.

01

Brute force : compute all the Fibonacci numbers up to the limit, and then filter out the odd ones.

A slightly deeper investigation into Python, as this results in a Fibonacci sequence "generator", using the "yield" keyword. Using a generator allows the function to maintain state across function calls, so we don't need to do that ourselves. Also, since generators are iterable, we can use them in comprehensions to build lists, sets, other generators, etc...

Modern machines are so fast that again, the sample and the problem do not push the machine very hard - even with a naive approach : around 1.5 and 4 µsec for each case. Again, we plot out the values way past the original range to see how performance trends - while it's pretty noisy (even when taking the average of 100k runs for each point), it appears to grow logarithmically. I believe this is because the Fibonacci numbers get sparser as they get larger, but the computation for the next number is fixed time. Examining the curve for thousands of data points as the limit moves into the quadrillions shows that the curve never stops growing, but the rate of growth decreases. For many cases, the direct approach is probably "good enough". Note that in the graph, the range of results only spans about 700 nanoseconds - at this level of resolution, noise becomes significant.

02

Optimize the previous approach by taking advantage of the fact that Fibonacci numbers have a repeating pattern of odd-odd-even. By computing them all, but only returning every 3rd result, we avoid the need for trial division to get the even numbers. As a result, we get similar results, but about 2x faster :

03

Optimize the previous approach by avoiding generating the odd values altogether - substitute the values for the odd values into the formulas, and then simplify to get :

\displaystyle E_n = 4E_{n - 1} + E_{n - 2}

This is a bit faster, but we still are generating all the previous even values to get to the one we want, and we are seeing diminishing returns by optimizing the current approach :

04

There is a closed form for computing a Fibonacci number without having to compute all the previous ones. In our case, since we are computing sums, we need all those previous numbers - but it's worth seeing if using the closed form does any better. So, we use Binet's Formula for computing a Fibonacci number :

\displaystyle \frac{\phi^n - \upsilon^n}{\sqrt{5}}

where :

\displaystyle \phi = \frac{1 + \sqrt{5}}{2}
\displaystyle \upsilon = 1 - \phi

However, this assumes we know when to stop. We can check the result to see if we have hit the limit, or we can derive the index from the formula to get :

\displaystyle n = \frac{log_e{(limit)} + \frac{log_e{(5)}}{2}}{log_e{(\phi)}}

Even though computing the index is complicated, we only need to do it once, instead of running a comparison on every result to see if we have reached the limit.

While this is a much faster approach to generating a single Fibonacci number, building a sum by traversing the ones we need turns out to be a bit slower due to the extra work done to compute each item in the series. It would be better to get a closed form for the sum of the elements, but that is beyond me (but see below for an update) - the best I was able to do was to compute the sum of the numerators, and then finally divide by $\sqrt{5}$ - the goal being to reduce the number of operations done for each element. Even pulling as much out of the repeating loop as possible, the end result is still slower than the optimized direct approach.

05

The SymPy library includes a function to compute Fibonacci numbers - so let's see if the library has a better way to do things. Turns out this is a case where "rolling your own" has much better performance - I'm not sure why the library is so slow, unless it is because it is incurring the overhead of doing the math symbolically instead of numerically.

06

Using Google Gemini AI, I was able to prompt it to determine the closed form for the solution. It would not do it directly - I had to tell it to combine the various formulas, and even then it was not happy about the results until I told it to include the "floor" function.

Sum = \frac{\Phi^{3 \left\lfloor \frac{\ln(\text{limit}) + \frac{1}{2}\ln(5)}{3 \ln(\Phi)} \right\rfloor + 2} - \Psi^{3 \left\lfloor \frac{\ln(\text{limit}) + \frac{1}{2}\ln(5)}{3 \ln(\Phi)} \right\rfloor + 2} - \sqrt{5}}{2\sqrt{5}}

Refactoring the resulting code to reduce the number of duplicated operations then results in the fastest solution :

Investigation Into Issues With "float"

While Python has built-in support for arbitrarily large integers, it uses the normal IEEE-754 floating point standard for floats. This has limited precision (11 bits in the exponent, 52 bits in the mantissa, and a sign bit - total of 64 bits). This means that when floats are raised to large powers, or very large float values are needed, problems can occur. We'll discuss the issue of very large float in a later problem - but the loss of precision when raising float to large powers can become an issue when using Binet's formula.

For example, using the limit of "10**17" (note we don't use "1e17" here - that will be a float, and not an integer), the first approach produces an answer of 49597426547377748, but the closed form approach produces 49597426547377888. They are close, but not the same.

07 decimal.Decimal

One approach to handling floating-point issues is to use the "decimal" package that comes with Python. While it's convenient and written in "pure python", it does not provide arbitrary precision. Instead, you tell it the precision you want, and it uses that as its fixed precision. The default precision is 28 digits, while float has about 16 decimal digits of precision - but the precision is configurable.

It's important to realize that Decimal is not simply an extension of float with more bits. It's implemented using base-10, and not binary. Also, it gives great control over how the precision is used. While calculations can silently truncate if the precision is exceeded, other operations will throw an exception if they require more precision than is available. Also, "traps" are available to change the behavior of calculations so that they throw exceptions if precision is exceeded, instead of silently returning the wrong result.

For example, changing the closed form to use Decimal allows it to return the correct result when the limit is $10^{17}$. However, while the correct solution for $10^{30}$ is 727244555616386341839153320976, the Decimal solution returns 727244555616386341839153296895, because the calculations exceeded the default precision. Also, to display this result, I had to temporarily increase the precision when calling "quantize()" to round the result, as "quantize()" is one of the operations that will throw an exception if the precision is exceeded. This also means that if I increase the limit even further (until it exceeds the local precision used for quantize), then an exception will be thrown. However, using traps can make Decimal very picky, to the point where you spend a lot of time dealing with adjusting significant figures to make the calculations run.

Using the same algorithm, but using Decimal, we turn the fastest version into the slowest - it runs over 100x slower!

08 mpmath

Another high-precision library is "mpmath". It also uses fixed precision that can be configured dynamically, and it's also "pure python. While it does not come with Python, it is what SymPy uses for floats - so it's worth investigating. Any features or issues with mpmath will also carry through to SymPy. Its default precision is 15 decimal digits - so a proper comparison requires setting the precision to 28. It also tends to not throw exceptions when there is a loss of precision (it behaves more like floats in this regard), and will silently produce wrong values when precision is exceeded. Converting the Decimal version to use mpmath is simply a case of using a slightly different set of API calls, changing the precision of the global context, and removing the local context. The resulting code is a little bit cleaner - but does not have the same safety as with Decimal.

This is almost twice as fast as Decimal. It still has the problem of returning the wrong answer then the limit is $10^{30}$ - but a simple increase of the precision to 100 allows it to return the correct answer, at the cost of a few extra microseconds. With the limit of $10^{100}$, the integer version returns an exact value of 12065007678944807420403981310014175239608005638595098371630805388439212255831420630608529497465143520, while the mpmath version with 100 digits of precision returns 1.206500767894480742040398131001417523960800563859509837163080538843921225583142063060852949746514355e+100. However, changing the precision to 200 gives the correct result and adds about 10 microseconds.

09 gmpy2

So you want arbitrary precision floats AND performance? The "gmpy2" library is a Python wrapper around the GMP (GNU Multiple Precision) library. GMP is written in C, and is specially coded to leverage hardware optimizations to be very fast. Like Decimal and mpmath, it uses a fixed precision that is dynamically configurable. like before, this means converting the code just involves coding to a new set of API calls. However, while it seems like it's a straightforward conversion to use the gmpy2 API, it insisted on producing results that did not match the other versions. I know the library works, as I have successfully used it in a later problem - butter some reason, I've had no luck here.

TODO - add tons of diagnostic prints to show intermediate values of all calculations, for both gmpy2 version and mpmath version. Comparison of values should show where the problem comes in. Once reduced to a single operation or segment, can then investigate with known inputs in python shell and determine what is going on.

NOTES

If future problems include a need to compute Fibonacci numbers, then the optimized code should be refactored out to the "utils" project.

The O(log n) behavior of the iterative approach is nearly horizontally flat for very large limits (such as $10^{10000}$) - so that such limits still produce a result in under a second. Whereas the closed form approach was very difficult to get right. While still faster than iterative (mpmath version with precision of 15000 was able to produce result for $10^{10000}$ in 31 msec), it might not be worth the effort unless the limits are staggeringly large.