Prob 0005 ‐ Smallest Multiple - maccergit/Euler GitHub Wiki

This problem describes finding the Lowest Common Multiple (LCM) of the natural numbers 1..N. Finding the LCM is well known in math, and there are many approaches to this.

01

This can be solved easily by hand. Since the problem tells us LCM(1..10) is 2520, we just need to add any missing prime factors for the numbers 11..20, and we get 11, 13, 2, 17, and 19. Note that 2 is included because the powers of 2 in 1..10 are $2^1$, $2^2$, and $2^3$ - but 16 is $2^4$. The answer is then $2520 \times 2 \times 11 \times 13 \times 17 \times 19$, However, we would like an automated approach that is also general for any limit.

02

The direct trial division approach can still be optimized by realizing that the result will need to be a multiple of 2520, allowing us to skip the intervening 2519 numbers for each trial :

03

We can use the same approach as before, but take the optimization even further by using the LCM found so far, instead of just using 2520 :

While this is much faster than the previous approach, it still does not take long before it is too slow to provide results for larger limits :

04

A more efficient approach is to factor the numbers 1..limit, and then use these to compute the LCM. While factoring is more difficult (aka "slower") than trial division, we only need to factor a much smaller range of numbers. By using a Python dictionary keyed by prime factors found, we can also keep track of the exponent needed for that prime - and then do the necessary multiplication at the end. This ends up being much faster than the most optimized trial division approach :

This allows us to scale to much large limits :

At this point, you may be asking why we would need to scale to such large limits. Consider adding fractions with unequal denominators - the first step is to find the lowest common denominator, which is the LCM of the denominators. Now consider adding elements of a series (such as a Taylor series or a Maclaurin series expansion of a function). These are often fractions, and computing the series to desired accuracy may require adding a large number of fractions with different denominators - thus the need to compute the LCM of a large set of numbers.

05

SymPy includes an integer LCM function for an arbitrary number of arguments, so we don't need to roll our own. It's also pretty fast :

06

NumPy also includes an LCM function. However, it only operates on 2 arguments at a time (although being an array processing library, the arguments can be more than just scalars). Numpy also includes the "reduce()" ufunc that allows reduction of an array or iterable along an axis, repeatedly applying the indicated function to each element. This turns out to be even faster than the SymPy version :

07

Returning to factoring to find the LCM, but using the fastest factoring we have found so far (pyprimesieve), we do much better than the brute force factoring - but not as good as the libraries :

08

Some research reveals that another way to find the LCM is to use the "greatest common divisor" (GCD) :

\displaystyle lcd(a,b) = \frac{ab}{gcd(a,b)}

This works because the GCD is the product of the common set of prime factors in both arguments. The product of both arguments provides two copies of the factors in the GCD, along with one copy each of the prime factor unique to each argument. Dividing by the GCD cancels out one of the copies of the GCD in the numerator, leaving the product of the GCD and the unique prime factor for each argument - which is the LCM. This turns the problem into how to find the GCD, which is another well studied problem.

One of the earliest computational algorithms was recorded by Euclid at ~300BC and is called "Euclid's Method", although it's believed the algorithm was well known before that time. This algorithm is the center of many areas of mathematics, and the original version has been improved upon - the improved version using modulo arithmetic is the version shown here :

\displaystyle gcd(a,b) = \begin{dcases}
                           a & b = 0 \\
                           gcd(b,a \bmod b) &
                         \end{dcases}

This results in better times than the best factoring approach, but is still slower than the library versions :

09

The standard libraries include a GCD function - perhaps that can be used to improve the GCD version of the code...and it does, but still slower than NumPy :

10

Since Python 3.9, the standard libraries also include their own LCM function that almost as fast as NumPy :

The big question is how does the NumPy version scale on larger limits compared to the math library version :

NumPy is the clear winner - and further exploration shows what appears to be linear scale even up to a limit of 10 million, running in under 2 seconds.