Halley's method - avachon100501/ultrafractalcollection GitHub Wiki

In numerical analysis, Halley's method is a root-finding algorithm used for functions of one real variable with a continuous second derivative. It is named after its inventor Edmond Halley.

The algorithm is second in the class of Householder's methods, after Newton's method. Like the latter, it produces iteratively a sequence of approximations to the root; their rate of convergence to the root is cubic. Multidimensional versions of this method exist.

Halley's method exactly finds the roots of a linear-over-linear Padé approximation to the function, in contrast to Newton's method or the secant method which approximate the function linearly, or Muller's method which approximates the function quadratically.

History

In January 1684, Edmond Halley, Christopher Wren and Robert Hooke met to discuss planetary motion and the inverse square law. Hooke claimed to have derived it, but after some time was never able to provide evidence of doing so. In August of that year, Halley then turned to Isaac Newton, who also claimed to have solved the inverse square law, but he was actually able to provide evidence. Newton then gave Halley a paper called "Of the Motion of Bodies in an Orbit". That paper became the basis for a series of books that we call "Principia", first published on July 5th, 1687. The Principia states Newton's laws of motion, forming the foundation of classical mechanics; Newton's law of universal gravitation; and a derivation of Kepler's laws of planetary motion (which Kepler first obtained empirically).

In December of 1689, Halley nominated Joseph Raphson to join the Royal Society. Raphson is best known from the Newton-Raphson method originally published in the book "Analysis Aequationum Universalis" in 1690. Then, in 1692, a French mathematician, named Thomas Fantet de Lagny, created this new method for finding cube roots in the book "Méthodes Nouvelles et Abrégées Pour L'extraction et L'approximation des Racines": a + (ab/3a^3 + b) < cuberoot(a^3 + b) < a/2 + sqrt((a^2/4) + (b/3a)), a^3 >> b > 0 which intrigued Halley. In 1694, Halley published "A New Exact & Easy Method of Finding the Roots of Any Equations Generally Without Any Reduction", which we can now refer to as the basis of Halley's method. He built upon the work of de Lagny and Raphson.

Skipping to 1705, Halley published what is probably his most famous work: his synopsis of comets in "Astronomiae Cometicae Synopsis". In the English translation he writes: Hitherto I have considered the orbits of comets as exactly parabolic. Then, he also writes: But since they appear frequently enough and since none of them can be found to move with an hyperbolic motion, or a motion swifter than what the comet might acquire by its gravity from the Sun, it is highly probable they rather move in very eccentric orbits and make their returns after long periods of time, though the period of its revolution be vastly long. Then whenever a new comet shall appear, we may be able to know whether it be any of those which have appeared before and consequently to determine its period, the axis of its orbit and to foretell its return. He then goes on to note three comet sightings, which were probably the same one in 1531, 1607 and one that he saw in 1682. "Hence I dare venture to foretell, that it will return again in the year 1758." On Christmas of 1758, Johann Georg Palitzsch spotted Halley's Comet. If Halley had lived to see it, he would have been well over one hundred years old, because he died in 1741 at the age of 85.

Back in 1712, to Brook Taylor, and a letter to John Machin, on the 12th of July. He writes: "While I was thinking of (solving the Kepler Problem) I fell into a general method of applying Dr. Halley's extraction of roots to all polynomials, ... And it is comprehended in this theorem ..." It is referred to as Taylor's theorem, which it will have it's own page.

Derivation setup

From: Feigenbaum, Lenore. "Brook Taylor and the method of increments." Archive for history of exact sciences 34.1-2 (1985): 1-140. "If I interpret Taylor's notes correctly, he arrived at [alpha(z) - alpha(p) = alpha'(p)(z - p) + (alpha''(p)/2)(z - p)^2 + (alpha'''(p)/1*2*3)(z - p)^3 + ...] and [alpha(z) - alpha(p) = -alpha'(p)(p - z) - (alpha''(p)/2)(p - z)^2 - (alpha'''(p)/1*2*3)(p - z)^3 - ...] in the following way. He first noticed what Halley had failed to realize before him: that the coefficients in [a generalized version of his equations] are directly related to successive derivatives of the original polynomial."

From: Scavo, T.R., and J.B. Thoo. "On the Geometry of Halley's Method." The American Mathematical Monthly 102.5 (1995): 417-426. "Despite Taylor's achievements, he was unable to provide a general formula for Halley's method. It remained to Schröder, more than one-and-a-half centuries later, to derive Halley's iteration function as we now know it. But Schröder made no reference to Halley. Indeed, Schröder... mentioned Halley's formula in passing... Frame, on the other hand, was the first to derive Halley's iteration function via a second-degree Taylor polynomial... Salehov (in 1952) was apparently the first to suggest that Halley's iterative function could be derived using an osculating rational function of the form ... [h(x) = ((x-xn)+c)/(a(x-xn)+b)] ... In other words Halley's formula can be derived using an osculating hyperbola. Indeed Halley's method is sometimes called the method of tangent hyperbolas..."

Tangent hyperbola derivation

The red line is the tangent hyperbola, which makes our new value for x2. This may not look like a tangent hyperbola, unless we zoom out, or use a different function, which is more clear here:

As explained, the tangent hyperbola is given by this function: h(x) = ((x - xn) + c)/(a(x - xn) + b)), designed to go through our point at xn and to also be tangent using the variables a a = -(f''(xn)/d), b b = 2f'(xn)/d and c c = 2f(xn)f'(xn)/d. We're also having a helper variable d d = 2f'(xn) - f(xn)f''(xn). From this equation we can actually derive Halley's method. With our hyperbola all we really care about for our next value for x is where our hyperbola is zero, so if we plug in xn+1 h(xn+1) = 0, ((xn+1 - xn) + c) / (a(xn+1 - xn) + b) = 0, all we really need is for that numerator to be equal to zero xn+1 - xn + c = 0. Therefore, if we move our xn and c terms over and substitue what we know for c, this gives us Halley's method xn+1 = xn - (2f(xn)f'(xn)/2f'(xn)^2-f(xn)f''(xn)), although it's not the traditional derivation.

Derivation from the Taylor series

Let's imagine a line y(x) = a(x - xn) + b, and have that line go through our function at xn and have the same derivative at xn f(xn) = y(xn) and f'(xn) = y'(xn), y(x) = f'(xn)(x - xn) + f(xn), giving us an equation that looks something like this: y(xn+1) = 0 where that line intersects the x-axis we care to be zero, to give us our next value for x. f'(xn)(xn+1 - xn) + f(xn) = 0 Therefore, plug in xn+1 f'(xn)(xn+1 - xn) = -f(xn), move the f term over xn+1 - xn = f(xn)/f'(xn) and divide by the derivative term xn+1 = xn - f(xn)/f'(xn), then move the x term over and we get Newton's method. Strangely enough, that line equation is actually our first order Taylor polynomial: y(x) = f'(xn)(x - xn) + f(xn). What if we instead used the second-order Taylor polynomial? y(x) = f(xn) + f'(xn)(x - xn) + f''(xn)/2 (x - xn)^2 Using the same process, plug in xn+1 y(xn+1) = 0 where our function will give us 0. f(xn) + f'(xn)(xn+1 - xn) + f''(xn)/2 (xn+1 - xn)^2 = 0 Then move our f term to the right (xn+1 - xn)(f'(xn) + f''(xn)/2 (xn+1 - xn)) = -f(xn), and this time, we'll factor out the subtraction of (xn+1 - xn) xn+1 - xn = - (f(xn)/(f'(xn)+f''(xn)/2 (xn+1 - xn))), then divide our other term over and finally move our x term. xn+1 = xn - (f(xn)/(f'(xn) + f''(xn)/2 (xn+1 - xn))). But we're still left with an xn+1 term on the right side. The way to get rid of this is as follows: let's recall our Newton method function. xn+1 = xn - f(xn)/f'(xn). If we replace that subtraction with the Newton step xn+1 = xn - (f(xn)/(f'(xn) + f''(xn)/2 (-(f(xn)/f'(xn))))), it can come up with this form: xn+1 = xn - (f(xn)/(f'(xn) - (f''(xn)f(xn)/2f'(xn))))). Do some clean up (2f'(xn)/2f'(xn)) and we get Halley's method. xn+1 = xn - (2f(xn)f'(xn))/(2f'(xn)^2-f''(xn)f(xn))

Examples

f(x) = x^2 - x - 1, f'(x) = 2x - 1, f''(x) = 2, epsilon = 10^(-7)

n = 0, xn = 0, f(xn) = -1, f'(xn) = -1, f''(xn) = 2

n = 1, xn = -0.5, f(xn) = -0.25, f'(xn) = -2, f''(xn) = 2

n = 2, xn = -0.617647, f(xn) = -0.000865, f'(xn) = -2.235294, f''(xn) = 2

n = 3, xn = -0.618034, f(xn) = -2.6e-11

After two more iterations, we're able to come up with the solution of -0.618.

f(x) = x^3 - x^2 - x - 1, f'(x) = 3x^2 - 2x - 1, f''(x) = 6x - 2, epsilon = 10^(-7)

n = 0, xn = -2, f(xn) = -11, f'(xn) = 15, f''(xn) = -14

n = 1, xn = -0.88514, f(xn) = -0.88514, f'(xn) = 3.120663, f''(xn) = -7.31081

n = 2, xn = 0.382123, f(xn) = -1.47234, f'(xn) = -1.32619, f''(xn) = 0.292737

n = 3, xn = -0.6069, f(xn) = -0.98496, f'(xn) = 1.318762, f''(xn) = -5.64138

n = 10, xn = 1.8392866, f(xn) = -8.22e-7, f'(xn) = 5.470352, f''(xn) = 9.03572

Compare it with Newton's method:

Newton: n = 0, xn = -2, f(xn) = 1

n = 1, xn = -1.266666667, f(xn) = -11

n = 2, xn = -0.7356676004, f(xn) -3.370074074

n = 3, xn = -0.161102914, f(xn) = -1.203687539

n = 4, xn = -1.609655287, f(xn) = -0.8690325239

n = 35, xn = 1.839286755, f(xn) = 0

We notice that Newton's method doubles back on itself, and takes a very long time to converge. Newton's method finds the root after 35 iterations, while Halley did so with only 10 iterations.

One more example, the arctangent function (f(x) = atan(x)).

f(x) = atan(x), f'(x) = 1/(x^2 + 1), f''(x) = -2x/(x^2 + 1)^2, epsilon = 10^(-7)

n = 0, xn = 1.5, f(xn) = 0.98279, f'(xn) = 0.30769, f''(xn) = -0.28402

n = 1, xn = 0.20904, f(xn) = 0.20607, f'(xn) = 0.95813, f''(xn) = -0.38381

n = 2, xn = 0.00284, f(xn) = 0.00284, f'(xn) = 0.99999, f''(xn) = -0.00569

n = 3, xn = 7.7e-9, f(xn) = 7.7e-9

Let's compare it to Newton's method:

In this example, Newton's method diverges, while Halley's method converges. And in fact, Halley's method is usually much more convergent overall and converges with a faster order, and does so with a cubic convergence (convergence of order 3).

Fractals rendered using Halley's method

Halley fractal for f(z) = z^3 - 1.

With Halley's method, we have far less extremum points that would cause the function to diverge, and it also converges much more quickly than Newton's method.

Halley fractal for f(z) = z^2 - z - 1

This fractal uses the first example above. It unfortunately doesn't give any fractal patterns, but it does remind me of the Star Gate sequence from Stanley Kubrick's 2001: A Space Odyssey.

Halley fractal for f(z) = z^3 - z^2 - z - 1

This one uses the function from the second example.

Halley fractal for the arctangent function (f(z) = atan(z))

It is now obvious that Halley's method converges in this function, instead of diverging, which is the case for Newton's method.

Notes

More complex than Newton's method
Requires 2-times differentiable function
Cubic convergence order
Can diverge or divide by zero
Halley's Comet to return in 2061 - last encounter was in 1986
Less extremum points, yielding a less interesting fractal
Nova variant has Cubic Mandelbrot sets, in relation to the convergence order