Shanks Transformation and Wynn's ε-Algorithm

Perturbation Methods

Shanks transformation and Wynn's ε-algorithm are NONLINEAR sequence accelerators — they take a sequence ${S_{0}, S_{1}, S_{2}, \dots}$ and produce a new sequence with the same limit but much faster convergence. Unlike Padé approximants (which work with the power-series coefficients) or Borel-Padé (which tames factorially divergent ones), Shanks takes the PARTIAL SUMS directly as input. That makes it ideal for accelerating slowly convergent series — alternating-series of arithmetic functions, definite integrals approximated by infinite sums, anywhere you have a sequence of approximations and want them to converge faster.

The transformation was discovered by Daniel Shanks in 1955 (writing for the Naval Ordnance Laboratory), and the cheap recursive form was given by Peter Wynn in 1956. Together they give one of the simplest and most useful tools in numerical analysis. Twenty lines of code.

The idea

Suppose a sequence $S_{n}$ approaches its limit $S^{*}$ with a single dominant geometric tail:

S_{n} = S^{*} + A r^{n} + (faster-decaying terms) .

Three consecutive terms $S_{n - 1}, S_{n}, S_{n + 1}$ are three equations in three unknowns $(S^{*}, A, r)$ . Solving for $S^{*}$ :

T [S_{n}] = \frac{S _{n - 1} S _{n + 1} - S _{n}^{2}}{S _{n - 1} - 2 S _{n} + S _{n + 1}} .

This is the Shanks transformation. If the assumed form of the error is exact, $T [S_{n}] = S^{*}$ EXACTLY at every $n$ . If the error has multiple geometric components or polynomial corrections, $T [S_{n}]$ kills the leading one and leaves a smaller residual. Iterate: feed the transformed sequence back through $T$ , killing the next-leading error each time.

The denominator $S_{n - 1} - 2 S_{n} + S_{n + 1}$ is exactly the discrete second difference $Δ^{2} S_{n - 1}$ . When the sequence is smooth, the second difference is small — and the transformation amplifies the cancellations in the numerator. This is also why Shanks can lose precision in floating point: differencing nearly equal numbers loses significant digits. Worth running in extended precision for very accurate work.

Wynn's ε-algorithm

Repeated Shanks transformations involve forming products of three terms, taking second differences, and dividing. Doing this naively for the $k$ -fold accelerated sequence requires $O (k N)$ operations and storage. Wynn (1956) discovered a much cleaner recursion that produces the same numerical values via simple rhomboid updates.

Define a table $ε_{k}^{(n)}$ with two boundary conditions:

ε_{- 1}^{(n)} = 0, ε_{0}^{(n)} = S_{n},

and the rhomboid recursion

ε_{k + 1}^{(n)} = ε_{k - 1}^{(n + 1)} + \frac{1}{ε _{k}^{(n + 1)} - ε _{k}^{(n)}} .

Wynn's THEOREM: the even-numbered entries $ε_{2 k}^{(n)}$ are exactly the $k$ -fold Shanks transforms of the original sequence starting at index $n$ . The odd-numbered entries are auxiliary "carrier" values — finite, but not directly meaningful as approximations to the limit. The $ε_{2 k}^{(0)}$ column gives a sequence of accelerations of the starting sequence: $ε_{2}^{(0)}$ is one Shanks transform, $ε_{4}^{(0)}$ is two, and so on.

The algorithm uses only the previous two columns of the table at any time, so storage is $O (N)$ total even for high-order acceleration. No division by a polynomial, no linear system to solve — just simple subtractions and reciprocals. It is one of the cheapest non-trivial algorithms in numerical analysis.

Code

# Shanks transformation: nonlinear sequence acceleration.
# Given three consecutive terms S_{n-1}, S_n, S_{n+1} of a sequence,
# define
#     T[S_n] = (S_{n-1} * S_{n+1} - S_n^2) / (S_{n-1} - 2 S_n + S_{n+1}).
# T eliminates the leading geometric-decay error and produces a new
# sequence converging faster than the original.
#
# Wynn's epsilon algorithm is a rhomboid recursion that REPEATEDLY
# applies Shanks without ever forming the products of three S_n
# explicitly. Storage and arithmetic stay linear.

import numpy as np

def shanks(S):
    """One pass of the Shanks transformation. Input: length-N sequence.
    Output: length-(N-2) accelerated sequence."""
    S = np.asarray(S, dtype=float)
    T = np.empty(len(S) - 2)
    for n in range(1, len(S) - 1):
        denom = S[n-1] - 2*S[n] + S[n+1]
        T[n-1] = (S[n-1]*S[n+1] - S[n]**2) / denom if denom != 0 else S[n]
    return T

def wynn_epsilon(S):
    """Wynn's epsilon algorithm. Returns the full triangular table.
    Convention: eps[0] holds eps_{-1} (all zeros); eps[1] holds eps_0
    (the input sequence). The k-fold Shanks acceleration of S_0 is
    eps_{2k}^(0), which lives at table index [2k+1][0]."""
    N = len(S)
    eps = [[0.0] * (N + 2), list(S) + [0.0]]
    for k in range(1, N):
        row = [0.0] * (N - k + 1)
        for j in range(N - k):
            d = eps[-1][j + 1] - eps[-1][j]
            row[j] = eps[-2][j + 1] + (1.0 / d if abs(d) > 1e-300 else 0.0)
        eps.append(row)
    return eps

def shanks_n(S, k):
    """k-fold Shanks acceleration of S_0 via Wynn's epsilon."""
    eps = wynn_epsilon(S)
    return eps[2*k + 1][0] if 2*k + 1 < len(eps) else None

# ─── Example 1: pi/4 = sum (-1)^n / (2n+1) ──────────────────────────────
S_pi = np.cumsum([(-1)**n / (2*n + 1) for n in range(20)])
print("pi/4 = sum (-1)^n / (2n+1)  (slowly convergent alternating):")
print(f"  Raw S_5            = {S_pi[5]:.10f}  err {abs(S_pi[5] - np.pi/4):.2e}")
print(f"  Raw S_10           = {S_pi[10]:.10f}  err {abs(S_pi[10] - np.pi/4):.2e}")
print(f"  Raw S_20           = {S_pi[-1]:.10f}  err {abs(S_pi[-1] - np.pi/4):.2e}")
for k in range(1, 6):
    v = shanks_n(list(S_pi), k)
    print(f"  Shanks^{k} (eps_{{{2*k}}}^(0)) = {v:.10f}  err {abs(v - np.pi/4):.2e}")

# ─── Example 2: ln 2 = sum (-1)^{n+1}/n ─────────────────────────────────
print("\nln(2) = sum (-1)^{n+1}/n:")
S_ln2 = np.cumsum([(-1)**(n+1) / n for n in range(1, 21)])
print(f"  Raw S_20 = {S_ln2[-1]:.10f}  err {abs(S_ln2[-1] - np.log(2)):.2e}")
for k in range(1, 6):
    v = shanks_n(list(S_ln2), k)
    print(f"  Shanks^{k} = {v:.10f}  err {abs(v - np.log(2)):.2e}")

Output on two classical alternating series:

pi/4 = sum (-1)^n / (2n+1)  (slowly convergent alternating):
  Raw S_5            = 0.7440115440  err 4.14e-02
  Raw S_10           = 0.8080789524  err 2.27e-02
  Raw S_20           = 0.7729059517  err 1.25e-02
  Shanks^1 (eps_{2}^(0)) = 0.7916666667  err 6.27e-03
  Shanks^2 (eps_{4}^(0)) = 0.7855855856  err 1.87e-04
  Shanks^3 (eps_{6}^(0)) = 0.7854037267  err 5.56e-06
  Shanks^4 (eps_{8}^(0)) = 0.7853983280  err 1.65e-07
  Shanks^5 (eps_{10}^(0)) = 0.7853981683  err 4.86e-09

ln(2) = sum (-1)^{n+1}/n:
  Raw S_20 = 0.6687714032  err 2.44e-02
  Shanks^1 = 0.7000000000  err 6.85e-03
  Shanks^2 = 0.6933333333  err 1.86e-04
  Shanks^3 = 0.6931524548  err 5.27e-06
  Shanks^4 = 0.6931473324  err 1.52e-07
  Shanks^5 = 0.6931471850  err 4.40e-09

Read across the rows. The Leibniz series for $π /4$ is famously slow — after twenty terms the partial sum is still off by $1 0^{- 2}$ , accurate to less than two digits. ONE Shanks transform improves this to $6 \times 1 0^{- 3}$ ; TWO transforms gives $2 \times 1 0^{- 4}$ ; five iterations of Shanks (using the same twenty terms) gives $5 \times 1 0^{- 9}$ — nine digits of accuracy from data that natively delivers fewer than two. Each Shanks transform earns roughly two more digits. The $ln 2$ series behaves identically: 2.4 × 10⁻² raw error, 4.4 × 10⁻⁹ after five accelerations.

What Shanks is doing geometrically

The simplest model: $S_{n} = S^{*} + A r^{n}$ , a single geometric correction with ratio $r$ . Shanks transforms $S_{n}$ by EXTRAPOLATING the geometric decay: it assumes the next three terms fit this functional form and analytically solves for $S^{*}$ as if you had infinitely many. The transformation is the analogue of computing the sum of an infinite geometric series from the first few terms.

Iterated Shanks generalizes this. Two transformations effectively model a sum of TWO geometric components $A_{1} r_{1}^{n} + A_{2} r_{2}^{n}$ ; three transformations handle three; and so on. Each iteration "subtracts off" the next geometric decay rate. For alternating series — where the error has both a smooth and an oscillating component — Shanks finds and removes both, which is why the convergence acceleration is so dramatic.

The connection to Padé approximants is not coincidental: Wynn's epsilon algorithm on the partial sums of a power series produces exactly the diagonal Padé approximants of that series, evaluated at the appropriate point. Shanks/Wynn and Padé are the same construction from two different starting points (sequences vs power series).

When Shanks fails

The sequence is already converging too fast. If $S_{n}$ converges geometrically with very small ratio (or super-geometrically), Shanks can introduce noise from differencing nearly-equal numbers. The geometric series $\sum (1/2)^{n} = 2$ is "solved" exactly by one Shanks transform; applying it again amplifies floating-point round-off rather than improving the answer.
Logarithmic convergence. Sequences with $S_{n} = S^{*} + A / lo g n$ have error not of geometric form. Shanks does little for them; you need Levin-type transformations instead.
Asymptotic series. Shanks/Wynn applied to the partial sums of an asymptotic divergent series doesn't help past optimal truncation. You need Borel-Padé on the coefficients, not Shanks on the partial sums.

What it's good for

Concrete applications, in order of how often you see them in numerical work:

Slowly convergent alternating series. Series like the Leibniz formula, log-related sums, certain Fourier-series partial sums. Shanks gets you to machine precision from far fewer terms than would otherwise be needed.
Numerical integration. Apply to the Romberg extrapolation table for trapezoidal-rule estimates of an integral. Shanks's nonlinear extrapolation is sometimes more robust than linear Richardson when the error structure isn't pure polynomial-in-h.
Fixed-point iteration. Apply to the iterates of a fixed-point map $x_{n + 1} = f (x_{n})$ . Shanks (the original 1955 paper called this "Aitken-Shanks") routinely converts linearly convergent iterations into super-linearly convergent ones.
Eigenvalue / iteration convergence. The successive approximations to an eigenvalue produced by power iteration or by long iterative refinement of a discretization can be accelerated. Three Lanczos-iteration eigenvalue approximations, fed through Shanks, often improve accuracy by several orders.
Partial sums of asymptotic expansions BEFORE optimal truncation. If the series is asymptotic but you're still in the regime where partial sums improve, Shanks accelerates that improvement; it gives up at optimal truncation just like the raw series does.

Padé approximants — applied to power-series coefficients rather than partial sums. The two methods coincide for power-series cases.
Borel-Padé — the right tool for divergent series, where Shanks doesn't help.
Variational perturbation theory — a third strategy for taming divergent series, useful at strong coupling.