Shanks Transformation and Wynn's ε-Algorithm
Perturbation Methods
Shanks transformation and Wynn's ε-algorithm are NONLINEAR sequence accelerators — they take a sequence and produce a new sequence with the same limit but much faster convergence. Unlike Padé approximants (which work with the power-series coefficients) or Borel-Padé (which tames factorially divergent ones), Shanks takes the PARTIAL SUMS directly as input. That makes it ideal for accelerating slowly convergent series — alternating-series of arithmetic functions, definite integrals approximated by infinite sums, anywhere you have a sequence of approximations and want them to converge faster.
The transformation was discovered by Daniel Shanks in 1955 (writing for the Naval Ordnance Laboratory), and the cheap recursive form was given by Peter Wynn in 1956. Together they give one of the simplest and most useful tools in numerical analysis. Twenty lines of code.
The idea
Suppose a sequence approaches its limit with a single dominant geometric tail:
Three consecutive terms are three equations in three unknowns . Solving for :
This is the Shanks transformation. If the assumed form of the error is exact, EXACTLY at every . If the error has multiple geometric components or polynomial corrections, kills the leading one and leaves a smaller residual. Iterate: feed the transformed sequence back through , killing the next-leading error each time.
The denominator is exactly the discrete second difference . When the sequence is smooth, the second difference is small — and the transformation amplifies the cancellations in the numerator. This is also why Shanks can lose precision in floating point: differencing nearly equal numbers loses significant digits. Worth running in extended precision for very accurate work.
Wynn's ε-algorithm
Repeated Shanks transformations involve forming products of three terms, taking second differences, and dividing. Doing this naively for the -fold accelerated sequence requires operations and storage. Wynn (1956) discovered a much cleaner recursion that produces the same numerical values via simple rhomboid updates.
Define a table with two boundary conditions:
and the rhomboid recursion
Wynn's THEOREM: the even-numbered entries are exactly the -fold Shanks transforms of the original sequence starting at index . The odd-numbered entries are auxiliary "carrier" values — finite, but not directly meaningful as approximations to the limit. The column gives a sequence of accelerations of the starting sequence: is one Shanks transform, is two, and so on.
The algorithm uses only the previous two columns of the table at any time, so storage is total even for high-order acceleration. No division by a polynomial, no linear system to solve — just simple subtractions and reciprocals. It is one of the cheapest non-trivial algorithms in numerical analysis.
Code
# Shanks transformation: nonlinear sequence acceleration.
# Given three consecutive terms S_{n-1}, S_n, S_{n+1} of a sequence,
# define
# T[S_n] = (S_{n-1} * S_{n+1} - S_n^2) / (S_{n-1} - 2 S_n + S_{n+1}).
# T eliminates the leading geometric-decay error and produces a new
# sequence converging faster than the original.
#
# Wynn's epsilon algorithm is a rhomboid recursion that REPEATEDLY
# applies Shanks without ever forming the products of three S_n
# explicitly. Storage and arithmetic stay linear.
import numpy as np
def shanks(S):
"""One pass of the Shanks transformation. Input: length-N sequence.
Output: length-(N-2) accelerated sequence."""
S = np.asarray(S, dtype=float)
T = np.empty(len(S) - 2)
for n in range(1, len(S) - 1):
denom = S[n-1] - 2*S[n] + S[n+1]
T[n-1] = (S[n-1]*S[n+1] - S[n]**2) / denom if denom != 0 else S[n]
return T
def wynn_epsilon(S):
"""Wynn's epsilon algorithm. Returns the full triangular table.
Convention: eps[0] holds eps_{-1} (all zeros); eps[1] holds eps_0
(the input sequence). The k-fold Shanks acceleration of S_0 is
eps_{2k}^(0), which lives at table index [2k+1][0]."""
N = len(S)
eps = [[0.0] * (N + 2), list(S) + [0.0]]
for k in range(1, N):
row = [0.0] * (N - k + 1)
for j in range(N - k):
d = eps[-1][j + 1] - eps[-1][j]
row[j] = eps[-2][j + 1] + (1.0 / d if abs(d) > 1e-300 else 0.0)
eps.append(row)
return eps
def shanks_n(S, k):
"""k-fold Shanks acceleration of S_0 via Wynn's epsilon."""
eps = wynn_epsilon(S)
return eps[2*k + 1][0] if 2*k + 1 < len(eps) else None
# ─── Example 1: pi/4 = sum (-1)^n / (2n+1) ──────────────────────────────
S_pi = np.cumsum([(-1)**n / (2*n + 1) for n in range(20)])
print("pi/4 = sum (-1)^n / (2n+1) (slowly convergent alternating):")
print(f" Raw S_5 = {S_pi[5]:.10f} err {abs(S_pi[5] - np.pi/4):.2e}")
print(f" Raw S_10 = {S_pi[10]:.10f} err {abs(S_pi[10] - np.pi/4):.2e}")
print(f" Raw S_20 = {S_pi[-1]:.10f} err {abs(S_pi[-1] - np.pi/4):.2e}")
for k in range(1, 6):
v = shanks_n(list(S_pi), k)
print(f" Shanks^{k} (eps_{{{2*k}}}^(0)) = {v:.10f} err {abs(v - np.pi/4):.2e}")
# ─── Example 2: ln 2 = sum (-1)^{n+1}/n ─────────────────────────────────
print("\nln(2) = sum (-1)^{n+1}/n:")
S_ln2 = np.cumsum([(-1)**(n+1) / n for n in range(1, 21)])
print(f" Raw S_20 = {S_ln2[-1]:.10f} err {abs(S_ln2[-1] - np.log(2)):.2e}")
for k in range(1, 6):
v = shanks_n(list(S_ln2), k)
print(f" Shanks^{k} = {v:.10f} err {abs(v - np.log(2)):.2e}") Output on two classical alternating series:
pi/4 = sum (-1)^n / (2n+1) (slowly convergent alternating):
Raw S_5 = 0.7440115440 err 4.14e-02
Raw S_10 = 0.8080789524 err 2.27e-02
Raw S_20 = 0.7729059517 err 1.25e-02
Shanks^1 (eps_{2}^(0)) = 0.7916666667 err 6.27e-03
Shanks^2 (eps_{4}^(0)) = 0.7855855856 err 1.87e-04
Shanks^3 (eps_{6}^(0)) = 0.7854037267 err 5.56e-06
Shanks^4 (eps_{8}^(0)) = 0.7853983280 err 1.65e-07
Shanks^5 (eps_{10}^(0)) = 0.7853981683 err 4.86e-09
ln(2) = sum (-1)^{n+1}/n:
Raw S_20 = 0.6687714032 err 2.44e-02
Shanks^1 = 0.7000000000 err 6.85e-03
Shanks^2 = 0.6933333333 err 1.86e-04
Shanks^3 = 0.6931524548 err 5.27e-06
Shanks^4 = 0.6931473324 err 1.52e-07
Shanks^5 = 0.6931471850 err 4.40e-09 Read across the rows. The Leibniz series for is famously slow — after twenty terms the partial sum is still off by , accurate to less than two digits. ONE Shanks transform improves this to ; TWO transforms gives ; five iterations of Shanks (using the same twenty terms) gives — nine digits of accuracy from data that natively delivers fewer than two. Each Shanks transform earns roughly two more digits. The series behaves identically: 2.4 × 10⁻² raw error, 4.4 × 10⁻⁹ after five accelerations.
What Shanks is doing geometrically
The simplest model: , a single geometric correction with ratio . Shanks transforms by EXTRAPOLATING the geometric decay: it assumes the next three terms fit this functional form and analytically solves for as if you had infinitely many. The transformation is the analogue of computing the sum of an infinite geometric series from the first few terms.
Iterated Shanks generalizes this. Two transformations effectively model a sum of TWO geometric components ; three transformations handle three; and so on. Each iteration "subtracts off" the next geometric decay rate. For alternating series — where the error has both a smooth and an oscillating component — Shanks finds and removes both, which is why the convergence acceleration is so dramatic.
The connection to Padé approximants is not coincidental: Wynn's epsilon algorithm on the partial sums of a power series produces exactly the diagonal Padé approximants of that series, evaluated at the appropriate point. Shanks/Wynn and Padé are the same construction from two different starting points (sequences vs power series).
When Shanks fails
- The sequence is already converging too fast. If converges geometrically with very small ratio (or super-geometrically), Shanks can introduce noise from differencing nearly-equal numbers. The geometric series is "solved" exactly by one Shanks transform; applying it again amplifies floating-point round-off rather than improving the answer.
- Logarithmic convergence. Sequences with have error not of geometric form. Shanks does little for them; you need Levin-type transformations instead.
- Asymptotic series. Shanks/Wynn applied to the partial sums of an asymptotic divergent series doesn't help past optimal truncation. You need Borel-Padé on the coefficients, not Shanks on the partial sums.
What it's good for
Concrete applications, in order of how often you see them in numerical work:
- Slowly convergent alternating series. Series like the Leibniz formula, log-related sums, certain Fourier-series partial sums. Shanks gets you to machine precision from far fewer terms than would otherwise be needed.
- Numerical integration. Apply to the Romberg extrapolation table for trapezoidal-rule estimates of an integral. Shanks's nonlinear extrapolation is sometimes more robust than linear Richardson when the error structure isn't pure polynomial-in-h.
- Fixed-point iteration. Apply to the iterates of a fixed-point map . Shanks (the original 1955 paper called this "Aitken-Shanks") routinely converts linearly convergent iterations into super-linearly convergent ones.
- Eigenvalue / iteration convergence. The successive approximations to an eigenvalue produced by power iteration or by long iterative refinement of a discretization can be accelerated. Three Lanczos-iteration eigenvalue approximations, fed through Shanks, often improve accuracy by several orders.
- Partial sums of asymptotic expansions BEFORE optimal truncation. If the series is asymptotic but you're still in the regime where partial sums improve, Shanks accelerates that improvement; it gives up at optimal truncation just like the raw series does.
Related
- Padé approximants — applied to power-series coefficients rather than partial sums. The two methods coincide for power-series cases.
- Borel-Padé — the right tool for divergent series, where Shanks doesn't help.
- Variational perturbation theory — a third strategy for taming divergent series, useful at strong coupling.