Lanczos Iteration

Linear Algebra

Symmetric Lanczos is the Hermitian special case of Arnoldi iteration — and the simplification it brings is one of the cleanest payoffs in numerical linear algebra. When , the upper-Hessenberg projection produced by Arnoldi collapses to a TRIDIAGONAL matrix, and the orthogonalisation against all previous Krylov vectors collapses to a THREE-TERM recursion against only the two most recent. Cost per step drops from to ; storage drops from to . That collapse is what makes Lanczos the workhorse for extremal eigenvalues of symmetric sparse matrices and for evaluating on systems too large to factor.

Why three terms

Build an orthonormal basis of the Krylov subspace . By orthogonality, the new vector expands as with . For Hermitian we can use . Both and live in the Krylov subspace of dimension , so the inner product vanishes whenever . Combining with the standard Arnoldi argument ( for ) gives:

So only THREE coefficients per column are non-zero: the diagonal and the two off-diagonal entries (the latter two equal by symmetry). The matrix is tridiagonal:

And the recursion that builds the basis is:

Rearranging, . Computing the next basis vector requires only the current and previous one — never the older ones. That's the famous "three-term recursion" and the reason Lanczos's memory footprint is constant in the chain length.

The algorithm

# Symmetric Lanczos — the Hermitian special case of Arnoldi.
# Three-term recursion in place of full Gram-Schmidt:
#    A q_j = beta_{j-1} q_{j-1} + alpha_j q_j + beta_j q_{j+1}
# Output: Q (n x m) with orthonormal columns, and tridiag T with diagonal
# alpha and off-diagonal beta:
#    Q^T A Q = T,  with T = tridiag(beta, alpha, beta).
# Crucially, only THREE vectors (q_prev, q, q_next) need to be in memory
# at once — independent of how long the chain runs.

import numpy as np

def lanczos(A, b, m):
    n = A.shape[0]
    Q = np.zeros((n, m))
    alphas = np.zeros(m)
    betas  = np.zeros(m - 1)
    q = b / np.linalg.norm(b)
    q_prev = np.zeros(n)
    beta = 0.0
    for j in range(m):
        Q[:, j] = q
        v = A @ q - beta * q_prev          # subtract the j-1 direction
        alpha = q @ v                      # diagonal entry
        alphas[j] = alpha
        v = v - alpha * q                  # subtract the j direction
        if j < m - 1:
            beta = np.linalg.norm(v)
            betas[j] = beta
            if beta < 1e-12:
                return Q[:, :j+1], alphas[:j+1], betas[:j]
            q_prev = q
            q = v / beta
    return Q, alphas, betas

# ─── Validation on a symmetric, well-conditioned matrix ─────────────────
np.random.seed(0)
n, m = 40, 15
A = np.random.randn(n, n)
A = (A + A.T) / 2 + n * np.eye(n)        # symmetric, SPD by construction
b = np.random.randn(n)

Q, alphas, betas = lanczos(A, b, m)
T = np.diag(alphas) + np.diag(betas, 1) + np.diag(betas, -1)

print(f"max |Q^T Q - I|   = {np.max(np.abs(Q.T @ Q - np.eye(m))):.2e}")
print(f"max |Q^T A Q - T| = {np.max(np.abs(Q.T @ A @ Q - T)):.2e}")

ritz = np.sort(np.linalg.eigvalsh(T))
true = np.sort(np.linalg.eigvalsh(A))
print(f"Smallest:  Ritz {ritz[0]:.5f}    true {true[0]:.5f}    diff {abs(ritz[0]-true[0]):.2e}")
print(f"Largest:   Ritz {ritz[-1]:.5f}    true {true[-1]:.5f}    diff {abs(ritz[-1]-true[-1]):.2e}")

# Demonstration: loss of orthogonality at long chain length
Q_long, _, _ = lanczos(A, b, m=35)
print(f"After m=35 iters: max |Q^T Q - I| = {np.max(np.abs(Q_long.T @ Q_long - np.eye(35))):.2e}")

Output on a 40×40 symmetric positive-definite matrix with :

max |Q^T Q - I|   = 2.46e-13
max |Q^T A Q - T| = 8.87e-12
Smallest:  Ritz 30.88945    true 30.88945    diff 7.68e-07
Largest:   Ritz 48.20060    true 48.20061    diff 1.28e-05
After m=35 iters: max |Q^T Q - I| = 4.10e-03

Three things to read off. (1) Orthogonality holds to and the tridiagonal relation to — both at the level of typical accumulated round-off. (2) The smallest and largest Ritz values (eigenvalues of ) match the true extremal eigenvalues of to many decimal places after just 15 iterations on a 40-dim problem. This rapid convergence at the SPECTRAL EDGES is the central practical strength of Lanczos for symmetric eigenproblems. (3) The third number reveals the famous gotcha.

Loss of orthogonality

Extending the chain to 35 iterations, the residual orthogonality error climbs to — eleven orders of magnitude worse than at iteration 15. This is not a bug in the algorithm; it is intrinsic to the no-re-orthogonalisation Lanczos recursion in finite precision. As Ritz values converge to true eigenvalues, the directions they correspond to get amplified in the basis. Tiny round-off errors that point along already-converged directions are not subtracted by the three-term recursion (since those directions only show up in ), so they accumulate.

The effect was identified by Paige (1971-1976) and analyzed in detail by Cullum and Willoughby. The pathology in the eigenvalue problem is the production of "GHOST" Ritz values: spurious duplicate eigenvalues that appear when a converged Ritz value is re-discovered along a drifted basis vector. Distinguishing ghosts from genuine multiplicities requires either re-orthogonalisation (expensive, restores Arnoldi cost) or post-hoc filtering (Cullum-Willoughby ghost test). For most extremal-eigenvalue applications a small partial re-orthogonalisation against converged Ritz vectors is enough.

For the resolvent / continued-fraction applications (see bi-orthogonal Lanczos and Liouville-Lanczos), loss of orthogonality is essentially benign: ghost eigenvalues contribute zero residue to the matrix element you actually care about. The chain coefficients are still the continued-fraction coefficients of the true resolvent.

What it's used for

Related methods