Bi-Orthogonal Lanczos
Linear Algebra
The bi-orthogonal (or "two-sided") Lanczos algorithm tridiagonalises a NON-Hermitian matrix. Standard symmetric Lanczos relies on to truncate the Krylov projection to three terms; that property doesn't hold for general operators. The bi-orthogonal variant restores a three-term recursion by carrying TWO Krylov sequences — one driven by , one driven by — and enforcing duality between them. The output is a (non-symmetric) tridiagonal matrix that represents in a bi-orthogonal basis.
The motivating use case is resolvents of non-Hermitian operators: as a function of . This matrix element shows up everywhere — Green's functions in many-body physics, response functions in linear-response TDDFT, transfer-matrix calculations, transient response of non-self-adjoint linearized PDEs. Bi-orthogonal Lanczos turns the resolvent into a CONTINUED FRACTION whose coefficients are exactly the entries of the tridiagonal projection — no eigendecomposition required.
Why two sequences
For a Hermitian matrix, the Krylov subspace has a natural orthonormal basis under the standard inner product, and a SYMMETRIC three-term recursion suffices. For a non-Hermitian , you can still build and orthonormalise it (that's Arnoldi) — but the result is a HESSENBERG matrix, not tridiagonal, because there is no reason for to vanish when unless is Hermitian under the chosen inner product.
Bi-orthogonal Lanczos sidesteps Arnoldi's Hessenberg cost. Build TWO Krylov subspaces:
and choose bases for the first and for the second so that they are DUAL:
This is BI-ORTHOGONALITY — not orthogonality in the usual sense. The two sets are not individually orthonormal; only their pairwise inner products vanish. Once the dual basis is in hand, the matrix (with and ) turns out to be TRIDIAGONAL — and the off-diagonal entries are non-zero only between adjacent levels, because matrix elements vanish when by the dual recursion structure. That is the bi-orthogonal analogue of the symmetric Lanczos three-term collapse, and it is the algorithmic payoff for maintaining two sequences instead of one.
The recursion
Assume the bases satisfy:
Three remarks. (i) Both recursions share the diagonal coefficient — they have to, because appears as . (ii) The off-diagonal coefficients are TWO different families, and , swapped between the right and left recursions. This is forced by bi-orthogonality, as can be checked: take of the first equation and of the second; the matching imposes that shows up in the recursion's coefficient and shows up in the recursion's coefficient. (iii) The PRODUCT is determined by the residuals; how you SPLIT it between and individually is a gauge choice (you can pick for the symmetric split, or any other consistent choice).
The bookkeeping at each iteration:
- Compute — this is one diagonal entry of .
- Form right and left residuals:
- Compute their bi-orthogonal product .
- Split the product (gauge choice — usually , ).
- Normalize the new basis vectors: , .
Three vectors of storage per side ( and the analogous 's), one application each of and per iteration. The cost per step is the same order as symmetric Lanczos, just doubled: two mat-vecs instead of one, and two stored Krylov vectors instead of one.
The algorithm
# Bi-orthogonal Lanczos on a (generic, non-Hermitian) matrix A.
#
# Inputs: A (n x n), v0 (right seed), w0 (left seed), n_iter
# Outputs: alphas, betas, gammas — tridiagonal coefficients
# V, W (n x n_iter) — right and left Krylov bases
# seed = <w0 | v0> — overall scaling factor
#
# Bi-orthogonality: W^T V = I (the columns are dual bases of the Krylov
# subspaces of A acting on v0 and A^T acting on w0).
# Tridiagonal projection: T = W^T A V has
# T[j, j] = alpha_j (diagonal)
# T[j+1, j] = beta_{j+1} (sub-diagonal — built into the q-recursion)
# T[j-1, j] = gamma_j (super-diagonal — built into the p-recursion)
# In particular, T is NOT symmetric; that is the whole point.
import numpy as np
def biortho_lanczos(A, v0, w0, n_iter):
seed = np.dot(w0, v0)
if abs(seed) < 1e-14:
raise ValueError("Seed vectors orthogonal — pick a different start.")
scale = np.sqrt(seed + 0j) # complex sqrt handles seed < 0 cleanly
q = (v0 / scale).astype(complex)
p = (w0 / scale).astype(complex) # <p | q> = 1 by construction
n = A.shape[0]
V = np.zeros((n, n_iter), dtype=complex)
W = np.zeros((n, n_iter), dtype=complex)
q_prev = np.zeros_like(q)
p_prev = np.zeros_like(p)
beta_prev = gamma_prev = 0.0 + 0.0j
alphas, betas, gammas = [], [], []
for j in range(n_iter):
V[:, j] = q
W[:, j] = p
Aq = A @ q
ATp = A.T @ p
# Bi-orthogonal projection: <p_j | A q_j> = alpha_j
alpha = np.dot(p, Aq)
alphas.append(alpha)
# Three-term residuals: project out the components already in
# span{q_j, q_{j-1}} and span{p_j, p_{j-1}}.
r = Aq - alpha * q - gamma_prev * q_prev
s = ATp - alpha * p - beta_prev * p_prev
rs = np.dot(s, r) # = beta_{j+1} * gamma_{j+1}
if abs(rs) < 1e-13:
# Either lucky breakdown (invariant subspace found) or serious
# breakdown (the bi-orthogonal recursion fails). For a small
# demo we just stop; production code uses a look-ahead variant.
break
beta = np.sqrt(rs + 0j)
gamma = rs / beta # any split with beta * gamma = rs works
betas.append(beta)
gammas.append(gamma)
q_prev, p_prev = q, p
q = r / beta
p = s / gamma
beta_prev, gamma_prev = beta, gamma
return (np.array(alphas, dtype=complex),
np.array(betas, dtype=complex),
np.array(gammas, dtype=complex),
V, W, seed) What the output represents
After iterations you have:
- with — basis of .
- with — basis of .
- The identity (bi-orthogonality, to machine precision).
- The tridiagonal projection :
Three uses of , in increasing order of how much of the spectrum you need:
1. Extremal eigenvalues / Ritz values. The eigenvalues of (called Ritz values when seen as approximations to those of ) converge to the extreme eigenvalues of very rapidly as grows. Same idea as symmetric Lanczos for ground states, just on a non-Hermitian operator. Be warned: Ritz values can be complex even for a real , and they can MISS eigenvalues with small overlap to the seed.
2. Action of a function of . For any function regular on the spectrum of :
Why this is true: when and span a Krylov subspace large enough to capture the spectral content that probes. Then the matrix element . The most important specialization is — the resolvent.
3. Resolvent matrix elements as continued fractions. The (0,0) entry of the inverse of a tridiagonal matrix is a CONTINUED FRACTION in its entries. For with diagonal and off-diagonal products :
This is the most useful property of the algorithm. The Lanczos coefficients ARE the continued-fraction coefficients of the resolvent matrix element. After one chain of length you can evaluate the resolvent at any for work per point. That is exactly what makes Liouville-Lanczos linear-response TDDFT tractable for solids.
Validation
Run on a generic random non-Hermitian matrix and verify the three structural claims: bi-orthogonality, tridiagonal projection, and continued-fraction resolvent.
# Validate the algorithm against direct linear algebra:
# (1) bi-orthogonality W^T V = I,
# (2) tridiagonal projection W^T A V = tridiag(alpha, beta, gamma),
# (3) resolvent <w0|(A - z)^{-1}|v0> = seed * [(T - z)^{-1}]_{00}
# evaluated as a continued fraction in the Lanczos coefficients.
import numpy as np
np.random.seed(7)
n = 10
A = np.random.randn(n, n) # generic non-Hermitian matrix
v0 = np.random.randn(n)
w0 = np.random.randn(n)
alphas, betas, gammas, V, W, seed = biortho_lanczos(A, v0, w0, n_iter=n)
# (1) Bi-orthogonality
print(f"max |W^T V - I| = {np.max(np.abs(W.T @ V - np.eye(n))):.2e}")
# (2) Tridiagonal projection
T_proj = W.T @ A @ V
T_built = (np.diag(alphas)
+ np.diag(betas, -1)
+ np.diag(gammas, +1))
print(f"max |W^T A V - tridiag| = {np.max(np.abs(T_proj - T_built)):.2e}")
# (3) Continued-fraction resolvent
def resolvent_lanczos(alphas, betas, gammas, seed, omega, eta=0.05):
z = omega + 1j * eta
cf = 0.0 + 0.0j
# Bottom-up: at level j the off-diagonal product is beta_j * gamma_j,
# stored at array index j-1.
for j in range(len(alphas) - 1, 0, -1):
cf = (betas[j-1] * gammas[j-1]) / ((alphas[j] - z) - cf)
cf = 1.0 / ((alphas[0] - z) - cf)
return seed * cf
def resolvent_direct(A, v0, w0, omega, eta=0.05):
z = omega + 1j * eta
return w0 @ np.linalg.solve(A - z * np.eye(A.shape[0]),
v0.astype(complex))
print("\nResolvent <w0|(A - omega - i*eta)^{-1}|v0>:")
for w in np.linspace(-2.0, 2.0, 5):
rd = resolvent_direct(A, v0, w0, w)
rl = resolvent_lanczos(alphas, betas, gammas, seed, w)
print(f" omega = {w:+.2f} direct = ({rd.real:+.5f}, {rd.imag:+.5f})"
f" Lanczos = ({rl.real:+.5f}, {rl.imag:+.5f})"
f" |diff| = {abs(rd - rl):.1e}") Output:
max |W^T V - I| = 1.81e-12
max |W^T A V - tridiag| = 8.47e-12
Resolvent <w0|(A - omega - i*eta)^{-1}|v0>:
omega = -2.00 direct = (+1.12493, +0.07443) Lanczos = (+1.12493, +0.07443) |diff| = 2.7e-14
omega = -1.00 direct = (+3.78649, +0.14042) Lanczos = (+3.78649, +0.14042) |diff| = 1.8e-13
omega = +0.00 direct = (+4.67512, +0.08985) Lanczos = (+4.67512, +0.08985) |diff| = 1.9e-13
omega = +1.00 direct = (+15.84230, +0.27023) Lanczos = (+15.84230, +0.27023) |diff| = 1.0e-12
omega = +2.00 direct = (-4.05682, +1.44429) Lanczos = (-4.05682, +1.44429) |diff| = 4.0e-13 All three properties hold to machine precision when the chain runs to the full dimension. The resolvent reproduces the direct linear-solve answer at every sampled frequency.
Breakdowns
Bi-orthogonal Lanczos has TWO failure modes that the symmetric variant doesn't. (1) Lucky breakdown: or — the chain has discovered an exact invariant subspace, and you can stop. Eigenvalues of the current are exact eigenvalues of . (2) SERIOUS breakdown: while neither residual is zero — the two Krylov subspaces have "collided" in a degenerate way and the bi-orthogonal recursion cannot continue without modification. Serious breakdown is the price of trading orthogonality for the three-term recursion; it does not occur in Arnoldi.
Production codes use LOOK-AHEAD Lanczos, which lets the dual basis vectors be "blocked" together over a few iterations to step past the singularity. In small or well-conditioned problems serious breakdowns are rare. For the linear-response problems on this site (where the seed vectors are physical perturbations and the operator is the Liouvillian of a physically well-posed system), they essentially never come up.
Relation to other methods
- Symmetric Lanczos — the special case and . The two sequences collapse to one and .
- Arnoldi iteration — works on non-Hermitian with one Krylov sequence and full Gram-Schmidt orthogonalisation; produces a HESSENBERG (not tridiagonal) projection. More numerically robust, but every step requires work and storage rather than constant. Use Arnoldi when you need eigenvalues robustly; use bi-orthogonal Lanczos when you need resolvent matrix elements cheaply and the operator structure is friendly.
- Power iteration — the underlying Krylov-building primitive both use.
- Liouville-Lanczos linear-response TDDFT — the physics application that motivates this page.