Bi-Orthogonal Lanczos

Linear Algebra

The bi-orthogonal (or "two-sided") Lanczos algorithm tridiagonalises a NON-Hermitian matrix. Standard symmetric Lanczos relies on to truncate the Krylov projection to three terms; that property doesn't hold for general operators. The bi-orthogonal variant restores a three-term recursion by carrying TWO Krylov sequences — one driven by , one driven by — and enforcing duality between them. The output is a (non-symmetric) tridiagonal matrix that represents in a bi-orthogonal basis.

The motivating use case is resolvents of non-Hermitian operators: as a function of . This matrix element shows up everywhere — Green's functions in many-body physics, response functions in linear-response TDDFT, transfer-matrix calculations, transient response of non-self-adjoint linearized PDEs. Bi-orthogonal Lanczos turns the resolvent into a CONTINUED FRACTION whose coefficients are exactly the entries of the tridiagonal projection — no eigendecomposition required.

Why two sequences

For a Hermitian matrix, the Krylov subspace has a natural orthonormal basis under the standard inner product, and a SYMMETRIC three-term recursion suffices. For a non-Hermitian , you can still build and orthonormalise it (that's Arnoldi) — but the result is a HESSENBERG matrix, not tridiagonal, because there is no reason for to vanish when unless is Hermitian under the chosen inner product.

Bi-orthogonal Lanczos sidesteps Arnoldi's Hessenberg cost. Build TWO Krylov subspaces:

and choose bases for the first and for the second so that they are DUAL:

This is BI-ORTHOGONALITY — not orthogonality in the usual sense. The two sets are not individually orthonormal; only their pairwise inner products vanish. Once the dual basis is in hand, the matrix (with and ) turns out to be TRIDIAGONAL — and the off-diagonal entries are non-zero only between adjacent levels, because matrix elements vanish when by the dual recursion structure. That is the bi-orthogonal analogue of the symmetric Lanczos three-term collapse, and it is the algorithmic payoff for maintaining two sequences instead of one.

The recursion

Assume the bases satisfy:

Three remarks. (i) Both recursions share the diagonal coefficient — they have to, because appears as . (ii) The off-diagonal coefficients are TWO different families, and , swapped between the right and left recursions. This is forced by bi-orthogonality, as can be checked: take of the first equation and of the second; the matching imposes that shows up in the recursion's coefficient and shows up in the recursion's coefficient. (iii) The PRODUCT is determined by the residuals; how you SPLIT it between and individually is a gauge choice (you can pick for the symmetric split, or any other consistent choice).

The bookkeeping at each iteration:

  1. Compute — this is one diagonal entry of .
  2. Form right and left residuals:
  3. Compute their bi-orthogonal product .
  4. Split the product (gauge choice — usually , ).
  5. Normalize the new basis vectors: , .

Three vectors of storage per side ( and the analogous 's), one application each of and per iteration. The cost per step is the same order as symmetric Lanczos, just doubled: two mat-vecs instead of one, and two stored Krylov vectors instead of one.

The algorithm

# Bi-orthogonal Lanczos on a (generic, non-Hermitian) matrix A.
#
# Inputs:  A (n x n),  v0 (right seed),  w0 (left seed),  n_iter
# Outputs: alphas, betas, gammas         — tridiagonal coefficients
#          V, W (n x n_iter)             — right and left Krylov bases
#          seed = <w0 | v0>              — overall scaling factor
#
# Bi-orthogonality: W^T V = I  (the columns are dual bases of the Krylov
# subspaces of A acting on v0 and A^T acting on w0).
# Tridiagonal projection:  T = W^T A V  has
#     T[j, j]   = alpha_j     (diagonal)
#     T[j+1, j] = beta_{j+1}  (sub-diagonal — built into the q-recursion)
#     T[j-1, j] = gamma_j     (super-diagonal — built into the p-recursion)
# In particular, T is NOT symmetric; that is the whole point.

import numpy as np

def biortho_lanczos(A, v0, w0, n_iter):
    seed = np.dot(w0, v0)
    if abs(seed) < 1e-14:
        raise ValueError("Seed vectors orthogonal — pick a different start.")
    scale = np.sqrt(seed + 0j)   # complex sqrt handles seed < 0 cleanly
    q = (v0 / scale).astype(complex)
    p = (w0 / scale).astype(complex)        # <p | q> = 1 by construction

    n = A.shape[0]
    V = np.zeros((n, n_iter), dtype=complex)
    W = np.zeros((n, n_iter), dtype=complex)
    q_prev = np.zeros_like(q)
    p_prev = np.zeros_like(p)
    beta_prev = gamma_prev = 0.0 + 0.0j

    alphas, betas, gammas = [], [], []
    for j in range(n_iter):
        V[:, j] = q
        W[:, j] = p

        Aq  = A @ q
        ATp = A.T @ p

        # Bi-orthogonal projection: <p_j | A q_j> = alpha_j
        alpha = np.dot(p, Aq)
        alphas.append(alpha)

        # Three-term residuals: project out the components already in
        # span{q_j, q_{j-1}} and span{p_j, p_{j-1}}.
        r = Aq  - alpha * q - gamma_prev * q_prev
        s = ATp - alpha * p - beta_prev  * p_prev

        rs = np.dot(s, r)            # = beta_{j+1} * gamma_{j+1}
        if abs(rs) < 1e-13:
            # Either lucky breakdown (invariant subspace found) or serious
            # breakdown (the bi-orthogonal recursion fails). For a small
            # demo we just stop; production code uses a look-ahead variant.
            break

        beta  = np.sqrt(rs + 0j)
        gamma = rs / beta            # any split with beta * gamma = rs works
        betas.append(beta)
        gammas.append(gamma)

        q_prev, p_prev = q, p
        q = r / beta
        p = s / gamma
        beta_prev, gamma_prev = beta, gamma

    return (np.array(alphas, dtype=complex),
            np.array(betas,  dtype=complex),
            np.array(gammas, dtype=complex),
            V, W, seed)

What the output represents

After iterations you have:

Three uses of , in increasing order of how much of the spectrum you need:

1. Extremal eigenvalues / Ritz values. The eigenvalues of (called Ritz values when seen as approximations to those of ) converge to the extreme eigenvalues of very rapidly as grows. Same idea as symmetric Lanczos for ground states, just on a non-Hermitian operator. Be warned: Ritz values can be complex even for a real , and they can MISS eigenvalues with small overlap to the seed.

2. Action of a function of . For any function regular on the spectrum of :

Why this is true: when and span a Krylov subspace large enough to capture the spectral content that probes. Then the matrix element . The most important specialization is — the resolvent.

3. Resolvent matrix elements as continued fractions. The (0,0) entry of the inverse of a tridiagonal matrix is a CONTINUED FRACTION in its entries. For with diagonal and off-diagonal products :

This is the most useful property of the algorithm. The Lanczos coefficients ARE the continued-fraction coefficients of the resolvent matrix element. After one chain of length you can evaluate the resolvent at any for work per point. That is exactly what makes Liouville-Lanczos linear-response TDDFT tractable for solids.

Validation

Run on a generic random non-Hermitian matrix and verify the three structural claims: bi-orthogonality, tridiagonal projection, and continued-fraction resolvent.

# Validate the algorithm against direct linear algebra:
#  (1) bi-orthogonality W^T V = I,
#  (2) tridiagonal projection W^T A V = tridiag(alpha, beta, gamma),
#  (3) resolvent <w0|(A - z)^{-1}|v0> = seed * [(T - z)^{-1}]_{00}
#      evaluated as a continued fraction in the Lanczos coefficients.

import numpy as np

np.random.seed(7)
n = 10
A  = np.random.randn(n, n)          # generic non-Hermitian matrix
v0 = np.random.randn(n)
w0 = np.random.randn(n)

alphas, betas, gammas, V, W, seed = biortho_lanczos(A, v0, w0, n_iter=n)

# (1) Bi-orthogonality
print(f"max |W^T V - I| = {np.max(np.abs(W.T @ V - np.eye(n))):.2e}")

# (2) Tridiagonal projection
T_proj  = W.T @ A @ V
T_built = (np.diag(alphas)
           + np.diag(betas,  -1)
           + np.diag(gammas, +1))
print(f"max |W^T A V - tridiag| = {np.max(np.abs(T_proj - T_built)):.2e}")

# (3) Continued-fraction resolvent
def resolvent_lanczos(alphas, betas, gammas, seed, omega, eta=0.05):
    z = omega + 1j * eta
    cf = 0.0 + 0.0j
    # Bottom-up: at level j the off-diagonal product is beta_j * gamma_j,
    # stored at array index j-1.
    for j in range(len(alphas) - 1, 0, -1):
        cf = (betas[j-1] * gammas[j-1]) / ((alphas[j] - z) - cf)
    cf = 1.0 / ((alphas[0] - z) - cf)
    return seed * cf

def resolvent_direct(A, v0, w0, omega, eta=0.05):
    z = omega + 1j * eta
    return w0 @ np.linalg.solve(A - z * np.eye(A.shape[0]),
                                v0.astype(complex))

print("\nResolvent <w0|(A - omega - i*eta)^{-1}|v0>:")
for w in np.linspace(-2.0, 2.0, 5):
    rd = resolvent_direct(A, v0, w0, w)
    rl = resolvent_lanczos(alphas, betas, gammas, seed, w)
    print(f"  omega = {w:+.2f}  direct = ({rd.real:+.5f}, {rd.imag:+.5f})"
          f"   Lanczos = ({rl.real:+.5f}, {rl.imag:+.5f})"
          f"   |diff| = {abs(rd - rl):.1e}")

Output:

max |W^T V - I| = 1.81e-12
max |W^T A V - tridiag| = 8.47e-12

Resolvent <w0|(A - omega - i*eta)^{-1}|v0>:
  omega = -2.00  direct = (+1.12493, +0.07443)   Lanczos = (+1.12493, +0.07443)   |diff| = 2.7e-14
  omega = -1.00  direct = (+3.78649, +0.14042)   Lanczos = (+3.78649, +0.14042)   |diff| = 1.8e-13
  omega = +0.00  direct = (+4.67512, +0.08985)   Lanczos = (+4.67512, +0.08985)   |diff| = 1.9e-13
  omega = +1.00  direct = (+15.84230, +0.27023)  Lanczos = (+15.84230, +0.27023)  |diff| = 1.0e-12
  omega = +2.00  direct = (-4.05682, +1.44429)   Lanczos = (-4.05682, +1.44429)   |diff| = 4.0e-13

All three properties hold to machine precision when the chain runs to the full dimension. The resolvent reproduces the direct linear-solve answer at every sampled frequency.

Breakdowns

Bi-orthogonal Lanczos has TWO failure modes that the symmetric variant doesn't. (1) Lucky breakdown: or — the chain has discovered an exact invariant subspace, and you can stop. Eigenvalues of the current are exact eigenvalues of . (2) SERIOUS breakdown: while neither residual is zero — the two Krylov subspaces have "collided" in a degenerate way and the bi-orthogonal recursion cannot continue without modification. Serious breakdown is the price of trading orthogonality for the three-term recursion; it does not occur in Arnoldi.

Production codes use LOOK-AHEAD Lanczos, which lets the dual basis vectors be "blocked" together over a few iterations to step past the singularity. In small or well-conditioned problems serious breakdowns are rare. For the linear-response problems on this site (where the seed vectors are physical perturbations and the operator is the Liouvillian of a physically well-posed system), they essentially never come up.

Relation to other methods