Recognize a problem as a fixed-point iteration — an operator that depends on its own eigenvector — and solve by iterative refinement: linearize with a guess, solve, use the result to rebuild the operator, iterate. Diagnose convergence, oscillation, and the use of damping or DIIS-style acceleration.
1 worked example · 7 practice problems · 2 check problems
Worked example: scalar fixed-point iteration
Problem. The SCF iteration in Hartree-Fock has the structure "Fock matrix depends on density, density depends on Fock-matrix eigenvectors." That coupled structure is the matrix version of a scalar fixed-point problem. To build intuition for what convergence looks like, solve the scalar fixed-point x=2+x starting from x0=0. Run 5 iterations by hand. Identify the fixed point analytically.
Diagnosis. A fixed point of g(x) is any x∗ satisfying x∗=g(x∗). Fixed-point iteration xk+1=g(xk) converges to x∗ when ∣g′(x∗)∣<1 and you start close enough. The HF SCF loop is exactly this — with x replaced by a Fock matrix, g replaced by "build new Fock matrix from the current one's eigenvectors."
Predict before reading on: solve x=2+x algebraically before iterating. Square both sides; you get a quadratic. The positive root is the fixed point.
Analytical fixed point.x=2+x⇒x2=2+x⇒x2−x−2=0⇒x=(1±3)/2, so x∗=2 (taking the positive root, since the iteration involves a positive square root).
Iteration trace.
k
xk
computation
error ∣xk−2∣
0
0.000
(initial)
2.0
1
1.414
2+0
0.59
2
1.848
2+1.414
0.15
3
1.962
3.848
0.038
4
1.990
3.962
0.0096
5
1.998
3.990
0.0024
Convergence rate. Each iteration reduces the error by a factor of roughly 4 — the iteration is linearly convergent with rate g′(2)=1/(24)=1/4=0.25. This matches the formal theory: linear-rate fixed-point iteration converges as ∣xk−x∗∣∼∣g′(x∗)∣k.
Predict before reading on: how does the rate g′(x∗) compare to the worst-case rate (κ−1)/(κ+1) of gradient descent? What does it look like when g′(x∗)>1?
If ∣g′(x∗)∣>1, the iteration diverges away from the fixed point even if you start arbitrarily close. The HF SCF can do this — oscillate or blow up — which is what motivates damping (linear extrapolation of the new density with the old) and DIIS (Direct Inversion in Iterative Subspace, which uses a history of past iterates).
Verification.
import numpy as npx = 0.0for k in range(6): print(f"iter {k}: x = {x:.6f}") x = np.sqrt(2 + x)# Converges to x = 2.0
Articulate: state in one sentence the structural analogy between scalar fixed-point iteration and Hartree-Fock SCF.
Practice problems
Seven problems. The unifying move is the same — iterate until self-consistent — across very different surface contexts. Some are abstract (scalar fixed points, power iteration); some are physical (SCF, mean-field Ising); the last is a transfer to a non-quantum-chemistry self-consistency.
P.1scalar fixed-point convergence analysis
Consider the fixed-point iteration xk+1=21(xk+2/xk), starting from x0=1.
(a) Identify the fixed point analytically.
(b) Compute the rate ∣g′(x∗)∣ and predict the convergence behavior.
(c) Run 3 iterations and verify your prediction.
Find the analogue:
same iteration structure as the worked example, different g. Compute g′(x∗) and use the linear-rate formula. Then verify against actual iterations.
show answer
(a) Fixed point: x=21(x+2/x)⇒x/2=1/x⇒x2=2⇒x∗=2.
(b) g′(x)=21(1−2/x2). At x=2: g′(2)=21(1−1)=0. Quadratic convergence — the iteration is Newton's method for 2, and the rate is even better than linear: error squares each step.
This is the Babylonian (or Heron's) method for computing square roots, in use for ~4000 years. It's just Newton's method on f(x)=x2−2, and the g′(x∗)=0 property is what makes Newton converge quadratically.
P.2cosine fixed-point (Dottie number)
Iterate xk+1=cos(xk) starting from any x0. What does it converge to? Why is this called the "Dottie number" and what does it look like graphically?
Find the analogue:
same fixed-point iteration, different g(x)=cosx. Solve x=cos(x) graphically (intersection of y=x and y=cosx) and characterize the convergence rate.
show answer
Converges to x∗≈0.7391 — the Dottie number, the unique real solution of cosx=x.
It's called the Dottie number because the iteration converges from any starting point — try it on your calculator: hit cos repeatedly. Whatever you started with (in radians), you end up at 0.7391.
Graphically: the curve y=cosx intersects the line y=x exactly once, at x≈0.7391. The slope of cosine there is −sin(0.7391)≈−0.674, so ∣g′(x∗)∣=0.674<1 and the iteration converges linearly. The negative sign means it oscillates around the fixed point as it converges — successive iterates alternate above and below.
No closed form exists for the Dottie number — it's transcendental. Like many fixed points, you can only compute it by iterating.
P.3power iteration for dominant eigenvector
Run power iteration on the matrix A=diag(3,1,−2) starting from v0=(1,1,1)T. Normalize after each step. Where does it converge? Compute the Rayleigh quotient vTAv at convergence to find the dominant eigenvalue.
Find the analogue:
power iteration is fixed-point iteration on the normalized matrix-vector product map: v↦Av/∥Av∥. The fixed point is the dominant eigenvector. The SCF iteration is power iteration on the Fock-matrix-density loop, with extra structure.
show answer
import numpy as npA = np.diag([3.0, 1.0, -2.0])v = np.array([1.0, 1.0, 1.0])for _ in range(20): v = A @ v v = v / np.linalg.norm(v)print(v) # ≈ (1, 0, 0)print(v @ A @ v) # ≈ 3.0
Converges to (1,0,0)T, the eigenvector of the largest-magnitude eigenvalue 3.
Key subtlety: the eigenvalue −2 has larger magnitude than 1 but smaller than 3. Power iteration finds the largest-magnitude eigenvalue, not the largest signed one. Successive iterates alternate sign if the dominant eigenvalue is negative.
Convergence rate: ∣λ2/λ1∣k where λ1 is the dominant eigenvalue and λ2 the next-largest in magnitude. Here ∣λ2/λ1∣=2/3, so error shrinks by 2/3 per step — slow. If the spectrum were (3,0.1,0.01), convergence would be much faster.
The HF SCF inherits this same slow-when-eigenvalues-cluster behavior. When orbital energies are nearly degenerate near the HOMO/LUMO gap, vanilla SCF converges slowly or oscillates — which is what motivates DIIS and level shifting.
P.4oscillation and damping
Consider the iteration xk+1=1−2xk, starting from x0=0.
(a) What's the fixed point?
(b) Run 5 iterations. What happens?
(c) Apply linear damping: xk+1=α(1−2xk)+(1−α)xk with α=0.4. Run 5 iterations and observe.
Find the analogue:
damping (a.k.a. linear mixing) is the standard fix for SCF oscillation: instead of taking the full new density, take a weighted average of the old and new. Same trick applied here to a scalar diverging iteration.
show answer
(a) x=1−2x⇒x=1/3.
(b) g′(x)=−2, so ∣g′∣=2>1 — the iteration diverges. Trace: x0=0→1→−1→3→−5→11. Alternating sign, growing magnitude.
(c) Damped iteration: xk+1=0.4(1−2xk)+0.6xk=0.4−0.2xk. Effective g′(x)=−0.2, magnitude < 1, so now the iteration converges. Fixed point unchanged at 0.4/(1+0.2)=1/3.
Trace: x0=0→0.4→0.32→0.336→0.3328→0.3334. Converges nicely to 1/3.
The general formula: if the bare iteration has effective derivative g′(x∗)=m with ∣m∣>1, the damped iteration with mixing α∈(0,1) has effective derivative αm+(1−α). Choose α small enough to push this into (−1,1), then make it as large as possible for fastest convergence.
In SCF practice, α∼0.3−0.5 is a typical safe value for difficult systems. DIIS does this adaptively, using past iterates to project out the slowly-converging direction altogether.
P.5mean-field Ising self-consistency
The mean-field Ising model has self-consistent magnetization m=tanh(βJm) where βJ is the dimensionless coupling.
(a) At βJ=1.5, iterate from m0=0.5. What does it converge to?
(b) At βJ=0.5, iterate from m0=0.5. What does it converge to?
(c) The transition happens at βJ=1 (the mean-field Tc). Show this by examining the slope of g(m)=tanh(βJm) at m=0.
Find the analogue:
same fixed-point iteration as the worked example, but the function depends on a physical parameter βJ. The qualitative behavior changes at a critical value — exactly the same way HFB pairing has a critical Gc below which Δ=0 is the only fixed point.
show answer
(a) βJ=1.5: iterate m→tanh(1.5m) from m0=0.5. Converges to m∗≈0.859. Non-trivial fixed point — the system is in the symmetry-broken (ferromagnetic) phase.
(b) βJ=0.5: converges to m∗=0. Disordered (paramagnetic) phase.
(c) g′(m)=βJ⋅(1−tanh2(βJm)). At m=0: g′(0)=βJ. The trivial fixed point m∗=0 is stable iff ∣g′(0)∣<1, i.e., βJ<1. At βJ=1 the stability transitions: above this threshold, the trivial fixed point becomes unstable and a non-trivial fixed point emerges. This is a bifurcation, the fixed-point version of a phase transition.
The HFB gap equation has the same structure (P.1 of the HFB exercise set derived Gc=a/Ω). Both phase transitions are special cases of "loss of stability of the trivial fixed point" — fixed-point iteration with an external parameter, crossing the rate threshold ∣g′∣=1 at the critical value.
P.6SCF convergence diagnostics
Below are dE-per-iteration logs from two SCF runs:
Run A: iter 0 E = -2.7116 dE = -2.71e+00 iter 1 E = -2.8151 dE = -1.04e-01 iter 2 E = -2.8162 dE = -1.10e-03 iter 3 E = -2.81625 dE = -1.48e-05 iter 4 E = -2.81625 dE = -2.22e-07 iter 5 E = -2.81625 dE = -3.39e-09Run B: iter 0 E = -3.2 dE = -3.2e+00 iter 1 E = -3.1 dE = +0.1e+00 iter 2 E = -3.3 dE = -0.2e+00 iter 3 E = -3.1 dE = +0.2e+00 iter 4 E = -3.3 dE = -0.2e+00
(a) Which run is converging, and at what rate? Estimate the linear convergence factor r.
(b) What's happening in Run B? Propose two interventions.
Find the analogue:
Run A's pattern matches the worked example's: linear convergence, error shrinking by a constant factor per iteration. Run B's alternating-sign pattern is the same diagnostic as P.4: ∣g′∣>1 with a negative g′, producing the divergent oscillation.
show answer
(a) Run A is converging. Linear rate from successive ratios:
∣dE2/dE1∣=1.10×10−3/1.04×10−1≈0.011
∣dE3/dE2∣≈0.013
∣dE4/dE3∣≈0.015
Roughly r≈0.013. Each iteration reduces the energy error by ~80x — fast linear convergence, consistent with a well-conditioned SCF (large HOMO-LUMO gap).
(b) Run B is oscillating. The energy alternates between -3.1 and -3.3 — the iteration has crossed into ∣g′∣>1 territory, like the worked-example divergence but stuck in a 2-cycle rather than blowing up. Likely cause: small HOMO-LUMO gap or near-degeneracy at the Fermi level.
Interventions: (i) Linear damping: replace the new density Pnew with 0.3Pnew+0.7Pold (or similar). Costs O(basis size2) per iteration, easy to implement, sometimes works on its own. (ii) Level shifting: add a large positive shift b to the unoccupied-orbital block of the Fock matrix to artificially open the HOMO-LUMO gap. After convergence, the orbitals are unchanged but the gap is now controlled by the user. (iii) DIIS (Pulay's Direct Inversion of the Iterative Subspace): build a history of error vectors and find the linear combination that minimizes their norm. Standard tool in production quantum-chemistry codes.
Wien's displacement law states that the wavelength at which a blackbody's spectrum peaks satisfies λmaxT=b for some constant b. Deriving b reduces to solving the transcendental equation (5−x)ex=5.
(a) Show that this is equivalent to the fixed-point equation x=5(1−e−x).
(b) Iterate from x0=4 and find x to 4 decimal places.
(c) Why is this a fixed-point iteration, and what does it have to do with HF SCF?
Find the analogue:
transcendental equations from physics frequently reduce to fixed-point iterations. Wien's constant, Kepler's equation, the Saha equation in stellar atmospheres — all the same pattern as HF SCF, with the operator being scalar or low-dimensional rather than matrix-valued.
show answer
(a) (5−x)ex=5⇒5−x=5e−x⇒x=5(1−e−x). ✓
(b) Iterate x→5(1−e−x) from x0=4: 4→5(1−e−4)=4.908→5(1−e−4.908)=4.963→4.965. Converges to x∗≈4.9651.
From this, Wien's constant: b=hc/(kB⋅4.9651)≈2.898×10−3 m·K. Matches the tabulated value.
(c) It's a fixed-point iteration because x=g(x) with g(x)=5(1−e−x) has g′(x∗)=5e−x∗≈0.035, so ∣g′∣<1 and iteration converges. The connection to HF SCF: both are problems of the form "find x such that x=g(x)" where g represents some self-referential physical law. In SCF, g is "build new Fock matrix from current density." In Wien's law, g is "where does the derivative of x5/(ex−1) vanish?" Same algorithm, same convergence analysis, vastly different domains.
The pattern is so common that the term "self-consistent" appears in: SCF (Hartree-Fock), DFT (Kohn-Sham), GW (many-body perturbation theory), CCSD (coupled-cluster, technically not fixed-point but iterative), Vlasov-Poisson (plasma physics), mean-field neural-network theory, fixed-point combinators in functional programming. The shared mathematical structure is the Banach fixed-point theorem and its variants.
Check problems
Check 1derivation
Prove the local convergence theorem for fixed-point iteration:
If g:R→R is continuously differentiable and x∗ is a fixed point with ∣g′(x∗)∣<1, then there is a neighborhood of x∗ such that the iteration xk+1=g(xk) starting from any point in that neighborhood converges to x∗, with linear rate ∣g′(x∗)∣.
Specifically:
(a) Use the mean value theorem to derive xk+1−x∗=g′(ξk)(xk−x∗) for some ξk.
(b) Use continuity of g′ to argue that g′(ξk)→g′(x∗) as xk→x∗.
(c) Conclude the rate result.
show solution sketch
(a) By the mean value theorem applied to g on the interval between xk and x∗:
g(xk)−g(x∗)=g′(ξk)(xk−x∗) for some ξk between xk and x∗.
But g(xk)=xk+1 and g(x∗)=x∗ (fixed-point property), so
xk+1−x∗=g′(ξk)(xk−x∗) ✓
(b) Pick a neighborhood Nδ=(x∗−δ,x∗+δ) small enough that ∣g′(y)∣≤r for all y∈Nδ, where r is some number with ∣g′(x∗)∣<r<1. This is possible because g′ is continuous and ∣g′(x∗)∣<1.
If xk∈Nδ, then ξk∈Nδ (it's between xk and x∗), so ∣g′(ξk)∣≤r.
Therefore: ∣xk+1−x∗∣≤r⋅∣xk−x∗∣.
(c) By induction, ∣xk−x∗∣≤rk⋅∣x0−x∗∣, which goes to 0 as k→∞ since r<1.
Moreover, ∣xk+1−x∗∣/∣xk−x∗∣≤r for all k, and by continuity this ratio approaches ∣g′(x∗)∣ as xk→x∗. So the asymptotic rate is exactly ∣g′(x∗)∣. ∎
This proof generalizes to multi-dimensional fixed-point iterations on Rn with the Jacobian ∇g(x∗) replacing g′(x∗) and "all eigenvalues of ∇g(x∗) inside the unit disk" replacing ∣g′(x∗)∣<1. This is what governs HF SCF convergence: the SCF iteration is locally a multidimensional fixed-point iteration on the density matrix, and its convergence rate is set by the largest-magnitude eigenvalue of the corresponding Jacobian — which depends on the orbital structure (HOMO-LUMO gap especially).
Check 2articulation
In 150-250 words, articulate the relationship between three quantities in Hartree-Fock SCF:
The energy difference between consecutive iterations, ΔEk=Ek−Ek−1.
The density-matrix change between iterations, ∥Pk−Pk−1∥.
The HOMO-LUMO gap of the converged solution.
Explain why "small ΔE" is not a sufficient convergence criterion in practice, why density-based criteria are more reliable, and how the gap controls the asymptotic convergence rate. Connect your explanation to the fixed-point theory from C1.
show solution sketch
SCF energy is variational in the density matrix: E[P] has a local minimum at the converged density P∗, so ∂E/∂P=0 there. Near convergence, ΔEk=Ek−Ek−1 is second-order in ∥Pk−P∗∥ — the energy can be tiny while the density is still substantially off. This is a real problem: ∣ΔE∣<10−6 hartree might correspond to ∥ΔP∥∼10−3, which still produces wrong orbital coefficients for downstream uses (gradients, transition densities, embedding). Production codes always check both: ∣ΔE∣<10−7and∥ΔP∥<10−6.
The HOMO-LUMO gap controls the asymptotic rate. The SCF iteration is, near convergence, a linear fixed-point iteration on the density: Pk+1−P∗≈J(Pk−P∗) where J is the SCF Jacobian. The eigenvalues of J depend on inverse orbital-energy differences: (ϵi−ϵa)−1 where i indexes occupied orbitals and a indexes virtuals. The smallest difference — the HOMO-LUMO gap — produces the largest-magnitude Jacobian eigenvalue and the slowest converging component. Systems with large gaps (closed-shell organic molecules) converge in 5-10 iterations; systems with small gaps (transition metals, near-degenerate states) need 50+ or fail to converge without DIIS.
This is exactly the C1 result applied to the matrix case. The convergence rate r=∣g′(x∗)∣ generalizes to "spectral radius of the SCF Jacobian," and the SCF Jacobian's spectrum is what links convergence behavior to the system's electronic structure.