Quantum Chemistry

Hartree-Fock SCF — exercises

Recognize a problem as a fixed-point iteration — an operator that depends on its own eigenvector — and solve by iterative refinement: linearize with a guess, solve, use the result to rebuild the operator, iterate. Diagnose convergence, oscillation, and the use of damping or DIIS-style acceleration.

1 worked example · 7 practice problems · 2 check problems

Worked example: scalar fixed-point iteration

Problem. The SCF iteration in Hartree-Fock has the structure "Fock matrix depends on density, density depends on Fock-matrix eigenvectors." That coupled structure is the matrix version of a scalar fixed-point problem. To build intuition for what convergence looks like, solve the scalar fixed-point $x = 2 + x$ starting from $x_{0} = 0$ . Run 5 iterations by hand. Identify the fixed point analytically.

Diagnosis. A fixed point of $g (x)$ is any $x^{*}$ satisfying $x^{*} = g (x^{*})$ . Fixed-point iteration $x_{k + 1} = g (x_{k})$ converges to $x^{*}$ when $∣ g^{'} (x^{*}) ∣ < 1$ and you start close enough. The HF SCF loop is exactly this — with $x$ replaced by a Fock matrix, $g$ replaced by "build new Fock matrix from the current one's eigenvectors."

Predict before reading on: solve $x = 2 + x$ algebraically before iterating. Square both sides; you get a quadratic. The positive root is the fixed point.

Analytical fixed point. $x = 2 + x \Rightarrow x^{2} = 2 + x \Rightarrow x^{2} - x - 2 = 0 \Rightarrow x = (1 \pm 3) /2$ , so $x^{*} = 2$ (taking the positive root, since the iteration involves a positive square root).

Iteration trace.

k	$x_{k}$	computation	error $∣ x_{k} - 2∣$
0	0.000	(initial)	2.0
1	1.414	$2 + 0$	0.59
2	1.848	$2 + 1.414$	0.15
3	1.962	$3.848$	0.038
4	1.990	$3.962$	0.0096
5	1.998	$3.990$	0.0024

Convergence rate. Each iteration reduces the error by a factor of roughly 4 — the iteration is linearly convergent with rate $g^{'} (2) = 1/ (24) = 1/4 = 0.25$ . This matches the formal theory: linear-rate fixed-point iteration converges as $∣ x_{k} - x^{*} ∣ \sim ∣ g^{'} (x^{*}) ∣^{k}$ .

Predict before reading on: how does the rate $g^{'} (x^{*})$ compare to the worst-case rate $(κ - 1) / (κ + 1)$ of gradient descent? What does it look like when $g^{'} (x^{*}) > 1$ ?

If $∣ g^{'} (x^{*}) ∣ > 1$ , the iteration diverges away from the fixed point even if you start arbitrarily close. The HF SCF can do this — oscillate or blow up — which is what motivates damping (linear extrapolation of the new density with the old) and DIIS (Direct Inversion in Iterative Subspace, which uses a history of past iterates).

Verification.

import numpy as np

x = 0.0
for k in range(6):
    print(f"iter {k}: x = {x:.6f}")
    x = np.sqrt(2 + x)
# Converges to x = 2.0

Articulate: state in one sentence the structural analogy between scalar fixed-point iteration and Hartree-Fock SCF.

Practice problems

Seven problems. The unifying move is the same — iterate until self-consistent — across very different surface contexts. Some are abstract (scalar fixed points, power iteration); some are physical (SCF, mean-field Ising); the last is a transfer to a non-quantum-chemistry self-consistency.

P.1 scalar fixed-point convergence analysis

Consider the fixed-point iteration $x_{k + 1} = \frac{1}{2} (x_{k} + 2/ x_{k})$ , starting from $x_{0} = 1$ .

(a) Identify the fixed point analytically.

(b) Compute the rate $∣ g^{'} (x^{*}) ∣$ and predict the convergence behavior.

Find the analogue: same iteration structure as the worked example, different $g$ . Compute $g^{'} (x^{*})$ and use the linear-rate formula. Then verify against actual iterations.

show answer

(a) Fixed point: $x = \frac{1}{2} (x + 2/ x) \Rightarrow x /2 = 1/ x \Rightarrow x^{2} = 2 \Rightarrow x^{*} = 2$ .

(b) $g^{'} (x) = \frac{1}{2} (1 - 2/ x^{2})$ . At $x = 2$ : $g^{'} (2) = \frac{1}{2} (1 - 1) = 0$ . Quadratic convergence — the iteration is Newton's method for $2$ , and the rate is even better than linear: error squares each step.

(c) $x_{0} = 1$ → $x_{1} = (1 + 2) /2 = 1.5$ → $x_{2} = (1.5 + 1.333) /2 = 1.4167$ → $x_{3} \approx 1.41422$ . Each error squares: $0.414 \to 0.086 \to 0.0025 \to 5 \times 1 0^{- 6}$ . Quadratic ✓

This is the Babylonian (or Heron's) method for computing square roots, in use for ~4000 years. It's just Newton's method on $f (x) = x^{2} - 2$ , and the $g^{'} (x^{*}) = 0$ property is what makes Newton converge quadratically.

P.2 cosine fixed-point (Dottie number)

Iterate $x_{k + 1} = cos (x_{k})$ starting from any $x_{0}$ . What does it converge to? Why is this called the "Dottie number" and what does it look like graphically?

Find the analogue: same fixed-point iteration, different $g (x) = cos x$ . Solve $x = cos (x)$ graphically (intersection of $y = x$ and $y = cos x$ ) and characterize the convergence rate.

show answer

Converges to $x^{*} \approx 0.7391$ — the Dottie number, the unique real solution of $cos x = x$ .

It's called the Dottie number because the iteration converges from any starting point — try it on your calculator: hit cos repeatedly. Whatever you started with (in radians), you end up at 0.7391.

Graphically: the curve $y = cos x$ intersects the line $y = x$ exactly once, at $x \approx 0.7391$ . The slope of cosine there is $- sin (0.7391) \approx - 0.674$ , so $∣ g^{'} (x^{*}) ∣ = 0.674 < 1$ and the iteration converges linearly. The negative sign means it oscillates around the fixed point as it converges — successive iterates alternate above and below.

No closed form exists for the Dottie number — it's transcendental. Like many fixed points, you can only compute it by iterating.

P.3 power iteration for dominant eigenvector

Run power iteration on the matrix $A = diag (3, 1, - 2)$ starting from $v_{0} = (1, 1, 1)^{T}$ . Normalize after each step. Where does it converge? Compute the Rayleigh quotient $v^{T} A v$ at convergence to find the dominant eigenvalue.

Find the analogue: power iteration is fixed-point iteration on the normalized matrix-vector product map: $v \mapsto A v /∥ A v ∥$ . The fixed point is the dominant eigenvector. The SCF iteration is power iteration on the Fock-matrix-density loop, with extra structure.

show answer

import numpy as np
A = np.diag([3.0, 1.0, -2.0])
v = np.array([1.0, 1.0, 1.0])
for _ in range(20):
    v = A @ v
    v = v / np.linalg.norm(v)
print(v)                        # ≈ (1, 0, 0)
print(v @ A @ v)                # ≈ 3.0

Converges to $(1, 0, 0)^{T}$ , the eigenvector of the largest-magnitude eigenvalue 3.

Key subtlety: the eigenvalue $- 2$ has larger magnitude than 1 but smaller than 3. Power iteration finds the largest-magnitude eigenvalue, not the largest signed one. Successive iterates alternate sign if the dominant eigenvalue is negative.

Convergence rate: $∣ λ_{2} / λ_{1} ∣^{k}$ where $λ_{1}$ is the dominant eigenvalue and $λ_{2}$ the next-largest in magnitude. Here $∣ λ_{2} / λ_{1} ∣ = 2/3$ , so error shrinks by 2/3 per step — slow. If the spectrum were $(3, 0.1, 0.01)$ , convergence would be much faster.

The HF SCF inherits this same slow-when-eigenvalues-cluster behavior. When orbital energies are nearly degenerate near the HOMO/LUMO gap, vanilla SCF converges slowly or oscillates — which is what motivates DIIS and level shifting.

P.4 oscillation and damping

Consider the iteration $x_{k + 1} = 1 - 2 x_{k}$ , starting from $x_{0} = 0$ .

(a) What's the fixed point?

(b) Run 5 iterations. What happens?

Find the analogue: damping (a.k.a. linear mixing) is the standard fix for SCF oscillation: instead of taking the full new density, take a weighted average of the old and new. Same trick applied here to a scalar diverging iteration.

show answer

(a) $x = 1 - 2 x \Rightarrow x = 1/3$ .

(b) $g^{'} (x) = - 2$ , so $∣ g^{'} ∣ = 2 > 1$ — the iteration diverges. Trace: $x_{0} = 0 \to 1 \to - 1 \to 3 \to - 5 \to 11$ . Alternating sign, growing magnitude.

(c) Damped iteration: $x_{k + 1} = 0.4 (1 - 2 x_{k}) + 0.6 x_{k} = 0.4 - 0.2 x_{k}$ . Effective $g^{'} (x) = - 0.2$ , magnitude < 1, so now the iteration converges. Fixed point unchanged at $0.4/ (1 + 0.2) = 1/3$ .

Trace: $x_{0} = 0 \to 0.4 \to 0.32 \to 0.336 \to 0.3328 \to 0.3334$ . Converges nicely to 1/3.

The general formula: if the bare iteration has effective derivative $g^{'} (x^{*}) = m$ with $∣ m ∣ > 1$ , the damped iteration with mixing $α \in (0, 1)$ has effective derivative $α m + (1 - α)$ . Choose $α$ small enough to push this into $(- 1, 1)$ , then make it as large as possible for fastest convergence.

In SCF practice, $α \sim 0.3 - 0.5$ is a typical safe value for difficult systems. DIIS does this adaptively, using past iterates to project out the slowly-converging direction altogether.

P.5 mean-field Ising self-consistency

The mean-field Ising model has self-consistent magnetization $m = tanh (β J m)$ where $β J$ is the dimensionless coupling.

(a) At $β J = 1.5$ , iterate from $m_{0} = 0.5$ . What does it converge to?

(b) At $β J = 0.5$ , iterate from $m_{0} = 0.5$ . What does it converge to?

(c) The transition happens at $β J = 1$ (the mean-field $T_{c}$ ). Show this by examining the slope of $g (m) = tanh (β J m)$ at $m = 0$ .

Find the analogue: same fixed-point iteration as the worked example, but the function depends on a physical parameter $β J$ . The qualitative behavior changes at a critical value — exactly the same way HFB pairing has a critical $G_{c}$ below which $Δ = 0$ is the only fixed point.

show answer

(a) $β J = 1.5$ : iterate $m \to tanh (1.5 m)$ from $m_{0} = 0.5$ . Converges to $m^{*} \approx 0.859$ . Non-trivial fixed point — the system is in the symmetry-broken (ferromagnetic) phase.

(b) $β J = 0.5$ : converges to $m^{*} = 0$ . Disordered (paramagnetic) phase.

(c) $g^{'} (m) = β J \cdot (1 - tanh^{2} (β J m))$ . At $m = 0$ : $g^{'} (0) = β J$ . The trivial fixed point $m^{*} = 0$ is stable iff $∣ g^{'} (0) ∣ < 1$ , i.e., $β J < 1$ . At $β J = 1$ the stability transitions: above this threshold, the trivial fixed point becomes unstable and a non-trivial fixed point emerges. This is a bifurcation, the fixed-point version of a phase transition.

The HFB gap equation has the same structure (P.1 of the HFB exercise set derived $G_{c} = a /Ω$ ). Both phase transitions are special cases of "loss of stability of the trivial fixed point" — fixed-point iteration with an external parameter, crossing the rate threshold $∣ g^{'} ∣ = 1$ at the critical value.

P.6 SCF convergence diagnostics

Below are dE-per-iteration logs from two SCF runs:

Run A:
  iter 0    E = -2.7116    dE = -2.71e+00
  iter 1    E = -2.8151    dE = -1.04e-01
  iter 2    E = -2.8162    dE = -1.10e-03
  iter 3    E = -2.81625   dE = -1.48e-05
  iter 4    E = -2.81625   dE = -2.22e-07
  iter 5    E = -2.81625   dE = -3.39e-09

Run B:
  iter 0    E = -3.2       dE = -3.2e+00
  iter 1    E = -3.1       dE = +0.1e+00
  iter 2    E = -3.3       dE = -0.2e+00
  iter 3    E = -3.1       dE = +0.2e+00
  iter 4    E = -3.3       dE = -0.2e+00

(a) Which run is converging, and at what rate? Estimate the linear convergence factor $r$ .

(b) What's happening in Run B? Propose two interventions.

Find the analogue: Run A's pattern matches the worked example's: linear convergence, error shrinking by a constant factor per iteration. Run B's alternating-sign pattern is the same diagnostic as P.4: $∣ g^{'} ∣ > 1$ with a negative $g^{'}$ , producing the divergent oscillation.

show answer

(a) Run A is converging. Linear rate from successive ratios:

$∣ d E_{2} / d E_{1} ∣ = 1.10 \times 1 0^{- 3} /1.04 \times 1 0^{- 1} \approx 0.011$
$∣ d E_{3} / d E_{2} ∣ \approx 0.013$
$∣ d E_{4} / d E_{3} ∣ \approx 0.015$

Roughly $r \approx 0.013$ . Each iteration reduces the energy error by ~80x — fast linear convergence, consistent with a well-conditioned SCF (large HOMO-LUMO gap).

(b) Run B is oscillating. The energy alternates between -3.1 and -3.3 — the iteration has crossed into $∣ g^{'} ∣ > 1$ territory, like the worked-example divergence but stuck in a 2-cycle rather than blowing up. Likely cause: small HOMO-LUMO gap or near-degeneracy at the Fermi level.

Interventions: (i) Linear damping: replace the new density $P_{new}$ with $0.3 P_{new} + 0.7 P_{old}$ (or similar). Costs $O (basis size^{2})$ per iteration, easy to implement, sometimes works on its own. (ii) Level shifting: add a large positive shift $b$ to the unoccupied-orbital block of the Fock matrix to artificially open the HOMO-LUMO gap. After convergence, the orbitals are unchanged but the gap is now controlled by the user. (iii) DIIS (Pulay's Direct Inversion of the Iterative Subspace): build a history of error vectors and find the linear combination that minimizes their norm. Standard tool in production quantum-chemistry codes.

P.7 cross-domain self-consistency (Wien's displacement law)

Wien's displacement law states that the wavelength at which a blackbody's spectrum peaks satisfies $λ_{m a x} T = b$ for some constant $b$ . Deriving $b$ reduces to solving the transcendental equation $(5 - x) e^{x} = 5$ .

(a) Show that this is equivalent to the fixed-point equation $x = 5 (1 - e^{- x})$ .

(b) Iterate from $x_{0} = 4$ and find $x$ to 4 decimal places.

Find the analogue: transcendental equations from physics frequently reduce to fixed-point iterations. Wien's constant, Kepler's equation, the Saha equation in stellar atmospheres — all the same pattern as HF SCF, with the operator being scalar or low-dimensional rather than matrix-valued.

show answer

(a) $(5 - x) e^{x} = 5 \Rightarrow 5 - x = 5 e^{- x} \Rightarrow x = 5 (1 - e^{- x})$ . ✓

(b) Iterate $x \to 5 (1 - e^{- x})$ from $x_{0} = 4$ : $4 \to 5 (1 - e^{- 4}) = 4.908 \to 5 (1 - e^{- 4.908}) = 4.963 \to 4.965$ . Converges to $x^{*} \approx 4.9651$ .

From this, Wien's constant: $b = h c / (k_{B} \cdot 4.9651) \approx 2.898 \times 1 0^{- 3}$ m·K. Matches the tabulated value.

(c) It's a fixed-point iteration because $x = g (x)$ with $g (x) = 5 (1 - e^{- x})$ has $g^{'} (x^{*}) = 5 e^{- x^{*}} \approx 0.035$ , so $∣ g^{'} ∣ < 1$ and iteration converges. The connection to HF SCF: both are problems of the form "find $x$ such that $x = g (x)$ " where $g$ represents some self-referential physical law. In SCF, $g$ is "build new Fock matrix from current density." In Wien's law, $g$ is "where does the derivative of $x^{5} / (e^{x} - 1)$ vanish?" Same algorithm, same convergence analysis, vastly different domains.

The pattern is so common that the term "self-consistent" appears in: SCF (Hartree-Fock), DFT (Kohn-Sham), GW (many-body perturbation theory), CCSD (coupled-cluster, technically not fixed-point but iterative), Vlasov-Poisson (plasma physics), mean-field neural-network theory, fixed-point combinators in functional programming. The shared mathematical structure is the Banach fixed-point theorem and its variants.

Check problems

Check 1 derivation

Prove the local convergence theorem for fixed-point iteration:

If $g : R \to R$ is continuously differentiable and $x^{*}$ is a fixed point with $∣ g^{'} (x^{*}) ∣ < 1$ , then there is a neighborhood of $x^{*}$ such that the iteration $x_{k + 1} = g (x_{k})$ starting from any point in that neighborhood converges to $x^{*}$ , with linear rate $∣ g^{'} (x^{*}) ∣$ .

Specifically:

(a) Use the mean value theorem to derive $x_{k + 1} - x^{*} = g^{'} (ξ_{k}) (x_{k} - x^{*})$ for some $ξ_{k}$ .

(b) Use continuity of $g^{'}$ to argue that $g^{'} (ξ_{k}) \to g^{'} (x^{*})$ as $x_{k} \to x^{*}$ .

show solution sketch

(a) By the mean value theorem applied to $g$ on the interval between $x_{k}$ and $x^{*}$ :

$g (x_{k}) - g (x^{*}) = g^{'} (ξ_{k}) (x_{k} - x^{*})$ for some $ξ_{k}$ between $x_{k}$ and $x^{*}$ .

But $g (x_{k}) = x_{k + 1}$ and $g (x^{*}) = x^{*}$ (fixed-point property), so

$x_{k + 1} - x^{*} = g^{'} (ξ_{k}) (x_{k} - x^{*})$ ✓

(b) Pick a neighborhood $N_{δ} = (x^{*} - δ, x^{*} + δ)$ small enough that $∣ g^{'} (y) ∣ \leq r$ for all $y \in N_{δ}$ , where $r$ is some number with $∣ g^{'} (x^{*}) ∣ < r < 1$ . This is possible because $g^{'}$ is continuous and $∣ g^{'} (x^{*}) ∣ < 1$ .

If $x_{k} \in N_{δ}$ , then $ξ_{k} \in N_{δ}$ (it's between $x_{k}$ and $x^{*}$ ), so $∣ g^{'} (ξ_{k}) ∣ \leq r$ .

Therefore: $∣ x_{k + 1} - x^{*} ∣ \leq r \cdot ∣ x_{k} - x^{*} ∣$ .

(c) By induction, $∣ x_{k} - x^{*} ∣ \leq r^{k} \cdot ∣ x_{0} - x^{*} ∣$ , which goes to 0 as $k \to \infty$ since $r < 1$ .

Moreover, $∣ x_{k + 1} - x^{*} ∣/∣ x_{k} - x^{*} ∣ \leq r$ for all $k$ , and by continuity this ratio approaches $∣ g^{'} (x^{*}) ∣$ as $x_{k} \to x^{*}$ . So the asymptotic rate is exactly $∣ g^{'} (x^{*}) ∣$ . ∎

This proof generalizes to multi-dimensional fixed-point iterations on $R^{n}$ with the Jacobian $\nabla g (x^{*})$ replacing $g^{'} (x^{*})$ and "all eigenvalues of $\nabla g (x^{*})$ inside the unit disk" replacing $∣ g^{'} (x^{*}) ∣ < 1$ . This is what governs HF SCF convergence: the SCF iteration is locally a multidimensional fixed-point iteration on the density matrix, and its convergence rate is set by the largest-magnitude eigenvalue of the corresponding Jacobian — which depends on the orbital structure (HOMO-LUMO gap especially).

Check 2 articulation

In 150-250 words, articulate the relationship between three quantities in Hartree-Fock SCF:

The energy difference between consecutive iterations, $Δ E_{k} = E_{k} - E_{k - 1}$ .
The density-matrix change between iterations, $∥ P_{k} - P_{k - 1} ∥$ .
The HOMO-LUMO gap of the converged solution.

Explain why "small $Δ E$ " is not a sufficient convergence criterion in practice, why density-based criteria are more reliable, and how the gap controls the asymptotic convergence rate. Connect your explanation to the fixed-point theory from C1.

show solution sketch

SCF energy is variational in the density matrix: $E [P]$ has a local minimum at the converged density $P^{*}$ , so $\partial E / \partial P = 0$ there. Near convergence, $Δ E_{k} = E_{k} - E_{k - 1}$ is second-order in $∥ P_{k} - P^{*} ∥$ — the energy can be tiny while the density is still substantially off. This is a real problem: $∣Δ E ∣ < 1 0^{- 6}$ hartree might correspond to $∥Δ P ∥ \sim 1 0^{- 3}$ , which still produces wrong orbital coefficients for downstream uses (gradients, transition densities, embedding). Production codes always check both: $∣Δ E ∣ < 1 0^{- 7}$ and $∥Δ P ∥ < 1 0^{- 6}$ .

The HOMO-LUMO gap controls the asymptotic rate. The SCF iteration is, near convergence, a linear fixed-point iteration on the density: $P_{k + 1} - P^{*} \approx J (P_{k} - P^{*})$ where $J$ is the SCF Jacobian. The eigenvalues of $J$ depend on inverse orbital-energy differences: $(ϵ_{i} - ϵ_{a})^{- 1}$ where $i$ indexes occupied orbitals and $a$ indexes virtuals. The smallest difference — the HOMO-LUMO gap — produces the largest-magnitude Jacobian eigenvalue and the slowest converging component. Systems with large gaps (closed-shell organic molecules) converge in 5-10 iterations; systems with small gaps (transition metals, near-degenerate states) need 50+ or fail to converge without DIIS.

This is exactly the C1 result applied to the matrix case. The convergence rate $r = ∣ g^{'} (x^{*}) ∣$ generalizes to "spectral radius of the SCF Jacobian," and the SCF Jacobian's spectrum is what links convergence behavior to the system's electronic structure.