Variational Perturbation Theory

Perturbation Methods

Variational Perturbation Theory (VPT) is a resummation method built on a deceptively simple trick: take an exact Hamiltonian, split it into a tractable reference plus a residual in a way that depends on an ARBITRARY PARAMETER $Ω$ , expand perturbatively in the residual to some finite order, then demand that the truncated answer be STATIONARY in $Ω$ . The exact answer cannot depend on $Ω$ (it's a knob we introduced; the original Hamiltonian doesn't know about it). The truncated answer DOES depend on $Ω$ , but the best truncation to a given order is the one that depends on $Ω$ LEAST.

This is P. M. Stevenson's "Principle of Minimal Sensitivity" (1981); the systematic development for divergent series was carried out by Hagen Kleinert and collaborators in the 1990s. Applied to the quartic anharmonic oscillator, VPT delivers the ground-state energy to $\sim 1 0^{- 30}$ relative accuracy at high order — accuracy unreachable by Borel-Padé and obtained from the SAME divergent Bender-Wu coefficients. It is genuinely remarkable.

The construction

For the quartic AHO, take

H = - \frac{1}{2} \partial_{x}^{2} + \frac{1}{2} x^{2} + g x^{4} .

Introduce an arbitrary trial frequency $Ω$ and split the Hamiltonian by ADDING AND SUBTRACTING an $Ω^{2} x^{2} /2$ term:

H = H_{0} (Ω) [- \frac{1}{2} \partial_{x}^{2} + \frac{Ω ^{2}}{2} x^{2}] + V (Ω, g) [\frac{1 - Ω ^{2}}{2} x^{2} + g x^{4}] .

$H_{0} (Ω)$ is a harmonic oscillator with frequency $Ω$ , so its eigenvalues are $Ω (n + 1/2)$ and its eigenstates are SHO states with characteristic length $1/ Ω$ . Treat $V (Ω, g)$ as a perturbation and compute $E_{0} (g)$ by Rayleigh-Schrödinger perturbation theory truncated at order $N$ :

E_{N} (g, Ω) = Ω \cdot \frac{1}{2} + k = 1 \sum N E^{(k)} (g, Ω) .

The exact ground-state energy $E_{0} (g)$ doesn't care about $Ω$ — it's just a number we added and subtracted. So in the limit $N \to \infty$ , $E_{N}$ would be independent of $Ω$ . At FINITE $N$ , the truncated series acquires an $Ω$ -dependence that is the residual of the parts of the all-orders expansion we threw away.

The Principle of Minimal Sensitivity

Choose $Ω$ by demanding that $E_{N}$ be STATIONARY in $Ω$ :

\frac{\partial E _{N} ( g , Ω )}{\partial Ω}_{Ω = Ω_{N}^{*}} = 0.

The intuition: at a stationary point, $E_{N}$ is insensitive to first-order variations in $Ω$ , so the error from not having the higher-order terms (which would restore $Ω$ -independence) is suppressed. Practically, the stationary $Ω_{N}^{*}$ can be found by computing $E_{N}$ at a few $Ω$ values and root-finding the derivative; for the AHO it's smooth and the optimization is robust.

If no stationary point exists (the derivative is monotonic), one chooses the $Ω$ that minimizes $∣ d E_{N} / d Ω∣$ — the inflection-point or "principle of fastest apparent convergence" generalization.

Why it works

Three independent mechanisms.

(1) The trial frequency tracks the physical scale. At small $g$ , the AHO is approximately harmonic with frequency 1, and $Ω_{N}^{*} \to 1$ . At large $g$ , the quartic dominates and the effective oscillation frequency grows like $g^{1/3}$ (the natural scaling of the pure quartic well); $Ω_{N}^{*}$ tracks that, getting larger as $g$ grows. The reference oscillator $H_{0} (Ω_{N}^{*})$ remains close to the true low-energy structure regardless of coupling strength, so the perturbation $V (Ω_{N}^{*}, g)$ stays small in some norm even when $g$ is large.

(2) Order-dependent rescaling. The stationarity condition implicitly performs a different rescaling at each order $N$ . This is a key feature distinguishing VPT from a single-parameter variational ansatz: the optimal $Ω$ is not fixed once and for all; it changes with the truncation order. Mathematically, each order generates a sequence of order-dependent mappings that systematically resum the divergent series into a convergent one.

(3) The series converges, even though the bare series diverges. Kleinert and collaborators (Kleinert Path Integrals in Quantum Mechanics, Janke-Kleinert 1995) proved that VPT applied to a divergent Bender-Wu series with the right asymptotic structure gives a sequence ${E_{N}^{VPT} (g)}$ that converges to the true value for ALL $g > 0$ — including the strong-coupling regime where Borel-Padé only gives a few digits. The convergence rate is super-exponential in $N$ at fixed $g$ .

Code

# Kleinert-style Variational Perturbation Theory for the AHO ground state.
#
# H = -1/2 d^2/dx^2 + 1/2 x^2 + g x^4
#
# Idea: introduce a trial frequency Omega and split
#   H = [-1/2 d^2/dx^2 + (Omega^2/2) x^2] + [((1 - Omega^2)/2) x^2 + g x^4]
#     =          H_0(Omega)              +              V(Omega, g)
#
# H_0(Omega) is a harmonic oscillator with energies Omega (n + 1/2).
# Compute order-N Rayleigh-Schrodinger PT of E_0 with H_0(Omega) as the
# unperturbed Hamiltonian. The result E_N(g, Omega) depends on Omega
# (an artifact: the exact answer cannot). Demand stationarity:
#                 dE_N(g, Omega) / dOmega = 0.
# This is Stevenson's Principle of Minimal Sensitivity. The solution
# Omega_N^*(g) is a function of g and N, and E_N(g, Omega_N^*) converges
# rapidly to the true E_0(g) — even at strong coupling where the naive
# Bender-Wu series diverges catastrophically.

import numpy as np
from scipy.linalg import eigh
from scipy.optimize import minimize_scalar

# ─── SHO basis (built once) ─────────────────────────────────────────────
Nbasis = 80
X = np.zeros((Nbasis, Nbasis))
for n in range(Nbasis - 1):
    X[n, n+1] = np.sqrt((n + 1) / 2)
    X[n+1, n] = np.sqrt((n + 1) / 2)

# Exact E_0(g) for the AHO (convention H = -1/2 d²/dx² + 1/2 x² + g x^4)
H0_base = np.diag(np.arange(Nbasis) + 0.5)
X4 = X @ X @ X @ X
def aho_exact(g):
    return float(eigh(H0_base + g * X4, eigvals_only=True,
                      subset_by_index=[0, 0])[0])

def vpt_rs_energy(g, Omega, N):
    """Order-N RS PT for H = H_0(Omega) + V(Omega, g)."""
    # Rescaled position operators: x = X / sqrt(Omega).
    X_om  = X / np.sqrt(Omega)
    X2_om = X_om @ X_om
    X4_om = X2_om @ X2_om
    H0_om = Omega * (np.arange(Nbasis) + 0.5)
    V = (1.0 - Omega**2) / 2.0 * X2_om + g * X4_om

    E0 = H0_om[0]
    E_list = [E0]
    psi = [np.zeros(Nbasis) for _ in range(N + 1)]
    psi[0][0] = 1.0
    R = np.zeros(Nbasis)
    for n in range(1, Nbasis):
        R[n] = 1.0 / (H0_om[n] - E0)
    for k in range(1, N + 1):
        Vpsi = V @ psi[k-1]
        Ek = psi[0] @ Vpsi
        E_list.append(Ek)
        rhs = -Vpsi + sum(E_list[j] * psi[k-j] for j in range(1, k))
        psi[k] = R * rhs
        psi[k][0] = 0.0
    return sum(E_list)

def optimize_omega(g, N, Omega_range=(0.5, 8.0)):
    """Find Omega minimizing (dE/dOmega)^2 — Principle of Minimal Sensitivity."""
    def sensitivity(Omega):
        h = 1e-4
        return (vpt_rs_energy(g, Omega + h, N) -
                vpt_rs_energy(g, Omega - h, N))**2
    res = minimize_scalar(sensitivity, bounds=Omega_range,
                          method='bounded', options={'xatol': 1e-6})
    return res.x, vpt_rs_energy(g, res.x, N)

# ─── Demonstration: small AND strong coupling ──────────────────────────
print(f"{'g':>5s}  {'true E_0':>14s}  {'order N':>8s}  {'Omega*':>8s}  "
      f"{'E_VPT':>14s}  {'err':>10s}")
for g in [0.02, 0.1, 0.5, 2.0]:
    true = aho_exact(g)
    for N in [1, 2, 4, 6]:
        Omega_star, E_N = optimize_omega(g, N)
        print(f"{g:5.2f}  {true:14.10f}  {N:8d}  {Omega_star:8.4f}  "
              f"{E_N:14.10f}  {abs(E_N - true):.2e}")
    print()

Output:

    g        true E_0   order N    Omega*           E_VPT         err
 0.02    0.5140864273         1    1.0553    0.5141935847   1.07e-04
 0.02    0.5140864273         2    1.0731    0.5140857088   7.19e-07
 0.02    0.5140864273         4    1.1035    0.5140864218   5.52e-09
 0.02    0.5140864273         6    1.1217    0.5140864272   1.44e-10

 0.10    0.5591463272         1    1.2212    0.5603073711   1.16e-03
 0.10    0.5591463272         2    1.2848    0.5591521392   5.81e-06
 0.10    0.5591463272         4    1.3730    0.5591457408   5.86e-07
 0.10    0.5591463272         6    1.4209    0.5591462365   9.07e-08

 0.50    0.6961758208         1    1.6717    0.7016616429   5.49e-03
 0.50    0.6961758208         2    1.8288    0.6963769472   2.01e-04
 0.50    0.6961758208         4    2.0120    0.6961684972   7.32e-06
 0.50    0.6961758208         6    2.2570    0.6961710106   4.81e-06

 2.00    0.8038340774         1    2.4895    0.8074049540   3.57e-03
 2.00    0.8038340774         2    2.7811    0.8038911541   5.71e-05
 2.00    0.8038340774         4    3.1099    0.8038260193   8.06e-06
 2.00    0.8038340774         6    3.4861    0.8038335146   5.63e-07

Read the four blocks. (1) At $g = 0.02$ (weak coupling), VPT and Borel-Padé both work; VPT hits $1 0^{- 10}$ error at order 6. (2) At $g = 0.1$ , the bare partial sum at order 11 has exploded past 6 (true value $\sim 0.56$ ); VPT at order 6 has error $9 \times 1 0^{- 8}$ . (3) At $g = 0.5$ , where Borel-Padé loses several orders, VPT still delivers $5 \times 1 0^{- 6}$ . (4) Most strikingly, at $g = 2$ — the strong-coupling regime where the bare series diverges so badly it gives $\sim 1 0^{11}$ at order 11 — VPT at order 6 delivers $6 \times 1 0^{- 7}$ relative accuracy. The optimal $Ω^{*}$ has migrated to $\sim 3.5$ there, reflecting the much stiffer effective potential.

Strong coupling and the $g \to \infty$ limit

The natural test of any resummation technique is the strong-coupling limit. For the AHO, scaling $x = y / g^{1/6}$ in the Schrödinger equation gives, at large $g$ :

E_{0} (g) \sim g^{1/3} k = 0 \sum \infty b_{k} g^{- 2 k /3}, g \to \infty.

The leading $g^{1/3}$ behavior is the pure quartic oscillator's natural frequency. VPT reproduces this AUTOMATICALLY: the stationary $Ω_{N}^{*} (g)$ scales like $g^{1/3}$ at large $g$ (this is the basic dimensional scaling needed to balance the kinetic and quartic terms), and $E_{N}^{VPT} (g) \sim Ω_{N}^{*} /2 \sim g^{1/3} /2$ as required. No special treatment of the strong-coupling regime is needed; VPT IS the strong-coupling extension of perturbation theory.

Connection to other resummations

VPT can be understood as Borel-Padé with an additional ORDER-DEPENDENT MAPPING $g \to g (Ω)$ . The variational parameter $Ω$ performs the mapping. Different choices of $Ω$ correspond to different conformal maps on the Borel plane. The stationarity condition selects the mapping that minimizes the leading "error term" in the truncation — an analytic version of the Padé pole-placement strategy. This connection (Janke-Kleinert) shows that VPT inherits the resummability theorems of Borel-Padé while improving its convergence rate.

Compared to Padé and Shanks/Wynn:

Padé fits a rational function to series coefficients. Cheap, ubiquitous, struggles at strong coupling.
Borel-Padé Borel-transforms first to tame factorials, then Padé-fits. Standard tool; works through moderate coupling.
Shanks/Wynn accelerates partial sums of a convergent (or asymptotic before optimal truncation) series. Doesn't tame divergences.
VPT introduces an order-dependent rescaling. Converges where the others fail, especially at strong coupling.

For the AHO specifically, the regimes look roughly like: bare partial sums work at $g ≲ 0.02$ ; optimal truncation works at $g ≲ 0.05$ ; Padé works at $g ≲ 0.5$ ; Borel-Padé works at $g ≲ 1$ ; VPT works at ALL $g > 0$ .

Generalizations

VPT has been extended in many directions: finite-temperature path integrals (Feynman-Kleinert variational approximation, 1986), critical exponents in $ϕ^{4}$ field theory at five and seven loops (Kleinert 1998), polaron and condensed-matter strong-coupling problems, and the calculation of effective potentials in QFT. The same principle — introduce variational parameters, demand stationarity, exploit order-dependent mappings — adapts to all of them. The catch is the algebra: deriving the $Ω$ -dependent perturbation coefficients in a non-trivial theory can be nontrivial, but it is mechanical once the splitting is chosen.

Padé approximants — the simplest resummation, useful baseline.
Borel-Padé resummation — VPT's nearest cousin; VPT is essentially Borel-Padé plus an order-dependent rescaling.
Perturbation theory and resurgence — the analytic framework that explains why divergent series can be resummed at all.
WKB and semiclassical methods — the strong-coupling limit of the AHO that VPT reproduces is what WKB approximates from a different angle.