Separation of Variables

Pdes

Separation of variables is the workhorse analytical technique for linear PDEs on simple domains. The idea is almost shockingly simple: GUESS that the solution factors as $u (x, t) = X (x) T (t)$ (or a product in more variables), substitute, and observe that the PDE separates into ordinary differential equations — one for each independent variable — coupled through a separation constant. Solve the resulting eigenvalue problem in space, the matching ODE in time, and SUM the family of products. The resulting series — Fourier, Bessel, Legendre, etc., depending on the geometry — is the analytical solution.

The technique works whenever the equation is LINEAR with constant or separable coefficients, the domain is the right shape (interval, rectangle, disk, ball, sphere, cylinder), and the boundary conditions are HOMOGENEOUS (or can be made so by subtracting an easy auxiliary solution). Inhomogeneous source terms can be handled by Duhamel's principle once the homogeneous problem is solved. The two ingredients are STURM-LIOUVILLE THEORY for the spatial eigenvalue problem and ELEMENTARY ODEs for the time evolution.

The basic recipe

Consider the heat equation on the unit interval with Dirichlet boundary conditions:

u_{t} = u_{xx}, x \in (0, 1), t > 0; u (0, t) = u (1, t) = 0; u (x, 0) = u_{0} (x) .

Step 1: separate. Assume $u (x, t) = X (x) T (t)$ . Substitute:

X (x) T^{'} (t) = X^{''} (x) T (t) ⟹ \frac{T ^{'} ( t )}{T ( t )} = \frac{X ^{''} ( x )}{X ( x )} .

The left side depends only on $t$ , the right only on $x$ . The only way for them to be equal as functions of independent variables is to both be CONSTANT. Call that constant $- λ$ (sign chosen for forthcoming convenience):

\frac{X ^{''} ( x )}{X ( x )} = - λ, \frac{T ^{'} ( t )}{T ( t )} = - λ .

Step 2: solve the spatial eigenvalue problem. The boundary conditions translate to $X (0) = X (1) = 0$ . The ODE $X^{''} + λ X = 0$ on $[0, 1]$ with Dirichlet BCs is a STURM-LIOUVILLE problem whose eigenvalues and eigenfunctions are:

λ_{n} = (nπ)^{2}, X_{n} (x) = sin (nπ x), n = 1, 2, 3, \dots

Step 3: solve the time ODE. For each $λ_{n}$ , the time ODE is $T_{n}^{'} = - λ_{n} T_{n}$ , with solution $T_{n} (t) = e^{- λ_{n} t}$ .

Step 4: superpose. By linearity, any sum of the product solutions $X_{n} (x) T_{n} (t)$ is also a solution. The general solution is therefore

u (x, t) = n = 1 \sum \infty b_{n} sin (nπ x) e^{- (nπ)^{2} t} .

Step 5: match initial data. At $t = 0$ we need $\sum_{n} b_{n} sin (nπ x) = u_{0} (x)$ — i.e. the Fourier-sine series of $u_{0}$ . Using orthogonality of the sine eigenfunctions:

b_{n} = 2 \int_{0}^{1} u_{0} (x) sin (nπ x) d x .

Done. The solution to the heat equation is uniquely determined by the initial profile, decomposed onto the eigenfunctions of the spatial operator, with each mode decaying exponentially in time at its own rate.

What's actually going on: Sturm-Liouville theory

The reason separation works is that the spatial differential operator (here $- d^{2} / d x^{2}$ ) is SELF-ADJOINT under the standard $L^{2}$ inner product, with appropriate boundary conditions. Self-adjoint operators have a complete orthonormal basis of eigenfunctions, and any function can be expanded in that basis. The PDE in time then decouples into independent ODEs for each Fourier coefficient.

More generally, the spatial operator can be any Sturm-Liouville problem

- \frac{d}{d x} [p (x) X^{'}] + q (x) X = λ w (x) X,

with positive weight $w (x) > 0$ and appropriate boundary conditions. The classical theorem: such a problem has a countably infinite set of real eigenvalues $λ_{1} < λ_{2} < \dots \to \infty$ with eigenfunctions orthogonal under the $w$ -weighted inner product, and the eigenfunctions are COMPLETE in $L^{2} ([a, b]; w d x)$ . Every reasonable function on the interval has a convergent eigenfunction expansion.

The classical orthogonal families — Fourier sines/cosines, Legendre polynomials, Hermite, Chebyshev, Bessel functions — are all Sturm-Liouville eigenfunctions of specific operators and weights. Spherical harmonics arise from separating the Laplacian in spherical coordinates: the angular dependence becomes an eigenvalue problem whose solutions are $Y_{ℓ}^{m} (θ, ϕ)$ . Bessel functions arise from cylindrical coordinates. EACH coordinate system that separates a particular PDE has its own family of special functions; the family is determined entirely by the spatial operator and the geometry.

Code: Fourier solution vs finite differences

A concrete check. With initial condition $u_{0} (x) = x (1 - x)$ on $[0, 1]$ , the Fourier coefficients integrate cleanly:

b_{n} = 2 \int_{0}^{1} x (1 - x) sin (nπ x) d x = {8/ (nπ)^{3} 0 n odd n even.

So the analytical solution is

u (x, t) = odd n \sum \infty \frac{8}{( nπ ) ^{3}} sin (nπ x) e^{- (nπ)^{2} t} .

The code computes this sum, compares against a finite-difference reference solution, and tracks the leading-mode estimate $(8/ π^{3}) sin (π x) e^{- π^{2} t}$ (which dominates at late time, when higher modes have died).

# Separation of variables for the heat equation on [0, 1].
#
# u_t = u_xx,  u(0, t) = u(1, t) = 0,  u(x, 0) = x (1 - x).
#
# Separation ansatz: u(x, t) = X(x) T(t) yields X''/X = T'/T = -lambda.
# Spatial eigenvalue problem with Dirichlet BCs: X_n(x) = sin(n pi x),
# lambda_n = (n pi)^2.  Temporal ODE: T_n(t) = exp(-lambda_n t).
#
# Fourier coefficients for u_0(x) = x(1 - x):
#   b_n = 2 int_0^1 x(1-x) sin(n pi x) dx
#       = 8 / (n pi)^3  if n odd,  0 if n even.
#
# Compare the Fourier series against a finite-difference reference.

import numpy as np

# Fourier-series solution
def fourier_heat(x, t, M=50):
    u = np.zeros_like(x)
    for n in range(1, 2*M, 2):                  # odd n only
        b_n = 8.0 / (n * np.pi)**3
        u += b_n * np.sin(n * np.pi * x) * np.exp(-(n * np.pi)**2 * t)
    return u

# Finite-difference reference (FTCS, Dirichlet)
N  = 101
xg = np.linspace(0, 1, N)
dx = xg[1] - xg[0]
u0 = xg * (1 - xg)
def heat_fd(u_init, T, r=0.4):
    dt = r * dx**2
    nsteps = int(np.ceil(T / dt))
    dt = T / nsteps
    r = dt / dx**2
    u = u_init.copy()
    for _ in range(nsteps):
        u_new = u.copy()
        u_new[1:-1] = u[1:-1] + r * (u[2:] - 2*u[1:-1] + u[:-2])
        u_new[0] = u_new[-1] = 0.0
        u = u_new
    return u

# Run several times; compare to FD and to a one-mode estimate.
print(f"u_t = u_xx, u_0(x) = x(1-x), Dirichlet BCs.")
print(f"{'t':>8s}  {'Fourier u(0.5)':>18s}  {'FD u(0.5)':>14s}  "
      f"{'|diff|':>10s}  {'1-mode est':>14s}")
for t in [0.001, 0.01, 0.05, 0.1, 0.5]:
    u_f  = fourier_heat(xg, t, M=50)
    u_fd = heat_fd(u0, t)
    diff = np.max(np.abs(u_f - u_fd))
    one_mode = 8.0 / np.pi**3 * np.exp(-np.pi**2 * t)
    print(f"  t={t:.3f}  {u_f[N//2]:18.6f}  {u_fd[N//2]:14.6f}  "
          f"{diff:10.2e}  {one_mode:14.6f}")

Output:

u_t = u_xx, u_0(x) = x(1-x), Dirichlet BCs.
       t      Fourier u(0.5)       FD u(0.5)      |diff|      1-mode est
  t=0.001            0.248000        0.248000    5.66e-06        0.255478
  t=0.010            0.230002        0.230002    5.65e-06        0.233764
  t=0.050            0.157403        0.157395    8.43e-06        0.157516
  t=0.100            0.096162        0.096151    1.09e-05        0.096163
  t=0.500            0.001856        0.001855    1.05e-06        0.001856

Three things to read off. (1) Fourier and FD agree to $\sim 1 0^{- 5}$ across all sampled times — the small disagreement is FD discretization error (the FD solution has $O (d x^{2})$ error at every step, which accumulates over time-stepping; the Fourier-series solution is analytical and limited only by the truncation at $M = 50$ modes, which is effectively exact for this smooth initial condition). (2) The one-mode estimate is OFF by ~3% at $t = 0.001$ but is essentially EXACT by $t = 0.05$ . This is the central pedagogical content: higher Fourier modes die exponentially faster than lower ones (rate $\sim n^{2}$ ), so the long-time behavior is dominated by the first mode. (3) The amplitude drops by a factor of $e^{- π^{2} \cdot 0.5} \approx 7 \times 1 0^{- 3}$ by $t = 0.5$ — the system has essentially reached its equilibrium (zero, since both boundary values are zero).

The smoothing property — built into Fourier

Even discontinuous initial data become smooth instantly. Consider an initial condition with a discontinuity: its Fourier coefficients decay like $1/ n$ at large $n$ (slow decay; this is the Gibbs phenomenon's origin). At any $t > 0$ , each $b_{n}$ gets multiplied by $e^{- (nπ)^{2} t}$ — Gaussian decay in $n$ , which kills the high frequencies faster than any algebraic rate. So $u (x, t)$ for $t > 0$ is $C^{\infty}$ regardless of how rough $u_{0}$ was. This is the analytical face of the heat equation's smoothing property — the same fact established geometrically by the Gaussian heat-kernel Green's function.

Where it breaks: non-separable problems

Separation of variables requires three structural ingredients: linearity, separable coefficients, and a geometry compatible with the spatial operator. Generic 2D Laplace on an arbitrary curved boundary, for instance, does NOT separate in any standard coordinate system, and you must use Green's functions or numerical methods instead. The art of classical mathematical physics was in finding new coordinate systems and special functions in which previously-intractable separations could be performed (paraboloidal coordinates for the hydrogen Stark effect, oblate spheroidal for diffraction from a disc, etc.). The Schrödinger equation for hydrogen separates in spherical coordinates; for the helium atom it doesn't separate at all, which is why electronic-structure theory exists.

Where it generalizes: spectral methods

Separation of variables IS the analytical version of a SPECTRAL METHOD. Numerical spectral methods represent the solution in a basis of the natural eigenfunctions for the geometry (Fourier modes on periodic domains, Chebyshev on intervals, spherical harmonics on the sphere) and time-step the Fourier coefficients. The exponential convergence in $N$ that spectral methods enjoy for smooth solutions is the numerical face of the analytical fact that the truncation error of an eigenfunction expansion decays FASTER than any polynomial when the function is $C^{\infty}$ . Boyd's Chebyshev and Fourier Spectral Methods is the canonical reference; we'll add a dedicated spectral methods page once the foundations are in place.

The three classical equations — heat, wave, Laplace all separate in the same way on rectangular domains.
Green's functions — the eigenfunction expansion of the Green's function is the Sturm-Liouville completeness statement in disguise.
Infinite square well — the Sturm-Liouville problem here is the spatial part of separation for the time-independent Schrödinger equation; same eigenfunctions $sin (nπ x / L)$ arise.
WKB — the semiclassical limit of the same eigenvalue problem when the potential varies slowly.
Finite elements — a numerical generalization of expansion in basis functions, but with LOCAL piecewise-polynomial basis functions instead of GLOBAL Sturm-Liouville eigenfunctions.