Method of Characteristics

Pdes

The method of characteristics turns a first-order PDE into a family of ODEs along special curves in space-time. For LINEAR first-order equations, the solution is constant along each characteristic and the entire problem reduces to "follow the curves backwards from the point you care about." For QUASILINEAR equations, the characteristics still exist but their slopes depend on the solution itself — and the resulting curves can INTERSECT, producing shocks. This is the cleanest analytical setting for understanding how nonlinearity breaks the smoothness of PDE solutions.

Linear first-order PDEs

Consider

a (x, t) u_{x} + b (x, t) u_{t} = c (x, t, u) .

Parametrize a curve $(x (s), t (s))$ in the $x$ - $t$ plane by an arc-length-like parameter $s$ . The total derivative of $u$ along the curve is $d u / d s = (d x / d s) u_{x} + (d t / d s) u_{t}$ . If we CHOOSE the curve so that $d x / d s = a$ and $d t / d s = b$ , the PDE collapses to an ORDINARY differential equation along the curve:

\frac{d u}{d s} = c (x, t, u) .

These are the CHARACTERISTIC equations. Three ODEs in $(x, t, u)$ as functions of $s$ , parameterized by where the curve starts at the initial-data surface. To solve the PDE: pick a point $(x^{*}, t^{*})$ where you want $u$ ; trace the characteristic backwards until it hits the initial-data surface; the initial value of $u$ there gets transported (modified by the integral of $c$ along the curve) up to $(x^{*}, t^{*})$ .

The transport equation

The simplest example. The linear transport (advection) equation is

u_{t} + c u_{x} = 0,

with constant speed $c$ and initial data $u (x, 0) = u_{0} (x)$ . The characteristic equations are $d x / d s = c$ , $d t / d s = 1$ , $d u / d s = 0$ . So $x = x_{0} + c t$ (straight lines of slope $c$ in the $x$ - $t$ plane), and $u$ is CONSTANT along each characteristic. The solution is just translation:

u (x, t) = u_{0} (x - c t) .

The geometric content: information moves to the right at speed $c$ . The DOMAIN OF DEPENDENCE of $(x, t)$ is the single point $x - c t$ on the initial data — finite speed of propagation is built in. The transport equation is the cleanest hyperbolic equation; everything else in the analytical theory builds on it.

Quasilinear equations: characteristics with state-dependent slope

Now let the coefficient $a$ in the PDE depend on the solution itself. A quasilinear scalar first-order equation has the form

u_{t} + a (u) u_{x} = 0.

The characteristic equations become $d x / d s = a (u)$ , $d t / d s = 1$ , $d u / d s = 0$ . The third equation still says $u$ is constant along each characteristic. The FIRST equation says the slope of each characteristic equals $a (u_{0} (x_{0}))$ — a constant determined by where the characteristic started. So the characteristics are STILL STRAIGHT LINES, but each one has a DIFFERENT slope determined by the initial value at its starting point.

This is the root of all hyperbolic nonlinearity. Where $u_{0} (x)$ increases in the direction of motion, the characteristics FAN OUT; where $u_{0}$ decreases, they CONVERGE and eventually intersect. At the intersection point the solution becomes multivalued — at which point the classical PDE has broken down and we need WEAK SOLUTIONS.

Burgers' equation and shock formation

Take $a (u) = u$ :

u_{t} + u u_{x} = 0, u (x, 0) = u_{0} (x) .

The inviscid Burgers equation. Each characteristic is a straight line in the $x$ - $t$ plane: $x = x_{0} + u_{0} (x_{0}) t$ . Two characteristics from $x_{1} < x_{2}$ with $u_{0} (x_{1}) > u_{0} (x_{2})$ have the faster one starting LEFT of the slower one and CATCHING UP. They meet at some finite time.

The time of first intersection is computed implicitly. The characteristic from $x_{0}$ is $x = x_{0} + u_{0} (x_{0}) t$ . If $u_{0}^{'} (x_{0}) < 0$ , neighboring characteristics from $x_{0} + d x$ have nearby slope $u_{0} (x_{0}) + u_{0}^{'} (x_{0}) d x$ — slower than the one from $x_{0}$ — so they intersect at

t^{*} = - \frac{1}{u _{0}^{'} ( x _{0} )}, (assuming u_{0}^{'} (x_{0}) < 0) .

The first (earliest) shock forms at the value of $x_{0}$ that MINIMIZES $t^{*}$ — equivalently MAXIMIZES $∣ u_{0}^{'} (x_{0}) ∣$ with the right sign. For the initial condition $u_{0} (x) = sin (2 π x)$ , the most-negative slope is $u_{0}^{'} (0) = 2 π$ evaluated at the right zero... actually let me redo: $u_{0}^{'} (x) = 2 π cos (2 π x)$ , which is most negative at $x = 1/2$ with value $- 2 π$ . So $t^{*} = 1/ (2 π) \approx 0.159$ .

At and after $t = t^{*}$ , the smooth classical solution no longer exists — characteristics overlap, the would-be solution is multivalued, and we need a WEAK solution: a function (typically with a discontinuity) that satisfies the integral form of the conservation law.

The Rankine-Hugoniot jump condition

Write Burgers in conservation form: $u_{t} + (u^{2} /2)_{x} = 0$ . Suppose a shock at position $x_{s} (t)$ separates left value $u_{L}$ and right value $u_{R}$ . The integral form of the conservation law (no fluxes across an infinitesimal control volume around the shock) forces:

\frac{d x _{s}}{d t} = \frac{f ( u _{L} ) - f ( u _{R} )}{u _{L} - u _{R}} = \frac{u _{L} + u _{R}}{2}, for f (u) = u^{2} /2.

The shock speed is the AVERAGE of the values on either side — for Burgers specifically. The general statement $d x_{s} / d t = [f] / [u]$ (jump of flux over jump of conserved variable) is the Rankine-Hugoniot relation, applicable to all scalar conservation laws.

The entropy condition

Rankine-Hugoniot allows MULTIPLE weak solutions for the same initial data — both "compressive" shocks (where characteristics converge into the shock) and "expansive" shocks (where they diverge from it). The physical solution is the one satisfying an entropy condition:

u_{L} \geq u_{R} (Lax-Oleinik for Burgers) .

Characteristics must FLOW INTO the shock from both sides, not out of it. Equivalently: the shock must be a limit of the vanishing-viscosity solution to $u_{t} + u u_{x} = ϵ u_{xx}$ as $ϵ \to 0^{+}$ . The viscous regularization picks out the entropy-admissible shock; the smooth-but-violating ones are mathematical ghosts. Physically: shocks are irreversible, like the heat equation's time-asymmetry.

Code: advection and Burgers in flight

# Method of characteristics — two demonstrations.
#   (1) Linear advection u_t + c u_x = 0 — pulse just translates.
#   (2) Burgers' equation  u_t + u u_x = 0 — characteristics intersect,
#       and a shock forms at t = 1/(2 pi) for initial data sin(2 pi x).

import numpy as np

# ─── Linear advection: u_t + c u_x = 0, periodic ───────────────────────
# Exact solution: u(x, t) = u_0(x - c t).
# Numerical: first-order upwind FD (c > 0).
N = 401
x = np.linspace(0, 1, N, endpoint=False)
dx = x[1] - x[0]
u0_adv = np.exp(-((x - 0.3) / 0.05)**2)

def advection_solve(u_init, T, c=1.0, cfl=0.7):
    dt = cfl * dx / abs(c)
    nsteps = int(np.ceil(T / dt))
    dt = T / nsteps
    r  = c * dt / dx
    u  = u_init.copy()
    for _ in range(nsteps):
        u_new = u.copy()
        u_new[1:] = u[1:] - r * (u[1:] - u[:-1])
        u_new[0]  = u[0]  - r * (u[0]  - u[-1])
        u = u_new
    return u

T = 0.4
u_adv  = advection_solve(u0_adv, T, c=1.0)
exact  = np.exp(-((x - 0.7) / 0.05)**2)     # u_0(x - 1.0 * 0.4)
print(f"Linear advection (c = 1, T = {T}):")
print(f"  Initial pulse centred at x = 0.3 → exact centre x = 0.7")
print(f"  Peak position in numerical solution: x = {x[np.argmax(u_adv)]:.3f}")
print(f"  Peak value (exact = 1.0): {u_adv.max():.4f}  "
      f"(upwind dissipation damps it)")

# ─── Burgers' equation: u_t + u u_x = 0, periodic ──────────────────────
# Initial data u_0(x) = sin(2 pi x).
# Characteristic from (x_0, 0) is the line x = x_0 + u_0(x_0) t.
# Slope of characteristic depends on initial value → characteristics intersect
# when 1 + u_0'(x_0) t = 0 at the steepest-descent point of u_0.
# For sin(2 pi x), u_0' = 2 pi cos(2 pi x), most negative at x_0 = 0
# with value -2 pi → first shock at t* = 1/(2 pi) ~ 0.1592.

def burgers_solve(u_init, T, cfl=0.4):
    """Lax-Friedrichs for the conservative form (u^2/2)_x."""
    dt = cfl * dx / max(np.max(np.abs(u_init)), 1e-6)
    nsteps = int(np.ceil(T / dt))
    dt = T / nsteps
    u = u_init.copy()
    for _ in range(nsteps):
        F = 0.5 * u**2
        F_avg = 0.5 * (F[:-1] + F[1:]) - 0.5 * (dx / dt) * (u[1:] - u[:-1])
        u_new = u.copy()
        u_new[1:-1] = u[1:-1] - (dt / dx) * (F_avg[1:] - F_avg[:-1])
        # Periodic flux at the seam
        F_seam = 0.5 * (F[-1] + F[0]) - 0.5 * (dx / dt) * (u[0] - u[-1])
        u_new[0]  = u[0]  - (dt / dx) * (F_avg[0] - F_seam)
        u_new[-1] = u[-1] - (dt / dx) * (F_seam   - F_avg[-1])
        u = u_new
    return u

u0_b = np.sin(2 * np.pi * x)
print(f"\nBurgers u_t + u u_x = 0 with u_0(x) = sin(2 pi x):")
print(f"  Theoretical shock time t* = 1/(2 pi) = {1/(2*np.pi):.4f}")
print(f"  Numerical |du/dx|_max at successive times:")
for T in [0.05, 0.10, 0.15, 0.20]:
    u_b = burgers_solve(u0_b, T)
    grad_max = np.max(np.abs(np.diff(u_b)) / dx)
    state = "pre-shock " if T < 1/(2*np.pi) else "post-shock"
    print(f"    t = {T:.2f}:  |du/dx|_max = {grad_max:6.1f}   ({state})")

Output:

Linear advection (c = 1, T = 0.4):
  Initial pulse centred at x = 0.3 → exact centre x = 0.7
  Peak position in numerical solution: x = 0.701
  Peak value (exact = 1.0): 0.8995  (upwind dissipation damps it)

Burgers u_t + u u_x = 0 with u_0(x) = sin(2 pi x):
  Theoretical shock time t* = 1/(2 pi) = 0.1592
  Numerical |du/dx|_max at successive times:
    t = 0.05:  |du/dx|_max =    9.0   (pre-shock )
    t = 0.10:  |du/dx|_max =   15.4   (pre-shock )
    t = 0.15:  |du/dx|_max =   37.5   (pre-shock )
    t = 0.20:  |du/dx|_max =  106.3   (post-shock)

The linear advection test confirms the characteristic prediction at the integer level (peak ends up where the characteristic says it should), but the upwind scheme adds numerical diffusion that damps the peak from 1.0 to 0.90 over $T = 0.4$ . This is a well-known feature of first-order upwind discretization; it is the price paid for stability, and dedicated hyperbolic-PDE schemes (Lax-Wendroff, MUSCL, WENO) use higher-order reconstruction to control it.

The Burgers run is more dramatic. The maximum gradient grows monotonically with time, exactly as the characteristic-intersection analysis predicts: at $t = 0.05$ it's 9, at $t = 0.10$ it's 15, at $t = 0.15$ (just before $t^{*} = 0.159$ ) it's 37, and at $t = 0.20$ (just after) it has jumped to 106 — the Lax-Friedrichs scheme is now resolving a near-discontinuous shock. Higher-resolution numerical schemes can sharpen the shock further; the key qualitative phenomenon (smoothness breaks down at the predicted time) is captured cleanly.

Where this leads

The method of characteristics is the analytical foundation for hyperbolic PDEs in general. Two big extensions:

Systems of conservation laws. $\partial_{t} u + \partial_{x} f (u) = 0$ for vector-valued $u$ . The characteristics generalize to families of CHARACTERISTIC FIELDS associated to the eigenvalues of the Jacobian $\partial f / \partial u$ . Examples: shallow-water equations, Euler equations of compressible flow, MHD. The Riemann problem (decay of a single jump discontinuity) is the building block of all upwind numerical methods.
Eikonal and Hamilton-Jacobi equations. $∣\nabla u ∣^{2} = n (x)^{2}$ in geometric optics, $u_{t} + H (\nabla u, x) = 0$ in classical mechanics and optimal control. Characteristics are the BICHARACTERISTIC CURVES (rays in optics, classical trajectories in mechanics). The method of characteristics is the bridge between PDE and ODE descriptions: characteristics for a Hamilton-Jacobi equation are precisely the integral curves of the corresponding Hamiltonian flow.

The three classical equations — the wave equation as the prototypical hyperbolic example; characteristics there are straight lines of slope $\pm c$ .
WKB and semiclassical methods — WKB is the method of characteristics applied to the eikonal equation; the rays are the characteristic curves.
Differential equations — the ODE machinery used to integrate along characteristics.
Green's functions — for linear equations, the same job is done by integrating the Green's function against the initial data.