“Know how to solve every problem that has been solved.” “What I cannot create, I do not understand.” — Richard Feynman

Convergence: how fast does FEM get there?

Finite Elements

⛓ What you need to know first 3 concepts, 3 layers

The requisite-knowledge inventory for this page, bottom-up: the primitives at the base, combined upward until you reach what this page assumes. Skim the layers you already own; start wherever the ground gets unfamiliar.

base
- 1D Mesh
L1
- Weak Form
L2
- 1D Finite Elements
↳you are here

Chapters 3 and 5 each ended on the same observation: doubling $N$ roughly quartered the error. That's the headline theorem of finite-element analysis. This chapter states it cleanly, measures it on the visualizers we built, and shows what the log-log plot looks like when the theorem holds.

The theorem in one line

For piecewise-linear elements on a quasi-uniform mesh of size $h$ , solving an elliptic problem with a smooth solution:

∥ u - u_{h} ∥_{L^{2}} \leq C h^{2} ∥ u ∥_{H^{2}}

The error in the $L^{2}$ norm is bounded by a constant times $h^{2}$ . Halve $h$ and the error drops by $2^{2} = 4$ . Quarter it and the error drops by 16. The rate is the slope on a log-log plot of error vs. $h$ .

The $C$ in the theorem hides everything that isn't $h$ : domain shape, problem regularity, the constant factor in the basis functions. None of those scale with the mesh, so they vanish into a slope-2 line on log-log axes.

Where the 2 comes from

The exponent matches the polynomial degree of the basis plus one. P1 (linear) elements: 1 + 1 = 2. P2 (quadratic): 2 + 1 = 3. Higher-order elements converge faster, but each unknown costs more work — the trade is built into the choice of element.

The intuition: a linear basis can interpolate a smooth function with error $O (h^{2})$ pointwise. The Galerkin solution $u_{h}$ is the projection of $u$ onto the P1 space in a problem-specific norm; that projection can't be worse than the best interpolation. So the error inherits the interpolation rate.

This is Céa's lemma in disguise. The full statement says $u_{h}$ is the best approximation of $u$ in the energy norm; up to constants, the energy norm and the $L^{2}$ norm see the same convergence rate for our problem.

Measuring it

Build a manufactured solution: pick $u (x)$ first, then compute $f = - u^{''}$ as the forcing. Now the analytic solution is exact — there's no model error to confuse the discretization error. Solve at several values of $N$ , compute the $L^{2}$ error against the analytic $u$ , plot.

For a piecewise-linear FEM solution, the $L^{2}$ error over an element of width $h$ is a quadrature problem. Use 4-point Gauss-Legendre per element in 1D, centroid rule per triangle in 2D — overkill for our smooth tests but cheap enough that we don't need to think about it.

See it

Switch between 1D and 2D, and between problems. The empirical slope is computed by linear regression on the log-log data. The dashed reference line is the theoretical $O (h^{2})$ slope.

dimension

problem

empirical slope ≈ 2.00theoretical slope = 2.00

Things to read off the plot:

The data points sit on a straight line in log-log axes — that's the visual signature of a power-law convergence rate.
The empirical slope, measured by least-squares fit on the computed errors, lands within rounding of 2.00. That's the theorem holding empirically.
The empirical fit (solid accent line) and the reference $O (h^{2})$ line (dashed gray) are parallel — the constant $C$ differs between problems but the rate doesn't.
For the higher-frequency 1D problem ( $sin (2 π x)$ ), the absolute error at fixed $N$ is much larger than for $sin (π x)$ . The mesh has to resolve more oscillation. But the slope is still 2 — only the constant moves.

What about the energy norm?

A second standard norm for FEM error is the energy norm:

∥ u - u_{h} ∥_{E}^{2} = \int (u^{'} - u_{h}^{'})^{2} d x

This penalises errors in the gradient as well as in the value. The convergence rate is one lower: $∥ u - u_{h} ∥_{E} = O (h)$ . The reason: the basis function gradients are constants per element, so the gradient error is $O (1)$ at element scale, accumulating to $O (h)$ in the integrated norm.

Different norms see different rates. The $L^{2}$ rate is what most people quote because it's what you see when you compare $u_{h}$ to $u$ by eye — the values match. The energy norm matters in problems where derivatives are physically meaningful (stresses in elasticity, fluxes in heat transfer).

Where this is going

Two natural extensions live in the next chapters. Heat equation adds time, and the convergence rate in time is governed by the time-stepping scheme — forward Euler is first-order in $Δ t$ , Crank-Nicolson is second-order. Linear elasticity swaps the scalar PDE for a vector PDE; the same $O (h^{2})$ convergence rate carries over for displacement, with $O (h)$ for stresses (the energy-norm story repeating).