Convergence: how fast does FEM get there?
Finite Elements
Chapters 3 and 5 each ended on the same observation: doubling roughly quartered the error. That's not a coincidence — it's the headline theorem of finite-element analysis. This chapter states it cleanly, measures it on the visualizers we built, and shows what the log-log plot looks like when the theorem holds.
The theorem in one line
For piecewise-linear elements on a quasi-uniform mesh of size , solving an elliptic problem with a smooth solution:
The error in the norm is bounded by a constant times . Halve and the error drops by . Quarter it and the error drops by 16. The rate is the slope on a log-log plot of error vs. .
The in the theorem hides everything that isn't : domain shape, problem regularity, the constant factor in the basis functions. None of those scale with the mesh, so they vanish into a slope-2 line on log-log axes.
Where the 2 comes from
The exponent matches the polynomial degree of the basis plus one. P1 (linear) elements: 1 + 1 = 2. P2 (quadratic): 2 + 1 = 3. Higher-order elements converge faster, but each unknown costs more work — the trade is built into the choice of element.
The intuition: a linear basis can interpolate a smooth function with error pointwise. The Galerkin solution is the projection of onto the P1 space in a problem-specific norm; that projection can't be worse than the best interpolation. So the error inherits the interpolation rate.
This is Céa's lemma in disguise. The full statement says is the best approximation of in the energy norm; up to constants, the energy norm and the norm see the same convergence rate for our problem.
Measuring it
Build a manufactured solution: pick first, then compute as the forcing. Now the analytic solution is exact — there's no model error to confuse the discretization error. Solve at several values of , compute the error against the analytic , plot.
For a piecewise-linear FEM solution, the error over an element of width is a quadrature problem. Use 4-point Gauss-Legendre per element in 1D, centroid rule per triangle in 2D — overkill for our smooth tests but cheap enough that we don't need to think about it.
See it
Switch between 1D and 2D, and between problems. The empirical slope is computed by linear regression on the log-log data. The dashed reference line is the theoretical slope.
Things to read off the plot:
- The data points sit on a straight line in log-log axes — that's the visual signature of a power-law convergence rate.
- The empirical slope, measured by least-squares fit on the computed errors, lands within rounding of 2.00. That's the theorem holding empirically.
- The empirical fit (solid accent line) and the reference line (dashed gray) are parallel — the constant differs between problems but the rate doesn't.
- For the higher-frequency 1D problem (), the absolute error at fixed is much larger than for . The mesh has to resolve more oscillation. But the slope is still 2 — only the constant moves.
What about the energy norm?
A second standard norm for FEM error is the energy norm:
This penalises errors in the gradient as well as in the value. The convergence rate is one lower: . The reason: the basis function gradients are constants per element, so the gradient error is at element scale, accumulating to in the integrated norm.
Different norms see different rates. The rate is what most people quote because it's what you see when you compare to by eye — the values match. The energy norm matters in problems where derivatives are physically meaningful (stresses in elasticity, fluxes in heat transfer).
Where this is going
Two natural extensions live in the next chapters. Heat equation adds time, and the convergence rate in time is governed by the time-stepping scheme — forward Euler is first-order in , Crank-Nicolson is second-order. Linear elasticity swaps the scalar PDE for a vector PDE; the same convergence rate carries over for displacement, with for stresses (the energy-norm story repeating).