Convergence: how fast does FEM get there?

Finite Elements

Chapters 3 and 5 each ended on the same observation: doubling roughly quartered the error. That's not a coincidence — it's the headline theorem of finite-element analysis. This chapter states it cleanly, measures it on the visualizers we built, and shows what the log-log plot looks like when the theorem holds.


The theorem in one line

For piecewise-linear elements on a quasi-uniform mesh of size , solving an elliptic problem with a smooth solution:

The error in the norm is bounded by a constant times . Halve and the error drops by . Quarter it and the error drops by 16. The rate is the slope on a log-log plot of error vs. .

The in the theorem hides everything that isn't : domain shape, problem regularity, the constant factor in the basis functions. None of those scale with the mesh, so they vanish into a slope-2 line on log-log axes.


Where the 2 comes from

The exponent matches the polynomial degree of the basis plus one. P1 (linear) elements: 1 + 1 = 2. P2 (quadratic): 2 + 1 = 3. Higher-order elements converge faster, but each unknown costs more work — the trade is built into the choice of element.

The intuition: a linear basis can interpolate a smooth function with error pointwise. The Galerkin solution is the projection of onto the P1 space in a problem-specific norm; that projection can't be worse than the best interpolation. So the error inherits the interpolation rate.

This is Céa's lemma in disguise. The full statement says is the best approximation of in the energy norm; up to constants, the energy norm and the norm see the same convergence rate for our problem.


Measuring it

Build a manufactured solution: pick first, then compute as the forcing. Now the analytic solution is exact — there's no model error to confuse the discretization error. Solve at several values of , compute the error against the analytic , plot.

For a piecewise-linear FEM solution, the error over an element of width is a quadrature problem. Use 4-point Gauss-Legendre per element in 1D, centroid rule per triangle in 2D — overkill for our smooth tests but cheap enough that we don't need to think about it.


See it

Switch between 1D and 2D, and between problems. The empirical slope is computed by linear regression on the log-log data. The dashed reference line is the theoretical slope.

10-210-110-410-310-2O(h²)N=4N=8N=16N=32N=64N=128mesh size hL² error
dimension
problem
empirical slope ≈ 2.00theoretical slope = 2.00

Things to read off the plot:


What about the energy norm?

A second standard norm for FEM error is the energy norm:

This penalises errors in the gradient as well as in the value. The convergence rate is one lower: . The reason: the basis function gradients are constants per element, so the gradient error is at element scale, accumulating to in the integrated norm.

Different norms see different rates. The rate is what most people quote because it's what you see when you compare to by eye — the values match. The energy norm matters in problems where derivatives are physically meaningful (stresses in elasticity, fluxes in heat transfer).


Where this is going

Two natural extensions live in the next chapters. Heat equation adds time, and the convergence rate in time is governed by the time-stepping scheme — forward Euler is first-order in , Crank-Nicolson is second-order. Linear elasticity swaps the scalar PDE for a vector PDE; the same convergence rate carries over for displacement, with for stresses (the energy-norm story repeating).