Roofline Model

What You'll Learn

Mental Model

The roofline model shows that kernel performance is limited by either:

Which limit applies depends on operational intensity (operations per byte).

Operational Intensity

Operational intensity = Operations / Bytes transferred

Examples:

Compute Roof vs Bandwidth Roof

Bandwidth roof: Performance = Bandwidth × Operational Intensity
Compute roof: Performance = Peak FLOPS (flat line)

The roofline is the minimum of these two limits. Low intensity → bandwidth-bound. High intensity → compute-bound.

How to Classify Kernels

From measured data:

  1. Measure kernel performance (FLOPS)
  2. Measure memory bandwidth used
  3. Calculate operational intensity
  4. Compare to theoretical roofs
  5. Classify as bandwidth-bound or compute-bound

Checklist