Branch Prediction Basics
What You'll Learn
- Why branches are expensive when mispredicted
- How branch prediction works
- The cost of pipeline flushes
- Predictable vs unpredictable code patterns
Mental Model
CPUs use pipelines—instructions are broken into stages and multiple instructions are "in flight" at once. When the CPU encounters a branch (if/else, loop), it doesn't know which path to take until the condition is evaluated. But it can't wait—that would stall the pipeline.
So the CPU guesses which path to take. If it guesses wrong, the pipeline must be flushed and restarted from the correct path. This is expensive—typically 10-20 cycles on modern CPUs.
Why Branches Are Expensive When Mispredicted
When a branch is mispredicted:
- The CPU has been executing instructions from the wrong path
- Those instructions must be discarded (they're in the pipeline but shouldn't execute)
- The pipeline must be flushed
- Execution must restart from the correct path
- All the work done speculatively is wasted
This mispredict penalty is typically 10-20 cycles, depending on pipeline depth. On a 4 GHz CPU, that's 2.5-5 nanoseconds of wasted time per misprediction.
Mispredict Penalty Intuition
Think of it like this: the CPU pipeline is like an assembly line. When you realize you've been building the wrong product, you have to:
- Stop the line
- Clear out all the partially-built wrong products
- Restart with the correct product
- Wait for the line to fill up again
The deeper the pipeline, the more work is wasted, and the longer it takes to recover.
How Branch Prediction Works
Modern CPUs use sophisticated predictors:
- Static prediction: Simple heuristics (e.g., "backward branches are usually taken")
- Dynamic prediction: History-based (e.g., "this branch was taken the last 3 times")
- Pattern-based: Recognizes repeating patterns (e.g., "taken, not taken, taken, not taken...")
- Branch target buffer: Caches where branches go
Modern predictors achieve 95%+ accuracy on typical code. But when they fail, the penalty is severe.
Predictable vs Unpredictable Code
Predictable Patterns
- Loops with consistent iteration counts
- Branches that are almost always taken or almost never taken
- Patterns that repeat (e.g., every 4th iteration)
- Sorted data (enables predictable comparisons)
Unpredictable Patterns
- Random branches (50/50 probability)
- Data-dependent branches on unsorted data
- Complex patterns that don't fit predictor history
- Branches that change direction frequently
Speculation
When the CPU predicts a branch, it speculates—it starts executing instructions from the predicted path before knowing if the prediction is correct. This is why mispredictions are expensive: all that speculative work must be discarded.
However, speculation is essential for performance. Without it, the CPU would stall on every branch, waiting for the condition to be evaluated. Modern CPUs can have 100+ instructions in flight, many of them speculative.
Pipeline Flush
When a misprediction is detected (the branch condition is finally evaluated and the prediction was wrong), the CPU must:
- Mark all speculative instructions as invalid
- Flush the pipeline stages containing those instructions
- Restart fetching from the correct branch target
- Wait for the pipeline to fill again
This is why mispredictions are so expensive—you lose all the work that was in progress.
Common Pitfalls
- Assuming all branches are equal: Predictable branches have near-zero cost
- Over-optimizing: Most branches are predictable; only optimize the hot ones
- Ignoring data layout: Sorting data can make branches predictable
- Micro-benchmark artifacts: Small loops may be fully unrolled, eliminating branches
Tooling Upgrades (Optional)
Linux: perf stat
perf stat -e branches,branch-misses ./your_program shows branch miss rate.
branch-misses / branches gives the misprediction rate.
macOS: Instruments
Instruments "Counters" template can show branch prediction statistics on supported hardware.
Windows: ETW
ETW can capture branch prediction events. Use Windows Performance Analyzer to analyze branch behavior.
Checklist
- ✓ Understand why branches are expensive when mispredicted
- ✓ Know the typical mispredict penalty (10-20 cycles)
- ✓ Understand the difference between predictable and unpredictable branches
- ✓ Ready to measure branch prediction effects in experiments