Branch Prediction Basics

What You'll Learn

Why branches are expensive when mispredicted
How branch prediction works
The cost of pipeline flushes
Predictable vs unpredictable code patterns

Mental Model

CPUs use pipelines—instructions are broken into stages and multiple instructions are "in flight" at once. When the CPU encounters a branch (if/else, loop), it doesn't know which path to take until the condition is evaluated. But it can't wait—that would stall the pipeline.

So the CPU guesses which path to take. If it guesses wrong, the pipeline must be flushed and restarted from the correct path. This is expensive—typically 10-20 cycles on modern CPUs.

Why Branches Are Expensive When Mispredicted

When a branch is mispredicted:

The CPU has been executing instructions from the wrong path
Those instructions must be discarded (they're in the pipeline but shouldn't execute)
The pipeline must be flushed
Execution must restart from the correct path
All the work done speculatively is wasted

This mispredict penalty is typically 10-20 cycles, depending on pipeline depth. On a 4 GHz CPU, that's 2.5-5 nanoseconds of wasted time per misprediction.

Mispredict Penalty Intuition

Think of it like this: the CPU pipeline is like an assembly line. When you realize you've been building the wrong product, you have to:

Stop the line
Clear out all the partially-built wrong products
Restart with the correct product
Wait for the line to fill up again

The deeper the pipeline, the more work is wasted, and the longer it takes to recover.

How Branch Prediction Works

Modern CPUs use sophisticated predictors:

Static prediction: Simple heuristics (e.g., "backward branches are usually taken")
Dynamic prediction: History-based (e.g., "this branch was taken the last 3 times")
Pattern-based: Recognizes repeating patterns (e.g., "taken, not taken, taken, not taken...")
Branch target buffer: Caches where branches go

Modern predictors achieve 95%+ accuracy on typical code. But when they fail, the penalty is severe.

Predictable vs Unpredictable Code

Predictable Patterns

Loops with consistent iteration counts
Branches that are almost always taken or almost never taken
Patterns that repeat (e.g., every 4th iteration)
Sorted data (enables predictable comparisons)

Unpredictable Patterns

Random branches (50/50 probability)
Data-dependent branches on unsorted data
Complex patterns that don't fit predictor history
Branches that change direction frequently

Speculation

When the CPU predicts a branch, it speculates—it starts executing instructions from the predicted path before knowing if the prediction is correct. This is why mispredictions are expensive: all that speculative work must be discarded.

However, speculation is essential for performance. Without it, the CPU would stall on every branch, waiting for the condition to be evaluated. Modern CPUs can have 100+ instructions in flight, many of them speculative.

Pipeline Flush

When a misprediction is detected (the branch condition is finally evaluated and the prediction was wrong), the CPU must:

Mark all speculative instructions as invalid
Flush the pipeline stages containing those instructions
Restart fetching from the correct branch target
Wait for the pipeline to fill again

This is why mispredictions are so expensive—you lose all the work that was in progress.

Common Pitfalls

Assuming all branches are equal: Predictable branches have near-zero cost
Over-optimizing: Most branches are predictable; only optimize the hot ones
Ignoring data layout: Sorting data can make branches predictable
Micro-benchmark artifacts: Small loops may be fully unrolled, eliminating branches

Tooling Upgrades (Optional)

Linux: perf stat

perf stat -e branches,branch-misses ./your_program shows branch miss rate. branch-misses / branches gives the misprediction rate.

macOS: Instruments

Instruments "Counters" template can show branch prediction statistics on supported hardware.

Windows: ETW

ETW can capture branch prediction events. Use Windows Performance Analyzer to analyze branch behavior.

Checklist

✓ Understand why branches are expensive when mispredicted
✓ Know the typical mispredict penalty (10-20 cycles)
✓ Understand the difference between predictable and unpredictable branches
✓ Ready to measure branch prediction effects in experiments