Cache Conflicts & Thrashing
What You'll Learn
- How cache set conflicts cause thrashing
- Why some access patterns are "cache-friendly"
- How associativity affects conflict behavior
- Measuring conflict misses through timing
Mental Model
When multiple memory addresses map to the same cache set, they compete for the same slots. If the cache is not fully associative, one address can evict another, causing repeated cache misses. This is called thrashing.
Experiment 1: Friendly Stride
#include <chrono>
#include <iostream>
#include <vector>
using Clock = std::chrono::high_resolution_clock;
using Duration = std::chrono::nanoseconds;
double measure_stride(std::vector<int>& data, size_t stride) {
volatile int sum = 0;
const int iterations = 1000000;
auto start = Clock::now();
for (int i = 0; i < iterations; ++i) {
size_t idx = (i * stride) % data.size();
sum += data[idx];
}
auto end = Clock::now();
auto elapsed = std::chrono::duration_cast<Duration>(end - start);
return static_cast<double>(elapsed.count()) / iterations;
}
int main() {
std::vector<int> data(1024 * 1024, 1); // 4 MB
// Friendly stride (cache line size)
double friendly = measure_stride(data, 64);
std::cout << "Friendly stride (64): " << friendly << " ns/access\n";
// Conflict stride (prime number that causes conflicts)
double conflict = measure_stride(data, 1024 * 17); // Large prime stride
std::cout << "Conflict stride: " << conflict << " ns/access\n";
return 0;
} Experiment 2: Prime Stride Pattern
Prime-number strides can cause many addresses to map to the same cache sets:
// Try different stride values
for (size_t stride = 64; stride <= 1024 * 1024; stride *= 2) {
double time = measure_stride(data, stride);
std::cout << "Stride " << stride << ": " << time << " ns/access\n";
} What to Measure
- ns/access: Time per access for different stride patterns
- Runtime explosion: When stride causes conflicts
- Stride vs performance: Plot to see conflict patterns
Expected Shape of Results
You should see:
- Friendly stride: Fast (~1-3 ns), good cache utilization
- Conflict stride: Slow (~50-200 ns), constant cache misses
- Pattern: Some strides cause dramatic slowdowns
Interpretation
Set conflicts: When multiple addresses map to the same cache set, they compete. In a direct-mapped cache (1-way), only one can be cached. In N-way associative caches, N addresses can coexist, but more than N causes evictions.
Associativity: Higher associativity reduces conflicts. A 4-way associative cache can hold 4 addresses that map to the same set before conflicts occur.
Common Pitfalls
- Prefetching: Hardware prefetchers can mask some conflict effects
- Working set size: Conflicts are more visible when working set exceeds cache
- Measurement noise: Use median of many runs
Tooling Upgrades (Optional)
Linux: perf stat
perf stat -e cache-references,cache-misses ./your_program shows conflict miss rates.
Checklist
- ✓ Measured friendly vs conflict stride performance
- ✓ Identified stride patterns that cause conflicts
- ✓ Understood set conflicts and associativity