Lock Contention
What You'll Learn
- Mutex vs spin vs atomic operations
- How contention affects throughput
- Why contention causes collapse
- Measuring synchronization overhead
Experiment: Mutex vs Spin vs Atomic
#include <chrono>
#include <iostream>
#include <thread>
#include <vector>
#include <mutex>
#include <atomic>
using Clock = std::chrono::high_resolution_clock;
using Duration = std::chrono::nanoseconds;
double benchmark_mutex(int num_threads) {
std::mutex mtx;
int counter = 0;
const int iterations = 1000000;
auto start = Clock::now();
std::vector<std::thread> threads;
for (int t = 0; t < num_threads; ++t) {
threads.emplace_back([&]() {
for (int i = 0; i < iterations; ++i) {
std::lock_guard<std::mutex> lock(mtx);
counter++;
}
});
}
for (auto& th : threads) th.join();
auto end = Clock::now();
auto elapsed = std::chrono::duration_cast<Duration>(end - start);
return static_cast<double>(elapsed.count()) / (num_threads * iterations);
}
double benchmark_atomic(int num_threads) {
std::atomic<int> counter{0};
const int iterations = 1000000;
auto start = Clock::now();
std::vector<std::thread> threads;
for (int t = 0; t < num_threads; ++t) {
threads.emplace_back([&]() {
for (int i = 0; i < iterations; ++i) {
counter++;
}
});
}
for (auto& th : threads) th.join();
auto end = Clock::now();
auto elapsed = std::chrono::duration_cast<Duration>(end - start);
return static_cast<double>(elapsed.count()) / (num_threads * iterations);
}
int main() {
std::cout << "Threads\tMutex (ns)\tAtomic (ns)\n";
for (int threads = 1; threads <= 8; threads *= 2) {
double mutex_time = benchmark_mutex(threads);
double atomic_time = benchmark_atomic(threads);
std::cout << threads << "\t" << mutex_time << "\t" << atomic_time << "\n";
}
return 0;
} What to Measure
- Throughput: Operations per second vs thread count
- Tail latency: P95/P99 latency (optional)
- Contention collapse: Where throughput stops scaling
Expected Shape of Results
You should see:
- Single thread: Mutex and atomic similar
- Multiple threads: Atomic faster (no OS calls)
- High contention: Both degrade, mutex worse
Interpretation
Contention collapse: When many threads compete for the same lock, most threads spend time waiting. Throughput collapses because threads are blocked, not computing.
Checklist
- ✓ Measured mutex vs atomic performance
- ✓ Observed contention effects
- ✓ Understood why contention causes collapse