⚠️ Chapter 5: Sync or Chaos! — The Danger of Parallel Execution
💡 Story: Imagine 1,000 soldiers all trying to write their name in the same logbook at the exact same time. Scribbles, overwritten entries, chaos! That's a race condition — when multiple threads read/write the same memory location without coordination.
🔴 The Classic Race Condition:
// BUG: Multiple threads incrementing a counter
__global__ void countBuggy(int* counter) {
// All threads do this simultaneously:
// 1. Read counter value (say it's 5)
// 2. Add 1 to local copy (6)
// 3. Write back to memory (6)
*counter = *counter + 1; // THIS IS A RACE CONDITION!
// Problem: If 4 threads read '5' before ANY of them writes '6',
// they ALL compute '6' and write '6'. Net result: counter = 6
// WRONG! Should be 9 (was 5 + 4 increments)
}
🧠 Why race conditions happen:
// Simulating the race condition:
void simulateRace() {
int counter = 0;
// 4 threads all read counter=0 simultaneously
int t0_reads = counter; // 0 (Thread 0 reads)
int t1_reads = counter; // 0 (Thread 1 reads at the same time!)
int t2_reads = counter; // 0
int t3_reads = counter; // 0
// All compute +1:
// Thread 0 writes 1
// Thread 1 writes 1 (NOT 2! — read old value 0)
// Thread 2 writes 1
// Thread 3 writes 1
// Final counter = 1 (WRONG! Should be 4)
}
// Solutions: 1) atomicAdd 2) __syncthreads + reduction 3) Avoid sharing
🛡️ Three ways to fix race conditions:
📋 Instructions
Simulate a race condition counter. Run 4 threads without synchronization vs with the correct sequential approach and show the difference:
```
=== Race Condition Demo ===
--- WITHOUT atomic (simulated race) ---
Thread 0 reads counter=0, writes 1
Thread 1 reads counter=0, writes 1
Thread 2 reads counter=0, writes 1
Thread 3 reads counter=0, writes 1
Final counter (WRONG): 1
--- WITH atomic (correct) ---
counter after 4 increments: 4
```
The code is already written! Run it to see how a race condition causes wrong results. The key insight: in a true race, all threads read the original value before any thread writes back, so they all compute the same new value.