⚡ Chapter 1, Part 3: The Power of Doing Everything at Once
💡 Story: Imagine counting every grain of sand on a beach. Alone it takes 1,000 years. But if you split the beach into 10,000 sections and assign one person to each section — it takes less than a day. That's parallelism!
Two types of problems:
🧮 Amdahl's Law — The theoretical speedup from parallelization:
If 90% of your program is parallelizable and runs on 1,000 GPU cores vs 1 CPU core, your speedup is: 1 / (0.1 + 0.9/1000) ≈ 9.9x — Even with 1,000 workers, the 10% sequential part bottlenecks you. This is why optimizing the parallelizable code matters!
#include <stdio.h>
float amdahl(float serial_frac, int cores) {
return 1.0f / (serial_frac + (1.0f - serial_frac) / (float)cores);
}
int main() {
printf("Amdahl's Law Speedup Examples:\n");
printf("10%% serial, 1000 cores: %.2fx\n", amdahl(0.10f, 1000));
printf("1%% serial, 1000 cores: %.2fx\n", amdahl(0.01f, 1000));
printf("0%% serial, 1000 cores: %.2fx\n", amdahl(0.00f, 1000));
printf("50%% serial, 1000 cores: %.2fx\n", amdahl(0.50f, 1000));
return 0;
}
amdahl(0.10, 1000)
9.90
amdahl(0.01, 1000)
90.99
amdahl(0.50, 1000)
2.00