CUDA Programming The GPU Universe
💡
Exercise 2

GPU vs CPU 10 XP Easy

Ctrl+Enter Run Ctrl+S Save

🥊 The Ultimate Showdown: CPU vs GPU

💡 Story Time: Imagine you run a bakery. A CPU is like having 1 master chef — incredibly skilled, can do anything, handles complex decisions. A GPU is like having 10,000 helper elves — each does one simple thing, but together they bake MILLIONS of cookies at once!

Here's what makes them different:

  • 🧠 CPU — Few powerful cores (4-64), high clock speed (4-5 GHz), big cache, great for complex sequential logic
  • GPU — Thousands of small cores (3,000-20,000+), lower clock speed (1-2 GHz), optimized for parallel math
  • 📊 CPU cores — Each can do different tasks independently
  • 🔀 GPU cores — Groups (called Warps) run the SAME instruction on different data simultaneously
  • 💾 Memory — GPU has its own fast VRAM (e.g., 80GB on an A100)
// SERIAL (CPU-style): Add two arrays one by one void addArrayCPU(float* a, float* b, float* c, int n) { for (int i = 0; i < n; i++) { // One at a time: 0, 1, 2, ... c[i] = a[i] + b[i]; // 1,000,000 iterations = slow! } } // PARALLEL (GPU-style): ALL threads add at the SAME time! __global__ void addArrayGPU(float* a, float* b, float* c) { int i = threadIdx.x + blockIdx.x * blockDim.x; c[i] = a[i] + b[i]; // Thread 0 does c[0], Thread 1 does c[1]... } // ALL happen simultaneously!

Key CUDA vocabulary:

  • 🖥️ Host — The CPU and its memory (RAM)
  • 🎮 Device — The GPU and its memory (VRAM)
  • ⚙️ Kernel — A function that runs on the GPU
  • 🔀 Thread — One individual GPU worker
  • 📦 Warp — A group of 32 threads that run in lockstep
  • 🏗️ SM (Streaming Multiprocessor) — A cluster of GPU cores

📊 Performance comparison on matrix multiply (4096×4096):

  • 🐌 CPU (single thread): ~200 seconds
  • 🚗 CPU (16 cores, parallelized): ~15 seconds
  • 🚀 GPU (NVIDIA A100): ~0.05 seconds — 4000x faster!

This is why AI training lives on GPUs. Training GPT-4 on CPUs would take thousands of years. On GPU clusters? Months.

📋 Instructions
Study the comparison below and fill in the missing values by printing the correct CPU vs GPU specs: ``` === CPU vs GPU Comparison === CPU Cores: 16 GPU Cores: 10496 CPU Clock: 3.8 GHz GPU Clock: 1.4 GHz CPU Memory: 64 GB RAM GPU Memory: 80 GB VRAM Best for CPU: Complex sequential logic Best for GPU: Parallel math operations ```
Use printf() for each line. The first line is already started. For the header use printf("=== CPU vs GPU Comparison ===\n"); then print each spec line.
main.py
Hi! I'm Rex 👋
Output
Ready. Press ▶ Run or Ctrl+Enter.