⚙️ Chapter 1, Part 4: Your CUDA Toolbox
💡 Think of the CUDA Toolkit as your army headquarters. It gives you everything you need: the weapons (compilers), the maps (documentation), the training manuals (libraries), and the spy satellite (profiler).
The CUDA Toolkit contains:
// File: hello_cuda.cu
// Compile with: nvcc hello_cuda.cu -o hello_cuda
// Run with: ./hello_cuda
#include <stdio.h>
#include <cuda_runtime.h> // CUDA Runtime API header
// __global__ means: "this function runs on the GPU"
__global__ void helloKernel() {
printf("Hello from GPU thread %d!\n", threadIdx.x);
}
int main() {
// <<<blocks, threads_per_block>>>
helloKernel<<<1, 5>>>(); // 1 block, 5 threads
cudaDeviceSynchronize(); // Wait for GPU to finish
return 0;
}
How nvcc compiles your code:
🎮 Checking your GPU: Run nvidia-smi in your terminal to see your GPU model, driver version, and memory usage. This is the 'health check' for your GPU.
// Query GPU properties at runtime
#include <cuda_runtime.h>
#include <stdio.h>
int main() {
cudaDeviceProp prop;
cudaGetDeviceProperties(&prop, 0); // Device 0 = first GPU
printf("GPU Name: %s\n", prop.name);
printf("Total VRAM: %lu MB\n", prop.totalGlobalMem / (1024*1024));
printf("CUDA Cores per SM: %d\n", prop.maxThreadsPerBlock);
printf("Number of SMs: %d\n", prop.multiProcessorCount);
printf("Max Threads/Block: %d\n", prop.maxThreadsPerBlock);
return 0;
}
Each line uses printf("Component: Value\n"). Add all 7 lines after the header.