⚙️ Chapter 2, Part 2: The Three CUDA Qualifiers
💡 Story: Think of your CUDA code as a two-army operation. The CPU army is at base camp (Host). The GPU army is deployed on the battlefield (Device). Soldiers can only be given orders that match where they're stationed!
CUDA has 3 function qualifiers that tell the compiler WHERE a function runs and WHO can call it:
// 1. __global__ — Runs on GPU, called from CPU (or from other __global__ in new CUDA)
// This is what you use for kernels!
__global__ void myKernel(int* data, int n) {
int i = threadIdx.x;
if (i < n) data[i] *= 2;
}
// 2. __device__ — Runs on GPU, called from GPU
// Helper functions used inside kernels
__device__ int square(int x) {
return x * x;
}
// 3. __host__ — Runs on CPU, called from CPU (default — same as normal C)
__host__ void prepareData() {
// This is just regular CPU code
}
// You can combine __host__ and __device__ for functions that work on BOTH:
__host__ __device__ float clamp(float val, float lo, float hi) {
return val < lo ? lo : (val > hi ? hi : val);
}
// Complete example: CPU calls kernel, kernel calls __device__ helper
#include <stdio.h>
__device__ int doubled(int x) { // GPU helper function
return x * 2;
}
__global__ void processKernel(int* arr, int n) { // GPU kernel
int i = threadIdx.x;
if (i < n) {
arr[i] = doubled(arr[i]); // Kernel calls __device__ function
}
}
int main() { // CPU code
// ... allocate memory, copy data ...
processKernel<<<1, 5>>>(/* args */); // CPU calls kernel
cudaDeviceSynchronize();
return 0;
}
⚠️ Return types: __global__ kernels MUST return void. They can't return values — instead, they write results into memory that the CPU reads back.
📋 Instructions
Fill in the CUDA qualifiers to make this program work correctly. Print a table of the qualifier rules:
```
=== CUDA Function Qualifiers ===
__global__ : Runs on GPU, called from CPU
__device__ : Runs on GPU, called from GPU
__host__ : Runs on CPU, called from CPU
Return type of __global__: void
Can CPU call __device__? NO
Can GPU call __host__? NO
```
Just add printf() for each remaining rule. The pattern is printf("qualifier : description\n");