CUDA, Supercomputing for the Masses: Part 12

CUDA, Supercomputing for the Masses: Part 12
May 14, 2009
Prior to CUDA 2.2, CUDA kernels could not access host system memory directly. For that reason, CUDA programmers used the design pattern introduced in Part 1 and Part 2:

   1. Move data to the GPU.
   2. Perform calculation on GPU.
   3. Move result(s) from the GPU to host.

This paradigm has now changed as CUDA 2.2 has introduced new APIs that allow host memory to be mapped into device memory via a new function called cudaHostAlloc (or cuMemHostAlloc in the CUDA driver API).

Read full story at DDJ.
