您的位置:首页 > 其它

cudaStreamSynchronize vs CudaDeviceSynchronize vs cudaThreadSynchronize CUDA中的屏障同步

2015-03-04 18:43 1641 查看


cudaStreamSynchronize
vs CudaDeviceSynchronize vs cudaThreadSynchronize

These are all barriers. Barriers prevent code execution beyond the barrier until some condition is met.

cudaDeviceSynchronize() halts
execution in the CPU/host thread (that the cudaDeviceSynchronize was issued in) until the GPU has finished processing all previously requested cuda tasks (kernels, data copies, etc.)

cudaThreadSynchronize() as
you've discovered, is just a deprecated version of
cudaDeviceSynchronize
.
Deprecated just means that it still works for now, but it's recommended not to use it (use cudaDeviceSynchronize instead) and in the future, it may become unsupported. But
cudaThreadSynchronize
()
and
cudaDeviceSynchronize
()
are basically identical.

cudaStreamSynchronize() is
similar to the above two functions, but it prevents further execution in the CPU host thread until the GPU has finished processing all previously requested cuda tasks that
were issued in the referenced stream. So
cudaStreamSynchronize
()
takes a stream id as it's only parameter. cuda tasks issued in other streams may or may not be complete when CPU code execution continues beyond this barrier.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: