您的位置:首页 > Web前端

CAFFE实验学习笔记(3)——SSD(Single Shot MultiBox Detector)

2016-07-06 20:40 387 查看

1. cuda-7.5 is installed,but itshows 5.5 while checking the nvcc –v

http://blog.csdn.net/xuezhisdc/article/details/48651003

1、将/usr/local/cuda-7.0/bin添加到环境变量PATH路径中,这样一来,就可以在任何路径下调用cuda相关的可执行文件了。

2、将/usr/local/cuda7.0/lib64添加环境变量LD_LIBRARY_PATH
中,作为共享库使用。这样一来,后面编译Cuda Samples 和OpenCV时,就不会提示找不到库的错误了。

操作1:将以下内容添加到文件/etc/profile的最后面,保存后,执行命令source /etc/profile,使配置生效。

PATH=/usr/local/cuda-7.5/bin:$PATH

export PATH

LD_LIBRARY_PATH=/usr/local/cuda-7.5/lib64

export LD_ LIBRARY_PATH

ruyunli@ruyunli-All-Series:~$ echo $PATH

/usr/local/cuda-7.5/bin:/home/ruyunli/anaconda2/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games

ruyunli@ruyunli-All-Series:~$ echo $LD_LIBRARY_PATH

/usr/local/cuda-7.5/lib64

# Optionally, download your own cudnn; requires registration.

if [ -f "cudnn-7.0-linux-x64-v4.0-prod.tgz" ] ; then

tar -xvf cudnn-7.0-linux-x64-v4.0-prod.tgz

sudo cp -P cuda/lib64/libcudnn* /usr/local/cuda/lib64

sudo cp cuda/include/cudnn.h /usr/local/cuda/include

fi

# Need to put cuda on the linker path. This may not be the best way, but it works.

sudo sh -c "sudo echo '/usr/local/cuda/lib64' > /etc/ld.so.conf.d/cuda_hack.conf"

sudo ldconfig /usr/local/cuda/lib64

http://buptldy.github.io/2016/04/09/2016-04-09-Deepin%20CUDA%E5%AE%89%E8%A3%85%E5%8F%8AKeras%E4%BD%BF%E7%94%A8GPU%E6%A8%A1%E5%BC%8F%E8%BF%90%E8%A1%8C/
Device 0: "GeForce GTX 980 Ti"

CUDA Driver Version / Runtime Version 8.0 / 7.5

CUDA Capability Major/Minor version number: 5.2

Total amount of global memory: 6075 MBytes (6369837056 bytes)

(22) Multiprocessors, (128) CUDA Cores/MP: 2816 CUDA Cores

GPU Max Clock rate: 1190 MHz (1.19 GHz)

Memory Clock rate: 3505 Mhz

Memory Bus Width: 384-bit

L2 Cache Size: 3145728 bytes

Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)

Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers

Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 49152 bytes

Total number of registers available per block: 65536

Warp size: 32

Maximum number of threads per multiprocessor: 2048

Maximum number of threads per block: 1024

Max dimension size of a thread block (x,y,z): (1024, 1024, 64)

Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)

Maximum memory pitch: 2147483647 bytes

Texture alignment: 512 bytes

Concurrent copy and kernel execution: Yes with 2 copy engine(s)

Run time limit on kernels: No

Integrated GPU sharing Host Memory: No

Support host page-locked memory mapping: Yes

Alignment requirement for Surfaces: Yes

Device has ECC support: Disabled

Device supports Unified Addressing (UVA): Yes

Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0

Compute Mode:

< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Device 1: "GeForce GTX 970"

CUDA Driver Version / Runtime Version 8.0 / 7.5

CUDA Capability Major/Minor version number: 5.2

Total amount of global memory: 4034 MBytes (4229758976 bytes)

(13) Multiprocessors, (128) CUDA Cores/MP: 1664 CUDA Cores

GPU Max Clock rate: 1317 MHz (1.32 GHz)

Memory Clock rate: 3505 Mhz

Memory Bus Width: 256-bit

L2 Cache Size: 1835008 bytes

Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)

Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers

Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 49152 bytes

Total number of registers available per block: 65536

Warp size: 32

Maximum number of threads per multiprocessor: 2048

Maximum number of threads per block: 1024

Max dimension size of a thread block (x,y,z): (1024, 1024, 64)

Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)

Maximum memory pitch: 2147483647 bytes

Texture alignment: 512 bytes

Concurrent copy and kernel execution: Yes with 2 copy engine(s)

Run time limit on kernels: Yes

Integrated GPU sharing Host Memory: No

Support host page-locked memory mapping: Yes

Alignment requirement for Surfaces: Yes

Device has ECC support: Disabled

Device supports Unified Addressing (UVA): Yes

Device PCI Domain ID / Bus ID / location ID: 0 / 5 / 0

Compute Mode:

< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

> Peer access from GeForce GTX 980 Ti (GPU0) -> GeForce GTX 970 (GPU1) : No

> Peer access from GeForce GTX 970 (GPU1) -> GeForce GTX 980 Ti (GPU0) : No

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 7.5, NumDevs = 2, Device0 = GeForce GTX 980 Ti, Device1 = GeForce GTX 970

The NVIDIA CUDA® Deep Neural Network library(cuDNN) is a GPU-accelerated library of primitives for deep neural networks.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: