您的位置：首页 > 其它

ubuntu 安装cuda9.0+cudnn7.1－与cuda8.0共存

2018-06-15 16:16 423 查看

为了使用tensorflow目标检测API的所有算法，所以打算升级一下CUDA版本以支持tf-gpu 1.5++，但原本项目都是基于tf-gpu 1.4 的（tf-gpu 1.5以下都只能使用CUDA_8.0），所以保留了cuda-8.0的情况下安装cuda-9.0。

系统信息：

byz@ubuntu:~$ nvidia-smi　－(在cuda-8时就已经安装好了的驱动，所以下面选择安装驱动时选No

首先：关闭X. light类型的桌面系统$　sudo /etc/init.d/lightdm stop

一、下载CUDA

点击下载，选择如图，最好不要使用deb方式，比较容易失败。

二、新建cuda_9文件夹，将一中下载的文件放进来

mkdir cuda_9

三、生成可执行文件

chmod 777 cuda_9.0.176_384.81_linux-run

四、执行安装

./cuda_9.0.176_384.81_linux-run

五、依次填写选项

----------------- Do you accept the previously read EULA? accept/decline/quit: accept Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 384.81? (y)es/(n)o/(q)uit: n　　<--(不要装驱动 Install the CUDA 9.0 Toolkit? (y)es/(n)o/(q)uit: y Enter Toolkit Location [ default is /usr/local/cuda-9.0 ]: /usr/local/cuda-9.0 is not writable. Do you wish to run the installation with 'sudo'? (y)es/(n)o: y Please enter your password: Do you want to install a symbolic link at /usr/local/cuda? (y)es/(n)o/(q)uit: n　　<--(链接可在完成安装之后软cuda-9.0链接过来 Install the CUDA 9.0 Samples? (y)es/(n)o/(q)uit: y Enter CUDA Samples Location [ default is /home/byz ]: /home/byz/app/cuda_9 <--(刚才新建的文件夹 Installing the CUDA Toolkit in /usr/local/cuda-9.0 ... Installing the CUDA Samples in /home/byz/app/cuda_9 ... Copying samples to /home/byz/app/cuda_9/NVIDIA_CUDA-9.0_Samples now... Finished copying samples. =========== = Summary = =========== Driver: Not Selected Toolkit: Installed in /usr/local/cuda-9.0 Samples: Installed in /home/byz/app/cuda_9 Please make sure that - PATH includes /usr/local/cuda-9.0/bin - LD_LIBRARY_PATH includes /usr/local/cuda-9.0/lib64, or, add /usr/local/cuda-9.0/lib64 to /etc/ld.so.conf and run ldconfig as root To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-9.0/bin Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-9.0/doc/pdf for detailed information on setting up CUDA. ***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required for CUDA 9.0 functionality to work. To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file: sudo <CudaInstaller>.run -silent -driver Logfile is /tmp/cuda_install_2684.log

六、修改配置文件

vim .bashrc

export PATH="/usr/local/cuda/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda/lib64:$LD_LIBRARY_PATH"
export LIBRARY_PATH="/usr/local/cuda/lib64:$LIBRARY_PATH

source .bashrc

七、从cuda-8.0 切换到cuda-9.0

rm –rf /usr/local/cuda ln -s /usr/local/cuda-9.0 /usr/local/cuda (需要切换到8.0的话换汤不换药)

八、进到 cuda9.0的sample. make所有的例子

byz@ubuntu:~/app/cuda_9/NVIDIA_CUDA-9.0_Samples$ make
最后可能像这样：

... cp simpleCUFFT_2d_MGPU ../../bin/x86_64/linux/release make[1]: Leaving directory '/home/byz/app/cuda_9/NVIDIA_CUDA-9.0_Samples/7_CUDALibraries/simpleCUFFT_2d_MGPU' make[1]: Entering directory '/home/byz/app/cuda_9/NVIDIA_CUDA-9.0_Samples/7_CUDALibraries/MC_EstimatePiP' "/usr/local/cuda-9.0"/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o main.o -c src/main.cpp "/usr/local/cuda-9.0"/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o piestimator.o -c src/piestimator.cu "/usr/local/cuda-9.0"/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o test.o -c src/test.cpp "/usr/local/cuda-9.0"/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o MC_EstimatePiP main.o piestimator.o test.o -lcurand mkdir -p ../../bin/x86_64/linux/release cp MC_EstimatePiP ../../bin/x86_64/linux/release make[1]: Leaving directory '/home/byz/app/cuda_9/NVIDIA_CUDA-9.0_Samples/7_CUDALibraries/MC_EstimatePiP' make[1]: Entering directory '/home/byz/app/cuda_9/NVIDIA_CUDA-9.0_Samples/7_CUDALibraries/randomFog' "/usr/local/cuda-9.0"/bin/nvcc -ccbin g++ -I../../common/inc -m64 -Xcompiler -fpermissive -gencode arch=compute_30,code=compute_30 -o randomFog.o -c randomFog.cpp "/usr/local/cuda-9.0"/bin/nvcc -ccbin g++ -I../../common/inc -m64 -Xcompiler -fpermissive -gencode arch=compute_30,code=compute_30 -o rng.o -c rng.cpp "/usr/local/cuda-9.0"/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_30,code=compute_30 -o randomFog randomFog.o rng.o -L/usr/lib/nvidia-387 -lGL -lGLU -lX11 -lglut -lcurand mkdir -p ../../bin/x86_64/linux/release cp randomFog ../../bin/x86_64/linux/release make[1]: Leaving directory '/home/byz/app/cuda_9/NVIDIA_CUDA-9.0_Samples/7_CUDALibraries/randomFog' Finished building CUDA samples

九、测试是否安装成功

byz@ubuntu:~/app/cuda_9/NVIDIA_CUDA-9.0_Samples$ cd 1_Utilities/deviceQuery

byz@ubuntu:~/app/cuda_9/NVIDIA_CUDA-9.0_Samples$ ll total 132 drwxrwxr-x 12 byz byz 4096 Jun 15 14:12 ./ drwxrwxr-x 3 byz byz 4096 Jun 15 12:53 ../ drwxr-xr-x 50 byz byz 4096 Jun 15 12:53 0_Simple/ drwxr-xr-x 7 byz byz 4096 Jun 15 12:53 1_Utilities/ drwxr-xr-x 12 byz byz 4096 Jun 15 12:53 2_Graphics/ drwxr-xr-x 22 byz byz 4096 Jun 15 12:53 3_Imaging/ drwxr-xr-x 10 byz byz 4096 Jun 15 12:53 4_Finance/ drwxr-xr-x 10 byz byz 4096 Jun 15 12:53 5_Simulations/ drwxr-xr-x 34 byz byz 4096 Jun 15 12:53 6_Advanced/ drwxr-xr-x 38 byz byz 4096 Jun 15 12:53 7_CUDALibraries/ drwxrwxr-x 3 byz byz 4096 Jun 15 14:12 bin/ drwxr-xr-x 6 byz byz 4096 Jun 15 12:53 common/ -rw-r--r-- 1 byz byz 81262 Jun 15 12:53 EULA.txt -rw-r--r-- 1 byz byz 2652 Jun 15 12:53 Makefile byz@ubuntu:~/app/cuda_9/NVIDIA_CUDA-9.0_Samples/1_Utilities/deviceQuery$ ./deviceQuery
./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 8 CUDA Capable device(s) Device 0: "TITAN Xp" CUDA Driver Version / Runtime Version 9.1 / 9.0 CUDA Capability Major/Minor version number: 6.1 Total amount of global memory: 12190 MBytes (12782075904 bytes) (30) Multiprocessors, (128) CUDA Cores/MP: 3840 CUDA Cores GPU Max Clock rate: 1582 MHz (1.58 GHz) Memory Clock rate: 5705 Mhz Memory Bus Width: 384-bit L2 Cache Size: 3145728 bytes Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384) Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 2 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Supports Cooperative Kernel Launch: Yes Supports MultiDevice Co-op Kernel Launch: Yes Device PCI Domain ID / Bus ID / location ID: 0 / 4 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > Device 1: "TITAN Xp" CUDA Driver Version / Runtime Version 9.1 / 9.0 CUDA Capability Major/Minor version number: 6.1 Total amount of global memory: 12190 MBytes (12782075904 bytes) (30) Multiprocessors, (128) CUDA Cores/MP: 3840 CUDA Cores GPU Max Clock rate: 1582 MHz (1.58 GHz) Memory Clock rate: 5705 Mhz Memory Bus Width: 384-bit L2 Cache Size: 3145728 bytes Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384) Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 2 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Supports Cooperative Kernel Launch: Yes Supports MultiDevice Co-op Kernel Launch: Yes Device PCI Domain ID / Bus ID / location ID: 0 / 5 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > ... (每颗ＧＰＵ的信息) ... > Peer access from TITAN Xp (GPU6) -> TITAN Xp (GPU7) : Yes > Peer access from TITAN Xp (GPU7) -> TITAN Xp (GPU0) : No > Peer access from TITAN Xp (GPU7) -> TITAN Xp (GPU1) : No > Peer access from TITAN Xp (GPU7) -> TITAN Xp (GPU2) : No > Peer access from TITAN Xp (GPU7) -> TITAN Xp (GPU3) : No > Peer access from TITAN Xp (GPU7) -> TITAN Xp (GPU4) : Yes > Peer access from TITAN Xp (GPU7) -> TITAN Xp (GPU5) : Yes > Peer access from TITAN Xp (GPU7) -> TITAN Xp (GPU6) : Yes deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.1, CUDA Runtime Version = 9.0, NumDevs = 8 Result = PASS 接下来～

clone一个环境装tf1.8

conda create --name tf14 --clone py35安装tf-gpu-1.8

(py35) byz@ubuntu:~/app$ pip install tensorflow_gpu-1.8.0-cp35-cp35m-manylinux1_x86_64.whl

试运行一下，发现cudnn版本不对，报错如下：

>>> import tensorflow Traceback (most recent call last): File "/home/byz/anaconda3/envs/py35/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module> from tensorflow.python.pywrap_tensorflow_internal import * File "/home/byz/anaconda3/envs/py35/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module> _pywrap_tensorflow_internal = swig_import_helper() File "/home/byz/anaconda3/envs/py35/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description) File "/home/byz/anaconda3/envs/py35/lib/python3.5/imp.py", line 243, in load_module return load_dynamic(name, filename, file) File "/home/byz/anaconda3/envs/py35/lib/python3.5/imp.py", line 343, in load_dynamic return _load(spec) ImportError: libcudnn.so.7: cannot open shared object file: No such file or directory Failed to load the native TensorFlow runtime. See https://www.tensorflow.org/install/install_sources#common_installation_problems for some common reasons and solutions. Include the entire stack trace above this error message when asking for help.

十、安装cudnn-7.1

下载路径：https://developer.nvidia.com/rdp/cudnn-download

下的是

cuDNN v7.1 Library for Linux

这个版本。不要下

cuDNN v7.1 Developer Library for Ubuntu16.04 Power8 (Deb)

这个版本，因为是给powe8处理器用的，不是amd64。

解压安装：

(py35) byz@ubuntu:~/app$ tar -xvf cudnn-9.0-linux-x64-v7.1.tgz cuda/include/cudnn.h cuda/NVIDIA_SLA_cuDNN_Support.txt cuda/lib64/libcudnn.so cuda/lib64/libcudnn.so.7　<--(这就是上面缺的文件 cuda/lib64/libcudnn.so.7.1.4 cuda/lib64/libcudnn_static.a解压后把相应的文件拷贝到对应的CUDA目录下即可

sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
sudo chmod a+r /usr/local/cuda/include/cudnn.h
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*

再次运行tf，错误消失

(py35) byz@ubuntu:~/app$ python Python 3.5.5 |Anaconda, Inc.| (default, Mar 12 2018, 23:12:44) [GCC 7.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow /home/byz/anaconda3/envs/py35/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters
所以～　这就搞定啦

说在最后（tf1.4迁移新版本问题）：

ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory

解决：

byz@ubuntu:/usr/local/cuda/lib64$ sudo ln -s libcudart.so.9.0.176 libcudart.so.8.0

做个软链接到9.0

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航