您的位置：首页 > 运维架构 > Linux

Installing Nvidia CUDA on Ubuntu 14.04 for Linux GPU Computing

2016-08-01 16:20 435 查看

https://www.quantstart.com/articles/Installing-Nvidia-cuda-on-Ubuntu-14-04-for-Linux-GPU-Computing

In this article I am going to discuss how to install the Nvidia CUDA toolkit for carrying out high-performance computing (HPC) with an Nvidia Graphics Processing Unit (GPU). CUDA is the industry standard for working with GPU-HPC. In a previous article Valerio
Restocchi showed us how to
install Nvidia CUDA on a Mac OS X system. In this article I am going to describe the same procedure but carry it out under the latest version of Ubuntu, namely 14.04.

Installation and Testing

The first task is to make sure that you have the GNU compiler collection (GCC) tools installed. This is carried out by installing the

build-essential

package:

sudo apt-get install build-essential

I'll assume that you have a 64-bit system for the remainder of the article. The next step is to download the specific DEB package for the 64-bit version of CUDA for Ubuntu 14.04. I placed this in my home Downloads directory:

cd ~/Downloads
wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_6.5-14_amd64.deb[/code] The following commands will install CUDA 6.5:

sudo dpkg -i cuda-repo-ubuntu1404_6.5-14_amd64.deb
sudo apt-get update
sudo apt-get install cuda

We also need to add the following lines to our .bash_profile
 file in our home directory, in order to obtain the required compilation tools on our
PATH
:

export PATH=/usr/local/cuda-6.5/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-6.5/lib64:$LD_LIBRARY_PATH

Remember to make sure that the terminal has access to these variables:

source ~/.bash_profile

Before proceeding to test the GPU cards we will ensure that the drivers are correctly installed. The following line will provide us with the driver version:

cat /proc/driver/nvidia/version

The output on my system is as follows

NVRM version: NVIDIA UNIX x86_64 Kernel Module  331.89  Tue Jul  1 13:30:18 PDT 2014
GCC version:  gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1)

Check the version of the Nvidia CUDA compiler:

nvcc -V

The output on my system is as follows

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2014 NVIDIA Corporation
Built on Thu_Jul_17_21:41:27_CDT_2014
Cuda compilation tools, release 6.5, V6.5.12

In order to check that the installation was successful we are going to compile the CUDA samples, test that we can query the GPU device and ascertain its bandwidth. In the following code sample below, change <target_directory> to your preferred installation
location for the sample scripts:

cuda-install-samples-6.5.sh <target_directory>

Change directory to the <target_directory>/NVIDIA_CUDA-6.5_Samples and run the make command:

cd <target_directory>/NVIDIA_CUDA-6.5_Samples
make

This will take some time. Once complete we can run the deviceQuery
 script to test if we can communicate with the GPU:

cd bin/x86_64/linux/release
./deviceQuery

I have two GPU cards in SLI configuration on my system and so I've only shown the output for the first device:

./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 2 CUDA Capable device(s)

Device 0: "GeForce GTX 780 Ti"
CUDA Driver Version / Runtime Version          6.5 / 6.5
CUDA Capability Major/Minor version number:    3.5
Total amount of global memory:                 3072 MBytes (3220897792 bytes)
(15) Multiprocessors, (192) CUDA Cores/MP:     2880 CUDA Cores
GPU Clock rate:                                1084 MHz (1.08 GHz)
Memory Clock rate:                             3500 Mhz
Memory Bus Width:                              384-bit
L2 Cache Size:                                 1572864 bytes
Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
Total amount of constant memory:               65536 bytes
Total amount of shared memory per block:       49152 bytes
Total number of registers available per block: 65536
Warp size:                                     32
Maximum number of threads per multiprocessor:  2048
Maximum number of threads per block:           1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch:                          2147483647 bytes
Texture alignment:                             512 bytes
Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
Run time limit on kernels:                     Yes
Integrated GPU sharing Host Memory:            No
Support host page-locked memory mapping:       Yes
Alignment requirement for Surfaces:            Yes
Device has ECC support:                        Disabled
Device supports Unified Addressing (UVA):      Yes
Device PCI Bus ID / PCI location ID:           1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

..
..

> Peer access from GeForce GTX 780 Ti (GPU0) -> GeForce GTX 780 Ti (GPU1) : Yes
> Peer access from GeForce GTX 780 Ti (GPU1) -> GeForce GTX 780 Ti (GPU0) : Yes

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.5, CUDA Runtime Version = 6.5, NumDevs = 2, Device0 = GeForce GTX 780 Ti, Device1 = GeForce GTX 780 Ti
Result = PASS

The final line is the most important. It states that the test was successful as we received a "PASS". We also want to check the bandwidth to our GPU. We can run the
bandwidthTest
 command:

./bandwidthTest

The output on my system is as follows:

[CUDA Bandwidth Test] - Starting...
Running on...

Device 0: GeForce GTX 780 Ti
Quick Mode

Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes)	Bandwidth(MB/s)
33554432			6308.7

Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes)	Bandwidth(MB/s)
33554432			6464.2

Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes)	Bandwidth(MB/s)
33554432			264346.8

Result = PASS

As before, the final line is the most important. It states that the test was successful as we received a "PASS".

That concludes the installation and testing of the Nvidia CUDA toolkit! You should now be able to follow

Valerio's second tutorial on creating a "Hello World!" for CUDA.

I found the following articles helpful when installing CUDA on my system as I initially had issues with my Nvidia driver:

http://www.r-tutor.com/gpu-computing/cuda-installation/cuda6.5-ubuntu
http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/index.html#abstract
http://askubuntu.com/questions/451672/installing-and-testing-cuda-in-ubuntu-14-04

Michael Halls-Moore
Mike is the founder of QuantStart and has been involved in the quantitative finance industry for the last five years, primarily as a quant developer and later as a quant trader consulting for hedge funds.

Visit Michael's LinkedIn Profile

Related Articles
Matrix-Matrix Multiplication on the GPU with Nvidia CUDA
Monte Carlo Simulations In CUDA - Barrier Option Pricing
dev_array: A Useful Array Class for CUDA
Vector Addition "Hello World!" Example with CUDA on Mac OSX
Installing Nvidia CUDA on Mac OSX for GPU-Based Parallel Computing
Calculating the Greeks with Finite Difference and Monte Carlo Methods in C++
Jump-Diffusion Models for European Options Pricing in C++
Heston Stochastic Volatility Model with Euler Discretisation in C++
Implied Volatility in C++ using Template Functions and Newton-Raphson
Eigen Library for Matrix Algebra in C++

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航