您的位置：首页 > 编程语言 > MATLAB

Matlab中C-mex与CUDA环境的配置

2015-07-30 14:33 459 查看

Step 1：安装Visual Studio 2012（或以上版本）

Step 2：安装CUDA 7.0

  安装完毕后，可以看到系统中多了CUDA_PATH和CUDA_PATH_V6_0两个环境变量，接下来，还要在系统中添加以下几个环境变量：

CUDA_SDK_PATH

= C:\ProgramData\NVIDIA Corporation\CUDA Samples\v7.0

CUDA_LIB_PATH

= %CUDA_PATH%\lib\x64

CUDA_BIN_PATH
= %CUDA_PATH%\bin

CUDA_SDK_BIN_PATH
= %CUDA_SDK_PATH%\bin\win64

CUDA_SDK_LIB_PATH
= %CUDA_SDK_PATH%\common\lib\x64

然后，在系统变量PATH的末尾添加：

;%CUDA_LIB_PATH%;%CUDA_BIN_PATH%;%CUDA_SDK_LIB_PATH%;%CUDA_SDK_BIN_PATH%;

注意：CUDA一定要在VS之后安装

Step 3：安装Matlab R2014（或其他版本）

Step 4：
c-mex配置，在Matlab的命令行窗口中输入如下命令

可以看到，Matlab已经自动选择了VC 2012的C编译器进行C语言的编译。

Step 5：
测试c-mex是否可用

1. 在当前文件夹下建立一个新的文件夹（也可以在任意位置下建立），如”HelloMex”。然后双击进入该文件夹

2. 新建一个Matlab脚本，然后保存为.cpp文件

注意：文件的保存类型要选择为“所有文件（*.*）

3. 在hellomex.cpp中键入以下代码，然后保存

#include "mex.h"

void mexFunction(int nlhs, mxArray *plhs[],int nrhs,const mxArray *prhs[])

{

mexPrintf("Hello, mex!\n");

}

4. 在Matlab命令行窗口中键入以下命令，编译文件hellomex.cpp

由上图可知，MEX文件编译成了。我们还可以看到在当前文件夹下，生成了目标文件”hellomex.mexw64"

5. 在Matlab中调用。在Matlab中，我们使用文件名调用Mex目标文件，如下：

到此，在Matlab中调用c-mex的一个简单的例子已经测试完毕。

Step 6：
测试CUDA是否可用

1. 在Matlab命令行窗口输入以下命令

出现以上提示说明CUDA环境是配置好了的，即系统中安装了nvcc编译器。

如果系统提示没有nvcc编译器，说明你没有安装CUDA。

2. 新建一个脚本，输入以下代码并保存为“AddVectors.h”

#ifndef __ADDVECTORS_H__

#define __ADDVECTORS_H__

extern void addVectors(float* A, float* B, float* C, int size);

#endif

3. 新建一个脚本，输入以下代码，并保存为“AddVectors.cu"(.cu代表CUDA代码)

#include "addVectors.h"

#include "mex.h"

//cudaError_t addWithCuda(int *c, const int *a, const int *b, unsigned int size);

__global__ void addVectorsMask(float* a, float* b, float* c, int size)

{

int i = threadIdx.x;

if( i >= size ) return;

c[i] = a[i] + b[i];

}

// Helper function for using CUDA to add vectors in parallel.

void addVectors(float* a, float* b, float* c, int size)

{

float *dev_a = 0;

float *dev_b = 0;

float *dev_c = 0;

cudaError_t cudaStatus;

// Choose which GPU to run on, change this on a multi-GPU system.

cudaStatus = cudaSetDevice(0);

if (cudaStatus != cudaSuccess) {

fprintf(stderr, "cudaSetDevice failed! Do you have a CUDA-capable GPU installed?");

goto Error;

}

// Allocate GPU buffers for three vectors (two input, one output) .

cudaStatus = cudaMalloc((void**)&dev_c, size * sizeof(float));

if (cudaStatus != cudaSuccess) {

fprintf(stderr, "cudaMalloc failed!");

goto Error;

}

cudaStatus = cudaMalloc((void**)&dev_a, size * sizeof(float));

if (cudaStatus != cudaSuccess) {

fprintf(stderr, "cudaMalloc failed!");

goto Error;

}

cudaStatus = cudaMalloc((void**)&dev_b, size * sizeof(float));

if (cudaStatus != cudaSuccess) {

fprintf(stderr, "cudaMalloc failed!");

goto Error;

}

// Copy input vectors from host memory to GPU buffers.

cudaStatus = cudaMemcpy(dev_a, a, size * sizeof(float), cudaMemcpyHostToDevice);

if (cudaStatus != cudaSuccess) {

fprintf(stderr, "cudaMemcpy failed!");

goto Error;

}

cudaStatus = cudaMemcpy(dev_b, b, size * sizeof(float), cudaMemcpyHostToDevice);

if (cudaStatus != cudaSuccess) {

fprintf(stderr, "cudaMemcpy failed!");

goto Error;

}

// Launch a kernel on the GPU with one thread for each element.

addVectorsMask<<<1, size>>>(dev_c, dev_a, dev_b,size);

// Check for any errors launching the kernel

cudaStatus = cudaGetLastError();

if (cudaStatus != cudaSuccess) {

fprintf(stderr, "addKernel launch failed: %s\n", cudaGetErrorString(cudaStatus));

goto Error;

}



// cudaDeviceSynchronize waits for the kernel to finish, and returns

// any errors encountered during the launch.

cudaStatus = cudaDeviceSynchronize();

if (cudaStatus != cudaSuccess) {

fprintf(stderr, "cudaDeviceSynchronize returned error code %d after launching addKernel!\n", cudaStatus);

goto Error;

}

// Copy output vector from GPU buffer to host memory.

cudaStatus = cudaMemcpy(c, dev_c, size * sizeof(float), cudaMemcpyDeviceToHost);

if (cudaStatus != cudaSuccess) {

fprintf(stderr, "cudaMemcpy failed!");

goto Error;

}

Error:

cudaFree(dev_c);

cudaFree(dev_a);

cudaFree(dev_b);

}

4. 新建一个脚本文件，输入以下代码，保存为”AddVectors.cpp“

#include "mex.h"

#include "addVectors.h"

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, mxArray *prhs[])

{

if (nrhs != 2)

mexErrMsgTxt("Invaid number of input arguments");

if (nlhs != 1)

mexErrMsgTxt("Invalid number of outputs");

if (!mxIsSingle(prhs[0]) && !mxIsSingle(prhs[1]))

mexErrMsgTxt("input vector data type must be single");



int numRowsA = (int)mxGetM(prhs[0]);

int numColsA = (int)mxGetN(prhs[0]);

int numRowsB = (int)mxGetM(prhs[1]);

int numColsB = (int)mxGetN(prhs[1]);



if (numRowsA != numRowsB || numColsA != numColsB)

mexErrMsgTxt("Invalid size. The sizes of two vectors must be same");



int minSize = (numRowsA < numColsA) ? numRowsA : numColsA;

int maxSize = (numRowsA > numColsA) ? numRowsA : numColsA;



if (minSize != 1)

mexErrMsgTxt("Invalid size. The vector must be one dimentional");

float* A = (float*)mxGetData(prhs[0]);

float* B = (float*)mxGetData(prhs[1]);

//create the output vector

plhs[0] = mxCreateNumericMatrix(numRowsA, numColsB, mxSINGLE_CLASS, mxREAL);

float* C = (float*)mxGetData(plhs[0]);

addVectors(A, B, C, maxSize);

}

5. 使用nvcc编译器编译".cu"文件，

编译完成后，在当前文件夹下可以看到生成了".obj"目标文件。

这一步可能出现找不到编译器'cl.exe'的错误，如下：

出现这个错误的原因是在系统变量”PATH“中没有添加VC编译器”cl.exe"的路径，找到该路径并添加便可。

“cl.exe”在VS安装路径下的“VC\bin”中，将这个路径添加到系统变量“PATH”的末尾。

注意：要使环境变量生效，还需重启计算机或者注销用户。

6. 编译mex文件并链接到生成的CUDA目标文件，命令为“mex AddVectors.cpp addVectors.obj -lcudart -L"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\lib\x64"

命令行窗口提示”MEX 已成功完成“，在当前目录下，我们可以看到生成的文件”AddVectors.mexw64"

7. 在Matlab中调用，如下：

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： Matlab mex c-mex cud

相关文章推荐

新的分享

章节导航