VMware BitFusion 再探二(功能测试)
接上一篇文章:VMware BitFusion 初探一(环境搭建)
1. 测试步骤
主要测试步骤:
• Create a VM
• Enable VM for Bitfusion
• Install Bitfusion Client
• Install CUDA 10.0
• Install CuDNN 7
• Install python3, if needed (CentOS)
• Install TensorFlow 1.13.1
• Install TensorFlow benchmarks (branch cnn_tf_v1.13_compatible)
• Run TensorFlow benchmarks
2. Enable VM for Bitfusion
创建一台CentOS7 虚机,不要开机,右击虚机启用bitfusion
选择For a client.
3. Install Bitfusion Client
将CentOS7开机,然后执行以下命令
[code]# Install bitfusion client $ yum install -y epel-release $ rpm --import https://packages.vmware.com/bitfusion/vmware.bitfusion.key $ yum install -y https://packages.vmware.com/bitfusion/centos/7/bitfusion-client-centos7-2.0.0-11.x86_64.rpm # Add user to bitfusion group $ usermod -aG bitfusion root # Confirm user belongs to bitfusion group $ groups root bitfusion # Test Bitfusion $ bitfusion list_gpus - server 0 [10.10.10.10:56001]: running 0 tasks |- GPU 0: free memory 15109 MiB / 15109 MiB |- GPU 1: free memory 15109 MiB / 15109 MiB |- GPU 2: free memory 15109 MiB / 15109 MiB
安装完成后,自动注册到bitfusion server中;
4. 安装CUDA
CUDA is the NVIDA library allows programmatic access to their GPUs. It will be used by the TensorFlow benchmarks.
[code]$ mkdir bitfusion $ cd bitfusion # install cuda-repo $ wget -P /etc/yun.repos.d/ https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo $ yum clean all $ yum install -y cuda-10-0
获取显卡设备信息
[code]$ bitfusion run --num_gpus 1 nvidia-smi Requested resources: Server List: 10.10.10.101:56001 Client idle timeout: 0 min Wed Jul 29 12:07:35 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 450.51.06 Driver Version: 440.64.00 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Tesla T4 Off | 00000000:04:00.0 Off | 0 | | N/A 28C P8 9W / 70W | 0MiB / 15109MiB | 0% Default | | | | ERR! | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
5. 安装CuDNN
CuDNN is the Deep Neural Network library from NVIDIA. The TensorFlow benchmarks you will run later will require this library too.
需要到 https://developer.nvidia.com/cudnn
创建账号并下载 libcudnn7并安装
[code]$ sudo rpm -ivh libcudnn7-7.6.5.32-1.cuda10.0.x86_64.rpm Preparing... ################################# [100%] Updating / installing... 1: libcudnn7-7.6.5.32-1.cuda10.0 ################################# [100%] $ sudo ldconfig # update libraries list $ ldconfig -p | grep cudnn # to see if it is installed libcudnn.so.7 (libc6,x86-64) => /lib64/libcudnn.so.7
6. 安装Python3和TensorFlow
[code]$ yum install python3 $ pip3 install tensorflow-gpu==1.13.1
7. 安装TensorFlow Benchmarks
The benchmarks are open source ML applications designed to test performance on the TensorFlow framework.
[code]$ cd ~/bitfusion $ git clone https://github.com/tensorflow/benchmarks.git $ cd benchmarks $ git branch -a * master remotes/origin/HEAD -> origin/master remotes/origin/cnn_tf_v1.10_compatible ... remotes/origin/cnn_tf_v1.13_compatible ... $ git checkout cnn_tf_v1.13_compatible Branch cnn_tf_v1.13_compatible set up to track remote branch cnn_tf_v1.13_compatible from origin. Switched to a new branch ‘cnn_tf_v1.13_compatible’ $ git branch * cnn_tf_v1.13_compatible master
8. 执行BitFunsion测试
在没有GPU的情况下执行TensorFlow测试,测试结果如下,平均处理图片为每秒805.71张。
[code]$ python3 ./benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py ... Running warm up Done warm up Step Img/sec total_loss 1 images/sec: 805.8 +/- 0.0 (jitter = 0.0) 14.299 10 images/sec: 803.0 +/- 3.3 (jitter = 4.5) 14.299 20 images/sec: 803.4 +/- 2.5 (jitter = 6.5) 14.299 30 images/sec: 806.4 +/- 2.1 (jitter = 9.4) 14.299 40 images/sec: 803.1 +/- 2.7 (jitter = 11.2) 14.298 50 images/sec: 801.5 +/- 2.8 (jitter = 10.2) 14.298 60 images/sec: 802.6 +/- 2.4 (jitter = 8.9) 14.299 70 images/sec: 804.4 +/- 2.1 (jitter = 9.0) 14.299 80 images/sec: 805.7 +/- 1.9 (jitter = 8.7) 14.298 90 images/sec: 806.3 +/- 1.7 (jitter = 8.5) 14.298 100 images/sec: 806.9 +/- 1.6 (jitter = 8.1) 14.298 ---------------------------------------------------------------- total images/sec: 805.71 ----------------------------------------------------------------
通过BitFusion来执行TensorFlow测试,分配1个GPU显存资源,测试结果如下,平均处理图片为每秒7013.81张。
[code]# bitfusion allocation 1 gpu $ bitfusion run --num_gpus 1 -- python3 ./benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py ... Done warm up Step Img/sec total_loss 1 images/sec: 5947.0 +/- 0.0 (jitter = 0.0) 14.299 10 images/sec: 6015.8 +/- 19.5 (jitter = 51.8) 14.299 20 images/sec: 6091.0 +/- 20.4 (jitter = 120.5) 14.299 30 images/sec: 6111.4 +/- 15.0 (jitter = 62.6) 14.299 40 images/sec: 6126.5 +/- 12.5 (jitter = 57.8) 14.298 50 images/sec: 6226.8 +/- 73.0 (jitter = 66.4) 14.298 60 images/sec: 6497.5 +/- 115.6 (jitter = 92.3) 14.299 70 images/sec: 6705.9 +/- 122.1 (jitter = 121.1) 14.299 80 images/sec: 6874.3 +/- 120.4 (jitter = 242.2) 14.298 90 images/sec: 7014.0 +/- 116.0 (jitter = 374.4) 14.298 100 images/sec: 7133.7 +/- 111.3 (jitter = 404.1) 14.298 ---------------------------------------------------------------- total images/sec: 7013.81 ----------------------------------------------------------------
9. GPU分片测试
在BitFusion管理页面中为Client显示为0.5个GPU
执行测试命令,会提示错误
[code]$ bitfusion run --num_gpus 1 -- python3 ./benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py Error requesting gpus: Error starting dispatcher: Error sending heartbeat: Error when sending cluster session information: ErrorOverQuota: client 18e87c5 allocation over quota: 0.50 quota, 1.00 allocated Error starting dispatcher: Error sending heartbeat: Error when sending cluster session information: ErrorOverQuota: client 18e87c5 allocation over quota: 0.50 quota, 1.00 allocated
修改为0.5个分片,运行成功
[code]$ bitfusion run --num_gpus 1 --partial 0.5 -- python3 ./benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py Requested resources: Server List: 10.10.10.11:56001 Client idle timeout: 1 min ... pciBusID: 0000:00:00.0 totalMemory: 7.38GiB freeMemory: 6.99GiB ... 2020-07-29 18:25:16.604751: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally Done warm up Step Img/sec total_loss 1 images/sec: 5998.6 +/- 0.0 (jitter = 0.0) 14.299 10 images/sec: 6005.8 +/- 12.4 (jitter = 32.3) 14.299 20 images/sec: 6009.2 +/- 7.4 (jitter = 35.3) 14.299 30 images/sec: 6001.9 +/- 7.1 (jitter = 40.8) 14.299 40 images/sec: 6123.6 +/- 89.9 (jitter = 50.5) 14.298 50 images/sec: 6436.6 +/- 130.8 (jitter = 66.8) 14.298 60 images/sec: 6667.9 +/- 132.9 (jitter = 113.8) 14.299 70 images/sec: 6849.3 +/- 127.8 (jitter = 209.3) 14.299 80 images/sec: 6996.1 +/- 120.6 (jitter = 486.3) 14.299 90 images/sec: 7131.8 +/- 116.1 (jitter = 406.1) 14.298 100 images/sec: 7289.4 +/- 117.6 (jitter = 449.9) 14.298 ---------------------------------------------------------------- total images/sec: 7165.57 ----------------------------------------------------------------
10. 测试总结
优点
- 使用简单,Client调用bitfusion GPU资源时和语言无关,直接在程序前加上: bitfusion run --num_gpus {n} --partial {n} -- 即可。
- GPU共享,可以供多台VM通过网络调用GPU计算资源,Client用完GPU资源后就会立即释放,其他Client可以继续使用。
- 可以将GPU内存划分为任意大小不同的切片,然后分配给不同的客户端以供同时使用。
- 不需要NVIDIA许可。
缺点
- 目前仅支持 RHEL/Centos 7, Ubuntu 18.04/16.04
- vmware fusion pro 10安装详细步骤
- vmware fusion pro 10破解版
- Mac虚拟机 VMware Fusion Pro 11破解版 附注册机序列号
- VMWARE FUSION PRO 11 FOR MAC(VM虚拟机)中文扩展版 V11.0.2(10952296)永久密钥版
- Mac电脑双系统怎么弄呢?VMware Fusion Pro 11 for Mac(VM虚拟机)完美解决你的需求!
- 通过VMware Fusion模拟cisco ASA/N7K等,再通过socat映射模拟串口进行配置
- VMware Fusion Professional 10
- [置顶] VMware Fusion Pro 8 for Mac 启动出现“内部错误” 解决方案
- VMware Fusion Pro 11.0.3 Extended Edition 中文扩展特别版 Mac 优秀的虚拟机
- Oracle Fusion Applications (11.1.8) Media Pack and Oracle Application Development Framework 11g (11.1.1.7.2) for Microsoft Windows x64 (64-bit)
- VMware-Fusion-7.0.0-2103067 Pro SN:序列号+ 百度云下载地址
- Vmware Fusion Pro安装Windows7系统
- 苹果虚拟机VMware Fusion Pro中文版
- Running a 64-bit VMware image on a 32-bit machine
- VMware-Fusion-7.0.0-2103067 Pro SN:序列号+ 百度云下载地址
- vmware fusion NAT下的端口,ip配置
- vmware fusion pro 10破解版
- vmware fusion pro 11 for mac 破解版永久激活方法
- VMWare虚拟机NAT模式下static IP(适用有mac vmvare fusion)
- Running a 64-bit VMware image on a 32-bit machine