TensorLayer官方中文文档1.7.4:CLI – 命令行界面
2018-03-10 16:40
411 查看
所属分类:TensorLayer
The tensorlayer.cli.train module provides the
It helps the user bootstrap a TensorFlow/TensorLayer program for distributed training
using multiple GPU cards or CPUs on a computer.
You need to first setup the CUDA_VISIBLE_DEVICES
to tell
In distribute training, each TensorFlow program needs a TF_CONFIG environment variable to describe
the cluster. It also needs a master daemon to
monitor all trainers.
for automatically managing these two tasks.
CUDA_VISIBLE_DEVICES="0,1"
tl train example/tutorial_mnist_distributed.py
# example of using CPU trainers for inception v3
tl train -c 16 example/tutorial_imagenet_inceptionV3_distributed.py
# example of using GPU trainers for inception v3 with customized arguments
# as CUDA_VISIBLE_DEVICES is not given, tl would try to discover all available GPUs
tl train example/tutorial_imagenet_inceptionV3_distributed.py -- --batch_size 16
to help parallel trainers to exchange intermediate gradients.
The best number of parameter servers is often proportional to the
size of your model as well as the number of CPUs available.
You can control the number of parameter servers using the
If you have a single computer with massive CPUs, you can use the
to enable CPU-only parallel training.
The reason we are not supporting GPU-CPU co-training is because GPU and
CPU are running at different speeds. Using them together in training would
incur stragglers.
艾伯特(http://www.aibbt.com/)国内第一家人工智能门户
CLI - 命令行界面¶
The tensorlayer.cli module provides a command-line tool for some common tasks.tl train¶
(Alpha release - usage might change later)The tensorlayer.cli.train module provides the
tl trainsubcommand.
It helps the user bootstrap a TensorFlow/TensorLayer program for distributed training
using multiple GPU cards or CPUs on a computer.
You need to first setup the CUDA_VISIBLE_DEVICES
to tell
tl trainwhich GPUs are available. If the CUDA_VISIBLE_DEVICES is not given,
tl trainwould try best to discover all available GPUs.
In distribute training, each TensorFlow program needs a TF_CONFIG environment variable to describe
the cluster. It also needs a master daemon to
monitor all trainers.
tl trainis responsible
for automatically managing these two tasks.
Usage¶
tl train [-h] [-p NUM_PSS] [-c CPU_TRAINERS] <file> [args [args ...]]# example of using GPU 0 and 1 for training mnistCUDA_VISIBLE_DEVICES="0,1"
tl train example/tutorial_mnist_distributed.py
# example of using CPU trainers for inception v3
tl train -c 16 example/tutorial_imagenet_inceptionV3_distributed.py
# example of using GPU trainers for inception v3 with customized arguments
# as CUDA_VISIBLE_DEVICES is not given, tl would try to discover all available GPUs
tl train example/tutorial_imagenet_inceptionV3_distributed.py -- --batch_size 16
Parameters¶
file: python file path.
NUM_PSS: The number of parameter servers.
CPU_TRAINERS: The number of CPU trainers.It is recommended that
NUM_PSS + CPU_TRAINERS <= cpu count
args: Any parameter after
--would be passed to the python program.
Notes¶
A parallel training program would require multiple parameter serversto help parallel trainers to exchange intermediate gradients.
The best number of parameter servers is often proportional to the
size of your model as well as the number of CPUs available.
You can control the number of parameter servers using the
-pparameter.
If you have a single computer with massive CPUs, you can use the
-cparameter
to enable CPU-only parallel training.
The reason we are not supporting GPU-CPU co-training is because GPU and
CPU are running at different speeds. Using them together in training would
incur stragglers.
艾伯特(http://www.aibbt.com/)国内第一家人工智能门户
相关文章推荐
- TensorLayer官方中文文档1.7.4:API – 操作系统管理
- TensorLayer官方中文文档1.7.4:API – 激活函数
- TensorLayer官方中文文档1.7.4:API – 数据预处理
- Apache Commons CLI官方文档翻译 —— 快速构建命令行启动模式
- redis 集群官方中文文档翻译
- TensorFlow官方中文文档-读书笔记
- Hadoop-2.2.0中文文档—— Common - CLI MiniCluster
- PyTorch官方中文文档:torch.autograd
- Spark SQL 官方文档-中文翻译
- Cocos2d-x官方中文文档 v3.x HttpClient
- 如何将项目部署到Scrapyd(可以通过scrapy1.5中文官方文档进行学习)
- SWFUpload 2.5.0版 官方说明文档 中文翻译版
- SWFUpload 2.5.0版 官方说明文档 中文翻译版
- Sap2000官方API文档中的C#示例(已修改原有错误并加中文注释)
- Spring Boot(二):安装命令行界面Spring boot CLI
- dbcp配置--官方文档中文版本
- 使用Cordova命令行界面(CLI)
- 谷歌官方中文开发文档
- Docker1.8 官方中文文档
- Physics Bodies(中文翻译)—UE4官方文档