Tensorflow ConfigProto & inter_/intra_op_parallelism_threads 整理
2017-05-08 15:51
1466 查看
tensorflow ConfigProto有什么用:
tf.ConfigProto一般用在创建session的时候。用来对session进行参数配置,参数包括:
a)记录设备指派情况:为了获取你的 operations 和 Tensor 被指派到哪个设备上运行, 用 log_device_placement 新建一个 session, 并设置为 True:sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))这将会打印出各个操作在哪个设备(cpu或者gpu)上运行。
另:可以手动设置哪些操作在cpu或者GPU上运行,即:with tf.device('/cpu:0'):,这就设定了设备环境为cpu0,在这个设备环境下的操作都将在cpu0上进行。
b)为了避免出现你指定的设备不存在这种情况, 你可以在创建的 session 里把参数 allow_soft_placement 设置为 True, 这样 tensorFlow 会自动选择一个存在并且支持的设备来运行 operation.
c)config = tf.ConfigProto()
config.gpu_options.allow_growth = True:刚一开始分配少量的GPU容量,然后按需慢慢的增加,由于不会释放内存,所以会导致碎片。
d)gpu_options=tf.GPUOptions(per_process_gpu_memory_fraction=0.7)
config=tf.ConfigProto(gpu_options=gpu_options):设置每个GPU应该拿出多少容量给进程使用,0.4代表 40%
参考:http://blog.csdn.Net/u012436149/article/details/53837651
http://wiki.jikexueyuan.com/project/tensorflow-zh/how_tos/using_gpu.html
e)intra_op_parallelism_threads和inter_op_parallelism_threads:Choose how many cores to use
源自 http://blog.csdn.net/h_jlwg6688/article/details/65441723?locationNum=12&fps=1
=========================================================================================================
在进行tf.ConfigProto()初始化时,我们也可以通过设置intra_op_parallelism_threads参数和inter_op_parallelism_threads参数,来控制每个操作符op并行计算的线程个数。二者的区别在于:
intra_op_parallelism_threads 控制运算符op内部的并行
当运算符op为单一运算符,并且内部可以实现并行时,如矩阵乘法,reduce_sum之类的操作,可以通过设置intra_op_parallelism_threads参数来并行,
intra代表内部。
inter_op_parallelism_threads 控制多个运算符op之间的并行计算
当有多个运算符op,并且他们之间比较独立,运算符和运算符之间没有直接的路径Path相连。Tensorflow会尝试并行地计算他们,使用由inter_op_parallelism_threads参数来控制数量的一个线程池。
源自:http://blog.csdn.net/rockingdingo/article/details/55652662
=========================================================================================================
TensorFlow:
inter- and intra-op parallelism configuration
The
are documented in the source
of the
buffer. These options configure two thread pools used by TensorFlow to parallelize execution, as the comments describe:
There are several possible forms of parallelism when running a TensorFlow graph, and these options provide some control multi-core CPU parallelism:
If you have an operation that can be parallelized internally, such as matrix multiplication (
or a reduction (e.g.
TensorFlow will execute it by scheduling tasks in a thread pool with
This configuration option therefore controls the maximum parallel speedup for a single operation. Note that if you run multiple operations in parallel, these operations will share this thread pool.
If you have many operations that are independent in your TensorFlow graph—because there is no directed path between them in the dataflow graph—TensorFlow will attempt to run them concurrently, using a thread pool with
If those operations have a multithreaded implementation, they will (in most cases) share the same thread pool for intra-op parallelism.
Finally, both configuration options take a default value of
which means "the system picks an appropriate number." Currently, this means that each thread pool will have one thread per CPU core in your machine.
源自:http://stackoverflow.com/questions/41233635/tensorflow-inter-and-intra-op-parallelism-configuration
tf.ConfigProto一般用在创建session的时候。用来对session进行参数配置,参数包括:
a)记录设备指派情况:为了获取你的 operations 和 Tensor 被指派到哪个设备上运行, 用 log_device_placement 新建一个 session, 并设置为 True:sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))这将会打印出各个操作在哪个设备(cpu或者gpu)上运行。
另:可以手动设置哪些操作在cpu或者GPU上运行,即:with tf.device('/cpu:0'):,这就设定了设备环境为cpu0,在这个设备环境下的操作都将在cpu0上进行。
b)为了避免出现你指定的设备不存在这种情况, 你可以在创建的 session 里把参数 allow_soft_placement 设置为 True, 这样 tensorFlow 会自动选择一个存在并且支持的设备来运行 operation.
c)config = tf.ConfigProto()
config.gpu_options.allow_growth = True:刚一开始分配少量的GPU容量,然后按需慢慢的增加,由于不会释放内存,所以会导致碎片。
d)gpu_options=tf.GPUOptions(per_process_gpu_memory_fraction=0.7)
config=tf.ConfigProto(gpu_options=gpu_options):设置每个GPU应该拿出多少容量给进程使用,0.4代表 40%
参考:http://blog.csdn.Net/u012436149/article/details/53837651
http://wiki.jikexueyuan.com/project/tensorflow-zh/how_tos/using_gpu.html
e)intra_op_parallelism_threads和inter_op_parallelism_threads:Choose how many cores to use
源自 http://blog.csdn.net/h_jlwg6688/article/details/65441723?locationNum=12&fps=1
=========================================================================================================
多线程,设置Multi-threads
在进行tf.ConfigProto()初始化时,我们也可以通过设置intra_op_parallelism_threads参数和inter_op_parallelism_threads参数,来控制每个操作符op并行计算的线程个数。二者的区别在于:intra_op_parallelism_threads 控制运算符op内部的并行
当运算符op为单一运算符,并且内部可以实现并行时,如矩阵乘法,reduce_sum之类的操作,可以通过设置intra_op_parallelism_threads参数来并行,
intra代表内部。
inter_op_parallelism_threads 控制多个运算符op之间的并行计算
当有多个运算符op,并且他们之间比较独立,运算符和运算符之间没有直接的路径Path相连。Tensorflow会尝试并行地计算他们,使用由inter_op_parallelism_threads参数来控制数量的一个线程池。
源自:http://blog.csdn.net/rockingdingo/article/details/55652662
=========================================================================================================
TensorFlow:
inter- and intra-op parallelism configuration
up vote down votefavorite | Can somebody please explain the following TensorFlow terms:-inter_op_parallelism_threads intra_op_parallelism_threads or please provide links to the right source of explanation. |
1 Answer
The inter_op_parallelism_threadsand
intra_op_parallelism_threadsoptions
are documented in the source
of the
tf.ConfigProtoprotocol
buffer. These options configure two thread pools used by TensorFlow to parallelize execution, as the comments describe:
// The execution of an individual op (for some op types) can be // parallelized on a pool of intra_op_parallelism_threads. // 0 means the system picks an appropriate number. int32 intra_op_parallelism_threads = 2; // Nodes that perform blocking operations are enqueued on a pool of // inter_op_parallelism_threads available in each process. // // 0 means the system picks an appropriate number. // // Note that the first Session created in the process sets the // number of threads for all future sessions unless use_per_session_threads is // true or session_inter_op_thread_pool is configured. int32 inter_op_parallelism_threads = 5;
There are several possible forms of parallelism when running a TensorFlow graph, and these options provide some control multi-core CPU parallelism:
If you have an operation that can be parallelized internally, such as matrix multiplication (
tf.matmul())
or a reduction (e.g.
tf.reduce_sum()),
TensorFlow will execute it by scheduling tasks in a thread pool with
intra_op_parallelism_threadsthreads.
This configuration option therefore controls the maximum parallel speedup for a single operation. Note that if you run multiple operations in parallel, these operations will share this thread pool.
If you have many operations that are independent in your TensorFlow graph—because there is no directed path between them in the dataflow graph—TensorFlow will attempt to run them concurrently, using a thread pool with
inter_op_parallelism_threadsthreads.
If those operations have a multithreaded implementation, they will (in most cases) share the same thread pool for intra-op parallelism.
Finally, both configuration options take a default value of
0,
which means "the system picks an appropriate number." Currently, this means that each thread pool will have one thread per CPU core in your machine.
|
相关文章推荐
- TensorFlow Data Input (Part 1): Placeholders, Protobufs & Queues
- tensorflow ConfigProto
- TensorFlow Data Input (Part 1): Placeholders, Protobufs & Queues 占位符,原型和队列
- tensorflow ConfigProto
- tensorflow API简单整理(四、Graph,Operation&Tensor)
- Tensorflow: setting of inter_op_parallelism_threads and intra_op_parallelism_threads don't effect
- 配置(7) Tensorflow OpKernel ('op: "BestSplits" device_type: "CPU"') for unknown op: BestSplits
- 用户空间具体是如何处理dpif_upcall ?(3)执行flow_miss_op->dpif_op,与内核沟通
- R已经可以进行深度学习!!!tensorflow&keras
- tensorflow:ConfigProto&GPU
- tensorflow & mnist & CNN
- 如何安装Spark & TensorflowOnSpark
- 简单的线性回归问题-TensorFlow+MATLAB·
- TensorFlow & im2txt学习笔记(一)
- 如何安装Spark & TensorflowOnSpark
- tensorflow学习笔记(二十五):ConfigProto&GPU
- tensorflow&卷积神经网络&字符识别
- tensorflow API简单整理(三、控制流)
- TensorFlow Data Input (Part 2): Extensions & Hacks
- ubuntu14.04 安装TensorFlow&升级 cuda8.0 的坑