Word2vec多线程(tensorflow)
2015-12-16 20:17
441 查看
workers = []
for _ in xrange(opts.concurrent_steps):
t = threading.Thread(target=self._train_thread_body)
t.start()
workers.append(t)
Word2vec.py使用了多线程
一般认为python多线程其实是单线程
由于python的设计 GPL 内存不是现成安全的
但是这里由于内部是调用c++代码
所以还是能起到多线程作用
而 Word2vec的 skipgramoperator内部类设计
解决多线程访问冲突问题用的是锁
mutex mu_;
random::PhiloxRandom philox_ GUARDED_BY(mu_);
random::SimplePhilox rng_ GUARDED_BY(mu_);
int32 current_epoch_ GUARDED_BY(mu_) = -1;
int64 total_words_processed_ GUARDED_BY(mu_) = 0;
int32 example_pos_ GUARDED_BY(mu_);
int32 label_pos_ GUARDED_BY(mu_);
int32 label_limit_ GUARDED_BY(mu_)
觉得operator的操作还是单线程并行执行的
由于锁
后面的batch计算是并行的
def _train_thread_body(self):
initial_epoch, = self._session.run([self._epoch])
while True:
_, epoch = self._session.run([self._train, self._epoch])
if epoch != initial_epoch:
break
(words, counts, words_per_epoch, self._epoch, self._words, examples,
labels) = word2vec.skipgram(filename=opts.train_data,
batch_size=opts.batch_size,
window_size=opts.window_size,
min_count=opts.min_count,
subsample=opts.subsample
The threading lock only affects Python code. If your thread is waiting for disk I/O or if it is calling C functions (e.g. via math library) you can ignore the GIL.
You may be able to use the async pattern to get around threading limits. Can you supply more information about what your program actually does?
I have issues with the technical accuracy of the video linked. David Beazley has done many well respected talks about the GIL at various Pycons. You can find them on pyvideo.org.
来自 <https://www.reddit.com/r/Python/comments/3s0vg9/is_my_multithreaded_python_program_doomed/>
for _ in xrange(opts.concurrent_steps):
t = threading.Thread(target=self._train_thread_body)
t.start()
workers.append(t)
Word2vec.py使用了多线程
一般认为python多线程其实是单线程
由于python的设计 GPL 内存不是现成安全的
但是这里由于内部是调用c++代码
所以还是能起到多线程作用
而 Word2vec的 skipgramoperator内部类设计
解决多线程访问冲突问题用的是锁
mutex mu_;
random::PhiloxRandom philox_ GUARDED_BY(mu_);
random::SimplePhilox rng_ GUARDED_BY(mu_);
int32 current_epoch_ GUARDED_BY(mu_) = -1;
int64 total_words_processed_ GUARDED_BY(mu_) = 0;
int32 example_pos_ GUARDED_BY(mu_);
int32 label_pos_ GUARDED_BY(mu_);
int32 label_limit_ GUARDED_BY(mu_)
觉得operator的操作还是单线程并行执行的
由于锁
后面的batch计算是并行的
def _train_thread_body(self):
initial_epoch, = self._session.run([self._epoch])
while True:
_, epoch = self._session.run([self._train, self._epoch])
if epoch != initial_epoch:
break
(words, counts, words_per_epoch, self._epoch, self._words, examples,
labels) = word2vec.skipgram(filename=opts.train_data,
batch_size=opts.batch_size,
window_size=opts.window_size,
min_count=opts.min_count,
subsample=opts.subsample
The threading lock only affects Python code. If your thread is waiting for disk I/O or if it is calling C functions (e.g. via math library) you can ignore the GIL.
You may be able to use the async pattern to get around threading limits. Can you supply more information about what your program actually does?
I have issues with the technical accuracy of the video linked. David Beazley has done many well respected talks about the GIL at various Pycons. You can find them on pyvideo.org.
来自 <https://www.reddit.com/r/Python/comments/3s0vg9/is_my_multithreaded_python_program_doomed/>
相关文章推荐
- Android中通过typeface设置字体
- keil中code和const的区别
- 开源中最好的Web开发的资源
- CodeForces 373B
- Java WebService-CXF-基于SOAP的Web服务
- 利用软妹纸ui写的前端
- 使用eclipse发布web服务出现"Error occured when adding the module, xxxxxx, to the server"
- 1216 递归下降分析法--算数语法分析 由列志华提供
- java.lang.RuntimeException: setAudioSource failed.
- MATLAB——scatter的简单应用
- openjudge放苹果
- 上拉下拉刷新(2)网络加载图片
- You Can't Manage What You Don't Measure
- 除去pdf文件内部的超链接/a标签
- viewController的生命周期
- 设置查看java的源程序
- 【Java EE 学习 70 下】【数据采集系统第二天】【Action中User注入】【设计调查页面】【Action中模型赋值问题】【编辑调查】
- Stucts应用引起的OutOfMemoryError
- 文件查找命令―find
- 消息的本质