TensorFlow学习日记6
2017-07-19 16:48
113 查看
1. tf.test.main
解析:main(argv=None):Runs all unit tests.
2. tf.test.TestCase
解析:
说明:TensorFlow provides a convenience class inheriting from unittest.TestCase which adds methods relevant to
TensorFlow tests.
3. SGD中的Weight Decay和Momentum
解析:SGD中可调节的参数包括学习率(Learning Rate),权值衰减(Weight Decay),动量(Momentum),学
习率衰减(Learning Rate Decay)。如下所示:
(1)Weight Decay
说明:权重衰减避免模型的过拟合。
(2)Momentum
说明:SDG在平坦区域快速学习。
4. 可视化机器学习平台
解析:
(1)阿里PAI 2.0
基于该平台,在淘宝搜索中,搜索结果会基于商品和用户的特征进行排序。通过使用参数服务器,淘宝可以把百亿个
特征的模型,分散到数十个乃至于上百个参数服务器上,打破了规模的瓶颈。
(2)腾讯DX-I
主要用于游戏流失率预测、用户标签传播以及广告点击行为预测等。以用户行为预测为例,借助DI-X平台,可以方便
的拖拽出一个BRNN Encoder模型(双向循环神经网络编码器),从用户自身和用户圈子好友的行为序列数据中提取
出基础特征,进行栈式自编码(Stacked Auto-Encoder)模型的训练,充分利用RNN的模型特点,得到比常规模型更
精准的行为预测效果。
5. 生成式对抗网络(Generative Adversarial Network,GAN)[2]
解析:生成式对抗网络主要解决的问题是如何从训练样本中学习出新样本。生成模型负责训练出样本的分布,如果训
练样本是图片就生成相似的图片,如果训练样本是文章句子就生成相似的文章句子。判别模型是一个二分类器,用来
判断输入样本是真实数据还是训练生成的样本。生成式对抗网络结构,如下所示:
6. tf.train.latest_checkpoint
解析:
(1)saver = tf.train.Saver()
(2)chkpt_fname = tf.train.latest_checkpoint(output_path)
(3)saver.restore(sess, chkpt_fname)
7. 频谱
解析:频谱是频率谱密度的简称,是频率的分布曲线。复杂振荡分解为振幅不同和频率不同的谐振荡,这些谐振荡的
幅值按频率排列的图形叫做频谱。频谱将对信号的研究从时域引入到频域,从而带来更直观的认识。把复杂的机械振
动分解成的频谱称为机械振动谱,把声振动分解成的频谱称为声谱,把光振动分解成的频谱称为光谱,把电磁振动分
解成的频谱称为电磁波谱。
8. 图的存储与加载
解析:用tf.train.writer_graph()保存,只包含图形结构,不包含权重,然后使用tf.import_graph_def()来加载图形。
9. TensorFlow中的队列
解析:
(1)FIFOQueue:创建一个先入先出队列。
(2)RandomShuffleQueue:创建一个随机队列,在出队列时,是以随机的顺序产生元素的。
说明:RandomShuffleQueue在TensorFlow使用异步计算时非常重要。tf.train.QueueRunner队列管理器,
tf.train.Coordinator协调器,所有队列管理器被默认加在图的tf.GraphKeys.QUEUE_RUNNERS集合中。
10. Word2Vec
解析:Word2vec是一个用于处理文本的双层神经网络。它的输入是文本语料,输出则是一组向量:该语料中词语的
特征向量。虽然Word2vec并不是深度神经网络,但它可以将文本转换为深度神经网络能够理解的数值形式。它主要
分为CBOW(Continuous Bag of Words)和Skip-Gram两种模式。其中,CBOW是从原始语句(比如,中国的首都是
__)推测目标字词(比如,北京);而Skip-Gram则正好相反,它是从目标字词推测出原始语句,其中CBOW对小型
数据比较合适,而Skip-Gram在大型预料中表现得更好。
11. tf.estimator.Estimator和tf.estimator.EstimatorSpec
解析:
(1)class tf.estimator.Estimator:Estimator class to train and evaluate TensorFlow models.
(2)class tf.estimator.EstimatorSpec:Ops and objects returned from a model_fn and passed to Estimator.
EstimatorSpec fully defines the model to be run by Estimator.
12. tf.logging.set_verbosity(tf.logging.INFO)
解析:在控制台显示训练日志。
13. tf.estimator.inputs.numpy_input_fn
解析:numpy_input_fn(x, y=None, batch_size=128, num_epochs=1, shuffle=None, queue_capacity=1000,
num_threads=1)
(1)x: dict of numpy array object.
(2)y: numpy array object. None if absent.
(3)batch_size: Integer, size of batches to return.
(4)num_epochs: Integer, number of epochs to iterate over data. If None will run forever.
(5)shuffle: Boolean, if True shuffles the queue. Avoid shuffle at prediction time.
(6)queue_capacity: Integer, size of queue to accumulate.
(7)num_threads: Integer, number of threads used for reading and enqueueing. In order to have predicted and
repeatable order of reading and enqueueing, such as in prediction and evaluation mode, num_threads should be 1.
14. tf.estimator.inputs.pandas_input_fn
解析:pandas_input_fn(x, y=None, batch_size=128, num_epochs=1, shuffle=None, queue_capacity=1000,
num_threads=1, target_column='target')
(1)x: pandas DataFrame object.
(2)y: pandas Series object. None if absent.
(3)batch_size: int, size of batches to return.
(4)num_epochs: int, number of epochs to iterate over data. If not None, read attempts that would exceed this
value will raise OutOfRangeError.
(5)shuffle: bool, whether to read the records in random order.
(6)queue_capacity: int, size of the read queue. If None, it will be set roughly to the size of x.
(7)num_threads: Integer, number of threads used for reading and enqueueing. In order to have predicted and
repeatable order of reading and enqueueing, such as in prediction and evaluation mode, num_threads should be 1.
(8)target_column: str, name to give the target column y.
15. tf.estimator.Estimator.train和tf.estimator.Estimator.test
解析:
(1)train(input_fn, hooks=None, steps=None, max_steps=None):Trains a model given training data input_fn.
(2)predict(input_fn, predict_keys=None, hooks=None, checkpoint_path=None):Returns predictions for given
features.
16. tf.estimator.Estimator.evaluate
解析:evaluate(input_fn, steps=None, hooks=None, checkpoint_path=None, name=None):Evaluates the model
given evaluation data input_fn.
(1)input_fn: Input function returning a tuple of: features - Dictionary of string feature name to Tensor or
SparseTensor. labels - Tensor or dictionary of Tensor with labels.
(2)steps: Number of steps for which to evaluate model. If None, evaluates until input_fn raises an end-of-input
exception.
(3)hooks: List of SessionRunHook subclass instances. Used for callbacks inside the evaluation call.
(4)checkpoint_path: Path of a specific checkpoint to evaluate. If None, the latest checkpoint in model_dir is used.
(5)name: Name of the evaluation if user needs to run multiple evaluations on different data sets, such as on
training data vs test data. Metrics for different evaluations are saved in separate folders, and appear separately in
tensorboard.
17. tf.contrib.learn.datasets
解析:Dataset utilities and synthetic/reference datasets.
18. tf.metrics.accuracy
解析:accuracy(labels, predictions, weights=None, metrics_collections=None, updates_collections=None,
name=None):Calculates how often predictions matches labels.
(1)labels: The ground truth values, a Tensor whose shape matches predictions.
(2)predictions: The predicted values, a Tensor of any shape.
(3)weights: Optional Tensor whose rank is either 0, or the same rank as labels, and must be broadcastable to
labels (i.e., all dimensions must be either 1, or the same as the corresponding labels dimension).
(4)metrics_collections: An optional list of collections that accuracy should be added to.
(5)updates_collections: An optional list of collections that update_op should be added to.
(6)name: An optional variable_scope name.
19. tf.metrics.auc
解析:auc(labels, predictions, weights=None, num_thresholds=200, metrics_collections=None,
updates_collections=None, curve='ROC', name=None):Computes the approximate AUC via a Riemann sum.
(1)labels: A Tensor whose shape matches predictions. Will be cast to bool.
(2)predictions: A floating point Tensor of arbitrary shape and whose values are in the range [0, 1].
(3)weights: Optional Tensor whose rank is either 0, or the same rank as labels, and must be broadcastable to
labels (i.e., all dimensions must be either 1, or the same as the corresponding labels dimension).
(4)num_thresholds: The number of thresholds to use when discretizing the roc curve.
(5)metrics_collections: An optional list of collections that auc should be added to.
(6)updates_collections: An optional list of collections that update_op should be added to.
(7)curve: Specifies the name of the curve to be computed, 'ROC' [default] or 'PR' for the Precision-Recall-curve.
(8)name: An optional variable_scope name.
20. tf.one_hot
解析:one_hot(indices, depth, on_value=None, off_value=None, axis=None, dtype=None, name=None):Returns
a one-hot tensor.
(1)indices: A Tensor of indices.
(2)depth: A scalar defining the depth of the one hot dimension.
(3)on_value: A scalar defining the value to fill in output when indices[j] = i. (default: 1)
(4)off_value: A scalar defining the value to fill in output when indices[j] != i. (default: 0)
(5)axis: The axis to fill (default: -1, a new inner-most axis).
(6)dtype: The data type of the output tensor.
参考文献:
[1] TensorFlow Python:http://docs.w3cub.com/tensorflow~python/
[2] zhangqianhui/AdversarialNetsPapers:https://github.com/zhangqianhui/AdversarialNetsPapers
解析:main(argv=None):Runs all unit tests.
2. tf.test.TestCase
解析:
import tensorflow as tf class SquareTest(tf.test.TestCase): def testSquare(self): with self.test_session(): x = tf.square([2, 3]) self.assertAllEqual(x.eval(), [4, 9]) if __name__ == '__main__': tf.test.main()
说明:TensorFlow provides a convenience class inheriting from unittest.TestCase which adds methods relevant to
TensorFlow tests.
3. SGD中的Weight Decay和Momentum
解析:SGD中可调节的参数包括学习率(Learning Rate),权值衰减(Weight Decay),动量(Momentum),学
习率衰减(Learning Rate Decay)。如下所示:
(1)Weight Decay
说明:权重衰减避免模型的过拟合。
(2)Momentum
说明:SDG在平坦区域快速学习。
4. 可视化机器学习平台
解析:
(1)阿里PAI 2.0
基于该平台,在淘宝搜索中,搜索结果会基于商品和用户的特征进行排序。通过使用参数服务器,淘宝可以把百亿个
特征的模型,分散到数十个乃至于上百个参数服务器上,打破了规模的瓶颈。
(2)腾讯DX-I
主要用于游戏流失率预测、用户标签传播以及广告点击行为预测等。以用户行为预测为例,借助DI-X平台,可以方便
的拖拽出一个BRNN Encoder模型(双向循环神经网络编码器),从用户自身和用户圈子好友的行为序列数据中提取
出基础特征,进行栈式自编码(Stacked Auto-Encoder)模型的训练,充分利用RNN的模型特点,得到比常规模型更
精准的行为预测效果。
5. 生成式对抗网络(Generative Adversarial Network,GAN)[2]
解析:生成式对抗网络主要解决的问题是如何从训练样本中学习出新样本。生成模型负责训练出样本的分布,如果训
练样本是图片就生成相似的图片,如果训练样本是文章句子就生成相似的文章句子。判别模型是一个二分类器,用来
判断输入样本是真实数据还是训练生成的样本。生成式对抗网络结构,如下所示:
6. tf.train.latest_checkpoint
解析:
(1)saver = tf.train.Saver()
(2)chkpt_fname = tf.train.latest_checkpoint(output_path)
(3)saver.restore(sess, chkpt_fname)
7. 频谱
解析:频谱是频率谱密度的简称,是频率的分布曲线。复杂振荡分解为振幅不同和频率不同的谐振荡,这些谐振荡的
幅值按频率排列的图形叫做频谱。频谱将对信号的研究从时域引入到频域,从而带来更直观的认识。把复杂的机械振
动分解成的频谱称为机械振动谱,把声振动分解成的频谱称为声谱,把光振动分解成的频谱称为光谱,把电磁振动分
解成的频谱称为电磁波谱。
8. 图的存储与加载
解析:用tf.train.writer_graph()保存,只包含图形结构,不包含权重,然后使用tf.import_graph_def()来加载图形。
9. TensorFlow中的队列
解析:
(1)FIFOQueue:创建一个先入先出队列。
(2)RandomShuffleQueue:创建一个随机队列,在出队列时,是以随机的顺序产生元素的。
说明:RandomShuffleQueue在TensorFlow使用异步计算时非常重要。tf.train.QueueRunner队列管理器,
tf.train.Coordinator协调器,所有队列管理器被默认加在图的tf.GraphKeys.QUEUE_RUNNERS集合中。
10. Word2Vec
解析:Word2vec是一个用于处理文本的双层神经网络。它的输入是文本语料,输出则是一组向量:该语料中词语的
特征向量。虽然Word2vec并不是深度神经网络,但它可以将文本转换为深度神经网络能够理解的数值形式。它主要
分为CBOW(Continuous Bag of Words)和Skip-Gram两种模式。其中,CBOW是从原始语句(比如,中国的首都是
__)推测目标字词(比如,北京);而Skip-Gram则正好相反,它是从目标字词推测出原始语句,其中CBOW对小型
数据比较合适,而Skip-Gram在大型预料中表现得更好。
11. tf.estimator.Estimator和tf.estimator.EstimatorSpec
解析:
(1)class tf.estimator.Estimator:Estimator class to train and evaluate TensorFlow models.
(2)class tf.estimator.EstimatorSpec:Ops and objects returned from a model_fn and passed to Estimator.
EstimatorSpec fully defines the model to be run by Estimator.
12. tf.logging.set_verbosity(tf.logging.INFO)
解析:在控制台显示训练日志。
13. tf.estimator.inputs.numpy_input_fn
解析:numpy_input_fn(x, y=None, batch_size=128, num_epochs=1, shuffle=None, queue_capacity=1000,
num_threads=1)
(1)x: dict of numpy array object.
(2)y: numpy array object. None if absent.
(3)batch_size: Integer, size of batches to return.
(4)num_epochs: Integer, number of epochs to iterate over data. If None will run forever.
(5)shuffle: Boolean, if True shuffles the queue. Avoid shuffle at prediction time.
(6)queue_capacity: Integer, size of queue to accumulate.
(7)num_threads: Integer, number of threads used for reading and enqueueing. In order to have predicted and
repeatable order of reading and enqueueing, such as in prediction and evaluation mode, num_threads should be 1.
14. tf.estimator.inputs.pandas_input_fn
解析:pandas_input_fn(x, y=None, batch_size=128, num_epochs=1, shuffle=None, queue_capacity=1000,
num_threads=1, target_column='target')
(1)x: pandas DataFrame object.
(2)y: pandas Series object. None if absent.
(3)batch_size: int, size of batches to return.
(4)num_epochs: int, number of epochs to iterate over data. If not None, read attempts that would exceed this
value will raise OutOfRangeError.
(5)shuffle: bool, whether to read the records in random order.
(6)queue_capacity: int, size of the read queue. If None, it will be set roughly to the size of x.
(7)num_threads: Integer, number of threads used for reading and enqueueing. In order to have predicted and
repeatable order of reading and enqueueing, such as in prediction and evaluation mode, num_threads should be 1.
(8)target_column: str, name to give the target column y.
15. tf.estimator.Estimator.train和tf.estimator.Estimator.test
解析:
(1)train(input_fn, hooks=None, steps=None, max_steps=None):Trains a model given training data input_fn.
(2)predict(input_fn, predict_keys=None, hooks=None, checkpoint_path=None):Returns predictions for given
features.
16. tf.estimator.Estimator.evaluate
解析:evaluate(input_fn, steps=None, hooks=None, checkpoint_path=None, name=None):Evaluates the model
given evaluation data input_fn.
(1)input_fn: Input function returning a tuple of: features - Dictionary of string feature name to Tensor or
SparseTensor. labels - Tensor or dictionary of Tensor with labels.
(2)steps: Number of steps for which to evaluate model. If None, evaluates until input_fn raises an end-of-input
exception.
(3)hooks: List of SessionRunHook subclass instances. Used for callbacks inside the evaluation call.
(4)checkpoint_path: Path of a specific checkpoint to evaluate. If None, the latest checkpoint in model_dir is used.
(5)name: Name of the evaluation if user needs to run multiple evaluations on different data sets, such as on
training data vs test data. Metrics for different evaluations are saved in separate folders, and appear separately in
tensorboard.
17. tf.contrib.learn.datasets
解析:Dataset utilities and synthetic/reference datasets.
18. tf.metrics.accuracy
解析:accuracy(labels, predictions, weights=None, metrics_collections=None, updates_collections=None,
name=None):Calculates how often predictions matches labels.
(1)labels: The ground truth values, a Tensor whose shape matches predictions.
(2)predictions: The predicted values, a Tensor of any shape.
(3)weights: Optional Tensor whose rank is either 0, or the same rank as labels, and must be broadcastable to
labels (i.e., all dimensions must be either 1, or the same as the corresponding labels dimension).
(4)metrics_collections: An optional list of collections that accuracy should be added to.
(5)updates_collections: An optional list of collections that update_op should be added to.
(6)name: An optional variable_scope name.
19. tf.metrics.auc
解析:auc(labels, predictions, weights=None, num_thresholds=200, metrics_collections=None,
updates_collections=None, curve='ROC', name=None):Computes the approximate AUC via a Riemann sum.
(1)labels: A Tensor whose shape matches predictions. Will be cast to bool.
(2)predictions: A floating point Tensor of arbitrary shape and whose values are in the range [0, 1].
(3)weights: Optional Tensor whose rank is either 0, or the same rank as labels, and must be broadcastable to
labels (i.e., all dimensions must be either 1, or the same as the corresponding labels dimension).
(4)num_thresholds: The number of thresholds to use when discretizing the roc curve.
(5)metrics_collections: An optional list of collections that auc should be added to.
(6)updates_collections: An optional list of collections that update_op should be added to.
(7)curve: Specifies the name of the curve to be computed, 'ROC' [default] or 'PR' for the Precision-Recall-curve.
(8)name: An optional variable_scope name.
20. tf.one_hot
解析:one_hot(indices, depth, on_value=None, off_value=None, axis=None, dtype=None, name=None):Returns
a one-hot tensor.
(1)indices: A Tensor of indices.
(2)depth: A scalar defining the depth of the one hot dimension.
(3)on_value: A scalar defining the value to fill in output when indices[j] = i. (default: 1)
(4)off_value: A scalar defining the value to fill in output when indices[j] != i. (default: 0)
(5)axis: The axis to fill (default: -1, a new inner-most axis).
(6)dtype: The data type of the output tensor.
参考文献:
[1] TensorFlow Python:http://docs.w3cub.com/tensorflow~python/
[2] zhangqianhui/AdversarialNetsPapers:https://github.com/zhangqianhui/AdversarialNetsPapers
相关文章推荐
- C#队列Queue用法实例分析
- Java 队列 Queue 用法实例详解
- 用PHP写的基于Memcache的Queue实现代码
- JAVA 数据结构之Queue处理实例代码
- C++ STL容器stack和queue详解
- C#队列Queue多线程用法实例
- linux中编写自己的并发队列类(Queue 并发阻塞队列)
- vector,map,list,queue的区别详细解析
- Laravel 4.2 中队列服务(queue)使用感受
- jQuery中队列queue()函数的实例教程
- java中queue接口的使用详解
- Python Queue模块详细介绍及实例
- Python Queue模块详解
- Python多进程通信Queue、Pipe、Value、Array实例
- python使用Queue在多个子进程间交换数据的方法
- 简单谈谈python中的Queue与多进程
- HAZELCAST 客户端命令 可用于简单调试
- EJB3.0 JBoss的JMS实例
- IBM WebSphere MQ介绍安装以及配置服务详解