[深度学习]-初识 TensorFlow (Python)
2017-10-16 10:34
477 查看
综述
TensorFlow 是一个编程系统, 使用图来表示计算任务. 图中的节点被称之为 op (operation 的缩写). 一个 op 获得 0 个或多个 Tensor, 执行计算, 产生 0 个或多个 Tensor. 每个 Tensor 是一个类型化的多维数组. 例如, 你可以将一小组图像集表示为一个四维浮点数数组, 这四个维度分别是 [batch, height, width, channels].一个 TensorFlow 图描述了计算的过程. 为了进行计算, 图必须在 会话 里被启动. 会话 将图的 op 分发到诸如 CPU 或 GPU 之类的 设备 上, 同时提供执行 op 的方法. 这些方法执行后, 将产生的 tensor 返回. 在 Python 语言中, 返回的 tensor 是 numpy ndarray 对象; 在 C 和 C++ 语言中, 返回的 tensor 是 tensorflow::Tensor 实例.
基本概念:
使用图 (graph) 来表示计算任务.
在被称之为 会话 (Session) 的上下文 (context) 中执行图.
使用 tensor 表示数据.
通过 变量 (Variable) 维护状态.
使用 feed 和 fetch 可以为任意的操作(arbitrary operation) 赋值或者从其中获取数据.
官方安装指南
图与会话
创建图,执行会话
以下代码创建了图:import tensorflow as tf x = tf.Variable(5, name='x') y = tf.Variable(2, name='y') f = x*x*y + y + 10
上边的代码创建了计算图,但是 没有 执行计算。计算这个图,需要打开一个 TensorFlow Session ,然后使用它来初始化变量以及计算 f:
sess = tf.Session() sess.run(x.initializer) sess.run(y.initializer) print(sess.run(f)) sess.close()
如果变量很多,会使得
sess.run()多次出现。所以,我们使用
with块来设置默认session:
with tf.Session() as sess: x.initializer.run() # equivalent to tf.get_default_session().run(x.initializer) y.initializer.run() retsult = f.eval() # equivalent to calling tf.get_default_session().run(f) print(retsult) sess.close()
上边的代码手动去初始化了各个变量。我们也可以使用
global_variables_initializer()来初始化所有变量(不会立即执行初始化):
init = tf.global_variables_initializer() with tf.Session() as sess: init.run() retsult = f.eval() print(retsult) sess.close()
管理图
上边的代码都是使用默认图,如果需要在独立的图里边执行代码,可以自行创建图:import tensorflow as tf x1 = tf.Variable(1) print(x1.graph is tf.get_default_graph()) # True graph = tf.Graph() # 独立的 Graph with graph.as_default(): x2 = tf.Variable(2) print(x2.graph is tf.get_default_graph()) # False
Node 的存活周期
变量的存活开始于其初始化,结束于会话结束:import tensorflow as tf w = tf.constant(3) x = w + 2 y = x + 3 z = x + 4 # 计算 w 、 x 两次 with tf.Session() as sess: print(y.eval()) print(z.eval()) sess.close() # 计算 w 、 x 一次 with tf.Session() as sess: y_eval, z_eval = sess.run([y, z]) print(y_eval) print(z_eval) sess.close()
示例:使用TensorFlow实现线性回归
θ 等式计算
线性回归的计算我们使用:θ=(XT⋅X)−1⋅XT⋅y
我们引入 sklearn 中
california_housing来进行演示,代码如下:
import tensorflow as tf import numpy as np from sklearn.datasets import fetch_california_housing housing = fetch_california_housing() m, n = housing.data.shape housing_data_with_bias = np.c_[np.ones([m, 1]), housing.data] X = tf.constant(housing_data_with_bias, dtype=tf.float32, name='X') y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name='y') XT = tf.transpose(X) theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, X)), XT), y) # (X^T * X)^-1 * X^T * y with tf.Session() as sess: theta_value = theta.eval() print(theta_value)
输出:
[[ -3.74651413e+01] [ 4.35734153e-01] [ 9.33829229e-03] [ -1.06622010e-01] [ 6.44106984e-01] [ -4.25131839e-06] [ -3.77322501e-03] [ -4.26648885e-01] [ -4.40514028e-01]]
实现梯度下降
下边我们使用梯度下降来代替上边的等式:import tensorflow as tf import numpy as np import numpy.random as rnd from sklearn.preprocessing import StandardScaler from sklearn.datasets import fetch_california_housing from datetime import datetime scaler = StandardScaler() housing = fetch_california_housing() m, n = housing.data.shape scale_housing_data = scaler.fit_transform(housing.data) scaled_housing_data_plus_bias = np.c_[np.ones([m, 1]), scale_housing_data] # ### 计算梯度(Batch)### tf.reset_default_graph() n_epochs = 1000 learning_rate = 0.01 X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name='X') y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name='y') theta = tf.Variable(tf.random_uniform([n+1, 1], -1.0, 1.0, seed=42), name='theta') y_pred = tf.matmul(X, theta, name='predictions') error = y_pred - y mse = tf.reduce_mean(tf.square(error), name='mse') # gradients = 2/m * tf.matmul(tf.transpose(X), error) # ① 手动计算梯度 # training_op = tf.assign(theta, theta - gradients * learning_rate) # gradients = tf.gradients(mse, [theta])[0] # ② autodiff 自动计算梯度 # training_op = tf.assign(theta, theta - gradients * learning_rate) optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate) # ③ 梯度下降优化器 # optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate, momentum=0.25) # 可以使用其他优化器 training_op = optimizer.minimize(mse) init = tf.global_variables_initializer() saver = tf.train.Saver() with tf.Session() as sess: # saver.restore(sess, 'my_model_final.ckpt') sess.run(init) for epoch in range(n_epochs): if epoch % 100 == 0: print("Epoch", epoch, "MSE =", mse.eval()) save_path = saver.save(sess, '/tmp/my_model.ckpt') sess.run(training_op) best_theta = theta.eval() save_path = saver.save(sess, "my_model_final.ckpt") print("Best theta:") print(best_theta)
手动实现梯度下降
gradients = 2/m * tf.matmul(tf.transpose(X), error) # ① 手动计算梯度 # training_op = tf.assign(theta, theta - gradients * learning_rate)
tf.random_uniform()产生随机数
tf.assign()将新值赋予一个变量,在 “① 手动计算梯度” 中,我们使用了它实现 θ(nextstep)=θ−\arrowdown
使用 autodiff 实现梯度下降
使用手动实现梯度下降,在深度神经网络中,代码可能变的冗长易错。我们可以改而使用 symbolic differentiation 对偏导自动查找等式。自动实现梯度下降的主要解决方案如下:gradients = tf.gradients(mse, [theta])[0] # ② autodiff 自动计算梯度 # training_op = tf.assign(theta, theta - gradients * learning_rate)
使用优化器实现梯度下降
TensorFlow 提供了一系列优化器优化器,我们代码中使用了tf.train.GradientDescentOptimizer(),也可以使用其他优化器,如
tf.train.MomentumOptimizer()。代码如下:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate) # ③ 梯度下降优化器 # optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate, momentum=0.25) # 可以使用其他优化器 training_op = optimizer.minimize(mse)
保存和加载模型
saver = tf.train.Saver() [...] save_path = saver.save(sess, '/tmp/my_model.ckpt') [...] saver.restore(sess, 'my_model_final.ckpt')
Mini-batch 梯度下降 —— 逐步“喂”数据
实现 Mini-batch Gradient Descent 需要在每个迭代中将X和y替换,最简单的就是使用tf.placeholder()。如下:
X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X") y = tf.placeholder(tf.float32, shape=(None, 1), name="y")
在每次迭代中通过
feed_dict参数来填充数据:
X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size) sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
全部代码如下:
X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X") # “If you specify None for a dimension, it means “any size.”
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")
theta = tf.Variable(tf.random_uniform([n+1, 1], -1.0, 1, seed=42), name='theta')
y_pred = tf.matmul(X, theta, name='predictions')
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name='mse')
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)
init = tf.global_variables_initializer()
rnd.seed(42)
def fetch_batch(epoch, batch_index, batch_size):
rnd.seed(epoch * n_batches + batch_index)
indices = rnd.randint(m, size=batch_size)
X_batch = scaled_housing_data_plus_bias[indices]
y_batch = housing.target.reshape(-1, 1)[indices]
return X_batch, y_batch
n_epochs = 10
batch_size = 100
n_batches = int(np.ceil(m / batch_size))
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epochs):
for batch_index in range(n_batches):
X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size) sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
best_theta = theta.eval()
print("Best theta:")
print(best_theta)
可视化 —— 使用 TensorBoard
首先,定义日志文件目录和名称:now = datetime.utcnow().strftime("%Y%m%d%H%M%S") root_logdir = "tf_logs" logdir = "{}/run-{}/".format(root_logdir, now)
然后添加下列代码:
mse_summary = tf.summary.scalar('MSE', mse) summary_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())
第一行在图中创建一歌节点,将MSE记录进
summary(a TensorBoard-compatible binary log string)。第二行创建
tf.summary.FileWriter(),用以将所有
summary写入日志文件目录。
最后使用
add_summary()更新文件。代码如下:
tf.reset_default_graph()
now = datetime.utcnow().strftime("%Y%m%d%H%M%S") root_logdir = "tf_logs" logdir = "{}/run-{}/".format(root_logdir, now)
n_epochs = 100
learning_rate = 0.01
X = tf.placeholder(tf.float32, shape=(None, n+1), name='X')
y = tf.placeholder(tf.float32, shape=(None, 1), name='y')
theta = tf.Variable(tf.random_uniform([n+1, 1], -1.0, 1, seed=42), name='theta')
y_pred = tf.matmul(X, theta, name='predictions')
with tf.name_scope('loss') as scope: # NameScope
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name='mse')
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)
init = tf.global_variables_initializer()
mse_summary = tf.summary.scalar('MSE', mse) summary_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())
n_epochs = 10
batch_size = 100
n_batches = int(np.ceil(m / batch_size))
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epochs):
for batch_index in range(n_batches):
X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
if batch_index % 10 == 0:
summary_str = mse_summary.eval(feed_dict={X:X_batch, y:y_batch})
step = epoch * n_batches + batch_index
summary_writer.add_summary(summary_str, step)
sess.run(training_op, feed_dict={X : X_batch, y : y_batch})
best_theta = theta.eval()
summary_writer.flush()
summary_writer.close()
print("Best theta:")
print(best_theta)
终端里边启动 TensorBoard:
(tensorflow) ➜ ch09 git:(master) ✗ tensorboard --logdir ./logs Starting TensorBoard b'41' on port 6006 (You can navigate to http://127.0.0.1:6006) ...
这个时候可以在浏览器 http://127.0.0.1:6006 中看到图了。
命名空间、模块化和共享变量
Name Scopes
在复杂的模型中很容易产生很多节点,那么图绘变得很乱。所以,我们使用 Name Scope 来使相关节点变成一个群体,如下:with tf.name_scope('loss') as scope: error = y_pred - y mse = tf.reduce_mean(tf.square(error), name="mse") print(error.op.name) # loss/sub print(mse.op.name) # loss/mse
Modularity
看一下下边的代码:tf.reset_default_graph() n_features = 3 X = tf.placeholder(tf.float32, shape=(None, n_features), name="X") w1 = tf.Variable(tf.random_normal((n_features, 1)), name="weights1") w2 = tf.Variable(tf.random_normal((n_features, 1)), name="weights2") b1 = tf.Variable(0.0, name="bias1") b2 = tf.Variable(0.0, name="bias2") linear1 = tf.add(tf.matmul(X, w1), b1, name="linear1") linear2 = tf.add(tf.matmul(X, w2), b2, name="linear2") relu1 = tf.maximum(linear1, 0, name="relu1") relu2 = tf.maximum(linear1, 0, name="relu2") # Oops, cut&paste error! Did you spot it? output = tf.add_n([relu1, relu2], name="output")
上边的代码炒鸡丑陋啊有木有?如果我们需要很多重复操作,那么就需要使其模块化:
tf.reset_default_graph() def relu(X): with tf.name_scope("relu"): w_shape = int(X.get_shape()[1]), 1 w = tf.Variable(tf.random_normal(w_shape), name="weights") b = tf.Variable(0.0, name="bias") linear = tf.add(tf.matmul(X, w), b, name="linear") return tf.maximum(linear, 0, name="max") n_features = 3 X = tf.placeholder(tf.float32, shape=(None, n_features), name="X") relus = [relu(X) for i in range(5)] output = tf.add_n(relus, name="output") summary_writer = tf.summary.FileWriter("logs/relu2", tf.get_default_graph())
Sharing Variables
如果我们需要一个共享变量,我们有什么办法呢?考虑一下下边几种方案:创建后在函数中通过参数传递。这种方法在需要很多共享变量时变得很痛苦。
tf.reset_default_graph() def relu(X, threshold): with tf.name_scope("relu"): w_shape = int(X.get_shape()[1]), 1 w = tf.Variable(tf.random_normal(w_shape), name="weights") b = tf.Variable(0.0, name="bias") linear = tf.add(tf.matmul(X, w), b, name="linear") return tf.maximum(linear, threshold, name="max") threshold = tf.Variable(0.0, name="threshold") X = tf.placeholder(tf.float32, shape=(None, n_features), name="X") relus = [relu(X, threshold) for i in range(5)] output = tf.add_n(relus, name="output")
使用类或者字典来保存。或者是在 relu() 首次调用时设置这个共享变量。
tf.reset_default_graph() def relu(X): with tf.name_scope("relu"): if not hasattr(relu, "threshold"): relu.threshold = tf.Variable(0.0, name="threshold") w_shape = int(X.get_shape()[1]), 1 w = tf.Variable(tf.random_normal(w_shape), name="weights") b = tf.Variable(0.0, name="bias") linear = tf.add(tf.matmul(X, w), b, name="linear") return tf.maximum(linear, relu.threshold, name="max") X = tf.placeholder(tf.float32, shape=(None, n_features), name="X") relus = [relu(X) for i in range(5)] output = tf.add_n(relus, name="output")
TensorFlow 的方案
TensorFlow 使用
get_variable()来处理共享变量:不存在则创建,存在则复用。他的行为(创建还是复用)通过
variable_scope()来控制:
tf.reset_default_graph() def relu(X): with tf.variable_scope("relu", reuse=True): threshold = tf.get_variable("threshold", shape=(), initializer=tf.constant_initializer(0.0)) w_shape = int(X.get_shape()[1]), 1 w = tf.Variable(tf.random_normal(w_shape), name="weights") b = tf.Variable(0.0, name="bias") linear = tf.add(tf.matmul(X, w), b, name="linear") return tf.maximum(linear, threshold, name="max") X = tf.placeholder(tf.float32, shape=(None, n_features), name="X") with tf.variable_scope("relu"): threshold = tf.get_variable("threshold", shape=(), initializer=tf.constant_initializer(0.0)) relus = [relu(X) for i in range(5)] output = tf.add_n(relus, name="output") summary_writer = tf.summary.FileWriter("logs/relu6", tf.get_default_graph()) summary_writer.close()
上边的共享变量是在主题方法外定义的,使用下列代码将其放在方法内:
import tensorflow as tf n_features = 3 def relu(X): with tf.variable_scope("relu"): threshold = tf.get_variable("threshold", shape=(), initializer=tf.constant_initializer(0.0)) w_shape = int(X.get_shape()[1]), 1 w = tf.Variable(tf.random_normal(w_shape), name="weights") b = tf.Variable(0.0, name="bias") linear = tf.add(tf.matmul(X, w), b, name="linear") return tf.maximum(linear, threshold, name="max") X = tf.placeholder(tf.float32, shape=(None, n_features), name="X") with tf.variable_scope("", default_name="") as scope: first_relu = relu(X) # create the shared variable scope.reuse_variables() # then reuse it relus = [first_relu] + [relu(X) for i in range(4)] output = tf.add_n(relus, name="output") summary_writer = tf.summary.FileWriter("logs/relu8", tf.get_default_graph()) summary_writer.close()
相关文章推荐
- Tensorflow 实战Google深度学习框架 Python3 代码
- python机器学习系列教程——深度学习框架比较TensorFlow、Theano、Caffe、SciKit-learn、Keras
- 深度学习Dya1-初识Python(Python环境搭建及numpy、matplotlib包安装)
- win10下基于python(anaconda)安装gpu版本的TensorFlow以及kears深度学习框架
- 【TensorFlow深度学习框架教程二】Python一小时入门导学
- 深度学习tensorflow-gpu环境搭建避坑指南-win10_anaconda_python3.5_cuda8.0
- 【TensorFlow深度学习框架教程三】初识TensorFlow和神经网络
- python使用tensorflow深度学习识别验证码
- 免费教材丨第55期:Python机器学习实践指南、Tensorflow 实战Google深度学习框架
- Tensorflow1.4.0(GPU)+Win10+Anaconda5.0.1+CUDA8.0+cuDNN6.0+Python3.6深度学习环境安装
- 深度学习入门之一:Windows10(64)+Anaconda3(Python3.5)+TensorFlow-Gpu1.4+CUDA8.0+cuDNN6安装详解及Pycharm配置指南
- WIN10深度学习环境搭建 Python3.6+Tensorflow+CUDA8.0+Anaconda3+keras
- 福利 | Python、深度学习、机器学习、TensorFlow 好书推荐
- 深度学习实战 1-搭建Ubuntu16.04+Anaconda(内嵌Python3.6)+tensorflow
- 深度学习之卷积神经网络CNN及tensorflow代码实现示例
- TensorFlow深度学习框架学习(三):TensorFlow实现K-Means算法
- 深度解析】Google第二代深度学习引擎TensorFlow开源(CMU邢波独家点评、白皮书全文、视频翻译)
- 深度学习笔记 (二) 在TensorFlow上训练一个多层卷积神经网络
- 2018年网易深度学习图像实习校招编程题(100%案例通过)python解析
- 深度学习之Ubuntu下安装caffe和TensorFlow的cpu版本