您的位置：首页 > 编程语言 > Python开发

[深度学习]-初识 TensorFlow (Python)

2017-10-16 10:34 477 查看

综述

TensorFlow 是一个编程系统, 使用图来表示计算任务. 图中的节点被称之为 op (operation 的缩写). 一个 op 获得 0 个或多个 Tensor, 执行计算, 产生 0 个或多个 Tensor. 每个 Tensor 是一个类型化的多维数组. 例如, 你可以将一小组图像集表示为一个四维浮点数数组, 这四个维度分别是 [batch, height, width, channels].

一个 TensorFlow 图描述了计算的过程. 为了进行计算, 图必须在会话里被启动. 会话将图的 op 分发到诸如 CPU 或 GPU 之类的设备上, 同时提供执行 op 的方法. 这些方法执行后, 将产生的 tensor 返回. 在 Python 语言中, 返回的 tensor 是 numpy ndarray 对象; 在 C 和 C++ 语言中, 返回的 tensor 是 tensorflow::Tensor 实例.

基本概念：

使用图 (graph) 来表示计算任务.

在被称之为 会话 (Session) 的上下文 (context) 中执行图.

使用 tensor 表示数据.

通过 变量 (Variable) 维护状态.

使用 feed 和 fetch 可以为任意的操作(arbitrary operation) 赋值或者从其中获取数据.

官方安装指南

图与会话

创建图，执行会话

以下代码创建了图：

import tensorflow as tf

x = tf.Variable(5, name='x')
y = tf.Variable(2, name='y')
f = x*x*y + y + 10

上边的代码创建了计算图，但是没有执行计算。计算这个图，需要打开一个 TensorFlow Session ，然后使用它来初始化变量以及计算 f：

sess = tf.Session()
sess.run(x.initializer)
sess.run(y.initializer)
print(sess.run(f))
sess.close()

如果变量很多，会使得

sess.run()

多次出现。所以，我们使用

with

块来设置默认session：

with tf.Session() as sess:
x.initializer.run()  # equivalent to tf.get_default_session().run(x.initializer)
y.initializer.run()
retsult = f.eval()  # equivalent to calling tf.get_default_session().run(f)
print(retsult)
sess.close()

上边的代码手动去初始化了各个变量。我们也可以使用

global_variables_initializer()

来初始化所有变量（不会立即执行初始化）：

init = tf.global_variables_initializer()

with tf.Session() as sess:
init.run()
retsult = f.eval()
print(retsult)
sess.close()

管理图

上边的代码都是使用默认图，如果需要在独立的图里边执行代码，可以自行创建图：

import tensorflow as tf

x1 = tf.Variable(1)
print(x1.graph is tf.get_default_graph())  # True

graph = tf.Graph()  # 独立的 Graph
with graph.as_default():
x2 = tf.Variable(2)
print(x2.graph is tf.get_default_graph())  # False

Node 的存活周期

变量的存活开始于其初始化，结束于会话结束：

import tensorflow as tf

w = tf.constant(3)
x = w + 2
y = x + 3
z = x + 4

# 计算 w 、 x 两次
with tf.Session() as sess:
print(y.eval())
print(z.eval())
sess.close()

# 计算 w 、 x 一次
with tf.Session() as sess:
y_eval, z_eval = sess.run([y, z])
print(y_eval)
print(z_eval)
sess.close()

示例：使用TensorFlow实现线性回归

θ 等式计算

线性回归的计算我们使用：

θ=(XT⋅X)−1⋅XT⋅y

我们引入 sklearn 中

california_housing

来进行演示，代码如下：

import tensorflow as tf
import numpy as np
from sklearn.datasets import fetch_california_housing

housing = fetch_california_housing()
m, n = housing.data.shape
housing_data_with_bias = np.c_[np.ones([m, 1]), housing.data]

X = tf.constant(housing_data_with_bias, dtype=tf.float32, name='X')
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name='y')
XT = tf.transpose(X)
theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, X)), XT), y)  # (X^T * X)^-1 * X^T * y

with tf.Session() as sess:
theta_value = theta.eval()
print(theta_value)

输出：

[[ -3.74651413e+01]
[  4.35734153e-01]
[  9.33829229e-03]
[ -1.06622010e-01]
[  6.44106984e-01]
[ -4.25131839e-06]
[ -3.77322501e-03]
[ -4.26648885e-01]
[ -4.40514028e-01]]

实现梯度下降

下边我们使用梯度下降来代替上边的等式：

import tensorflow as tf
import numpy as np
import numpy.random as rnd
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import fetch_california_housing
from datetime import datetime

scaler = StandardScaler()
housing = fetch_california_housing()
m, n = housing.data.shape
scale_housing_data = scaler.fit_transform(housing.data)
scaled_housing_data_plus_bias = np.c_[np.ones([m, 1]), scale_housing_data]

# ### 计算梯度（Batch）###
tf.reset_default_graph()

n_epochs = 1000
learning_rate = 0.01

X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name='X')
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name='y')
theta = tf.Variable(tf.random_uniform([n+1, 1], -1.0, 1.0, seed=42), name='theta')
y_pred = tf.matmul(X, theta, name='predictions')
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name='mse')
# gradients = 2/m * tf.matmul(tf.transpose(X), error)   # ① 手动计算梯度
# training_op = tf.assign(theta, theta - gradients * learning_rate)

# gradients = tf.gradients(mse, [theta])[0]             # ② autodiff 自动计算梯度
# training_op = tf.assign(theta, theta - gradients * learning_rate)

optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)  # ③ 梯度下降优化器
# optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate, momentum=0.25)  # 可以使用其他优化器
training_op = optimizer.minimize(mse)

init = tf.global_variables_initializer()
saver = tf.train.Saver()

with tf.Session() as sess:
# saver.restore(sess, 'my_model_final.ckpt')
sess.run(init)

for epoch in range(n_epochs):
if epoch % 100 == 0:
print("Epoch", epoch, "MSE =", mse.eval())
save_path = saver.save(sess, '/tmp/my_model.ckpt')
sess.run(training_op)

best_theta = theta.eval()
save_path = saver.save(sess, "my_model_final.ckpt")

print("Best theta:")
print(best_theta)

手动实现梯度下降

gradients = 2/m * tf.matmul(tf.transpose(X), error)   # ① 手动计算梯度
# training_op = tf.assign(theta, theta - gradients * learning_rate)

tf.random_uniform()

产生随机数

tf.assign()

将新值赋予一个变量，在 “① 手动计算梯度” 中，我们使用了它实现 θ(nextstep)=θ−\arrowdown

使用 autodiff 实现梯度下降

使用手动实现梯度下降，在深度神经网络中，代码可能变的冗长易错。我们可以改而使用 symbolic differentiation 对偏导自动查找等式。自动实现梯度下降的主要解决方案如下：

gradients = tf.gradients(mse, [theta])[0]             # ② autodiff 自动计算梯度
# training_op = tf.assign(theta, theta - gradients * learning_rate)

使用优化器实现梯度下降

TensorFlow 提供了一系列优化器优化器，我们代码中使用了

tf.train.GradientDescentOptimizer()

，也可以使用其他优化器，如

tf.train.MomentumOptimizer()

。代码如下：

optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)  # ③ 梯度下降优化器
# optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate, momentum=0.25)  # 可以使用其他优化器
training_op = optimizer.minimize(mse)

保存和加载模型

saver = tf.train.Saver()
[...]
save_path = saver.save(sess, '/tmp/my_model.ckpt')
[...]
saver.restore(sess, 'my_model_final.ckpt')

Mini-batch 梯度下降 —— 逐步“喂”数据

实现 Mini-batch Gradient Descent 需要在每个迭代中将X和y替换，最简单的就是使用

tf.placeholder()

。如下：

X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")

在每次迭代中通过

feed_dict

参数来填充数据：

X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})

全部代码如下：

X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")  # “If you specify None for a dimension, it means “any size.”
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")
theta = tf.Variable(tf.random_uniform([n+1, 1], -1.0, 1, seed=42), name='theta')
y_pred = tf.matmul(X, theta, name='predictions')
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name='mse')
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)

init = tf.global_variables_initializer()

rnd.seed(42)

def fetch_batch(epoch, batch_index, batch_size):
rnd.seed(epoch * n_batches + batch_index)
indices = rnd.randint(m, size=batch_size)
X_batch = scaled_housing_data_plus_bias[indices]
y_batch = housing.target.reshape(-1, 1)[indices]
return X_batch, y_batch

n_epochs = 10
batch_size = 100
n_batches = int(np.ceil(m / batch_size))

with tf.Session() as sess:
sess.run(init)

for epoch in range(n_epochs):
for batch_index in range(n_batches):
X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})

best_theta = theta.eval()

print("Best theta:")
print(best_theta)

可视化 —— 使用 TensorBoard

首先，定义日志文件目录和名称：

now = datetime.utcnow().strftime("%Y%m%d%H%M%S")
root_logdir = "tf_logs"
logdir = "{}/run-{}/".format(root_logdir, now)

然后添加下列代码：

mse_summary = tf.summary.scalar('MSE', mse)
summary_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())

第一行在图中创建一歌节点，将MSE记录进

summary

（a TensorBoard-compatible binary log string）。第二行创建

tf.summary.FileWriter()

，用以将所有

summary

写入日志文件目录。

最后使用

add_summary()

更新文件。代码如下：

tf.reset_default_graph()

now = datetime.utcnow().strftime("%Y%m%d%H%M%S")
root_logdir = "tf_logs"
logdir = "{}/run-{}/".format(root_logdir, now)

n_epochs = 100
learning_rate = 0.01

X = tf.placeholder(tf.float32, shape=(None, n+1), name='X')
y = tf.placeholder(tf.float32, shape=(None, 1), name='y')
theta = tf.Variable(tf.random_uniform([n+1, 1], -1.0, 1, seed=42), name='theta')
y_pred = tf.matmul(X, theta, name='predictions')
with tf.name_scope('loss') as scope:   # NameScope
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name='mse')
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)

init = tf.global_variables_initializer()

mse_summary = tf.summary.scalar('MSE', mse)
summary_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())

n_epochs = 10
batch_size = 100
n_batches = int(np.ceil(m / batch_size))

with tf.Session() as sess:
sess.run(init)

for epoch in range(n_epochs):
for batch_index in range(n_batches):
X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
if batch_index % 10 == 0:
summary_str = mse_summary.eval(feed_dict={X:X_batch, y:y_batch})
step = epoch * n_batches + batch_index
summary_writer.add_summary(summary_str, step)
sess.run(training_op, feed_dict={X : X_batch, y : y_batch})

best_theta = theta.eval()

summary_writer.flush()
summary_writer.close()

print("Best theta:")
print(best_theta)

终端里边启动 TensorBoard：

(tensorflow) ➜  ch09 git:(master) ✗ tensorboard --logdir ./logs
Starting TensorBoard b'41' on port 6006
(You can navigate to http://127.0.0.1:6006) ...

这个时候可以在浏览器 http://127.0.0.1:6006 中看到图了。

命名空间、模块化和共享变量

Name Scopes

在复杂的模型中很容易产生很多节点，那么图绘变得很乱。所以，我们使用 Name Scope 来使相关节点变成一个群体，如下：

with tf.name_scope('loss') as scope:
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")

print(error.op.name)  # loss/sub
print(mse.op.name)  # loss/mse

Modularity

看一下下边的代码：

tf.reset_default_graph()

n_features = 3
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")

w1 = tf.Variable(tf.random_normal((n_features, 1)), name="weights1")
w2 = tf.Variable(tf.random_normal((n_features, 1)), name="weights2")
b1 = tf.Variable(0.0, name="bias1")
b2 = tf.Variable(0.0, name="bias2")

linear1 = tf.add(tf.matmul(X, w1), b1, name="linear1")
linear2 = tf.add(tf.matmul(X, w2), b2, name="linear2")

relu1 = tf.maximum(linear1, 0, name="relu1")
relu2 = tf.maximum(linear1, 0, name="relu2")  # Oops, cut&paste error! Did you spot it?

output = tf.add_n([relu1, relu2], name="output")

上边的代码炒鸡丑陋啊有木有？如果我们需要很多重复操作，那么就需要使其模块化：

tf.reset_default_graph()

def relu(X):
with tf.name_scope("relu"):
w_shape = int(X.get_shape()[1]), 1
w = tf.Variable(tf.random_normal(w_shape), name="weights")
b = tf.Variable(0.0, name="bias")
linear = tf.add(tf.matmul(X, w), b, name="linear")
return tf.maximum(linear, 0, name="max")

n_features = 3
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X) for i in range(5)]
output = tf.add_n(relus, name="output")

summary_writer = tf.summary.FileWriter("logs/relu2", tf.get_default_graph())

Sharing Variables

如果我们需要一个共享变量，我们有什么办法呢？考虑一下下边几种方案：

创建后在函数中通过参数传递。这种方法在需要很多共享变量时变得很痛苦。

tf.reset_default_graph()

def relu(X, threshold):
with tf.name_scope("relu"):
w_shape = int(X.get_shape()[1]), 1
w = tf.Variable(tf.random_normal(w_shape), name="weights")
b = tf.Variable(0.0, name="bias")
linear = tf.add(tf.matmul(X, w), b, name="linear")
return tf.maximum(linear, threshold, name="max")

threshold = tf.Variable(0.0, name="threshold")
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X, threshold) for i in range(5)]
output = tf.add_n(relus, name="output")

使用类或者字典来保存。或者是在 relu() 首次调用时设置这个共享变量。

tf.reset_default_graph()

def relu(X):
with tf.name_scope("relu"):
if not hasattr(relu, "threshold"):
relu.threshold = tf.Variable(0.0, name="threshold")
w_shape = int(X.get_shape()[1]), 1
w = tf.Variable(tf.random_normal(w_shape), name="weights")
b = tf.Variable(0.0, name="bias")
linear = tf.add(tf.matmul(X, w), b, name="linear")
return tf.maximum(linear, relu.threshold, name="max")

X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X) for i in range(5)]
output = tf.add_n(relus, name="output")

TensorFlow 的方案

TensorFlow 使用

get_variable()

来处理共享变量：不存在则创建，存在则复用。他的行为（创建还是复用）通过

variable_scope()

来控制：

tf.reset_default_graph()

def relu(X):
with tf.variable_scope("relu", reuse=True):
threshold = tf.get_variable("threshold", shape=(), initializer=tf.constant_initializer(0.0))
w_shape = int(X.get_shape()[1]), 1
w = tf.Variable(tf.random_normal(w_shape), name="weights")
b = tf.Variable(0.0, name="bias")
linear = tf.add(tf.matmul(X, w), b, name="linear")
return tf.maximum(linear, threshold, name="max")

X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
with tf.variable_scope("relu"):
threshold = tf.get_variable("threshold", shape=(), initializer=tf.constant_initializer(0.0))
relus = [relu(X) for i in range(5)]
output = tf.add_n(relus, name="output")

summary_writer = tf.summary.FileWriter("logs/relu6", tf.get_default_graph())
summary_writer.close()

上边的共享变量是在主题方法外定义的，使用下列代码将其放在方法内：

import tensorflow as tf

n_features = 3

def relu(X):
with tf.variable_scope("relu"):
threshold = tf.get_variable("threshold", shape=(), initializer=tf.constant_initializer(0.0))
w_shape = int(X.get_shape()[1]), 1
w = tf.Variable(tf.random_normal(w_shape), name="weights")
b = tf.Variable(0.0, name="bias")
linear = tf.add(tf.matmul(X, w), b, name="linear")
return tf.maximum(linear, threshold, name="max")

X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
with tf.variable_scope("", default_name="") as scope:
first_relu = relu(X)     # create the shared variable
scope.reuse_variables()  # then reuse it
relus = [first_relu] + [relu(X) for i in range(4)]
output = tf.add_n(relus, name="output")

summary_writer = tf.summary.FileWriter("logs/relu8", tf.get_default_graph())
summary_writer.close()

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： python 深度学习 tensorflow

相关文章推荐

新的分享

章节导航