您的位置:首页 > 编程语言 > Python开发

[深度学习]-初识 TensorFlow (Python)

2017-10-16 10:34 477 查看

综述

TensorFlow 是一个编程系统, 使用图来表示计算任务. 图中的节点被称之为 op (operation 的缩写). 一个 op 获得 0 个或多个 Tensor, 执行计算, 产生 0 个或多个 Tensor. 每个 Tensor 是一个类型化的多维数组. 例如, 你可以将一小组图像集表示为一个四维浮点数数组, 这四个维度分别是 [batch, height, width, channels].

一个 TensorFlow 图描述了计算的过程. 为了进行计算, 图必须在 会话 里被启动. 会话 将图的 op 分发到诸如 CPU 或 GPU 之类的 设备 上, 同时提供执行 op 的方法. 这些方法执行后, 将产生的 tensor 返回. 在 Python 语言中, 返回的 tensor 是 numpy ndarray 对象; 在 C 和 C++ 语言中, 返回的 tensor 是 tensorflow::Tensor 实例.

基本概念:

使用图 (graph) 来表示计算任务.

在被称之为 会话 (Session)上下文 (context) 中执行图.

使用 tensor 表示数据.

通过 变量 (Variable) 维护状态.

使用 feed 和 fetch 可以为任意的操作(arbitrary operation) 赋值或者从其中获取数据.

官方安装指南

图与会话

创建图,执行会话

以下代码创建了图:

import tensorflow as tf

x = tf.Variable(5, name='x')
y = tf.Variable(2, name='y')
f = x*x*y + y + 10


上边的代码创建了计算图,但是 没有 执行计算。计算这个图,需要打开一个 TensorFlow Session ,然后使用它来初始化变量以及计算 f:

sess = tf.Session()
sess.run(x.initializer)
sess.run(y.initializer)
print(sess.run(f))
sess.close()


如果变量很多,会使得
sess.run()
多次出现。所以,我们使用
with
块来设置默认session:

with tf.Session() as sess:
x.initializer.run()  # equivalent to tf.get_default_session().run(x.initializer)
y.initializer.run()
retsult = f.eval()  # equivalent to calling tf.get_default_session().run(f)
print(retsult)
sess.close()


上边的代码手动去初始化了各个变量。我们也可以使用
global_variables_initializer()
来初始化所有变量(不会立即执行初始化):

init = tf.global_variables_initializer()

with tf.Session() as sess:
init.run()
retsult = f.eval()
print(retsult)
sess.close()


管理图

上边的代码都是使用默认图,如果需要在独立的图里边执行代码,可以自行创建图:

import tensorflow as tf

x1 = tf.Variable(1)
print(x1.graph is tf.get_default_graph())  # True

graph = tf.Graph()  # 独立的 Graph
with graph.as_default():
x2 = tf.Variable(2)
print(x2.graph is tf.get_default_graph())  # False


Node 的存活周期

变量的存活开始于其初始化,结束于会话结束:

import tensorflow as tf

w = tf.constant(3)
x = w + 2
y = x + 3
z = x + 4

# 计算 w 、 x 两次
with tf.Session() as sess:
print(y.eval())
print(z.eval())
sess.close()

# 计算 w 、 x 一次
with tf.Session() as sess:
y_eval, z_eval = sess.run([y, z])
print(y_eval)
print(z_eval)
sess.close()


示例:使用TensorFlow实现线性回归

θ 等式计算

线性回归的计算我们使用:

θ=(XT⋅X)−1⋅XT⋅y

我们引入 sklearn 中
california_housing
来进行演示,代码如下:

import tensorflow as tf
import numpy as np
from sklearn.datasets import fetch_california_housing

housing = fetch_california_housing()
m, n = housing.data.shape
housing_data_with_bias = np.c_[np.ones([m, 1]), housing.data]

X = tf.constant(housing_data_with_bias, dtype=tf.float32, name='X')
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name='y')
XT = tf.transpose(X)
theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, X)), XT), y)  # (X^T * X)^-1 * X^T * y

with tf.Session() as sess:
theta_value = theta.eval()
print(theta_value)


输出:

[[ -3.74651413e+01]
[  4.35734153e-01]
[  9.33829229e-03]
[ -1.06622010e-01]
[  6.44106984e-01]
[ -4.25131839e-06]
[ -3.77322501e-03]
[ -4.26648885e-01]
[ -4.40514028e-01]]


实现梯度下降

下边我们使用梯度下降来代替上边的等式:

import tensorflow as tf
import numpy as np
import numpy.random as rnd
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import fetch_california_housing
from datetime import datetime

scaler = StandardScaler()
housing = fetch_california_housing()
m, n = housing.data.shape
scale_housing_data = scaler.fit_transform(housing.data)
scaled_housing_data_plus_bias = np.c_[np.ones([m, 1]), scale_housing_data]

# ### 计算梯度(Batch)###
tf.reset_default_graph()

n_epochs = 1000
learning_rate = 0.01

X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name='X')
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name='y')
theta = tf.Variable(tf.random_uniform([n+1, 1], -1.0, 1.0, seed=42), name='theta')
y_pred = tf.matmul(X, theta, name='predictions')
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name='mse')
# gradients = 2/m * tf.matmul(tf.transpose(X), error)   # ① 手动计算梯度
# training_op = tf.assign(theta, theta - gradients * learning_rate)

# gradients = tf.gradients(mse, [theta])[0]             # ② autodiff 自动计算梯度
# training_op = tf.assign(theta, theta - gradients * learning_rate)

optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)  # ③ 梯度下降优化器
# optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate, momentum=0.25)  # 可以使用其他优化器
training_op = optimizer.minimize(mse)

init = tf.global_variables_initializer()
saver = tf.train.Saver()

with tf.Session() as sess:
# saver.restore(sess, 'my_model_final.ckpt')
sess.run(init)

for epoch in range(n_epochs):
if epoch % 100 == 0:
print("Epoch", epoch, "MSE =", mse.eval())
save_path = saver.save(sess, '/tmp/my_model.ckpt')
sess.run(training_op)

best_theta = theta.eval()
save_path = saver.save(sess, "my_model_final.ckpt")

print("Best theta:")
print(best_theta)


手动实现梯度下降

gradients = 2/m * tf.matmul(tf.transpose(X), error)   # ① 手动计算梯度
# training_op = tf.assign(theta, theta - gradients * learning_rate)


tf.random_uniform()
产生随机数

tf.assign()
将新值赋予一个变量,在 “① 手动计算梯度” 中,我们使用了它实现 θ(nextstep)=θ−\arrowdown

使用 autodiff 实现梯度下降

使用手动实现梯度下降,在深度神经网络中,代码可能变的冗长易错。我们可以改而使用 symbolic differentiation 对偏导自动查找等式。自动实现梯度下降的主要解决方案如下:



gradients = tf.gradients(mse, [theta])[0]             # ② autodiff 自动计算梯度
# training_op = tf.assign(theta, theta - gradients * learning_rate)


使用优化器实现梯度下降

TensorFlow 提供了一系列优化器优化器,我们代码中使用了
tf.train.GradientDescentOptimizer()
,也可以使用其他优化器,如
tf.train.MomentumOptimizer()
。代码如下:

optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)  # ③ 梯度下降优化器
# optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate, momentum=0.25)  # 可以使用其他优化器
training_op = optimizer.minimize(mse)


保存和加载模型

saver = tf.train.Saver()
[...]
save_path = saver.save(sess, '/tmp/my_model.ckpt')
[...]
saver.restore(sess, 'my_model_final.ckpt')


Mini-batch 梯度下降 —— 逐步“喂”数据

实现 Mini-batch Gradient Descent 需要在每个迭代中将X和y替换,最简单的就是使用
tf.placeholder()
。如下:

X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")


在每次迭代中通过
feed_dict
参数来填充数据:

X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})


全部代码如下:

X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")  # “If you specify None for a dimension, it means “any size.”
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")
theta = tf.Variable(tf.random_uniform([n+1, 1], -1.0, 1, seed=42), name='theta')
y_pred = tf.matmul(X, theta, name='predictions')
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name='mse')
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)

init = tf.global_variables_initializer()

rnd.seed(42)

def fetch_batch(epoch, batch_index, batch_size):
rnd.seed(epoch * n_batches + batch_index)
indices = rnd.randint(m, size=batch_size)
X_batch = scaled_housing_data_plus_bias[indices]
y_batch = housing.target.reshape(-1, 1)[indices]
return X_batch, y_batch

n_epochs = 10
batch_size = 100
n_batches = int(np.ceil(m / batch_size))

with tf.Session() as sess:
sess.run(init)

for epoch in range(n_epochs):
for batch_index in range(n_batches):
X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size) sess.run(training_op, feed_dict={X: X_batch, y: y_batch})

best_theta = theta.eval()

print("Best theta:")
print(best_theta)


可视化 —— 使用 TensorBoard

首先,定义日志文件目录和名称:

now = datetime.utcnow().strftime("%Y%m%d%H%M%S")
root_logdir = "tf_logs"
logdir = "{}/run-{}/".format(root_logdir, now)


然后添加下列代码:

mse_summary = tf.summary.scalar('MSE', mse)
summary_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())


第一行在图中创建一歌节点,将MSE记录进
summary
(a TensorBoard-compatible binary log string)。第二行创建
tf.summary.FileWriter()
,用以将所有
summary
写入日志文件目录。

最后使用
add_summary()
更新文件。代码如下:

tf.reset_default_graph()

now = datetime.utcnow().strftime("%Y%m%d%H%M%S") root_logdir = "tf_logs" logdir = "{}/run-{}/".format(root_logdir, now)

n_epochs = 100
learning_rate = 0.01

X = tf.placeholder(tf.float32, shape=(None, n+1), name='X')
y = tf.placeholder(tf.float32, shape=(None, 1), name='y')
theta = tf.Variable(tf.random_uniform([n+1, 1], -1.0, 1, seed=42), name='theta')
y_pred = tf.matmul(X, theta, name='predictions')
with tf.name_scope('loss') as scope: # NameScope
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name='mse')
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)

init = tf.global_variables_initializer()

mse_summary = tf.summary.scalar('MSE', mse) summary_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())

n_epochs = 10
batch_size = 100
n_batches = int(np.ceil(m / batch_size))

with tf.Session() as sess:
sess.run(init)

for epoch in range(n_epochs):
for batch_index in range(n_batches):
X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
if batch_index % 10 == 0:
summary_str = mse_summary.eval(feed_dict={X:X_batch, y:y_batch})
step = epoch * n_batches + batch_index
summary_writer.add_summary(summary_str, step)
sess.run(training_op, feed_dict={X : X_batch, y : y_batch})

best_theta = theta.eval()

summary_writer.flush()
summary_writer.close()

print("Best theta:")
print(best_theta)


终端里边启动 TensorBoard:

(tensorflow) ➜  ch09 git:(master) ✗ tensorboard --logdir ./logs
Starting TensorBoard b'41' on port 6006
(You can navigate to http://127.0.0.1:6006) ...


这个时候可以在浏览器 http://127.0.0.1:6006 中看到图了。

命名空间、模块化和共享变量

Name Scopes

在复杂的模型中很容易产生很多节点,那么图绘变得很乱。所以,我们使用 Name Scope 来使相关节点变成一个群体,如下:

with tf.name_scope('loss') as scope:
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")

print(error.op.name)  # loss/sub
print(mse.op.name)  # loss/mse


Modularity

看一下下边的代码:

tf.reset_default_graph()

n_features = 3
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")

w1 = tf.Variable(tf.random_normal((n_features, 1)), name="weights1")
w2 = tf.Variable(tf.random_normal((n_features, 1)), name="weights2")
b1 = tf.Variable(0.0, name="bias1")
b2 = tf.Variable(0.0, name="bias2")

linear1 = tf.add(tf.matmul(X, w1), b1, name="linear1")
linear2 = tf.add(tf.matmul(X, w2), b2, name="linear2")

relu1 = tf.maximum(linear1, 0, name="relu1")
relu2 = tf.maximum(linear1, 0, name="relu2")  # Oops, cut&paste error! Did you spot it?

output = tf.add_n([relu1, relu2], name="output")


上边的代码炒鸡丑陋啊有木有?如果我们需要很多重复操作,那么就需要使其模块化:

tf.reset_default_graph()

def relu(X):
with tf.name_scope("relu"):
w_shape = int(X.get_shape()[1]), 1
w = tf.Variable(tf.random_normal(w_shape), name="weights")
b = tf.Variable(0.0, name="bias")
linear = tf.add(tf.matmul(X, w), b, name="linear")
return tf.maximum(linear, 0, name="max")

n_features = 3
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X) for i in range(5)]
output = tf.add_n(relus, name="output")

summary_writer = tf.summary.FileWriter("logs/relu2", tf.get_default_graph())


Sharing Variables

如果我们需要一个共享变量,我们有什么办法呢?考虑一下下边几种方案:

创建后在函数中通过参数传递。这种方法在需要很多共享变量时变得很痛苦。

tf.reset_default_graph()

def relu(X, threshold):
with tf.name_scope("relu"):
w_shape = int(X.get_shape()[1]), 1
w = tf.Variable(tf.random_normal(w_shape), name="weights")
b = tf.Variable(0.0, name="bias")
linear = tf.add(tf.matmul(X, w), b, name="linear")
return tf.maximum(linear, threshold, name="max")

threshold = tf.Variable(0.0, name="threshold")
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X, threshold) for i in range(5)]
output = tf.add_n(relus, name="output")


使用类或者字典来保存。或者是在 relu() 首次调用时设置这个共享变量。

tf.reset_default_graph()

def relu(X):
with tf.name_scope("relu"):
if not hasattr(relu, "threshold"):
relu.threshold = tf.Variable(0.0, name="threshold")
w_shape = int(X.get_shape()[1]), 1
w = tf.Variable(tf.random_normal(w_shape), name="weights")
b = tf.Variable(0.0, name="bias")
linear = tf.add(tf.matmul(X, w), b, name="linear")
return tf.maximum(linear, relu.threshold, name="max")

X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X) for i in range(5)]
output = tf.add_n(relus, name="output")


TensorFlow 的方案

TensorFlow 使用
get_variable()
来处理共享变量:不存在则创建,存在则复用。他的行为(创建还是复用)通过
variable_scope()
来控制:

tf.reset_default_graph()

def relu(X):
with tf.variable_scope("relu", reuse=True):
threshold = tf.get_variable("threshold", shape=(), initializer=tf.constant_initializer(0.0))
w_shape = int(X.get_shape()[1]), 1
w = tf.Variable(tf.random_normal(w_shape), name="weights")
b = tf.Variable(0.0, name="bias")
linear = tf.add(tf.matmul(X, w), b, name="linear")
return tf.maximum(linear, threshold, name="max")

X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
with tf.variable_scope("relu"):
threshold = tf.get_variable("threshold", shape=(), initializer=tf.constant_initializer(0.0))
relus = [relu(X) for i in range(5)]
output = tf.add_n(relus, name="output")

summary_writer = tf.summary.FileWriter("logs/relu6", tf.get_default_graph())
summary_writer.close()


上边的共享变量是在主题方法外定义的,使用下列代码将其放在方法内:

import tensorflow as tf

n_features = 3

def relu(X):
with tf.variable_scope("relu"):
threshold = tf.get_variable("threshold", shape=(), initializer=tf.constant_initializer(0.0))
w_shape = int(X.get_shape()[1]), 1
w = tf.Variable(tf.random_normal(w_shape), name="weights")
b = tf.Variable(0.0, name="bias")
linear = tf.add(tf.matmul(X, w), b, name="linear")
return tf.maximum(linear, threshold, name="max")

X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
with tf.variable_scope("", default_name="") as scope:
first_relu = relu(X)     # create the shared variable
scope.reuse_variables()  # then reuse it
relus = [first_relu] + [relu(X) for i in range(4)]
output = tf.add_n(relus, name="output")

summary_writer = tf.summary.FileWriter("logs/relu8", tf.get_default_graph())
summary_writer.close()
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
相关文章推荐