您的位置：首页 > 其它

TensorFlow入门深度学习--01.基础知识

2018-03-28 11:25 916 查看

TensorFlow入门深度学习–01.基础知识

1.1 Tensorflow安装

1.1.1 安装GPU版本

一定要选好TensorFlow、cuda及cudnn版本号，这里选择版本号分别为：

(1)TensorFlow 1.3.0

【pip install –upgrade https://mirrors.tuna.tsinghua.edu.cn/tensorflow/windows/gpu/tensorflow_gpu-1.3.0rc0-cp35-cp35m-win_amd64.whl】

这个网址里有很多tensorflow 版本的链接: https://github.com/tensorflow/tensorflow/pull/8212/files

查看tensorflow版本：

python -c ‘import tensorflow as tf; print(tf._ _ version__)’# for Python 2

python3 -c ‘import tensorflow as tf; print(tf._ _ version__)’# for Python 3

(2)cuda_8.0.61_windows.exe

查看cuda 版本nvcc –version

(3)cudnn-8.0-windows7-x64-v6.0

1.1.2 安装CPU版本

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/ https://mirrors.tuna.tsinghua.edu.cn/tensorflow/windows/cpu/tensorflow-1.1.0-cp35-cp35m-win_amd64.whl

1.1.3 卸载tensorflow

1.激活tensorflow：activate tensorflow

2.输入：pip uninstall tensorflow

1.2 TensorFlow设计理念

TensorFlow字面中文翻译为张量流动，TensorFlow的设计理念主要体现在以下两个方面：

1.TensorFlow被认为是符号式编程。编程模式分为命令式编程和符号式编程。Torch是典型的命令式编程，Caffe/MXMet采用两种编程模式混合的方法，而TensorFlow采用符号式编程，符号式编程设计许多的嵌入和优化，不容易理解和调试，但运行速率有所提升。符号式编程一般是先定义各种变量，然后建立一个数据流图。

2.**TensorFlow中涉及的运算定义都要放在数据流图（又称网络结构图）中，并且图的定义和图的运行完全分开。在数据流图中规定各个变量之间的计算关系。最后需要对数据流图进行编译，但此时数据流图还是一个空壳，里面没有任何实际数据，好比先挖好河道，但河道里现在还没有水。

TensorFlow中，图的运行只发生在绘画中，启动绘画后，就可以用数据取填充节点，进行运算；关闭会话后，就不能进行计算了。就好比，将图看成河道，将数据流看成水，而会话看成大坝，只有在大坝开启的时候，河道里才有水，而大坝关闭后，河道里的水就没了。**

Import tensorflow as tf
#创建图
a = tf.constant([1.0, 2.0])
b = tf.constant([3.0, 4.0])
c = a*b
#创建会话
sess = tf.Session()
#计算c
print sess.run(c)
Sess.close()

1.3 Python与TensorFlow的计算效率

还没有tensorflow的时候，python使用Numpy做密集的运算，因为Numpy是使用C和一部分fortran编写的，并且调用openblas、mkl等矩阵运算库，因此效率很高。但其中每一个运算的结果都要返回到python中，但不同语言之间传输数据可能会带来比较大的延迟。Tensorflow同样调用openblas、mkl等矩阵运算库，但TensorFlow通过定义一个计算图将所有的运算操作全部运行在Python外执行，且不需要每次把运算完的数据传回Python，因此运算效率更高。

1.4 编程模型

1.4.1 计算流程

TensorFlow是用数据流图做计算的，典型的数据流图如下图所示。

图中描述了TensorFlow运行原理，包括输入、塑性、ReLUctant层、Logit层、softmax、交叉熵、梯度、SGD等模块，是一个简单的神经网络模型。它的计算过是，首先从输入开始，经过塑性后，一层一层进行前向传播。ReLU层有两个变量：权重矩阵W和b，输入经塑形后通过MatMul节点与权重矩阵W相乘，相乘后的结果通过BiasAdd操作与偏置系数b相加，由此获得加权输入，通过激活函数ReLU进行非线性处理后得到Relu层的输出。Logit层有两个变量：权重矩阵W和偏置系数b，上一层的输出通过MatMul节点与权重矩阵W相乘，相乘后的结果通过BiasAdd操作与偏置系数b相加，由此获得加权输入，用softmax计算输出结果中各个类别的概率分布，用交叉熵来度量两个概率分布（源样本的类标记与输出结果）之间的相似性。然后开始计算梯度，需要所有的权重参数、偏执参数、输入输出、激活函数的导数以及交叉熵后的计算结果。随后进入SGD训练，从上往下计算每一层的参数，依次更新。

通过激活函数ReL
4000
U进行非线性处理后得到Relu层的输出。

1.4.2 基本组件

上面对计算流程进行了详细描述，但新手可能不知道这些是什么东西，没关系，通过下面的介绍，你应该会有认知。

1.4.2.1 图

图即数据流图，数据流图中规定各个变量之间的计算关系，数据流好比水，数据流图就好比河道，必须把计算流程中用到的公式事先在数据流图中定义好了，也就是“声明”各种元数据及各种数学操作，并且河道要用特殊的工具来挖，数据流图中的公式定义需用TensorFlow中的语法定义来实现。

import tensorflow as tf
var1 = tf.constant([[1.0]])
var2 = tf.constant([[2.0]])
out = tf.matmul(var1,var2)

1.4.2.2 会话

会话就好比大坝，建好大坝并开闸放水，河道里才能有水流动。因此启动图的第一步是创建会话sess = tf.Session()，开闸即sess.run()，但开闸后水往哪个河道里灌呢，需在run()的形参中指定，既有sess.run([out])，但不要忘了最后要关闭大坝sess.close()，整个过程如下：

sess = tf.Session()
print('%0.2f' %(sess.run(out)))
sess.close()

用with tf.Session() as sess:代替上面会话的定义，可以省略关闭会话这个过程。

with tf.Session() as sess:
print('%0.2f' %(sess.run(out)))

会话中还有“填充”、“取回”、“扩展”等概念，后面用到了再说。

tf.InteractiveSession()会常见交互式上下文的Tensorflow会话，与常规会话不同的是，交互式会话是默认会话。那么所有操作不在需要通过形参传递到会话的run函数中，可以直接调用自身的run函数来执行。计算图中一般要定义交互式会话sess，然后初始化全局参数，可用原来的方式来调用：sess.run(tf.global_variables_initializer())，也可直接运行：tf.global_variables_initializer().run()，然后是定义优化器train_ step，同样有两种方式运行该优化器：一种是通过绘画的run函数调用：Sess.run(train_step, feed_ dict={s:batch_ xs,y:batch_ ys})；另一种是直接通过优化器自身的run调用：train_ step.run({x: batch_ xs, y_: batch_ ys})

Sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
train_step = tf.train.GradientDescentOptimizer(0.3).minimize(cross_entropy)
train_step.run({x: batch_xs, y_: batch_ys})

需要注意的是，会话执行的run函数还有一个可替代的函数eval，只不过eval()只能用于tf.Tensor类对象，也就是有输出的Operation。对于没有输出的Operation，可以用.run()或者Session.run()。run()没有这个限制。

train_step.eval({x: batch_xs, y_: batch_ys})

1.4.2.3 边

TensorFlow的边有两种连接关系：数据依赖和控制依赖。其中，实线边表示数据依赖，代表数据，即张量。

控制依赖一般画为虚线边，可以用于控制操作的运行，这被用来确保happens-before关系，这类边上没有数据流过，但源节点必须在目的节点开始执行前完成执行。示例如下：

1.4.2.3.1 数据依赖：

import tensorflow as tf
x = tf.placeholder(tf.int32, shape=[], name='x')
y = tf.Variable(2, dtype=tf.int32)
out = x * y
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(3):
print('output:', sess.run(out, feed_dict={x: 1}))
#输出
#2
#2
#2

1.4.2.3.2 控制依赖：

import tensorflow as tf
x = tf.placeholder(tf.int32, shape=[], name='x')
y = tf.Variable(2, dtype=tf.int32)
assign_op = tf.assign(y, y + 1) #或者 assign_op = tf.assign_add(y, 1)
with tf.control_dependencies([assign_op]):
out = x * y
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(3):
print('output:', sess.run(out, feed_dict={x: 1}))
#输出
#2
#3
#4

或者

import tensorflow as tf
x = tf.placeholder(tf.int32, shape=[], name='x')
y = tf.Variable(2, dtype=tf.int32)
assign_op = tf.assign(y, y + 1)
update_y = y.assign(y+1)
out = x * y
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(3):
print('output:', sess.run([update_y, out], feed_dict={x: 1}))

1.4.2.4 操作

操作对应图中的节点，表示数学运算以及输入的起点及输出的终点等。

import tensorflow as tf
k1 = tf.Variable([1,2,3],name='k1')
k2 = tf.Variable([2,3,4])
#k1 = tf.Variable(np.array([1,2,3]),name='k1') #结果一样
#k2 = tf.Variable(np.array([2,3,4]))
out = k1*k2   表示 点乘
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(out))
#输出
# 2 6 12

1.4.2.5 变量

按维度是否可变可分为静态变量或动态变量，变量在图中有固定的位置，不像普通张量那样可以流动。静态变量主要有变量张量以及常亮张量，变量张量用tf.Variable来初始化，常量张量用tf.constant来初始化。动态变量有占位符张量，占位符张量用tf.placeholder来初始化。它们的主要区别如下：

tf.Variable

是用来存储长期存在并可以更新的变量，主要在于一些可训练变量，比如模型的权重（weights，W）或者偏执值（bias）；由于在程序运行时其值是可以改变的，所以必须初始化。

weights = tf.Variable(tf.truncated_normal([IMAGE_PIXELS, hidden1_units], stddev=1./math.sqrt(float(IMAGE_PIXELS)), name='weights'))
biases = tf.Variable(tf.zeros([hidden1_units]), name='biases')

tf.placeholder：

用来存储大小不定但需事先“声明”的变量，中文意思为占位符（就是先占住一个固定的位置，等着你再往里面添加内容的符号），主要用于得到传递进来的个数可变的训练样本：

input = tf.placeholder(tf.float32, shape=[None, 120])

其中第2个参数中的“shape=”可以省略，None代表不限条数的输入，120代表每条输入是120维的向量，。与Variable不同，不必指定初始值，可在运行时，在函数Session.run()中利用feed_dict参数指定placeholder变量的值，feed_dict是一个字典(map)，在字典中需要给出每个用到的placeholder的取值

tf.get_variable()：

tf.get_variable()的功能主要是使得变量可以共享。tf.get_variable()的用法与tf.Variable()的用法几乎一tf.get_variable()： tf.get_variable()的功能主要是使得变量可以共享或者复用。tf.get_variable()的用法与tf.Variable()的用法几乎一样，除了一下几点：

（1）使用tf.get_variable()时一定要定义变量名，而tf.Variable()没有这个要求，这从这两个函数的定义可以看出差别，tf.get_variable()的name形参没有默认值。

tf.Variable(initial_value=None, trainable=True, collections=None, validate_shape=True, caching_device=None, name=None, variable_def=None, dtype=None, expected_shape=None, import_scope=None)
tf.get_variable(name, shape=None, dtype=None, initializer=None, regularizer=None, trainable=True, collections=None, caching_device=None, partitioner=None, validate_shape=True, custom_getter=None)

（2）tf.name_scope中的域名对该域中定义的tf.get_variable变量没有影响。

import tensorflow as tf
with tf.variable_scope('v_scope'):
with tf.name_scope('n_scope'):
x = tf.Variable([1], name='x')
y = tf.get_variable('x', shape=[1], dtype=tf.int32)
z = x + y
x.name, y.name, z.name
>>'v_scope/n_scope/x:0', 'v_scope/x:0', 'v_scope/n_scope/add:0'

（3）默认情况下，系统检测到tf.Variable定义的变量存在重名时，系统会自己进行改名处理。而系统检测到tf.Variable定义的变量存在重名时，系统会报错。

import tensorflow as tf
w1 = tf.Variable(1,name="w1")
w2 = tf.Variable(2,name="w1")
print w1.name
print w2.name
#输出
#w1:0
#w1_1:0
import tensorflow as tf
w1 = tf.get_variable(name="w1",initializer=1)
w2 = tf.get_variable(name="w1",initializer=2)
#错误信息
#ValueError: Variable w1 already exists, disallowed. Did
#you mean to set reuse=True in VarScope?

1.4.2.6 变量的共享或复用

但那么上面明明说tf.get_variable()的功能主要是使得变量可以共享或者复用，那么重名情况是如何处理的呢。

方法一：

需要在tf.variable_scope中将参数重用reuse设为True。但是在VarScope中设置reuse=True对tf.Variable并没有影响，因为tf.Variable每次都在创建新的变量对象。

import tensorflow as tf
with tf.variable_scope("scope1", reuse=True):
w1 = tf.get_variable("w1", shape=[])
w2 = tf.Variable("w2", 0.0)
with tf.variable_scope("scope1", reuse=True):
w1_1 = tf.get_variable("w1", shape=[])
w2_1 = tf.Variable("w2", 1.0)
print(w1 is w1_1, w2 is w2_1)
#输出
#True  False

方法二：在当前tf.variable_scope 作用域中调用tf.get_variable_scope().reuse_variables()

import tensorflow as tf
with tf.variable_scope ('scope'):
v1 = tf.get_variable('var', [1])
tf.get_variable_scope().reuse_variables()
v2 = tf.get_variable('var', [1])
print(v1 is v2)
#输出
#True

对于包含变量的图，在会话开始运行具体的图之前，必须进行全局变量初始化：

sess.run(tf.global_variables_initializer())

1.4.2.7 作用域

引入作用域name_space，通过with tf.name_scope(‘conv1’) as scope:将该scope内生成的Variable自动命名为conv1/xxx，以便于区分不同卷积层之间的组件，如果为变量显示指定name，那么xxx为相应显式指定的name；如果没有为变量显式指定name，那么xxx为相应的变量类型，Variable的name就是Variable、constant的name就是Const。下面示例的代码中，定义了函数print_name用来显示张量名称和维数。但tf.name_scope中的域名对该域中定义的tf.get_variable变量没有影响（上面已经说明）。

import tensorflow as tf
def print_name(t):
print(t.op.name,' ',t.get_shape().as_list())
with tf.name_scope('Scope') as scope:
k1 = tf.Variable([1],name='k1')
print_name(k1)
k2 = tf.Variable([2,3])
print_name(k2)
k3 = tf.constant([2,3])
print_name(k3)
# 显示
# Scope/k1   [1]
# Scope/Variable   [2]
# Scope/Const   [2]
with tf.name_scope('Scope') as scope:
k1 = tf.Variable([1],name='k1')
print_name(k1)
k2 = tf.Variable([2,3])
print_name(k2)
k3 = tf.constant([2,3])
print_name(k3)
# 显示
# Scope_1/k1   [1]
# Scope_1/Variable   [2]
# Scope/Const   [2]

最后来一个简单小例子，可以看成是部分知识点的应用小结：

import tensorflow as tf
import numpy as np
x = tf.Variable(np.array([[1,2,3],[4,5,6]]),dtype=np.float32)
assign_op = tf.assign(x, x + 1.0) #或者 assign_op = tf.assign_add(x, 1)
cost = tf.reduce_sum(x**2)
train_step = tf.train.AdamOptimizer(0.3).minimize(cost)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
print(sess.run(x))
print(sess.run(cost))
for i in range(1000):
sess.run(train_step)
print(sess.run(x))
print(sess.run(cost))
with tf.control_dependencies([assign_op]):
cost = tf.reduce_sum(x**2)
print(sess.run(x))
print(sess.run(cost))
sess.close()

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航