您的位置：首页 > 其它

1 pytorch60分钟入门教程

2017-05-22 13:16 766 查看

1 Tensors

from __future__ import print_function
import torch
x = torch.Tensor(5, 3)  # 构造一个未初始化的5*3的矩阵
x = torch.rand(5, 3)  # 构造一个随机初始化的矩阵

0.9643  0.2740  0.9700
0.2375  0.8547  0.1793
0.2462  0.8887  0.0271
0.8668  0.6014  0.9562
0.8588  0.3883  0.3741
[torch.FloatTensor of size 5x3]

x.size()

torch.Size([5, 3])

y = torch.rand(5,3)

0.3458  0.1517  0.1397
0.6764  0.6408  0.0139
0.6116  0.4172  0.8836
0.9197  0.6072  0.0751
0.7214  0.0613  0.4052
[torch.FloatTensor of size 5x3]

x+y

1.3101  0.4257  1.1097
0.9139  1.4955  0.1932
0.8578  1.3060  0.9108
1.7865  1.2086  1.0312
1.5802  0.4495  0.7793
[torch.FloatTensor of size 5x3]

torch.add(x,y)

1.3101  0.4257  1.1097
0.9139  1.4955  0.1932
0.8578  1.3060  0.9108
1.7865  1.2086  1.0312
1.5802  0.4495  0.7793
[torch.FloatTensor of size 5x3]

z = x+y

1.3101  0.4257  1.1097
0.9139  1.4955  0.1932
0.8578  1.3060  0.9108
1.7865  1.2086  1.0312
1.5802  0.4495  0.7793
[torch.FloatTensor of size 5x3]

result = torch.Tensor(5, 3) # 语法一
torch.add(x, y, out=result) # 语法二
result

1.3101  0.4257  1.1097
0.9139  1.4955  0.1932
0.8578  1.3060  0.9108
1.7865  1.2086  1.0312
1.5802  0.4495  0.7793
[torch.FloatTensor of size 5x3]

y.add_(x) # 将y与x相加

# 特别注明：任何可以改变tensor内容的操作都会在方法名后加一个下划线'_'
# 例如：x.copy_(y), x.t_(), 这俩都会改变x的值。
y

1.3101  0.4257  1.1097
0.9139  1.4955  0.1932
0.8578  1.3060  0.9108
1.7865  1.2086  1.0312
1.5802  0.4495  0.7793
[torch.FloatTensor of size 5x3]

x[:1]

0.9643  0.2740  0.9700
[torch.FloatTensor of size 1x3]

2 Tensor与numpy的转换

注意Torch的Tensor和numpy的array会共享他们的存储空间，修改一个会导致另外的一个也被修改。

a = torch.ones(5)
b = a.numpy()
a,b

(
1
1
1
1
1
[torch.FloatTensor of size 5],
array([ 1.,  1.,  1.,  1.,  1.], dtype=float32))

import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
a,b

(array([ 1.,  1.,  1.,  1.,  1.]),
1
1
1
1
1
[torch.DoubleTensor of size 5])

# 另外除了CharTensor之外，所有的tensor都可以在CPU运算和GPU预算之间相互转换
# 使用CUDA函数来将Tensor移动到GPU上
# 当CUDA可用时会进行GPU的运算
if torch.cuda.is_available():
x = x.cuda()
y = y.cuda()
x + y

3 PyTorch中的神经网络

PyTorch中所有的神经网络都来自于autograd包

autograd自动梯度计算，这是一个运行时定义的框架，这意味着你的反向传播是根据你代码运行的方式来定义的，因此每一轮迭代都可以各不相同。

from torch.autograd import Variable
x = Variable(torch.ones(2,2),requires_grad = True)
y = x + 2
y.creator

<torch.autograd._functions.basic_ops.AddConstant at 0x3dad3f0>

z = y*y*3
out = z.mean()
out

Variable containing:
27
[torch.FloatTensor of size 1]

out.backward()

x.grad

Variable containing:
4.5000  4.5000
4.5000  4.5000
[torch.FloatTensor of size 2x2]

这里有推倒，就是一个x函数，当x=1时这个函数的导数是多少。

https://zhuanlan.zhihu.com/p/25572330

x = torch.randn(3)
# print(x)
x = Variable(x, requires_grad = True)
y = x * 2
while y.data.norm() < 1000:
y = y * 2
#     print (y)
gradients = torch.FloatTensor([0.1, 1.0, 0.0001])
y.backward(gradients)
x.grad

Variable containing:
204.8000
2048.0000
0.2048
[torch.FloatTensor of size 3]

这里y.backward(gradients)这句话有什么用呢？？？

y, y.data, y.data.norm()

(Variable containing:
1546.1327
-304.6176
642.7925
[torch.FloatTensor of size 3],
1546.1327
-304.6176
642.7925
[torch.FloatTensor of size 3], 1701.9108254060725)

4 神经网络

使用 torch.nn 包可以进行神经网络的构建。

现在你对autograd有了初步的了解，而nn建立在autograd的基础上来进行模型的定义和微分。

nn.Module中包含着神经网络的层，同时forward(input)方法能够将output进行返回。

举个例子，来看一下这个数字图像分类的神经网络。

一个典型的神经网络的训练过程是这样的：

定义一个有着可学习的参数（或者权重）的神经网络

对着一个输入的数据集进行迭代:

用神经网络对输入进行处理

计算代价值 (对输出值的修正到底有多少)

将梯度传播回神经网络的参数中

更新网络中的权重

通常使用简单的更新规则: weight = weight + learning_rate * gradient

import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 6, 5) # 1 input image channel, 6 output channels, 5x5 square convolution kernel
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1   = nn.Linear(16*5*5, 120) # an affine operation: y = Wx + b
self.fc2   = nn.Linear(120, 84)
self.fc3   = nn.Linear(84, 10)

def forward(self, x):
x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2)) # Max pooling over a (2, 2) window
x = F.max_pool2d(F.relu(self.conv2(x)), 2) # If the size is a square you can only specify a single number
x = x.view(-1, self.num_flat_features(x))
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x

def num_flat_features(self, x):
size = x.size()[1:] # all dimensions except the batch dimension
num_features = 1
for s in size:
num_features *= s
return num_features

net = Net()
net

Net (
(conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
(conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
(fc1): Linear (400 -> 120)
(fc2): Linear (120 -> 84)
(fc3): Linear (84 -> 10)
)

x = x.view(-1, self.num_flat_features(x))应该是把x打平吧

仅仅需要定义一个forward函数就可以了，backward会自动地生成。

你可以在forward函数中使用所有的Tensor中的操作。

模型中可学习的参数会由net.parameters()返回。m

params = list(net.parameters())
print(len(params))
print(params[0].size()) # conv1's .weight

input = Variable(torch.randn(1, 1, 32, 32))
out = net(input)

10
torch.Size([6, 1, 5, 5])

out

Variable containing:
0.0648  0.0148  0.0333  0.0013  0.0563 -0.0156  0.0543  0.1504 -0.0774 -0.0231
[torch.FloatTensor of size 1x10]

复习一下前面我们学到的：

torch.Tensor - 一个多维数组

autograd.Variable - 改变Tensor并且记录下来操作的历史记录。和Tensor拥有相同的API，以及backward()的一些API。同时包含着和张量相关的梯度。

nn.Module - 神经网络模块。便捷的数据封装，能够将运算移往GPU，还包括一些输入输出的东西。

nn.Parameter - 一种变量，当将任何值赋予Module时自动注册为一个参数。

autograd.Function - 实现了使用自动求导方法的前馈和后馈的定义。每个Variable的操作都会生成至少一个独立的Function节点，与生成了Variable的函数相连之后记录下操作历史。

到现在我们已经明白的部分:

定义了一个神经网络。

处理了输入以及实现了反馈。

仍然没整的:

计算代价。

更新网络中的权重。

5 计算每个参数的梯度

output = net(input)
target = Variable(torch.range(1, 10))  # a dummy target, for example
criterion = nn.MSELoss()
loss = criterion(output, target)
loss

Variable containing:
38.2952
[torch.FloatTensor of size 1]

input -> conv2d -> relu -> maxpool2d -> conv2d -> relu -> maxpool2d

-> view -> linear -> relu -> linear -> relu -> linear

-> MSELoss

-> loss

print(loss.creator) # MSELoss
print(loss.creator.previous_functions[0][0]) # Linear
print(loss.creator.previous_functions[0][0].previous_functions[0][0]) # ReLU

<torch.nn._functions.thnn.auto.MSELoss object at 0x31dd6878>
<torch.nn._functions.linear.Linear object at 0x31dd6790>
<torch.nn._functions.thnn.auto.Threshold object at 0x31dd66a8>

# 现在我们应当调用loss.backward(), 之后来看看 conv1's在进行反馈之后的偏置梯度如何
net.zero_grad() # 归零操作
print('conv1.bias.grad before backward')
print(net.conv1.bias.grad)
loss.backward()
print('conv1.bias.grad after backward')
print(net.conv1.bias.grad)

conv1.bias.grad before backward
None
conv1.bias.grad after backward
Variable containing:
0.2046
0.0389
-0.0529
-0.0108
-0.0941
-0.0869
[torch.FloatTensor of size 6]

第一层的bias的个数刚好是6个，这里已经给出了每个参数的梯度，这样就可以以固定的学习率来更新了。感觉深度学习框架的牛逼之处就在于写好了自动求梯度的东西了么？

loss.backward就可以计算每一层的梯度了，更新还没解决。

6 更新参数

最简单的更新的规则是随机梯度下降法(SGD):

weight = weight - learning_rate * gradient

我们可以用简单的python来表示:

learning_rate = 0.01

for f in net.parameters():

f.data.sub_(f.grad.data * learning_rate)

可以用torch.optim来实现

import torch.optim as optim
# create your optimizer
optimizer = optim.SGD(net.parameters(),lr = 0.01)

# in your training loop:
optimizer.zero_grad() # zero the gradient buffers
output = net(input)
loss = criterion(output,target)
loss.backward()
optimizer.step()# Does the update

总结一下

输入：

input = Variable(torch.randn(1, 1, 32, 32))

输出：

out = net(input)

网络结构

class Net(nn.Module):

def __init__(self):

def forward(self, x):

barkward自己会完成

更新

optimizer = optim.SGD(net.parameters(),lr = 0.01)

optimizer.zero_grad()

loss.backward()

optimizer.step()# Does the update

参考：

https://zhuanlan.zhihu.com/p/25572330

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航