您的位置：首页 > 理论基础 > 计算机网络

PyTorch | 从NumPy到PyTorch实现神经网络

2020-03-05 20:02 1146 查看

用NumPy实现两层神经网络

一个全连接ReLU神经网络，一个隐藏层，没有bias。用来从x预测y，使用Square Loss。

这一实现完全使用NumPy来计算前向神经网络，loss，和反向传播算法。

N—样本数据的大小 DinD_{in}Din—输入层向量大小 H—隐藏层向量大小 DoutD_{out}Dout—输出层向量大小

forward pass
h=xw1h = xw_1h=xw1 x=>N∗Din w1=>Din∗H h=>N∗H\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ x => N * D_{in}\ \ \ \ \ \ \ w_1 => D_{in} * H\ \ \ \ \ \ \ h => N * H x=>N∗Din w1=>Din∗H h=>N∗H
hrelu=max(0,h) hrelu=>N∗Hh_{relu} = max(0,h)\ \ \ \ \ \ \ \ h_{relu}=>N*Hhrelu=max(0,h) hrelu=>N∗H
y^=hreluw2 w2=>H∗Dout y^=>N∗Dout\hat{y} = h_{relu}w_2\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ w_2 => H * D_{out}\ \ \ \ \ \ \ \hat{y} => N * D_{out}y^=hreluw2 w2=>H∗Dout y^=>N∗Dout

loss

L(ω)=(y^−y)2L(\omega) = (\hat{y} - y)^2L(ω)=(y^−y)2

backward pass

∂L∂y^=2(y^−y)\frac{\partial{L}}{\partial{\hat{y}}} = 2(\hat{y} - y)∂y^∂L=2(y^−y)

∂L∂ω2=∂y^∂ω2∂L∂y^=hreluT∂L∂y^\frac{\partial{L}}{\partial{\omega_2}} = \frac{\partial{\hat{y}}}{\partial{\omega_2}}\frac{\partial{L}}{\partial{\hat{y}}} = h_{relu}^T\frac{\partial{L}}{\partial{\hat{y}}}∂ω2∂L=∂ω2∂y^∂y^∂L=hreluT∂y^∂L
∂L∂hrelu=∂y^∂hrelu∂L∂y^=∂L∂y^ω2T\frac{\partial{L}}{\partial{h_{relu}}} = \frac{\partial{\hat{y}}}{\partial{h_{relu}}}\frac{\partial{L}}{\partial{\hat{y}}} = \frac{\partial{L}}{\partial{\hat{y}}}\omega_2^T∂hrelu∂L=∂hrelu∂y^∂y^∂L=∂y^∂Lω2T
∂L∂h=∂L∂hrelu if h<0 ∂L∂h=0\frac{\partial{L}}{\partial{h}} = \frac{\partial{L}}{\partial{h_{relu}}}\ \ \ \ \ \ \ \ if \ \ h<0\ \ \ \frac{\partial{L}}{\partial{h}} = 0∂h∂L=∂hrelu∂L if h<0 ∂h∂L=0
∂L∂ω1=∂h∂ω1∂L∂h\frac{\partial{L}}{\partial{\omega_1}} = \frac{\partial{h}}{\partial{\omega_1}}\frac{\partial{L}}{\partial h}∂ω1∂L=∂ω1∂h∂h∂L

NumPy ndarray是一个普通的n维array。它不知道任何关于深度学习或者梯度（gradient）的知识，也不知道计算图（computation graph），只是一种用来计算数学运算的数据结构。

import numpy as np
import matplotlib.pyplot as plt
# N—样本数据大小 D_in—输入层向量大小 H—隐藏层向量大小 D_out—输出层向量大小
N, D_in, H, D_out = 64, 1000, 100, 10

# 随机创建一些训练数据
x = np.random.randn(N, D_in)
y = np.random.randn(N, D_out)

# 初始化权重w1 w2参数

w1 = np.random.randn(D_in, H)
w2 = np.random.randn(H, D_out)

# 初始化学习率
learning_rate = 1e-6

Loss = []
for it in range(500):
# Forward pass
h = x.dot(w1) # N * H
h_relu = np.maximum(h, 0) # N * H
y_pred = h_relu.dot(w2) # N * D_out

# compute loss
loss = np.square(y_pred - y).sum()
Loss.append(loss)
if it%49 == 0:
print(it, loss)

# Backword paaa
# compute the gradient
grad_y_pred = 2.0 * (y_pred - y)
grad_w2 = h_relu.T.dot(grad_y_pred)
grad_h_relu = grad_y_pred.dot(w2.T)
grad_h = grad_h_relu.copy()
grad_h[h<0] = 0
grad_w1 = x.T.dot(grad_h)

# update weights of w1 and w2
w1 -= learning_rate * grad_w1
w2 -= learning_rate * grad_w2

fig = plt.figure()
ax = plt.subplot(111)
ax.plot(Loss,lw=2)
plt.savefig("result.png")
plt.show()

Output

0 25842131.18766065
49 9341.26313812292
98 189.5463364244673
147 6.111110996230143
196 0.23069281196938063
245 0.009375792322499452
294 0.000398055380964109
343 1.7414106449815095e-05
392 7.791508421066646e-07
441 3.5475745256511925e-08
490 1.6372834073046531e-09

PyTorch：Tensor和autograd

PyTorch的一个重要功能就是autograd，也就是说只要定义了forward pass（前向神经网络），计算了loss之后，PyTorch可以自动求导计算模型所有参数的梯度。

一个PyTorch的Tensor表示计算图中的一个节点。如果x是一个Tensor并且x.requires_grad = True，那么x.grad是另一个存储着x当前梯度（相当于一个scalar，常常是loss）的向量

import torch
import matplotlib.pyplot as plt
# N—样本数据大小 D_in—输入层向量大小 H—隐藏层向量大小 D_out—输出层向量大小
N, D_in, H, D_out = 64, 1000, 100, 10

# 随机创建一些训练数据
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)

# 初始化权重w1 w2参数

w1 = torch.randn(D_in, H, requires_grad = True)
w2 = torch.randn(H, D_out, requires_grad = True)

# 初始化学习率
learning_rate = 1e-6

Loss = []
for it in range(500):
# Forward pass
y_pred = x.mm(w1).clamp(min = 0).mm(w2)
# compute loss
loss = (y_pred - y).pow(2).sum() # computation graph
Loss.append(loss.item())
if it%50 == 49:
print(it, loss.item())

# Backword pass
loss.backward()

# update weights of w1 and w2
with torch.no_grad():
w1 -= learning_rate * w1.grad
w2 -= learning_rate * w2.grad
w1.grad.zero_()
w2.grad.zero_()
fig = plt.figure()
ax = plt.subplot(111)
ax.plot(Loss,lw=2)
plt.savefig("result.png")
plt.show()

Output

49 19870.130859375
99 1072.4837646484375
149 85.9421157836914
199 7.828434467315674
249 0.753280520439148
299 0.0745907872915268
349 0.007736038416624069
399 0.0010647654999047518
449 0.00025447571533732116
499 9.513569966657087e-05

PyTorch：nn

这次我们使用PyTorch中nn这个库来构建网络。用PyTorch autograd来构建计算图和计算gradient，然后PyTorch会帮我们自动计算gradient。

import torch
import torch.nn as nn
import matplotlib.pyplot as plt
# N—样本数据大小 D_in—输入层向量大小 H—隐藏层向量大小 D_out—输出层向量大小
N, D_in, H, D_out = 64, 1000, 100, 10

# 随机创建一些训练数据
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)

model = torch.nn.Sequential(
torch.nn.Linear(D_in, H), # w_1 * x + b_1
torch.nn.ReLU(),
torch.nn.Linear(H, D_out)
)

loss_fn = nn.MSELoss(reduction = 'sum')

# 初始化学习率
learning_rate = 1e-3

Loss = []
for it in range(500):
# Forward pass
y_pred = model(x)
# compute loss
loss = loss_fn(y_pred, y) # computation graph
Loss.append(loss.item())
if it%50 == 49:
print(it, loss.item())

model.zero_grad()

# Backword pass
loss.backward()

# update weights of w1 and w2
with torch.no_grad():
for param in model.parameters():
param -= learning_rate * param.grad
fig = plt.figure()
ax = plt.subplot(111)
ax.plot(Loss,lw=2)
plt.savefig("result.png")
plt.show()

Output

49 0.003269762033596635
99 1.1983887588939979e-06
149 5.244479295285487e-10
199 1.957820407530453e-12
249 1.967756903600848e-12
299 1.74486575361954e-12
349 1.9000298296517615e-12
399 1.9209714114537535e-12
449 2.1060768146813347e-12
499 2.0324950490008264e-12

Pytorch：optim

这一次我们不再手动更新模型的weights，而是使用optim这个包来帮助我们更新参数。optim这个package提供了各种不同的模型优化方法，包括SGD+momentum，RMSProp，Adam等等。

import torch
import torch.nn as nn
import matplotlib.pyplot as plt

# N—样本数据大小 D_in—输入层向量大小 H—隐藏层向量大小 D_out—输出层向量大小
N, D_in, H, D_out = 64, 1000, 100, 10

# 随机创建一些训练数据
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)

model = torch.nn.Sequential(
torch.nn.Linear(D_in, H), # w_1 * x + b_1
torch.nn.ReLU(),
torch.nn.Linear(H, D_out)
)

# 初始化学习率
learning_rate = 1e-3

loss_fn = nn.MSELoss(reduction = 'sum')
optimizer = torch.optim.Adam(model.parameters(), lr = learning_rate)

Loss = []
for it in range(500):
# Forward pass
y_pred = model(x)

# compute loss
loss = loss_fn(y_pred, y) # computation graph
Loss.append(loss.item())
if it%50 == 49:
print(it, loss.item())

optimizer.zero_grad()
# Backword pass
loss.backward()

# update weights of w1 and w2
optimizer.step()
fig = plt.figure()
ax = plt.subplot(111)
ax.plot(Loss,lw=2)
plt.savefig("result.png")
plt.show()

Output

49 0.8909258842468262
99 0.005603241268545389
149 3.349817779962905e-05
199 1.8448776017976343e-07
249 9.901702791026423e-10
299 1.6195045304812083e-11
349 8.347061757063567e-12
399 7.631120561846227e-12
449 1.0616650787664828e-11
499 9.00871651582369e-12

PyTorch：自定义 nn Module

我们可以定义一个模型，这个模型继承自nn.Module类。如果需要定义一个比Sequential模型更加复杂的模型，就需要定义nn.Module模型。

import torch
import torch.nn as nn
import matplotlib.pyplot as plt

# N—样本数据大小 D_in—输入层向量大小 H—隐藏层向量大小 D_out—输出层向量大小
N, D_in, H, D_out = 64, 1000, 100, 10

# 随机创建一些训练数据
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)

class TwoLayerNet(torch.nn.Module):
def __init__(self, D_in, H, D_out):
super(TwoLayerNet, self).__init__()
self.linear1 = torch.nn.Linear(D_in, H, bias=False)
self.linear2 = torch.nn.Linear(H, D_out, bias=False)

def forward(self, x):
y_pred = self.linear2(self.linear1(x).clamp(min=0))
return y_pred

model = TwoLayerNet(D_in, H, D_out)
loss_fn = nn.MSELoss(reduction = 'sum')
learning_rate = 1e-3
optimizer = torch.optim.Adam(model.parameters(), lr = learning_rate)

Loss = []
for it in range(500):
# Forward pass
y_pred = model(x)

# compute loss
loss = loss_fn(y_pred, y) # computation graph
Loss.append(loss.item())
if it%50 == 49:
print(it, loss.item())

optimizer.zero_grad()
# Backword pass
loss.backward()

# update weights of w1 and w2
optimizer.step()
fig = plt.figure()
ax = plt.subplot(111)
ax.plot(Loss,lw=2)
plt.savefig("result.png")
plt.show()

Output

49 1.059552788734436
99 0.005720105022192001
149 3.2995118090184405e-05
199 1.9062906631006626e-07
249 1.0079914680716229e-09
299 1.5386018847873828e-11
349 8.505867017671864e-12
399 9.710801607276665e-12
449 9.59173278997083e-12
499 1.120120142472647e-11

点赞
收藏
分享
文章举报

Oh_MyBug 发布了30 篇原创文章 · 获赞 3 · 访问量 5922 私信关注

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航