PyTorch | 从NumPy到PyTorch实现神经网络
用NumPy实现两层神经网络
一个全连接ReLU神经网络,一个隐藏层,没有bias。用来从x预测y,使用Square Loss。
这一实现完全使用NumPy来计算前向神经网络,loss,和反向传播算法。
N—样本数据的大小 DinD_{in}Din—输入层向量大小 H—隐藏层向量大小 DoutD_{out}Dout—输出层向量大小
- forward pass
h=xw1h = xw_1h=xw1 x=>N∗Din w1=>Din∗H h=>N∗H\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ x => N * D_{in}\ \ \ \ \ \ \ w_1 => D_{in} * H\ \ \ \ \ \ \ h => N * H x=>N∗Din w1=>Din∗H h=>N∗H
-
hrelu=max(0,h) hrelu=>N∗Hh_{relu} = max(0,h)\ \ \ \ \ \ \ \ h_{relu}=>N*Hhrelu=max(0,h) hrelu=>N∗H
-
y^=hreluw2 w2=>H∗Dout y^=>N∗Dout\hat{y} = h_{relu}w_2\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ w_2 => H * D_{out}\ \ \ \ \ \ \ \hat{y} => N * D_{out}y^=hreluw2 w2=>H∗Dout y^=>N∗Dout
-
L(ω)=(y^−y)2L(\omega) = (\hat{y} - y)^2L(ω)=(y^−y)2
∂L∂y^=2(y^−y)\frac{\partial{L}}{\partial{\hat{y}}} = 2(\hat{y} - y)∂y^∂L=2(y^−y)
∂L∂ω2=∂y^∂ω2∂L∂y^=hreluT∂L∂y^\frac{\partial{L}}{\partial{\omega_2}} = \frac{\partial{\hat{y}}}{\partial{\omega_2}}\frac{\partial{L}}{\partial{\hat{y}}} = h_{relu}^T\frac{\partial{L}}{\partial{\hat{y}}}∂ω2∂L=∂ω2∂y^∂y^∂L=hreluT∂y^∂L
∂L∂hrelu=∂y^∂hrelu∂L∂y^=∂L∂y^ω2T\frac{\partial{L}}{\partial{h_{relu}}} = \frac{\partial{\hat{y}}}{\partial{h_{relu}}}\frac{\partial{L}}{\partial{\hat{y}}} = \frac{\partial{L}}{\partial{\hat{y}}}\omega_2^T∂hrelu∂L=∂hrelu∂y^∂y^∂L=∂y^∂Lω2T
∂L∂h=∂L∂hrelu if h<0 ∂L∂h=0\frac{\partial{L}}{\partial{h}} = \frac{\partial{L}}{\partial{h_{relu}}}\ \ \ \ \ \ \ \ if \ \ h<0\ \ \ \frac{\partial{L}}{\partial{h}} = 0∂h∂L=∂hrelu∂L if h<0 ∂h∂L=0
∂L∂ω1=∂h∂ω1∂L∂h\frac{\partial{L}}{\partial{\omega_1}} = \frac{\partial{h}}{\partial{\omega_1}}\frac{\partial{L}}{\partial h}∂ω1∂L=∂ω1∂h∂h∂L
NumPy ndarray是一个普通的n维array。它不知道任何关于深度学习或者梯度(gradient)的知识,也不知道计算图(computation graph),只是一种用来计算数学运算的数据结构。
import numpy as np import matplotlib.pyplot as plt # N—样本数据大小 D_in—输入层向量大小 H—隐藏层向量大小 D_out—输出层向量大小 N, D_in, H, D_out = 64, 1000, 100, 10 # 随机创建一些训练数据 x = np.random.randn(N, D_in) y = np.random.randn(N, D_out) # 初始化权重w1 w2参数 w1 = np.random.randn(D_in, H) w2 = np.random.randn(H, D_out) # 初始化学习率 learning_rate = 1e-6 Loss = [] for it in range(500): # Forward pass h = x.dot(w1) # N * H h_relu = np.maximum(h, 0) # N * H y_pred = h_relu.dot(w2) # N * D_out # compute loss loss = np.square(y_pred - y).sum() Loss.append(loss) if it%49 == 0: print(it, loss) # Backword paaa # compute the gradient grad_y_pred = 2.0 * (y_pred - y) grad_w2 = h_relu.T.dot(grad_y_pred) grad_h_relu = grad_y_pred.dot(w2.T) grad_h = grad_h_relu.copy() grad_h[h<0] = 0 grad_w1 = x.T.dot(grad_h) # update weights of w1 and w2 w1 -= learning_rate * grad_w1 w2 -= learning_rate * grad_w2 fig = plt.figure() ax = plt.subplot(111) ax.plot(Loss,lw=2) plt.savefig("result.png") plt.show()
Output
0 25842131.18766065 49 9341.26313812292 98 189.5463364244673 147 6.111110996230143 196 0.23069281196938063 245 0.009375792322499452 294 0.000398055380964109 343 1.7414106449815095e-05 392 7.791508421066646e-07 441 3.5475745256511925e-08 490 1.6372834073046531e-09
PyTorch:Tensor和autograd
PyTorch的一个重要功能就是autograd,也就是说只要定义了forward pass(前向神经网络),计算了loss之后,PyTorch可以自动求导计算模型所有参数的梯度。
一个PyTorch的Tensor表示计算图中的一个节点。如果x是一个Tensor并且x.requires_grad = True,那么x.grad是另一个存储着x当前梯度(相当于一个scalar,常常是loss)的向量
import torch import matplotlib.pyplot as plt # N—样本数据大小 D_in—输入层向量大小 H—隐藏层向量大小 D_out—输出层向量大小 N, D_in, H, D_out = 64, 1000, 100, 10 # 随机创建一些训练数据 x = torch.randn(N, D_in) y = torch.randn(N, D_out) # 初始化权重w1 w2参数 w1 = torch.randn(D_in, H, requires_grad = True) w2 = torch.randn(H, D_out, requires_grad = True) # 初始化学习率 learning_rate = 1e-6 Loss = [] for it in range(500): # Forward pass y_pred = x.mm(w1).clamp(min = 0).mm(w2) # compute loss loss = (y_pred - y).pow(2).sum() # computation graph Loss.append(loss.item()) if it%50 == 49: print(it, loss.item()) # Backword pass loss.backward() # update weights of w1 and w2 with torch.no_grad(): w1 -= learning_rate * w1.grad w2 -= learning_rate * w2.grad w1.grad.zero_() w2.grad.zero_() fig = plt.figure() ax = plt.subplot(111) ax.plot(Loss,lw=2) plt.savefig("result.png") plt.show()
Output
49 19870.130859375 99 1072.4837646484375 149 85.9421157836914 199 7.828434467315674 249 0.753280520439148 299 0.0745907872915268 349 0.007736038416624069 399 0.0010647654999047518 449 0.00025447571533732116 499 9.513569966657087e-05
PyTorch:nn
这次我们使用PyTorch中nn这个库来构建网络。用PyTorch autograd来构建计算图和计算gradient,然后PyTorch会帮我们自动计算gradient。
import torch import torch.nn as nn import matplotlib.pyplot as plt # N—样本数据大小 D_in—输入层向量大小 H—隐藏层向量大小 D_out—输出层向量大小 N, D_in, H, D_out = 64, 1000, 100, 10 # 随机创建一些训练数据 x = torch.randn(N, D_in) y = torch.randn(N, D_out) model = torch.nn.Sequential( torch.nn.Linear(D_in, H), # w_1 * x + b_1 torch.nn.ReLU(), torch.nn.Linear(H, D_out) ) loss_fn = nn.MSELoss(reduction = 'sum') # 初始化学习率 learning_rate = 1e-3 Loss = [] for it in range(500): # Forward pass y_pred = model(x) # compute loss loss = loss_fn(y_pred, y) # computation graph Loss.append(loss.item()) if it%50 == 49: print(it, loss.item()) model.zero_grad() # Backword pass loss.backward() # update weights of w1 and w2 with torch.no_grad(): for param in model.parameters(): param -= learning_rate * param.grad fig = plt.figure() ax = plt.subplot(111) ax.plot(Loss,lw=2) plt.savefig("result.png") plt.show()
Output
49 0.003269762033596635 99 1.1983887588939979e-06 149 5.244479295285487e-10 199 1.957820407530453e-12 249 1.967756903600848e-12 299 1.74486575361954e-12 349 1.9000298296517615e-12 399 1.9209714114537535e-12 449 2.1060768146813347e-12 499 2.0324950490008264e-12
Pytorch:optim
这一次我们不再手动更新模型的weights,而是使用optim这个包来帮助我们更新参数。optim这个package提供了各种不同的模型优化方法,包括SGD+momentum,RMSProp,Adam等等。
import torch import torch.nn as nn import matplotlib.pyplot as plt # N—样本数据大小 D_in—输入层向量大小 H—隐藏层向量大小 D_out—输出层向量大小 N, D_in, H, D_out = 64, 1000, 100, 10 # 随机创建一些训练数据 x = torch.randn(N, D_in) y = torch.randn(N, D_out) model = torch.nn.Sequential( torch.nn.Linear(D_in, H), # w_1 * x + b_1 torch.nn.ReLU(), torch.nn.Linear(H, D_out) ) # 初始化学习率 learning_rate = 1e-3 loss_fn = nn.MSELoss(reduction = 'sum') optimizer = torch.optim.Adam(model.parameters(), lr = learning_rate) Loss = [] for it in range(500): # Forward pass y_pred = model(x) # compute loss loss = loss_fn(y_pred, y) # computation graph Loss.append(loss.item()) if it%50 == 49: print(it, loss.item()) optimizer.zero_grad() # Backword pass loss.backward() # update weights of w1 and w2 optimizer.step() fig = plt.figure() ax = plt.subplot(111) ax.plot(Loss,lw=2) plt.savefig("result.png") plt.show()
Output
49 0.8909258842468262 99 0.005603241268545389 149 3.349817779962905e-05 199 1.8448776017976343e-07 249 9.901702791026423e-10 299 1.6195045304812083e-11 349 8.347061757063567e-12 399 7.631120561846227e-12 449 1.0616650787664828e-11 499 9.00871651582369e-12
PyTorch:自定义 nn Module
我们可以定义一个模型,这个模型继承自nn.Module类。如果需要定义一个比Sequential模型更加复杂的模型,就需要定义nn.Module模型。
import torch import torch.nn as nn import matplotlib.pyplot as plt # N—样本数据大小 D_in—输入层向量大小 H—隐藏层向量大小 D_out—输出层向量大小 N, D_in, H, D_out = 64, 1000, 100, 10 # 随机创建一些训练数据 x = torch.randn(N, D_in) y = torch.randn(N, D_out) class TwoLayerNet(torch.nn.Module): def __init__(self, D_in, H, D_out): super(TwoLayerNet, self).__init__() self.linear1 = torch.nn.Linear(D_in, H, bias=False) self.linear2 = torch.nn.Linear(H, D_out, bias=False) def forward(self, x): y_pred = self.linear2(self.linear1(x).clamp(min=0)) return y_pred model = TwoLayerNet(D_in, H, D_out) loss_fn = nn.MSELoss(reduction = 'sum') learning_rate = 1e-3 optimizer = torch.optim.Adam(model.parameters(), lr = learning_rate) Loss = [] for it in range(500): # Forward pass y_pred = model(x) # compute loss loss = loss_fn(y_pred, y) # computation graph Loss.append(loss.item()) if it%50 == 49: print(it, loss.item()) optimizer.zero_grad() # Backword pass loss.backward() # update weights of w1 and w2 optimizer.step() fig = plt.figure() ax = plt.subplot(111) ax.plot(Loss,lw=2) plt.savefig("result.png") plt.show()
Output
49 1.059552788734436 99 0.005720105022192001 149 3.2995118090184405e-05 199 1.9062906631006626e-07 249 1.0079914680716229e-09 299 1.5386018847873828e-11 349 8.505867017671864e-12 399 9.710801607276665e-12 449 9.59173278997083e-12 499 1.120120142472647e-11
- 点赞
- 收藏
- 分享
- 文章举报
- 利用pytorch 做一个简单的神经网络实现sklearn库中莺尾花的分类
- 深度学习之Pytorch(一)神经网络基础及代码实现
- Pytorch基础之——(六)Pytorch理解更多神经网络优化方法
- torch教程[1]用numpy实现三层全连接神经网络
- Pytorch 神经网络―自定义数据集上实现教程
- 使用NumPy实现人工神经网络分类图像
- 图像神经风格转移(neural transfer)-------Pytorch 实现
- torch入门笔记5:用torch实现RNN来制作一个神经网络计时器
- 利用pytorch实现神经网络风格迁移Neural Transfer
- PyTorch 学习 (3) 神经网络
- 基本的神经网络前向传播pytorch实现
- Pytorch 搭建分类回归神经网络并用GPU进行加速的例子
- 人脸目标检测的多任务级联神经网络MTCNN在Pytorch中的实现
- 使用Dropout防止神经网络过拟合(纯numpy实现)
- PyTorch笔记4-快速构建神经网络(NN)
- NumPy实现简单的神经网络分析Mnist手写数字库(四)之建立神经网络
- 循环神经网络在Python 、Numpy和Theano中的实现
- NumPy实现简单的神经网络分析Mnist手写数字库(一)之读取数据
- 跟我一起学PyTorch-05:深度神经网络DNN
- NumPy实现简单的神经网络分析Mnist手写数字库(二)之数据预处理