PyTorch参数初始化方法
2017-10-26 20:32
721 查看
torch.nn.init
torch.nn.init.
calculate_gain(nonlinearity, param=None)[source]
Return the recommended gain value for the given nonlinearity function. The values are as follows:
nonlinearity | gain |
---|---|
linear | 11 |
conv{1,2,3}d | 11 |
sigmoid | 11 |
tanh | 5/35/3 |
relu | 2‾√2 |
leaky_relu | 2/(1+negative_slope2)‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√2/(1+negative_slope2) |
Parameters: | nonlinearity – the nonlinear function (nn.functional name) param – optional parameter for the nonlinear function |
---|
>>> gain = nn.init.calculate_gain('leaky_relu')
torch.nn.init.
uniform(tensor, a=0, b=1)[source]
Fills the input Tensor or Variable with values drawn from the uniform distribution U(a,b)U(a,b).
Parameters: | tensor – an n-dimensional torch.Tensor or autograd.Variable a – the lower bound of the uniform distribution b – the upper bound of the uniform distribution |
---|
>>> w = torch.Tensor(3, 5) >>> nn.init.uniform(w)
torch.nn.init.
normal(tensor, mean=0, std=1)[source]
Fills the input Tensor or Variable with values drawn from the normal distribution N(mean,std)N(mean,std).
Parameters: | tensor – an n-dimensional torch.Tensor or autograd.Variable mean – the mean of the normal distribution std – the standard deviation of the normal distribution |
---|
>>> w = torch.Tensor(3, 5) >>> nn.init.normal(w)
torch.nn.init.
constant(tensor, val)[source]
Fills the input Tensor or Variable with the value val.
Parameters: | tensor – an n-dimensional torch.Tensor or autograd.Variable val – the value to fill the tensor with |
---|
>>> w = torch.Tensor(3, 5) >>> nn.init.constant(w)
torch.nn.init.
eye(tensor)[source]
Fills the 2-dimensional input Tensor or Variable with the identity matrix. Preserves the identity of the inputs in Linear layers, where as many inputs are preserved as possible.
Parameters: | tensor – a 2-dimensional torch.Tensor or autograd.Variable |
---|
>>> w = torch.Tensor(3, 5) >>> nn.init.eye(w)
torch.nn.init.
dirac(tensor)[source]
Fills the {3, 4, 5}-dimensional input Tensor or Variable with the Dirac delta function. Preserves the identity of the inputs in Convolutional layers, where as many input channels are
preserved as possible.
Parameters: | tensor – a {3, 4, 5}-dimensional torch.Tensor or autograd.Variable |
---|
>>> w = torch.Tensor(3, 16, 5, 5) >>> nn.init.dirac(w)
torch.nn.init.
xavier_uniform(tensor, gain=1)[source]
Fills the input Tensor or Variable with values according to the method described in “Understanding the difficulty of training deep feedforward neural networks” - Glorot, X. & Bengio,
Y. (2010), using a uniform distribution. The resulting tensor will have values sampled from U(−a,a)U(−a,a) where a=gain×2/(fan_in+fan_out)‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√×3‾√a=gain×2/(fan_in+fan_out)×3.
Also known as Glorot initialisation.
Parameters: | tensor – an n-dimensional torch.Tensor or autograd.Variable gain – an optional scaling factor |
---|
>>> w = torch.Tensor(3, 5) >>> nn.init.xavier_uniform(w, gain=nn.init.calculate_gain('relu'))
torch.nn.init.
xavier_normal(tensor, gain=1)[source]
Fills the input Tensor or Variable with values according to the method described in “Understanding the difficulty of training deep feedforward neural networks” - Glorot, X. & Bengio,
Y. (2010), using a normal distribution. The resulting tensor will have values sampled from N(0,std)N(0,std) where std=gain×2/(fan_in+fan_out)‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√std=gain×2/(fan_in+fan_out).
Also known as Glorot initialisation.
Parameters: | tensor – an n-dimensional torch.Tensor or autograd.Variable gain – an optional scaling factor |
---|
>>> w = torch.Tensor(3, 5) >>> nn.init.xavier_normal(w)
torch.nn.init.
kaiming_uniform(tensor, a=0, mode='fan_in')[source]
Fills the input Tensor or Variable with values according to the method described in “Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification” - He,
K. et al. (2015), using a uniform distribution. The resulting tensor will have values sampled from U(−bound,bound)U(−bound,bound) where bound=2/((1+a2)×fan_in)‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√×3‾√bound=2/((1+a2)×fan_in)×3.
Also known as He initialisation.
Parameters: | tensor – an n-dimensional torch.Tensor or autograd.Variable a – the negative slope of the rectifier used after this layer (0 for ReLU by default) mode – either ‘fan_in’ (default) or ‘fan_out’. Choosing fan_in preserves the magnitude of the variance of the weights in the forward pass. Choosing fan_outpreserves the magnitudes in the backwards pass. |
---|
>>> w = torch.Tensor(3, 5) >>> nn.init.kaiming_uniform(w, mode='fan_in')
torch.nn.init.
kaiming_normal(tensor, a=0, mode='fan_in')[source]
Fills the input Tensor or Variable with values according to the method described in “Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification” - He,
K. et al. (2015), using a normal distribution. The resulting tensor will have values sampled from N(0,std)N(0,std)where std=2/((1+a2)×fan_in)‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√std=2/((1+a2)×fan_in).
Also known as He initialisation.
Parameters: | tensor – an n-dimensional torch.Tensor or autograd.Variable a – the negative slope of the rectifier used after this layer (0 for ReLU by default) mode – either ‘fan_in’ (default) or ‘fan_out’. Choosing fan_in preserves the magnitude of the variance of the weights in the forward pass. Choosing fan_outpreserves the magnitudes in the backwards pass. |
---|
>>> w = torch.Tensor(3, 5) >>> nn.init.kaiming_normal(w, mode='fan_out')
torch.nn.init.
orthogonal(tensor, gain=1)[source]
Fills the input Tensor or Variable with a (semi) orthogonal matrix, as described in “Exact solutions to the nonlinear dynamics of learning in deep linear neural networks” - Saxe, A. et
al. (2013). The input tensor must have at least 2 dimensions, and for tensors with more than 2 dimensions the trailing dimensions are flattened.
Parameters: | tensor – an n-dimensional torch.Tensor or autograd.Variable, where n >= 2 gain – optional scaling factor |
---|
>>> w = torch.Tensor(3, 5) >>> nn.init.orthogonal(w)
torch.nn.init.
sparse(tensor, sparsity, std=0.01)[source]
Fills the 2D input Tensor or Variable as a sparse matrix, where the non-zero elements will be drawn from the normal distribution N(0,0.01)N(0,0.01),
as described in “Deep learning via Hessian-free optimization” - Martens, J. (2010).
Parameters: | tensor – an n-dimensional torch.Tensor or autograd.Variable sparsity – The fraction of elements in each column to be set to zero std – the standard deviation of the normal distribution used to generate non-zero values (the) – |
---|
>>> w = torch.Tensor(3, 5) >>> nn.init.sparse(w, sparsity=0.1)
相关文章推荐
- PyTorch中使用预训练的模型初始化网络的一部分参数
- python PyTorch参数初始化和Finetune
- PyTorch参数初始化和Finetune
- (3.1.2.2)有关Servlet初始化参数的获取方法
- oracle技术之查询初始化参数的方法(六)
- Pytorch入门——安装快速安装方法
- pytorch构建网络模型的4种方法
- Torch 网络层 参数的初始化问题
- js初始化url参数方法
- oracle技术之oracle查询初始化参数的方法(二)
- tensorflow的几种参数初始化方法
- 3. 定义一个分数类(Fraction) 实例变量:分子,分母 方法:初始化方法(2个参数),便利构造器,约分,打印,加,减,乘,除。
- pytorch + visdom CNN处理自建图片数据集的方法
- Tensorflow中关于参数初始化的方法
- Objective-C对象初始化 、 实例方法和参数 、 类方法 、 工厂方法 、 单例模式
- PyTorch快速搭建神经网络及其保存提取方法详解
- pytorch之torch.gather方法
- servlet都有一个servletConfig对象;四个config对象可以调用的方法;ServletContext对象之获取web项目信息;设置全局初始化参数的配置
- oracle技术之oracle查询初始化参数的方法(三)
- torch入门笔记21:xavier初始化方法