您的位置：首页 > 其它

keras中文文档笔记9——关于keras层

2017-08-19 10:47 489 查看

概述

所有的Keras层对象都有如下方法：

layer.get_weights()：返回层的权重（numpy array）

layer.set_weights(weights)：从numpy array中将权重加载到该层中，要求numpy array的形状与layer.get_weights()的形状相同

layer.get_config()：返回当前层配置信息的字典，层也可以借由配置信息重构

layer = Dense(32)
config = layer.get_config()
reconstructed_layer = Dense.from_config(config)

或者：

from keras import layers

config = layer.get_config()
layer = layers.deserialize({'class_name': layer.__class__.__name__,'config': config})

如果层仅有一个计算节点（即该层不是共享层），则可以通过下列方法获得输入张量、输出张量、输入数据的形状和输出数据的形状：

layer.input

layer.output

layer.input_shape

layer.output_shape

如果该层有多个计算节点（参考层计算节点和共享层）。可以使用下面的方法：

layer.get_input_at(node_index)

layer.get_output_at(node_index)

layer.get_input_shape_at(node_index)

layer.get_output_shape_at(node_index)

常用层

Dense层

Dense就是常用的全连接层，所实现的运算是output =activation(dot(input, kernel)+bias)。

keras.layers.core.Dense(units,
activation=None,
use_bias=True,
kernel_initializer='glorot_uniform', bias_initializer='zeros',
kernel_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
bias_constraint=None)

如果本层的输入数据的维度大于2，则会先被压为与kernel相匹配的大小。

这里是一个使用示例：

# as first layer in a sequential model:
model = Sequential()
model.add(Dense(32, input_shape=(16,)))
# now the model will take as input arrays of shape (*, 16)
# and output arrays of shape (*, 32)

# after the first layer, you don't need to specify
# the size of the input anymore:
model.add(Dense(32))

参数：

units：大于0的整数，代表该层的输出维度。

activation：激活函数，为预定义的激活函数名（参考激活函数），或逐元素（element-wise）的Theano函数。如果不指定该参数，将不会使用任何激活函数（即使用线性激活函数：a(x)=x）

use_bias: 布尔值，是否使用偏置项

kernel_initializer：权值初始化方法，为预定义初始化方法名的字符串，或用于初始化权重的初始化器。参考initializers

bias_initializer：权值初始化方法，为预定义初始化方法名的字符串，或用于初始化权重的初始化器。参考initializers

kernel_regularizer：施加在权重上的正则项，为Regularizer对象

bias_regularizer：施加在偏置向量上的正则项，为Regularizer对象

activity_regularizer：施加在输出上的正则项，为Regularizer对象

kernel_constraints：施加在权重上的约束项，为Constraints对象

bias_constraints：施加在偏置上的约束项，为Constraints对象

输入：

形如(nb_samples, …, input_shape[1])的nD张量，最常见的情况为(nb_samples, input_dim)的2D张量

输出：

形如(nb_samples, …, units)的nD张量，最常见的情况(nb_samples, output_dim)的2D张量

Activation层

keras.layers.core.Activation(activation)

激活层对一个层的输出施加激活函数。

Dropout层

keras.layers.core.Dropout(rate, noise_shape=None, seed=None)

为输入数据施加Dropout。Dropout将在训练过程中每次更新参数时随机断开一定百分比（rate）的输入神经元，Dropout层用于防止过拟合。

参数：

rate：0~1的浮点数，控制需要断开的神经元的比例

noise_shape：整数张量，为将要应用在输入上的二值Dropout mask的shape，例如你的输入为(batch_size, timesteps, features)，并且你希望在各个时间步上的Dropout mask都相同，则可传入noise_shape=(batch_size, 1, features)。

seed：整数，使用的随机数种子

Flatten层

keras.layers.core.Flatten()

Flatten层用来将输入“压平”，即把多维的输入一维化，常用在从卷积层到全连接层的过渡。Flatten不影响batch的大小。

例子：

model = Sequential()
model.add(Convolution2D(64, 3, 3,
border_mode='same',
input_shape=(3, 32, 32)))
# now: model.output_shape == (None, 64, 32, 32)

model.add(Flatten())
# now: model.output_shape == (None, 65536)

Reshape层

keras.layers.core.Reshape(target_shape)

Reshape层用来将输入shape转换为特定的shape。

例子：

# as first layer in a Sequential model
model = Sequential()
model.add(Reshape((3, 4), input_shape=(12,)))
# now: model.output_shape == (None, 3, 4)
# note: `None` is the batch dimension

# as intermediate layer in a Sequential model
model.add(Reshape((6, 2)))
# now: model.output_shape == (None, 6, 2)

# also supports shape inference using `-1` as dimension
model.add(Reshape((-1, 2, 2)))
# now: model.output_shape == (None, 3, 2, 2)

Permute层

keras.layers.core.Permute(dims)

Permute层将输入的维度按照给定模式进行重排，例如，当需要将RNN和CNN网络连接时，可能会用到该层。

参数：

dims：整数tuple，指定重排的模式，不包含样本数的维度。重拍模式的下标从1开始。例如（2，1）代表将输入的第二个维度重拍到输出的第一个维度，而将输入的第一个维度重排到第二个维度

例子

model = Sequential()
model.add(Permute((2, 1), input_shape=(10, 64)))
# now: model.output_shape == (None, 64, 10)
# note: `None` is the batch dimension

RepeatVector层

keras.layers.core.RepeatVector(n)

RepeatVector层将输入重复n次

例子

model = Sequential()
model.add(Dense(32, input_dim=32))
# now: model.output_shape == (None, 32)
# note: `None` is the batch dimension

model.add(RepeatVector(3))
# now: model.output_shape == (None, 3, 32)

Lambda层

keras.layers.core.Lambda(function, output_shape=None, mask=None, arguments=None)

本函数用以对上一层的输出施以任何Theano/TensorFlow表达式

例子

# add a x -> x^2 layer
model.add(Lambda(lambda x: x ** 2))

# add a layer that returns the concatenation
# of the positive part of the input and
# the opposite of the negative part

def antirectifier(x):
x -= K.mean(x, axis=1, keepdims=True)
x = K.l2_normalize(x, axis=1)
pos = K.relu(x)
neg = K.relu(-x)
return K.concatenate([pos, neg], axis=1)

def antirectifier_output_shape(input_shape):
shape = list(input_shape)
assert len(shape) == 2  # only valid for 2D tensors
shape[-1] *= 2
return tuple(shape)

model.add(Lambda(antirectifier,
output_shape=antirectifier_output_shape))

ActivityRegularizer层

keras.layers.core.ActivityRegularization(l1=0.0, l2=0.0)

经过本层的数据不会有任何变化，但会基于其激活值更新损失函数值

参数

Masking层

keras.layers.core.Masking(mask_value=0.0)

使用给定的值对输入的序列信号进行“屏蔽”，用以定位需要跳过的时间步

对于输入张量的时间步，即输入张量的第1维度（维度从0开始算，见例子），如果输入张量在该时间步上都等于mask_value，则该时间步将在模型接下来的所有层（只要支持masking）被跳过（屏蔽）。

如果模型接下来的一些层不支持masking，却接受到masking过的数据，则抛出异常。

例子

考虑输入数据x是一个形如(samples,timesteps,features)的张量，现将其送入LSTM层。因为你缺少时间步为3和5的信号，所以你希望将其掩盖。这时候应该：

赋值x[:,3,:] = 0.，x[:,5,:] = 0.

在LSTM层之前插入mask_value=0.的Masking层

model = Sequential()
model.add(Masking(mask_value=0., input_shape=(timesteps, features)))
model.add(LSTM(32))

卷积层

Conv1D层

keras.layers.convolutional.Conv1D(filters, kernel_size, strides=1, padding='valid', dilation_rate=1, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)

一维卷积层（即时域卷积），用以在一维输入信号上进行邻域滤波。当使用该层作为首层时，需要提供关键字参数input_shape。例如(10,128)代表一个长为10的序列，序列中每个信号为128向量。而(None, 128)代表变长的128维向量序列。

该层生成将输入信号与卷积核按照单一的空域（或时域）方向进行卷积。如果use_bias=True，则还会加上一个偏置项，若activation不为None，则输出为经过激活函数的输出。

【Tips】可以将Convolution1D看作Convolution2D的快捷版，对例子中（10，32）的信号进行1D卷积相当于对其进行卷积核为（filter_length, 32）的2D卷积。

Conv2D层

keras.layers.convolutional.Conv2D(filters, kernel_size, strides=(1, 1), padding='valid', data_format=None, dilation_rate=(1, 1), activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)

二维卷积层，即对图像的空域卷积。该层对二维输入进行滑动窗卷积，当使用该层作为第一层时，应提供input_shape参数。例如input_shape = (128,128,3)代表128*128的彩色RGB图像（data_format=’channels_last’）

SeparableConv2D层

keras.layers.convolutional.SeparableConv2D(filters, kernel_size, strides=(1, 1), padding='valid', data_format=None, depth_multiplier=1, activation=None, use_bias=True, depthwise_initializer='glorot_uniform', pointwise_initializer='glorot_uniform', bias_initializer='zeros', depthwise_regularizer=None, pointwise_regularizer=None, bias_regularizer=None, activity_regularizer=None, depthwise_constraint=None, pointwise_constraint=None, bias_constraint=None)

该层是在深度方向上的可分离卷积。

可分离卷积首先按深度方向进行卷积（对每个输入通道分别卷积），然后逐点进行卷积，将上一步的卷积结果混合到输出通道中。参数depth_multiplier控制了在depthwise卷积（第一步）的过程中，每个输入通道信号产生多少个输出通道。

直观来说，可分离卷积可以看做讲一个卷积核分解为两个小的卷积核，或看作Inception模块的一种极端情况。

当使用该层作为第一层时，应提供input_shape参数。例如input_shape = (3,128,128)代表128*128的彩色RGB图像

Conv2DTranspose层

keras.layers.convolutional.Conv2DTranspose(filters, kernel_size, strides=(1, 1), padding='valid', data_format=None, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)

该层是转置的卷积操作（反卷积）。需要反卷积的情况通常发生在用户想要对一个普通卷积的结果做反方向的变换。例如，将具有该卷积层输出shape的tensor转换为具有该卷积层输入shape的tensor。同时保留与卷积层兼容的连接模式。

当使用该层作为第一层时，应提供input_shape参数。例如input_shape = (3,128,128)代表128*128的彩色RGB图像

Conv3D层

keras.layers.convolutional.Conv3D(filters, kernel_size, strides=(1, 1, 1), padding='valid', data_format=None, dilation_rate=(1, 1, 1), activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)

三维卷积对三维的输入进行滑动窗卷积，当使用该层作为第一层时，应提供input_shape参数。例如input_shape = (3,10,128,128)代表对10帧128*128的彩色RGB图像进行卷积。数据的通道位置仍然有data_format参数指定。

Cropping1D层

keras.layers.convolutional.Cropping1D(cropping=(1, 1))

在时间轴（axis1）上对1D输入（即时间序列）进行裁剪

Cropping2D层

keras.layers.convolutional.Cropping2D(cropping=((0, 0), (0, 0)), data_format=None)

对2D输入（图像）进行裁剪，将在空域维度，即宽和高的方向上裁剪

Cropping3D层

keras.layers.convolutional.Cropping3D(cropping=((1, 1), (1, 1), (1, 1)), data_format=None)

对2D输入（图像）进行裁剪

UpSampling1D层

keras.layers.convolutional.UpSampling1D(size=2)

在时间轴上，将每个时间步重复length次

UpSampling2D层

keras.layers.convolutional.UpSampling2D(size=(2, 2), data_format=None)

将数据的行和列分别重复size[0]和size[1]次

UpSampling3D层

keras.layers.convolutional.UpSampling3D(size=(2, 2, 2), data_format=None)

将数据的三个维度上分别重复size[0]、size[1]和ize[2]次

本层目前只能在使用Theano为后端时可用

ZeroPadding1D层

keras.layers.convolutional.ZeroPadding1D(padding=1)

对1D输入的首尾端（如时域序列）填充0，以控制卷积以后向量的长度

ZeroPadding2D层

keras.layers.convolutional.ZeroPadding2D(padding=(1, 1), data_format=None)

对2D输入（如图片）的边界填充0，以控制卷积以后特征图的大小

ZeroPadding3D层

keras.layers.convolutional.ZeroPadding3D(padding=(1, 1, 1), data_format=None)

将数据的三个维度上填充0

本层目前只能在使用Theano为后端时可用

池化层

MaxPooling1D层

keras.layers.pooling.MaxPooling1D(pool_size=2, strides=None, padding='valid')

对时域1D信号进行最大值池化

参数

pool_size：整数，池化窗口大小

strides：整数或None，下采样因子，例如设2将会使得输出shape为输入的一半，若为None则默认值为pool_size。

padding：‘valid’或者‘same’

输入shape

形如（samples，steps，features）的3D张量

输出shape

形如（samples，downsampled_steps，features）的3D张量

MaxPooling2D层

keras.layers.pooling.MaxPooling2D(pool_size=(2, 2), strides=None, padding='valid', data_format=None)

为空域信号施加最大值池化

参数

pool_size：整数或长为2的整数tuple，代表在两个方向（竖直，水平）上的下采样因子，如取（2，2）将使图片在两个维度上均变为原长的一半。为整数意为各个维度值相同且为该数字。

strides：整数或长为2的整数tuple，或者None，步长值。

border_mode：‘valid’或者‘same’

data_format：字符串，“channels_first”或“channels_last”之一，代表图像的通道维的位置。该参数是Keras 1.x中image_dim_ordering，“channels_last”对应原本的“tf”，“channels_first”对应原本的“th”。以128x128的RGB图像为例，“channels_first”应将数据组织为（3,128,128），而“channels_last”应将数据组织为（128,128,3）。该参数的默认值是~/.keras/keras.json中设置的值，若从未设置过，则为“channels_last”。

输入shape

‘channels_first’模式下，为形如（samples，channels, rows，cols）的4D张量

‘channels_last’模式下，为形如（samples，rows, cols，channels）的4D张量

输出shape

‘channels_first’模式下，为形如（samples，channels, pooled_rows,

pooled_cols）的4D张量

‘channels_last’模式下，为形如（samples，pooled_rows,

pooled_cols，channels）的4D张量

MaxPooling3D层

keras.layers.pooling.MaxPooling3D(pool_size=(2, 2, 2), strides=None, padding='valid', data_format=None)

为3D信号（空域或时空域）施加最大值池化

本层目前只能在使用Theano为后端时可用

AveragePooling1D层

keras.layers.pooling.AveragePooling1D(pool_size=2, strides=None, padding='valid')

对时域1D信号进行平均值池化

AveragePooling2D层

keras.layers.pooling.AveragePooling2D(pool_size=(2, 2), strides=None, padding='valid', data_format=None)

为空域信号施加平均值池化

AveragePooling3D层

keras.layers.pooling.AveragePooling3D(pool_size=(2, 2, 2), strides=None, padding='valid', data_format=None)

为3D信号（空域或时空域）施加平均值池化

本层目前只能在使用Theano为后端时可用

GlobalMaxPooling1D层

keras.layers.pooling.GlobalMaxPooling1D()

对于时间信号的全局最大池化

GlobalAveragePooling1D层

keras.layers.pooling.GlobalAveragePooling1D()

为时域信号施加全局平均值池化

GlobalMaxPooling2D层

keras.layers.pooling.GlobalMaxPooling2D(dim_ordering='default')

为空域信号施加全局最大值池化

GlobalAveragePooling2D层

keras.layers.pooling.GlobalAveragePooling2D(dim_ordering='default')

为空域信号施加全局平均值池化

局部连接层LocallyConnceted

LocallyConnected1D层

keras.layers.local.LocallyConnected1D(filters, kernel_size, strides=1, padding='valid', data_format=None, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)

LocallyConnected1D层与Conv1D工作方式类似，唯一的区别是不进行权值共享。即施加在不同输入位置的滤波器是不一样的。

LocallyConnected2D层

keras.layers.local.LocallyConnected2D(filters, kernel_size, strides=(1, 1), padding='valid', data_format=None, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)

LocallyConnected2D层与Convolution2D工作方式类似，唯一的区别是不进行权值共享。即施加在不同输入patch的滤波器是不一样的，当使用该层作为模型首层时，需要提供参数input_dim或input_shape参数。参数含义参考Convolution2D。

例子

# apply a 3x3 unshared weights convolution with 64 output filters on a 32x32 image
# with `data_format="channels_last"`:
model = Sequential()
model.add(LocallyConnected2D(64, (3, 3), input_shape=(32, 32, 3)))
# now model.output_shape == (None, 30, 30, 64)
# notice that this layer will consume (30*30)*(3*3*3*64) + (30*30)*64 parameters

# add a 3x3 unshared weights convolution on top, with 32 output filters:
model.add(LocallyConnected2D(32, (3, 3)))
# now model.output_shape == (None, 28, 28, 32)

循环层Recurrent

Recurrent层

keras.layers.recurrent.Recurrent(return_sequences=False, go_backwards=False, stateful=False, unroll=False, implementation=0)

这是循环层的抽象类，请不要在模型中直接应用该层（因为它是抽象类，无法实例化任何对象）。请使用它的子类LSTM，GRU或SimpleRNN。

所有的循环层（LSTM,GRU,SimpleRNN）都服从本层的性质，并接受本层指定的所有关键字参数。

例子

# as the first layer in a Sequential model
model = Sequential()
model.add(LSTM(32, input_shape=(10, 64)))
# now model.output_shape == (None, 32)
# note: `None` is the batch dimension.

# the following is identical:
model = Sequential()
model.add(LSTM(32, input_dim=64, input_length=10))

# for subsequent layers, no need to specify the input size:
model.add(LSTM(16))

# to stack recurrent layers, you must use return_sequences=True
# on any recurrent layer that feeds into another recurrent layer.
# note that you only need to specify the input size on the first layer.
model = Sequential()
model.add(LSTM(64, input_dim=64, input_length=10, return_sequences=True))
model.add(LSTM(32, return_sequences=True))
model.add(LSTM(10))
SimpleRNN层

keras.layers.recurrent.SimpleRNN(units, activation='tanh', use_bias=True, kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', bias_initializer='zeros', kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0)

全连接RNN网络，RNN的输出会被回馈到输入

GRU层

keras.layers.recurrent.GRU(units, activation='tanh', recurrent_activation='hard_sigmoid', use_bias=True, kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', bias_initializer='zeros', kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0)

门限循环单元（详见参考文献）

LSTM层

keras.layers.recurrent.LSTM(units, activation='tanh', recurrent_activation='hard_sigmoid', use_bias=True, kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', bias_initializer='zeros', unit_forget_bias=True, kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0)

Keras长短期记忆模型

嵌入层 Embedding

Embedding层

keras.layers.embeddings.Embedding(input_dim, output_dim, embeddings_initializer='uniform', embeddings_regularizer=None, activity_regularizer=None, embeddings_constraint=None, mask_zero=False, input_length=None)

嵌入层将正整数（下标）转换为具有固定大小的向量，如[[4],[20]]->[[0.25,0.1],[0.6,-0.2]]

Embedding层只能作为模型的第一层

例子

model = Sequential()
model.add(Embedding(1000, 64, input_length=10))
# the model will take as input an integer matrix of size (batch, input_length).
# the largest integer (i.e. word index) in the input should be no larger than 999 (vocabulary size).
# now model.output_shape == (None, 10, 64), where None is the batch dimension.

input_array = np.random.randint(1000, size=(32, 10))

model.compile('rmsprop', 'mse')
output_array = model.predict(input_array)
assert output_array.shape == (32, 10, 64)

Merge层

Merge层提供了一系列用于融合两个层或两个张量的层对象和方法。以大写首字母开头的是Layer类，以小写字母开头的是张量的函数。小写字母开头的张量函数在内部实际上是调用了大写字母开头的层。

Add

keras.layers.merge.Add()

将Layer that adds a list of inputs.

该层接收一个列表的同shape张量，并返回它们的和，shape不变。

Multiply

keras.layers.merge.Multiply()

该层接收一个列表的同shape张量，并返回它们的逐元素积的张量，shape不变。

Average

keras.layers.merge.Average()

该层接收一个列表的同shape张量，并返回它们的逐元素均值，shape不变。

Maximum

keras.layers.merge.Maximum()

该层接收一个列表的同shape张量，并返回它们的逐元素最大值，shape不变。

Concatenate

keras.layers.merge.Concatenate(axis=-1)

该层接收一个列表的同shape张量，并返回它们的按照给定轴相接构成的向量。

Dot

keras.layers.merge.Dot(axes, normalize=False)

计算两个tensor中样本的张量乘积。例如，如果两个张量a和b的shape都为（batch_size, n），则输出为形如（batch_size,1）的张量，结果张量每个batch的数据都是a[i,:]和b[i,:]的矩阵（向量）点积。

add

add(inputs)

Add层的函数式包装

multiply

multiply(inputs)

Multiply的函数包装

average

average(inputs)

Average的函数包装

maximum

maximum(inputs)

Maximum的函数包装

concatenate

concatenate(inputs, axis=-1))

Concatenate的函数包装

dot

dot(inputs, axes, normalize=False)

Dot的函数包装

高级激活层Advanced Activation

LeakyReLU层

keras.layers.advanced_activations.LeakyReLU(alpha=0.3)

LeakyRelU是修正线性单元（Rectified Linear Unit，ReLU）的特殊版本，当不激活时，LeakyReLU仍然会有非零输出值，从而获得一个小梯度，避免ReLU可能出现的神经元“死亡”现象。即，f(x)=alpha * x for x < 0, f(x) = x for x>=0

PReLU层

keras.layers.advanced_activations.PReLU(alpha_initializer='zeros', alpha_regularizer=None, alpha_constraint=None, shared_axes=None)

该层为参数化的ReLU（Parametric ReLU），表达式是：f(x) = alpha * x for x < 0, f(x) = x for x>=0，此处的alpha为一个与xshape相同的可学习的参数向量。

ELU层

keras.layers.advanced_activations.ELU(alpha=1.0)

ELU层是指数线性单元（Exponential Linera Unit），表达式为：该层为参数化的ReLU（Parametric ReLU），表达式是：f(x) = alpha * (exp(x) - 1.) for x < 0, f(x) = x for x>=0

ThresholdedReLU层

keras.layers.advanced_activations.ThresholdedReLU(theta=1.0)

该层是带有门限的ReLU，表达式是：f(x) = x for x > theta,f(x) = 0 otherwise

规范化BatchNormalization

BatchNormalization层

keras.layers.normalization.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', moving_mean_initializer='zeros', moving_variance_initializer='ones', beta_regularizer=None, gamma_regularizer=None, beta_constraint=None, gamma_constraint=None)

该层在每个batch上将前一层的激活值重新规范化，即使得其输出数据的均值接近0，其标准差接近1。

噪声层Noise

GaussianNoise层

keras.layers.noise.GaussianNoise(stddev)

为数据施加0均值，标准差为stddev的加性高斯噪声。该层在克服过拟合时比较有用，你可以将它看作是随机的数据提升。高斯噪声是需要对输入数据进行破坏时的自然选择。

因为这是一个起正则化作用的层，该层只在训练时才有效。

GaussianDropout层

keras.layers.noise.GaussianDropout(rate)

为层的输入施加以1为均值，标准差为sqrt(rate/(1-rate)的乘性高斯噪声。

因为这是一个起正则化作用的层，该层只在训练时才有效。

包装器Wrapper

TimeDistributed包装器

keras.layers.wrappers.TimeDistributed(layer)

该包装器可以把一个层应用到输入的每一个时间步上

Bidirectional包装器

keras.layers.wrappers.Bidirectional(layer, merge_mode='concat', weights=None)

双向RNN包装器

例子

model = Sequential()
model.add(Bidirectional(LSTM(10, return_sequences=True), input_shape=(5, 10)))
model.add(Bidirectional(LSTM(10)))
model.add(Dense(5))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')

编写自己的层

对于简单的定制操作，我们或许可以通过使用layers.core.Lambda层来完成。但对于任何具有可训练权重的定制层，你应该自己来实现。

这里是一个Keras2的层应该具有的框架结构(如果你的版本更旧请升级)，要定制自己的层，你需要实现下面三个方法

build(input_shape)：这是定义权重的方法，可训练的权应该在这里被加入列表`self.trainable_weights中。其他的属性还包括self.non_trainabe_weights（列表）和self.updates（需要更新的形如（tensor, new_tensor）的tuple的列表）。你可以参考BatchNormalization层的实现来学习如何使用上面两个属性。这个方法必须设置self.built = True，可通过调用super([layer],self).build()实现

call(x)：这是定义层功能的方法，除非你希望你写的层支持masking，否则你只需要关心call的第一个参数：输入张量

compute_output_shape(input_shape)：如果你的层修改了输入数据的shape，你应该在这里指定shape变化的方法，这个函数使得Keras可以做自动shape推断

from keras import backend as K
from keras.engine.topology import Layer
import numpy as np

class MyLayer(Layer):

def __init__(self, output_dim, **kwargs):
self.output_dim = output_dim
super(MyLayer, self).__init__(**kwargs)

def build(self, input_shape):
# Create a trainable weight variable for this layer.
self.kernel = self.add_weight(shape=(input_shape[1], self.output_dim),
initializer='uniform',
trainable=True)
super(MyLayer, self).build(input_shape)  # Be sure to call this somewhere!

def call(self, x):
return K.dot(x, self.kernel)

def compute_output_shape(self, input_shape):
return (input_shape[0], self.output_dim)

现存的Keras层代码可以为你的实现提供良好参考，阅读源代码吧！

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航