Tensorflow中GRU和LSTM的权重初始化
2017-10-18 08:47
537 查看
GRU和LSTM权重初始化
在编写模型的时候,有时候你希望RNN用某种特别的方式初始化RNN的权重矩阵,比如xaiver或者
orthogonal,这时候呢,只需要:
12345678910 | cell = LSTMCell if self.args.use_lstm else GRUCellwith tf.variable_scope(initializer=tf.orthogonal_initializer()): input = tf.nn.embedding_lookup(embedding, questions_bt) cell_fw = MultiRNNCell(cells=[cell(hidden_size) for _ in range(num_layers)]) cell_bw = MultiRNNCell(cells=[cell(hidden_size) for _ in range(num_layers)]) outputs, last_states = tf.nn.bidirectional_dynamic_rnn(cell_bw=cell_bw, cell_fw=cell_fw, dtype="float32", inputs=input, swap_memory=True) |
123456 | with vs.variable_scope("fw") as fw_scope: output_fw, output_state_fw = dynamic_rnn( cell=cell_fw, inputs=inputs, sequence_length=sequence_length, initial_state=initial_state_fw, dtype=dtype, parallel_iterations=parallel_iterations, swap_memory=swap_memory, time_major=time_major, scope=fw_scope) |
12345678 | (outputs, final_state) = _dynamic_rnn_loop( cell, inputs, state, parallel_iterations=parallel_iterations, swap_memory=swap_memory, sequence_length=sequence_length, dtype=dtype) |
1 | call_cell = lambda: cell(input_t, state) |
12345678910111213141516 | def __call__(self, inputs, state, scope=None): """Gated recurrent unit (GRU) with nunits cells.""" with _checked_scope(self, scope or "gru_cell", reuse=self._reuse): with vs.variable_scope("gates"): # Reset gate and update gate. # We start with bias of 1.0 to not reset and not update. value = sigmoid(_linear( [inputs, state], 2 * self._num_units, True, 1.0)) r, u = array_ops.split( value=value, num_or_size_splits=2, axis=1) with vs.variable_scope("candidate"): c = self._activation(_linear([inputs, r * state], self._num_units, True)) new_h = u * state + (1 - u) * c return new_h, new_h |
12345678910 | with vs.variable_scope(scope) as outer_scope: weights = vs.get_variable( _WEIGHTS_VARIABLE_NAME, [total_arg_size, output_size], dtype=dtype)# ....some code with vs.variable_scope(outer_scope) as inner_scope: inner_scope.set_partitioner(None) biases = vs.get_variable( _BIAS_VARIABLE_NAME, [output_size], dtype=dtype, initializer=init_ops.constant_initializer(bias_start, dtype=dtype)) |
好的,经过我们的测试,嵌套的variable_scope如果内层没有初始化方法,那么以外层的为准。所以我们的结论呼之欲出:
RNN的两个变种在Tensorflow版本1.1.0的实现,只需要调用它们时在variable_scope加上初始化方法,它们的权重就会以该方式初始化;
但是无论是LSTM还是GRU,都没有提供偏置的初始化方法(不过好像可以定义初始值)。
原文地址: http://cairohy.github.io/2017/05/05/ml-coding-summarize/Tensorflow%E4%B8%ADGRU%E5%92%8CLSTM%E7%9A%84%E6%9D%83%E9%87%8D%E5%88%9D%E5%A7%8B%E5%8C%96/
相关文章推荐
- tensorflow学习笔记(六):LSTM 与 GRU
- tensorflow学习笔记:LSTM 与 GRU
- Tensorflow 中网络准确度不变,权重初始化NaN问题
- TensorFlow中权重的随机初始化的方法
- TensorFlow中权重的随机初始化
- tensorflow 权重初始化
- tensorflow预训练简单模型及权重文件复用初始化复杂模型
- TensorFlow中的LSTM源码理解与二次开发
- 将Alexnet的拓扑结构和权重从Caffe转换成tensorflow
- Tensorflow[LSTM]
- 解读tensorflow之rnn 的示例 ptb_word_lm.py 这两天想搞清楚用tensorflow来实现rnn/lstm如何做,但是google了半天,发现tf在rnn方面的实现代码或者教程
- Tensorflow学习: RNN-LSTM应用于MNIST数据分类
- LSTM文本分类(tensorflow)
- 利用 TensorFlow 高级 API Keras 实现 MLP,CNN,LSTM
- tensorflow 变量创建,初始化,共享
- 神经网络权重初始化问题
- Tensorflow中的RNN以及LSTM
- tensorflow(1)-初始化
- tensorflow 输出权重到csv或txt的实例
- 对Tensorflow中的变量初始化函数详解