TOC
如何计算RNN和LSTM的参数数量?
Environment:
python version: 3.7.4
pip version: 19.0.3
numpy version:1.19.4
matplotlib version:3.3.3
tensorflow version:1.14.0
keras version:2.1.5
代码如下:
from keras.layers import SimpleRNN
from keras.models import Model
from keras import Input
inputs = Input((None, 5))
simple_rnn = SimpleRNN(4)
output = simple_rnn(inputs) # The output has shape `[32, 4]`.
model = Model(inputs,output)
model.summary()
Output:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_4 (InputLayer) (None, None, 5) 0
_________________________________________________________________
simple_rnn_1 (SimpleRNN) (None, 4) 40
=================================================================
Total params: 40
Trainable params: 40
Non-trainable params: 0
_________________________________________________________________
这里的simple_rnn_1中的param为40是怎么计算的呢?
1.每个输入层神经元与循环网络层神经元全连接,故从输入层到循环网络层有,4x5=20个参数;
2.每个循环网络层中的神经元会把输出转给同层其它神经元,故循环神经元有4x4=16个参数;
3.每个循环网络层中的神经元会配一个bias,因此还有额外4x1=4个参数;
公式为:(units x units) + (units * inputs) + units_bias
units
为神经元数量;
inputs
为输入数量;
units_bias
为偏移值数量,其等于神经元数量;
于是共有20+16+4=40个参数;
LSTM 参数计算
代码:
from keras.layers import LSTM,Input
from keras.models import Model
inputs = Input(shape=(None,5));# Input() is used to instantiate a Keras tensor.
x = LSTM(4,return_sequences = True)(inputs);
model = Model(inputs,x)
model.summary();
summary output:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, None, 5) 0
_________________________________________________________________
lstm_1 (LSTM) (None, None, 4) 160
=================================================================
Total params: 160
Trainable params: 160
Non-trainable params: 0
_________________________________________________________________
问summary的lstm_1层对应的param为160是怎么计算的呢?
答:
如下图:
相比于简单的RNN,LSTM每个细胞单元会有3个门:更新门、遗忘门、和输出门;
更新门:有一组w(权重)和bias偏移 输出门:有一组w(权重)和bias偏移 遗忘门:有一组w(权重)和bias偏移 + tanh的一组(权重)和bias偏移
即每个细胞的参数比RNN细胞多了4组参数,因此LSTM的参数要在RNN参数的基础上乘以4.
RNN共有20+16+4=40个参数;
LSTM则有40x4=160个参数;
公式为:[(units x units) + (units * inputs) + units_bias ] * 4
units
为神经元数量;
inputs
为输入数量;
units_bias
为偏移值数量,其等于神经元数量;
REFERENCE: simple_rnn
how-to-calculate-the-number-of-parameters-of-an-lstm-network
long-short-term-memory-lstm-KXoay
「点个赞」
点个赞
使用微信扫描二维码完成支付
