如何计算RNN和LSTM的参数数量?

Posted by yaohong on Monday, November 23, 2020

TOC

如何计算RNN和LSTM的参数数量?

Environment:

python version: 3.7.4
pip version: 19.0.3
numpy version:1.19.4
matplotlib version:3.3.3
tensorflow version:1.14.0
keras version:2.1.5

代码如下:

from keras.layers import SimpleRNN
from keras.models import Model
from keras  import Input

inputs = Input((None, 5))
simple_rnn = SimpleRNN(4)

output = simple_rnn(inputs)  # The output has shape `[32, 4]`.
model = Model(inputs,output)
model.summary()

Output:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_4 (InputLayer)         (None, None, 5)           0         
_________________________________________________________________
simple_rnn_1 (SimpleRNN)     (None, 4)                 40        
=================================================================
Total params: 40
Trainable params: 40
Non-trainable params: 0
_________________________________________________________________

这里的simple_rnn_1中的param为40是怎么计算的呢?

1.每个输入层神经元与循环网络层神经元全连接,故从输入层到循环网络层有,4x5=20个参数;

2.每个循环网络层中的神经元会把输出转给同层其它神经元,故循环神经元有4x4=16个参数;

3.每个循环网络层中的神经元会配一个bias,因此还有额外4x1=4个参数;

公式为:(units x units) + (units * inputs) + units_bias

units为神经元数量; inputs为输入数量; units_bias为偏移值数量,其等于神经元数量;

于是共有20+16+4=40个参数;

LSTM 参数计算

代码:

from keras.layers import LSTM,Input
from keras.models import Model

inputs = Input(shape=(None,5));# Input() is used to instantiate a Keras tensor.
x = LSTM(4,return_sequences = True)(inputs);
model = Model(inputs,x)
model.summary();

summary output:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, None, 5)           0         
_________________________________________________________________
lstm_1 (LSTM)                (None, None, 4)           160       
=================================================================
Total params: 160
Trainable params: 160
Non-trainable params: 0
_________________________________________________________________

问summary的lstm_1层对应的param为160是怎么计算的呢?

答:

如下图: LSTM Gates

相比于简单的RNN,LSTM每个细胞单元会有3个门:更新门、遗忘门、和输出门;

更新门:有一组w(权重)和bias偏移 输出门:有一组w(权重)和bias偏移 遗忘门:有一组w(权重)和bias偏移 + tanh的一组(权重)和bias偏移

即每个细胞的参数比RNN细胞多了4组参数,因此LSTM的参数要在RNN参数的基础上乘以4.

RNN共有20+16+4=40个参数;

LSTM则有40x4=160个参数;

公式为:[(units x units) + (units * inputs) + units_bias ] * 4

units为神经元数量; inputs为输入数量; units_bias为偏移值数量,其等于神经元数量;

REFERENCE: simple_rnn

how-to-calculate-the-number-of-parameters-of-an-lstm-network

long-short-term-memory-lstm-KXoay

「点个赞」

Yaohong

点个赞

使用微信扫描二维码完成支付