LSTM cell with layer normalization and recurrent dropout.
layer_norm_lstm_cell( object, units, activation = "tanh", recurrent_activation = "sigmoid", use_bias = TRUE, kernel_initializer = "glorot_uniform", recurrent_initializer = "orthogonal", bias_initializer = "zeros", unit_forget_bias = TRUE, kernel_regularizer = NULL, recurrent_regularizer = NULL, bias_regularizer = NULL, kernel_constraint = NULL, recurrent_constraint = NULL, bias_constraint = NULL, dropout = 0, recurrent_dropout = 0, norm_gamma_initializer = "ones", norm_beta_initializer = "zeros", norm_epsilon = 0.001, ... )
object | Model or layer object |
---|---|
units | Positive integer, dimensionality of the output space. |
activation | Activation function to use. Default: hyperbolic tangent (`tanh`). If you pass `NULL`, no activation is applied (ie. "linear" activation: `a(x) = x`). |
recurrent_activation | Activation function to use for the recurrent step. Default: sigmoid (`sigmoid`). If you pass `NULL`, no activation is applied (ie. "linear" activation: `a(x) = x`). |
use_bias | Boolean, whether the layer uses a bias vector. |
kernel_initializer | Initializer for the `kernel` weights matrix, used for the linear transformation of the inputs. |
recurrent_initializer | Initializer for the `recurrent_kernel` weights matrix, used for the linear transformation of the recurrent state. |
bias_initializer | Initializer for the bias vector. |
unit_forget_bias | Boolean. If True, add 1 to the bias of the forget gate at initialization. Setting it to true will also force `bias_initializer="zeros"`. This is recommended in [Jozefowicz et al.](http://www.jmlr.org/proceedings/papers/v37/jozefowicz15.pdf) |
kernel_regularizer | Regularizer function applied to the `kernel` weights matrix. |
recurrent_regularizer | Regularizer function applied to the `recurrent_kernel` weights matrix. |
bias_regularizer | Regularizer function applied to the bias vector. |
kernel_constraint | Constraint function applied to the `kernel` weights matrix. |
recurrent_constraint | Constraint function applied to the `recurrent_kernel` weights matrix. |
bias_constraint | Constraint function applied to the bias vector. |
dropout | Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs. |
recurrent_dropout | Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state. |
norm_gamma_initializer | Initializer for the layer normalization gain initial value. |
norm_beta_initializer | Initializer for the layer normalization shift initial value. |
norm_epsilon | Float, the epsilon value for normalization layers. |
... | List, the other keyword arguments for layer creation. |
A tensor
This class adds layer normalization and recurrent dropout to a LSTM unit. Layer normalization implementation is based on: https://arxiv.org/abs/1607.06450. "Layer Normalization" Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton and is applied before the internal nonlinearities. Recurrent dropout is based on: https://arxiv.org/abs/1603.05118 "Recurrent Dropout without Memory Loss" Stanislau Semeniuta, Aliaksei Severyn, Erhardt Barth.