LSTM cell with layer normalization and recurrent dropout.

layer_norm_lstm_cell(
  object,
  units,
  activation = "tanh",
  recurrent_activation = "sigmoid",
  use_bias = TRUE,
  kernel_initializer = "glorot_uniform",
  recurrent_initializer = "orthogonal",
  bias_initializer = "zeros",
  unit_forget_bias = TRUE,
  kernel_regularizer = NULL,
  recurrent_regularizer = NULL,
  bias_regularizer = NULL,
  kernel_constraint = NULL,
  recurrent_constraint = NULL,
  bias_constraint = NULL,
  dropout = 0,
  recurrent_dropout = 0,
  norm_gamma_initializer = "ones",
  norm_beta_initializer = "zeros",
  norm_epsilon = 0.001,
  ...
)

Arguments

object

Model or layer object

units

Positive integer, dimensionality of the output space.

activation

Activation function to use. Default: hyperbolic tangent (`tanh`). If you pass `NULL`, no activation is applied (ie. "linear" activation: `a(x) = x`).

recurrent_activation

Activation function to use for the recurrent step. Default: sigmoid (`sigmoid`). If you pass `NULL`, no activation is applied (ie. "linear" activation: `a(x) = x`).

use_bias

Boolean, whether the layer uses a bias vector.

kernel_initializer

Initializer for the `kernel` weights matrix, used for the linear transformation of the inputs.

recurrent_initializer

Initializer for the `recurrent_kernel` weights matrix, used for the linear transformation of the recurrent state.

bias_initializer

Initializer for the bias vector.

unit_forget_bias

Boolean. If True, add 1 to the bias of the forget gate at initialization. Setting it to true will also force `bias_initializer="zeros"`. This is recommended in [Jozefowicz et al.](http://www.jmlr.org/proceedings/papers/v37/jozefowicz15.pdf)

kernel_regularizer

Regularizer function applied to the `kernel` weights matrix.

recurrent_regularizer

Regularizer function applied to the `recurrent_kernel` weights matrix.

bias_regularizer

Regularizer function applied to the bias vector.

kernel_constraint

Constraint function applied to the `kernel` weights matrix.

recurrent_constraint

Constraint function applied to the `recurrent_kernel` weights matrix.

bias_constraint

Constraint function applied to the bias vector.

dropout

Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs.

recurrent_dropout

Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state.

norm_gamma_initializer

Initializer for the layer normalization gain initial value.

norm_beta_initializer

Initializer for the layer normalization shift initial value.

norm_epsilon

Float, the epsilon value for normalization layers.

...

List, the other keyword arguments for layer creation.

Value

A tensor

Details

This class adds layer normalization and recurrent dropout to a LSTM unit. Layer normalization implementation is based on: https://arxiv.org/abs/1607.06450. "Layer Normalization" Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton and is applied before the internal nonlinearities. Recurrent dropout is based on: https://arxiv.org/abs/1603.05118 "Recurrent Dropout without Memory Loss" Stanislau Semeniuta, Aliaksei Severyn, Erhardt Barth.