Group normalization layer
layer_group_normalization( object, groups = 2, axis = -1, epsilon = 0.001, center = TRUE, scale = TRUE, beta_initializer = "zeros", gamma_initializer = "ones", beta_regularizer = NULL, gamma_regularizer = NULL, beta_constraint = NULL, gamma_constraint = NULL, ... )
object | Model or layer object |
---|---|
groups | Integer, the number of groups for Group Normalization. Can be in the range [1, N] where N is the input dimension. The input dimension must be divisible by the number of groups. |
axis | Integer, the axis that should be normalized. |
epsilon | Small float added to variance to avoid dividing by zero. |
center | If TRUE, add offset of beta to normalized tensor. If False, beta is ignored. |
scale | If TRUE, multiply by gamma. If False, gamma is not used. |
beta_initializer | Initializer for the beta weight. |
gamma_initializer | Initializer for the gamma weight. |
beta_regularizer | Optional regularizer for the beta weight. |
gamma_regularizer | Optional regularizer for the gamma weight. |
beta_constraint | Optional constraint for the beta weight. |
gamma_constraint | Optional constraint for the gamma weight. |
... | additional parameters to pass |
A tensor
Group Normalization divides the channels into groups and computes within each group the mean and variance for normalization. Empirically, its accuracy is more stable than batch norm in a wide range of small batch sizes, if learning rate is adjusted linearly with batch sizes. Relation to Layer Normalization: If the number of groups is set to 1, then this operation becomes identical to Layer Normalization. Relation to Instance Normalization: If the number of groups is set to the input dimension (number of groups is equal to number of channels), then this operation becomes identical to Instance Normalization.