Moving Average

optimizer_moving_average(
  optimizer,
  sequential_update = TRUE,
  average_decay = 0.99,
  num_updates = NULL,
  name = "MovingAverage",
  clipnorm = NULL,
  clipvalue = NULL,
  decay = NULL,
  lr = NULL
)

Arguments

optimizer

str or tf$keras$optimizers$Optimizer that will be used to compute and apply gradients.

sequential_update

Bool. If False, will compute the moving average at the same time as the model is updated, potentially doing benign data races. If True, will update the moving average after gradient updates.

average_decay

float. Decay to use to maintain the moving averages of trained variables.

num_updates

Optional count of the number of updates applied to variables.

name

Optional name for the operations created when applying gradients. Defaults to "MovingAverage".

clipnorm

is clip gradients by norm.

clipvalue

is clip gradients by value.

decay

is included for backward compatibility to allow time inverse decay of learning rate.

lr

is included for backward compatibility, recommended to use learning_rate instead.

Value

Optimizer for use with `keras::compile()`

Details

Optimizer that computes a moving average of the variables. Empirically it has been found that using the moving average of the trained parameters of a deep network is better than using its trained parameters directly. This optimizer allows you to compute this moving average and swap the variables at save time so that any code outside of the training loop will use by default the average values instead of the original ones.

Examples

if (FALSE) { opt = tf$keras$optimizers$SGD(learning_rate) opt = moving_average(opt) }