Moving Average
optimizer_moving_average( optimizer, sequential_update = TRUE, average_decay = 0.99, num_updates = NULL, name = "MovingAverage", clipnorm = NULL, clipvalue = NULL, decay = NULL, lr = NULL )
optimizer | str or tf$keras$optimizers$Optimizer that will be used to compute and apply gradients. |
---|---|
sequential_update | Bool. If False, will compute the moving average at the same time as the model is updated, potentially doing benign data races. If True, will update the moving average after gradient updates. |
average_decay | float. Decay to use to maintain the moving averages of trained variables. |
num_updates | Optional count of the number of updates applied to variables. |
name | Optional name for the operations created when applying gradients. Defaults to "MovingAverage". |
clipnorm | is clip gradients by norm. |
clipvalue | is clip gradients by value. |
decay | is included for backward compatibility to allow time inverse decay of learning rate. |
lr | is included for backward compatibility, recommended to use learning_rate instead. |
Optimizer for use with `keras::compile()`
Optimizer that computes a moving average of the variables. Empirically it has been found that using the moving average of the trained parameters of a deep network is better than using its trained parameters directly. This optimizer allows you to compute this moving average and swap the variables at save time so that any code outside of the training loop will use by default the average values instead of the original ones.
if (FALSE) { opt = tf$keras$optimizers$SGD(learning_rate) opt = moving_average(opt) }