Lookahead mechanism
lookahead_mechanism( optimizer, sync_period = 6, slow_step_size = 0.5, name = "Lookahead", clipnorm = NULL, clipvalue = NULL, decay = NULL, lr = NULL )
optimizer | The original optimizer that will be used to compute and apply the gradients. |
---|---|
sync_period | An integer. The synchronization period of lookahead. Enable lookahead mechanism by setting it with a positive value. |
slow_step_size | A floating point value. The ratio for updating the slow weights. |
name | Optional name for the operations created when applying gradients. Defaults to "Lookahead". |
clipnorm | is clip gradients by norm. |
clipvalue | is clip gradients by value. |
decay | is included for backward compatibility to allow time inverse decay of learning rate. |
lr | is included for backward compatibility, recommended to use learning_rate instead. |
Optimizer for use with `keras::compile()`
The mechanism is proposed by Michael R. Zhang et.al in the paper [Lookahead Optimizer: k steps forward, 1 step back](https://arxiv.org/abs/1907.08610v1). The optimizer iteratively updates two sets of weights: the search directions for weights are chosen by the inner optimizer, while the "slow weights" are updated each k steps based on the directions of the "fast weights" and the two sets of weights are synchronized. This method improves the learning stability and lowers the variance of its inner optimizer.
if (FALSE) { opt = tf$keras$optimizers$SGD(learning_rate) opt = lookahead_mechanism(opt) }