Lookahead mechanism

lookahead_mechanism(
  optimizer,
  sync_period = 6,
  slow_step_size = 0.5,
  name = "Lookahead",
  clipnorm = NULL,
  clipvalue = NULL,
  decay = NULL,
  lr = NULL
)

Arguments

optimizer

The original optimizer that will be used to compute and apply the gradients.

sync_period

An integer. The synchronization period of lookahead. Enable lookahead mechanism by setting it with a positive value.

slow_step_size

A floating point value. The ratio for updating the slow weights.

name

Optional name for the operations created when applying gradients. Defaults to "Lookahead".

clipnorm

is clip gradients by norm.

clipvalue

is clip gradients by value.

decay

is included for backward compatibility to allow time inverse decay of learning rate.

lr

is included for backward compatibility, recommended to use learning_rate instead.

Value

Optimizer for use with `keras::compile()`

Details

The mechanism is proposed by Michael R. Zhang et.al in the paper [Lookahead Optimizer: k steps forward, 1 step back](https://arxiv.org/abs/1907.08610v1). The optimizer iteratively updates two sets of weights: the search directions for weights are chosen by the inner optimizer, while the "slow weights" are updated each k steps based on the directions of the "fast weights" and the two sets of weights are synchronized. This method improves the learning stability and lowers the variance of its inner optimizer.

Examples

if (FALSE) { opt = tf$keras$optimizers$SGD(learning_rate) opt = lookahead_mechanism(opt) }