The World’s Largest Online Community for Developers

'; tensorflow - How to do cyclic momentum just like cyclic learning rate? - LavOzs.Com

I want the momentum to cycle from 0.95 to 0.85. I have seen some Keras implementation where you can K.set(model.opt.momentum, value) in the callback. However, in TensorFlow there is no momentum attribute in SGD optimizer. I'm using TensorFlow 2.2

Will this code below work? How can I know if it work?

class momsec(tf.keras.callbacks.Callback):

    def __init__(self, initial_mom,maximal_mom,step_size):

        self.initial_mom = initial_mom
        self.maximal_mom = maximal_mom

    def __call__(self):
        with tf.name_scope("CyclicalMom"):

            initial_mom = tf.convert_to_tensor(
            self.initial_mom, name="initial_mom")
            dtype = initial_mom.dtype
            maximal_mom = tf.cast(self.maximal_mom, dtype)
            step_size = tf.cast(self.step_size, dtype)
            cycle = tf.floor(1 + self.step / (2 * step_size))

            x = tf.abs(self.step / step_size - 2 * cycle + 1)

            mom=initial_mom + (
                    maximal_mom - initial_mom
                ) * tf.maximum(tf.cast(0, dtype), (1 - x)) 
            return mom

    def get_config(self):
        return {
            "initial_mom": self.initial_mom,
            "maximal_mom": self.maximal_mom,
            "step_size": self.step_size,

opt = tensorflow.keras.optimizers.SGD(learning_rate=lr,momentum=mom,nesterov=True)

You can do this by using Variable

import tensorflow as tf
mom = tf.Variable(0.2)
opt = tf.optimizers.SGD(learning_rate=0.001, momentum=mom)
How to set adaptive learning rate for GradientDescentOptimizer?
Using make_template() in TensorFlow
the learning rate change for the momentum optimizer
TF/keras subclass works perfectly in Eager Execution, and throws a massive untraceable error without it?
Trying to use and, getting "Expected binary or unicode string, got Tensor dtype=string
TF2.x eager mode can not support ParameterServerStrategy now?
Tensorflow 2.1 TPU v2 reduce memory usage with bfloat16
Is it possible to update the learning rate, each batch, based on batch label (y_true) distribution?