NAdam Optimizer
NAdam is a short form for Nesterov and Adam optimizer. NAdam uses Nesterov momentum to update gradient than vanilla momentum used by Adam.
Syntax: tf.keras.optimizers.Nadam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-07, name='Nadam', **kwargs) Parameters: learning_rate: rate at which algorithm updates the parameter. Tensor or float type of value.Default value is 0.001 beta_1: Exponential decay rate for 1st moment. Constant Float tensor or float type of value. Default value is 0.9 beta_2: Exponential decay rate for weighted infinity norm. Constant Float tensor or float type of value. Default value is 0.999 epsilon: Small value used to sustain numerical stability. Floating point type of value. Default value is 1e-07 name: Optional name for the operation **kwargs: Keyworded variable length argument length
Advantages:
- Gives better results for gradients with high curvature or noisy gradients.
- Learns faster
Disadvantage: Sometimes may not converge to an optimal solution
Optimizers in Tensorflow
Optimizers are techniques or algorithms used to decrease loss (an error) by tuning various parameters and weights, hence minimizing the loss function, providing better accuracy of model faster.