Optimizers#
|
The AdaBelief optimizer. |
|
The Adadelta optimizer. |
|
The ADAptive Nesterov momentum algorithm (Adan). |
|
The Adafactor optimizer. |
|
The Adagrad optimizer. |
|
The Adam optimizer. |
|
Adam with weight decay regularization. |
|
A variant of the Adam optimizer that uses the infinity norm. |
|
Adamax with weight decay regularization. |
|
The AMSGrad optimizer. |
|
The Frobenius matched gradient descent (Fromage) optimizer. |
|
The LAMB optimizer. |
|
The LARS optimizer. |
|
L-BFGS optimizer. |
|
The Lion optimizer. |
|
The NAdam optimizer. |
|
NAdamW optimizer, implemented as part of the AdamW optimizer. |
|
A variant of SGD with added noise. |
|
NovoGrad optimizer. |
|
An Optimistic Gradient Descent optimizer. |
|
The Optimistic Adam optimizer. |
|
SGD with Polyak step-size. |
|
The Rectified Adam optimizer. |
|
A flexible RMSProp optimizer. |
|
A canonical Stochastic Gradient Descent optimizer. |
|
A variant of SGD using only the signs of the gradient components. |
|
A variant of SGD using signs of the components of an EMA of the gradient. |
|
The SM3 optimizer. |
|
The Yogi optimizer. |
|
The Rprop optimizer. |