Optimizers#

Factories#

eztorch.optimizers.optimizer_factory(name, initial_lr, model, batch_size=None, num_steps_per_epoch=None, layer_decay_lr=None, keys_without_decay=[], exclude_wd_norm=False, exclude_wd_bias=False, scaler=None, params={}, divide_wd_by_lr=False, scheduler=None, multiply_lr=1.0, multiply_parameters=[])[source]#

Optimizer factory to build optimizers and optionally an attached scheduler.

Parameters:
  • name (

    Default:) – Name of the scheduler to retrieve the optimizer constructor from _OPTIMIZERS dict.

  • initial_lr (

    Default:) – Initial learning rate.

  • model (

    Default:) – Model to optimize.

  • batch_size (

    Default:, optional) – Batch size for the input of the model.
    Default: None

  • num_steps_per_epoch (

    Default:, optional) – Number of steps per epoch. Useful for some schedulers.
    Default: None

  • keys_without_decay (

    Default:, optional) – Keys to filter parameters for weight decay.
    Default: []

  • exclude_wd_norm (

    Default:, optional) – If True, exclude normalization layers to be regularized by weight decay.
    Default: False

  • exclude_wd_bias (

    Default:, optional) – If True, exclude bias layers to be regularized by weight decay.
    Default: False

  • scaler (

    Default:, optional) – Scaler rule for the initial learning rate.
    Default: None

  • params (

    Default:, optional) – Parameters for the optimizer constructor.
    Default: {}

  • divide_wd_by_lr (

    Default:, optional) – If True, divide the weight decay by the value of the learning rate.
    Default: False

  • scheduler (

    Default:, optional) – Scheduler config.
    Default: None

  • multiply_lr (

    Default:, optional) – Multiply the learning rate by factor. Applied for scheduler aswell.
    Default: 1.0

Return type:

Default:

Returns:

The optimizer with its optional scheduler.

eztorch.optimizers.optimizer_factory_two_groups(name, initial_lr1, initial_lr2, model1, model2, batch_size=None, num_steps_per_epoch=None, exclude_wd_norm=False, exclude_wd_bias=False, scaler=None, params={}, scheduler=None)[source]#

Optimizer factory to build an optimizer for two groups of parameters and optionally an attached scheduler.

Parameters:
  • name (

    Default:) – Name of the scheduler to retrieve the optimizer constructor from _OPTIMIZERS dict.

  • initial_lr1 (

    Default:) – Initial learning rate for model 1.

  • initial_lr2 (

    Default:) – Initial learning rate for model 2.

  • model1 (

    Default:) – Model 1 to optimize.

  • model2 (

    Default:) – Model 2 to optimize.

  • batch_size (

    Default:, optional) – Batch size for the input of the model.
    Default: None

  • num_steps_per_epoch (

    Default:, optional) – Number of steps per epoch. Useful for some schedulers.
    Default: None

  • exclude_wd_norm (

    Default:, optional) – If True, exclude normalization layers to be regularized by weight decay.
    Default: False

  • exclude_wd_bias (

    Default:, optional) – If True, exclude bias layers to be regularized by weight decay.
    Default: False

  • scaler (

    Default:, optional) – Scaler rule for the initial learning rate.
    Default: None

  • params (

    Default:, optional) – Parameters for the optimizer constructor.
    Default: {}

  • scheduler (

    Default:, optional) – Scheduler config for model.
    Default: None

Return type:

Default:

Returns:

The optimizer with its optional scheduler.

Custom Optimizers#

LARS#

class eztorch.optimizers.LARS(params, lr=0, weight_decay=0, momentum=0.9, trust_coefficient=0.001)[source]#

LARS optimizer, no rate scaling or weight decay for parameters <= 1D.

References LARS:
Parameters:
  • params (

    Default:) – Parameters to optimize.

  • lr (

    Default:, optional) – Learning rate of the optimizer.
    Default: 0

  • weight_decay (

    Default:, optional) – Weight decay to apply.
    Default: 0

  • momentum (

    Default:, optional) – Momentum for optimization.
    Default: 0.9

  • trust_coefficient (

    Default:, optional) – LARS trust coefficient.
    Default: 0.001