flowvision.scheduler¶

class flowvision.scheduler.Scheduler(optimizer: oneflow.optim.optimizer.Optimizer, param_group_field: str, noise_range_t=None, noise_type='normal', noise_pct=0.67, noise_std=1.0, noise_seed=None, initialize: bool = True)[source]¶

Parameter Scheduler Base Class Borrowed from pytorch-image-models A scheduler base class that can be used to schedule any optimizer parameter groups. Unlike the builtin PyTorch schedulers, this is intended to be consistently called

At the END of each epoch, before incrementing the epoch count, to calculate next epoch’s value
At the END of each optimizer update, after incrementing the update count, to calculate next update’s value

The schedulers built on this should try to remain as stateless as possible (for simplicity). This family of schedulers is attempting to avoid the confusion of the meaning of ‘last_epoch’ and -1 values for special behaviour. All epoch and update counts must be tracked in the training code and explicitly passed in to the schedulers on the corresponding step or step_update call. Based on ideas from:

https://github.com/pytorch/fairseq/tree/master/fairseq/optim/lr_scheduler

https://github.com/allenai/allennlp/tree/master/allennlp/training/learning_rate_schedulers

https://github.com/rwightman/pytorch-image-models/tree/master/timm/scheduler

class flowvision.scheduler.CosineLRScheduler(optimizer: oneflow.optim.optimizer.Optimizer, t_initial: int, t_mul: float = 1.0, lr_min: float = 0.0, decay_rate: float = 1.0, warmup_t=0, warmup_lr_init=0, warmup_prefix=False, cycle_limit=0, t_in_epochs=True, noise_range_t=None, noise_pct=0.67, noise_std=1.0, noise_seed=42)[source]¶

Cosine decay with restarts borrowed from timm. This is described in the paper https://arxiv.org/abs/1608.03983.

Inspiration from https://github.com/allenai/allennlp/blob/master/allennlp/training/learning_rate_schedulers/cosine.py

Parameters

optimizer – The optimizer will be used for the training process
t_initial – The initial number of epochs. Example, 50, 100 etc.
t_mul – updates the SGDR schedule annealing.
lr_min – Defaults to 1e-5. The minimum learning rate to use during the scheduling. The learning rate does not ever go below this value.
decay_rate – When decay rate > 0 and < 1., at every restart the learning rate is decayed by new learning rate which equals lr * decay_rate. If decay_rate=0.5, then in that case, the new learning rate becomes half the initial lr.
warmup_t – Defines the number of warmup epochs.
warmup_lr_init – The initial learning rate during warmup.

class flowvision.scheduler.LinearLRScheduler(optimizer: oneflow.optim.optimizer.Optimizer, t_initial: int, lr_min_rate: float, warmup_t=0, warmup_lr_init=0.0, t_in_epochs=True, noise_range_t=None, noise_pct=0.67, noise_std=1.0, noise_seed=42, initialize=True)[source]¶

Linear warmup and linear decay scheduler

Inspiration from https://github.com/microsoft/Swin-Transformer/blob/main/lr_scheduler.py

Parameters

optimizer – The optimizer will be used for the training process
t_initial – The initial number of epochs. Example, 50, 100 etc.
t_mul – updates the SGDR schedule annealing.
lr_min_rate – The minimum learning rate factor to use during the scheduling. The learning rate does not ever go below to lr * lr_min_rate.
warmup_t – Defines the number of warmup epochs.
warmup_lr_init – The initial learning rate during warmup.

class flowvision.scheduler.StepLRScheduler(optimizer: oneflow.optim.optimizer.Optimizer, decay_t: float, decay_rate: float = 1.0, warmup_t=0, warmup_lr_init=0, t_in_epochs=True, noise_range_t=None, noise_pct=0.67, noise_std=1.0, noise_seed=42, initialize=True)[source]¶

Step LRScheduler Decays the learning rate of each parameter group by decay_rate every decay_t steps.

Parameters

optimizer – The optimizer will be used for the training process
decay_t – Period of learning rate decay.
decay_rate – Multiplicative factor of learning rate decay. Default: 1.0.
warmup_t – Defines the number of warmup epochs.
warmup_lr_init – The initial learning rate during warmup.

class flowvision.scheduler.MultiStepLRScheduler(optimizer: oneflow.optim.optimizer.Optimizer, decay_t: List[int], decay_rate: float = 1.0, warmup_t=0, warmup_lr_init=0, t_in_epochs=True, noise_range_t=None, noise_pct=0.67, noise_std=1.0, noise_seed=42, initialize=True)[source]¶

MultiStep LRScheduler Decays the learning rate of each parameter group by decay_rate once the number of step reaches one of the decay_t.

Parameters

optimizer – The optimizer will be used for the training process
decay_t – List of epoch indices. Must be increasing.
decay_rate – Multiplicative factor of learning rate decay. Default: 1.0.
warmup_t – Defines the number of warmup epochs.
warmup_lr_init – The initial learning rate during warmup.

class flowvision.scheduler.PolyLRScheduler(optimizer: oneflow.optim.optimizer.Optimizer, t_initial: int, power: float = 0.5, lr_min: float = 0.0, cycle_mul: float = 1.0, cycle_decay: float = 1.0, cycle_limit: int = 1, warmup_t=0, warmup_lr_init=0, warmup_prefix=False, t_in_epochs=True, noise_range_t=None, noise_pct=0.67, noise_std=1.0, noise_seed=42, k_decay=1.0, initialize=True)[source]¶

Polynomial LR Scheduler w/ warmup, noise, and k-decay k-decay option based on k-decay: A New Method For Learning Rate Schedule - https://arxiv.org/abs/2004.05909

Parameters

optimizer – The optimizer will be used for the training process
t_initial – The initial number of epochs. Example, 50, 100 etc.
power – The power of polynomial. Defaults to 0.5.
lr_min – Defaults to 1e-5. The minimum learning rate to use during
scheduling. The learning rate does not ever go below this value. (the) –
warmup_t – Defines the number of warmup epochs.
warmup_lr_init – The initial learning rate during warmup.

class flowvision.scheduler.TanhLRScheduler(optimizer: oneflow.optim.optimizer.Optimizer, t_initial: int, lb: float = - 7.0, ub: float = 3.0, lr_min: float = 0.0, cycle_mul: float = 1.0, cycle_decay: float = 1.0, cycle_limit: int = 1, warmup_t=0, warmup_lr_init=0, warmup_prefix=False, t_in_epochs=True, noise_range_t=None, noise_pct=0.67, noise_std=1.0, noise_seed=42, initialize=True)[source]¶: Hyberbolic-Tangent decay with restarts. This is described in the paper https://arxiv.org/abs/1806.01593