flowvision.scheduler¶
-
class
flowvision.scheduler.
Scheduler
(optimizer: oneflow.optim.optimizer.Optimizer, param_group_field: str, noise_range_t=None, noise_type='normal', noise_pct=0.67, noise_std=1.0, noise_seed=None, initialize: bool = True)[source]¶ Parameter Scheduler Base Class Borrowed from pytorch-image-models A scheduler base class that can be used to schedule any optimizer parameter groups. Unlike the builtin PyTorch schedulers, this is intended to be consistently called
At the END of each epoch, before incrementing the epoch count, to calculate next epoch’s value
At the END of each optimizer update, after incrementing the update count, to calculate next update’s value
The schedulers built on this should try to remain as stateless as possible (for simplicity). This family of schedulers is attempting to avoid the confusion of the meaning of ‘last_epoch’ and -1 values for special behaviour. All epoch and update counts must be tracked in the training code and explicitly passed in to the schedulers on the corresponding step or step_update call. Based on ideas from:
-
class
flowvision.scheduler.
CosineLRScheduler
(optimizer: oneflow.optim.optimizer.Optimizer, t_initial: int, t_mul: float = 1.0, lr_min: float = 0.0, decay_rate: float = 1.0, warmup_t=0, warmup_lr_init=0, warmup_prefix=False, cycle_limit=0, t_in_epochs=True, noise_range_t=None, noise_pct=0.67, noise_std=1.0, noise_seed=42)[source]¶ Cosine decay with restarts borrowed from timm. This is described in the paper https://arxiv.org/abs/1608.03983.
Inspiration from https://github.com/allenai/allennlp/blob/master/allennlp/training/learning_rate_schedulers/cosine.py
- Parameters
optimizer – The optimizer will be used for the training process
t_initial – The initial number of epochs. Example, 50, 100 etc.
t_mul – updates the SGDR schedule annealing.
lr_min – Defaults to 1e-5. The minimum learning rate to use during the scheduling. The learning rate does not ever go below this value.
decay_rate – When decay rate > 0 and < 1., at every restart the learning rate is decayed by new learning rate which equals
lr * decay_rate
. If decay_rate=0.5, then in that case, the new learning rate becomes half the initial lr.warmup_t – Defines the number of warmup epochs.
warmup_lr_init – The initial learning rate during warmup.
-
class
flowvision.scheduler.
LinearLRScheduler
(optimizer: oneflow.optim.optimizer.Optimizer, t_initial: int, lr_min_rate: float, warmup_t=0, warmup_lr_init=0.0, t_in_epochs=True, noise_range_t=None, noise_pct=0.67, noise_std=1.0, noise_seed=42, initialize=True)[source]¶ Linear warmup and linear decay scheduler
Inspiration from https://github.com/microsoft/Swin-Transformer/blob/main/lr_scheduler.py
- Parameters
optimizer – The optimizer will be used for the training process
t_initial – The initial number of epochs. Example, 50, 100 etc.
t_mul – updates the SGDR schedule annealing.
lr_min_rate – The minimum learning rate factor to use during the scheduling. The learning rate does not ever go below to
lr * lr_min_rate
.warmup_t – Defines the number of warmup epochs.
warmup_lr_init – The initial learning rate during warmup.
-
class
flowvision.scheduler.
StepLRScheduler
(optimizer: oneflow.optim.optimizer.Optimizer, decay_t: float, decay_rate: float = 1.0, warmup_t=0, warmup_lr_init=0, t_in_epochs=True, noise_range_t=None, noise_pct=0.67, noise_std=1.0, noise_seed=42, initialize=True)[source]¶ Step LRScheduler Decays the learning rate of each parameter group by decay_rate every decay_t steps.
- Parameters
optimizer – The optimizer will be used for the training process
decay_t – Period of learning rate decay.
decay_rate – Multiplicative factor of learning rate decay. Default: 1.0.
warmup_t – Defines the number of warmup epochs.
warmup_lr_init – The initial learning rate during warmup.
-
class
flowvision.scheduler.
MultiStepLRScheduler
(optimizer: oneflow.optim.optimizer.Optimizer, decay_t: List[int], decay_rate: float = 1.0, warmup_t=0, warmup_lr_init=0, t_in_epochs=True, noise_range_t=None, noise_pct=0.67, noise_std=1.0, noise_seed=42, initialize=True)[source]¶ MultiStep LRScheduler Decays the learning rate of each parameter group by decay_rate once the number of step reaches one of the decay_t.
- Parameters
optimizer – The optimizer will be used for the training process
decay_t – List of epoch indices. Must be increasing.
decay_rate – Multiplicative factor of learning rate decay. Default: 1.0.
warmup_t – Defines the number of warmup epochs.
warmup_lr_init – The initial learning rate during warmup.
-
class
flowvision.scheduler.
PolyLRScheduler
(optimizer: oneflow.optim.optimizer.Optimizer, t_initial: int, power: float = 0.5, lr_min: float = 0.0, cycle_mul: float = 1.0, cycle_decay: float = 1.0, cycle_limit: int = 1, warmup_t=0, warmup_lr_init=0, warmup_prefix=False, t_in_epochs=True, noise_range_t=None, noise_pct=0.67, noise_std=1.0, noise_seed=42, k_decay=1.0, initialize=True)[source]¶ Polynomial LR Scheduler w/ warmup, noise, and k-decay k-decay option based on k-decay: A New Method For Learning Rate Schedule - https://arxiv.org/abs/2004.05909
- Parameters
optimizer – The optimizer will be used for the training process
t_initial – The initial number of epochs. Example, 50, 100 etc.
power – The power of polynomial. Defaults to 0.5.
lr_min – Defaults to 1e-5. The minimum learning rate to use during
scheduling. The learning rate does not ever go below this value. (the) –
warmup_t – Defines the number of warmup epochs.
warmup_lr_init – The initial learning rate during warmup.
-
class
flowvision.scheduler.
TanhLRScheduler
(optimizer: oneflow.optim.optimizer.Optimizer, t_initial: int, lb: float = - 7.0, ub: float = 3.0, lr_min: float = 0.0, cycle_mul: float = 1.0, cycle_decay: float = 1.0, cycle_limit: int = 1, warmup_t=0, warmup_lr_init=0, warmup_prefix=False, t_in_epochs=True, noise_range_t=None, noise_pct=0.67, noise_std=1.0, noise_seed=42, initialize=True)[source]¶ Hyberbolic-Tangent decay with restarts. This is described in the paper https://arxiv.org/abs/1806.01593