You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
|`start_lr`| float | The learning rate at the start of training (after warmup). |
42
-
|`stop_lr`| float | The learning rate at the end of training. **Mutually exclusive** with `stop_lr_ratio`. When `decay_rate` is explicitly set, this serves as the minimum learning rate. |
43
-
|`stop_lr_ratio`| float | The ratio of `stop_lr` to `start_lr`. `stop_lr = start_lr * stop_lr_ratio`. **Mutually exclusive** with `stop_lr`. |
|`warmup_steps`| int | 0 | Number of steps for warmup. Learning rate increases linearly from `warmup_start_factor * start_lr` to `start_lr`. Mutually exclusive with `warmup_ratio`. |
52
-
|`warmup_ratio`| float | None | Ratio of warmup steps to total training steps. `warmup_steps = int(warmup_ratio * num_steps)`. Mutually exclusive with `warmup_steps`.|
57
+
|`warmup_ratio`| float | None | Ratio of warmup steps to total training steps. `warmup_steps = int(warmup_ratio * numb_steps)`. Mutually exclusive with `warmup_steps`. |
53
58
|`warmup_start_factor`| float | 0.0 | Factor for initial warmup learning rate. Warmup starts from `warmup_start_factor * start_lr`. |
54
59
|`scale_by_worker`| str | "linear" | How to alter learning rate in parallel training. Options: `"linear"`, `"sqrt"`, `"none"`. |
|`decay_steps`| int | 5000 | Interval (in steps) at which learning rate decays. If `decay_steps` exceeds the total decay steps (`num_steps - warmup_steps`) and `decay_rate` is not provided, it will be automatically adjusted to a sensible default. |
63
-
|`decay_rate`| float | None | Explicit decay rate. If not provided, computed from `start_lr` and `stop_lr`. |
64
-
|`smooth`| bool | false | If `true`, use smooth exponential decay. If `false`, stepped decay. |
|`decay_steps`| int | 5000 | Interval (in steps) at which learning rate decays. If `decay_steps` exceeds the total decay steps (`numb_steps - warmup_steps`) and `decay_rate` is not provided, it will be automatically adjusted to a sensible default. |
68
+
|`decay_rate`| float | None | Explicit decay rate. If not provided, computed from `start_lr` and `stop_lr`. |
69
+
|`smooth`| bool | false | If `true`, use smooth exponential decay. If `false`, stepped decay. |
0 commit comments