-
Notifications
You must be signed in to change notification settings - Fork 4k
Description
Description
In the R package, lgb.train() and lightgbm() expose a keyword argument nrounds, an integer indicating how many boosting rounds should be performed.
{lightgbm} uses the value of that argument to set num_iteration, unless num_iterations or another alias for it (https://lightgbm.readthedocs.io/en/latest/Parameters.html#num_iterations) is provided in the keyword argument params.
LightGBM/R-package/R/lgb.train.R
Lines 111 to 115 in 798dc1d
| params <- lgb.check.wrapper_param( | |
| main_param_name = "num_iterations" | |
| , params = params | |
| , alternative_kwarg_value = nrounds | |
| ) |
Since nrounds is not a supported alias for num_iterations in LightGBM, it is not possible to override the default value of nrounds in lgb.train() or lightgbm() by passing nrounds as part of the list in params.
As @mikemahoney218 noted in #4226 (comment), this is confusing behavior. Everywhere else in the R and Python packages, LightGBM treats values in params as higher-precedence than those passed through keyword arguments.
This behavior also adds friction to hyperparameter tuning (and will add even more once the suggestions from #4226 are fully implemented), as it makes nrounds a training parameter that cannot be altered by altering the list in params.
nrounds should be added as an alias for num_iterations in LightGBM.
Reproducible example
library(lightgbm)
data(agaricus.train, package = "lightgbm")
dtrain <- lightgbm::lgb.Dataset(
agaricus.train$data
, label = agaricus.train$label
)
bst <- lightgbm::lgb.train(
params = list(
"nrounds" = 17
, "objective" = "regression"
)
, data = dtrain
)
# should be 17, but 100 boosting rounds were performed
bst$current_iter()
# [1] 100How to fix this
Add nrounds to the list of aliases for num_iterations at
LightGBM/R-package/R/aliases.R
Lines 108 to 119 in 798dc1d
| , "num_iterations" = c( | |
| "num_iterations" | |
| , "num_iteration" | |
| , "n_iter" | |
| , "num_tree" | |
| , "num_trees" | |
| , "num_round" | |
| , "num_rounds" | |
| , "num_boost_round" | |
| , "n_estimators" | |
| , "max_iter" | |
| ) |
To ensure that other interfaces to LightGBM besides the R package respect this parameter alias, update the relevant C++ code and documentation. See https://github.com/microsoft/LightGBM/pull/4637/files for reference of which files should be changed. Do not edit docs/Parameters.rst directly... run python helpers/parameter_generator.py from the root of the repo after updating files in include/ and src/.
Add a unit test to https://github.com/microsoft/LightGBM/blob/798dc1d4191b93fd34797d62b79c66cd95209406/R-package/tests/testthat/test_basic.R which confirms that {lightgbm} respects nrounds passed through parameters.
Additional Comments
@StrikerRUS @Laurae2 please let me know if you disagree with this idea or have any additional thoughts to add.