[python-package] simplify eval result printing#6749
Conversation
python-package/lightgbm/callback.py
Outdated
| return f"{value[0]}'s {value[1]}: {value[2]:g}" | ||
| else: | ||
| raise ValueError("Wrong metric value") | ||
| eval_name, metric_name, eval_result, *_ = value |
There was a problem hiding this comment.
What do you think of using similar names to the ones here
LightGBM/python-package/lightgbm/basic.py
Line 5215 in 53e0ddf
I find the first one a bit confusing, since it's the name of the validation set. Maybe something like
dataset_name, metric_name, metric_value, *_ = value?
There was a problem hiding this comment.
You're totally right... I think eval_name and eval_result are confusing. I love your suggested names, agree we should standardize on those (here and in future PRs like this).
There was a problem hiding this comment.
updated in 6b63377
I'll do more of this renaming in future PRs.
| else: | ||
| return f"{value[0]}'s {value[1]}: {value[2]:g}" | ||
| else: | ||
| raise ValueError("Wrong metric value") |
There was a problem hiding this comment.
We're loosing this error, not sure if it can happen but may be worth adding a check at the start like:
if len(value) not in [4, 5]:
raise ValueError("Wrong metric value")There was a problem hiding this comment.
Honestly, I don't think this is a very helpful error message. "Wrong metric value" doesn't really describe the problem here, and you'll end up walking the stacktrace to find this point in the code anyway.
Originally in this PR, I was thinking we could just keep this simple and let Python's "too many values to unpack" or "not enough values to unpack" tuple-unpacking errors convey that information. But thinking about it more... removing this exception means that you could now implement a custom metric function that returns tuples with more than 5 elements and LightGB would silently accept it. I think it's valuable to continue preventing that, to reserve the 6th, 7th, etc. elements for future LightGBM-internal purposes.
I'll put back an exception here using logic like you suggested... but changing the message to something a bit more helpful.
There was a problem hiding this comment.
Alright so I tried to add an error message here, and then write a test to check that it's raised... and I couldn't find a public code path that'd allow a tuple with too few or too many tuples to get to this point.
Here's what I tried:
import lightgbm as lgb
from sklearn.datasets import make_regression
X, y = make_regression(n_samples=10_000, n_features=3)
def constant_metric(preds, train_data):
# returns 4 elements (should be 3)
return ("error", 0.0, False, "too-much")
dtrain = lgb.Dataset(X, label=y).construct()
dvalid = dtrain.create_valid(X, label=y)
bst = lgb.train(
params={
"objective": "regression"
},
train_set=dtrain,
feval=[constant_metric],
valid_sets=[dvalid],
valid_names=["valid0"]
)That gets stopped earlier:
[LightGBM] [Info] Total Bins 765
[LightGBM] [Info] Number of data points in the train set: 10000, number of used features: 3
[LightGBM] [Info] Start training from score -0.471765
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/jlamb/miniforge3/envs/lgb-dev/lib/python3.11/site-packages/lightgbm/engine.py", line 329, in train
evaluation_result_list.extend(booster.eval_valid(feval))
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jlamb/miniforge3/envs/lgb-dev/lib/python3.11/site-packages/lightgbm/basic.py", line 4445, in eval_valid
return [
^
File "/Users/jlamb/miniforge3/envs/lgb-dev/lib/python3.11/site-packages/lightgbm/basic.py", line 4448, in <listcomp>
for item in self.__inner_eval(self.name_valid_sets[i - 1], i, feval)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jlamb/miniforge3/envs/lgb-dev/lib/python3.11/site-packages/lightgbm/basic.py", line 5214, in __inner_eval
eval_name, val, is_higher_better = feval_ret
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: too many values to unpack (expected 3)
Here:
LightGBM/python-package/lightgbm/basic.py
Lines 5209 to 5215 in b33a12e
So I think that maybe this error message we're talking about was effectively unreachable.
And I don't think we should add a custom and slightly more informative error message there in Booster.__inner_eval():
- that code will raise a
ValueErrorif anything other than exactly a 3-item tuple (or list of such tuples) is returned... so no need to add more protection against tuples of other sizes - that part of the code is already kind of complicated
- that part of the code runs on every iteration, so adding an
if len() ..: raisewould add a bit of extra overhead to every iteration
Co-authored-by: José Morales <jmoralz92@gmail.com>
…tGBM into python/format-eval-result
StrikerRUS
left a comment
There was a problem hiding this comment.
LGTM! Nice refactoring!
|
Thanks! I'll merge this and then continue with #6748 soon. |
|
This pull request has been automatically locked since there has not been any recent activity since it was closed. |
Proposes a small refactor of
_format_eval_result(), used to populate log messages like this:LightGBM/python-package/lightgbm/callback.py
Lines 99 to 100 in 53e0ddf
This is a first step in the broader refactoring proposed in #6748 ... see that issue for more details.