[python] allow to register any custom logger (fixes #4783) by RustingSword · Pull Request #4880 · microsoft/LightGBM

RustingSword · 2021-12-11T13:28:06Z

Allow to register any custom logger, as long as it provides info, warning and error method. Refer to #4783.

ghost · 2021-12-11T13:28:18Z

All CLA requirements met.

jameslamb

Thanks very much for this!

I've added (fixes #4783) to the pull request title here so GitHub will automatically close #4783 if this is merged, and so the connection between this PR and that issue is clear to anyone reading the release notes (PR titles become release note items in this project, see here for example).

Can you please add a unit test covering this new behavior, to be sure it isn't accidentally broken by future refactorings? See the diff from https://github.com/microsoft/LightGBM/pull/3820/files#diff-57a0d4d593bc1ebac04a99d2e477a178c209de8c50e6cffdb5f9195b26dd6e68 for an example.

python-package/lightgbm/basic.py

RustingSword · 2021-12-12T05:27:19Z

Thanks for the thorough guidance! I considered updating the unit test, however not sure what to test. How about add test to make sure the method check is satisfied? Maybe something like this:

def test_register_invalid_logger():
    class LoggerWithoutInfoMethod:
        def warning(self, msg: str) -> None:
            print(msg)

        def error(self, msg: str) -> None:
            print(msg)

    class LoggerWithoutWarningMethod:
        def info(self, msg: str) -> None:
            print(msg)

        def error(self, msg: str) -> None:
            print(msg)

    class LoggerWithoutErrorMethod:
        def info(self, msg: str) -> None:
            print(msg)

        def warning(self, msg: str) -> None:
            print(msg)

    def _test_raise_type_error(invalid_logger) -> None:
        try:
            lgb.register_logger(invalid_logger)
        except Exception as err:
            assert isinstance(err, TypeError)
            assert str(err) == "Logger must provide 'info', 'warning' and 'error' method"

    _test_raise_type_error(LoggerWithoutInfoMethod())
    _test_raise_type_error(LoggerWithoutWarningMethod())
    _test_raise_type_error(LoggerWithoutErrorMethod())

jameslamb · 2021-12-12T05:42:40Z

not sure what to test

I think tests covering the following would give us confidence that this feature is working:

given a logger class that implements info(), warning(), and error() but does not inherit from logging.Logger, that class's logging methods are used by correctly and successfully by lightgbm
given a class that has properties info, warning, and error set to something non-callable (like integers), lightgbm throws the error introduced in this PR
given a class that is missing info(), warning(), or error(), lightgbm throws the error introduced in this PR

For tests that check for errors, please use pytest.raises() for that instead of implementing _test_raise_type_error(), see the following for an example.

LightGBM/tests/python_package_test/test_dask.py

Line 481 in 8066261

    
           with pytest.raises(ValueError, match="Could not parse host name from worker address '0.0.0.1:80'"):

RustingSword · 2021-12-12T05:57:04Z

given a logger class that implements info(), warning(), and error() but does not inherit from logging.Logger, that class's logging methods are used by correctly and successfully by lightgbm

Is there any way to directly test the logging utility in lightgbm, or do I need to test it like https://github.com/microsoft/LightGBM/pull/3820/files#diff-57a0d4d593bc1ebac04a99d2e477a178c209de8c50e6cffdb5f9195b26dd6e68 to go through the training process? If so, I'm afraid there will be a large portion of duplicate code, which requires further refactoring.

For tests that check for errors, please use pytest.raises() for that instead of implementing _test_raise_type_error()

OK, this is much better than my clumsy implementation.

StrikerRUS · 2021-12-12T17:46:44Z

python-package/lightgbm/basic.py

+    def error(self, msg: str) -> None:
+        warnings.warn(msg, stacklevel=3)


I guess this was a typo, right?

Suggested change

def error(self, msg: str) -> None:

warnings.warn(msg, stacklevel=3)

def error(self, msg: str) -> None:

warnings.error(msg, stacklevel=3)

Seems there's no error() in warnings? I seldom use this module, so correct me if I'm wrong.

Ah, indeed! I confused this with error() function from logging module. Sorry.
https://docs.python.org/3/library/logging.html#logging.Logger.error

For our dummy logger I think we can just raise an exception then, WDYT?

Since this is the default logger, in the future when error() is actually needed, we still need to implement it. So I'm inclined to leave a dummy yet usable implementation as default.

Why is a .error() method being added at all? Per your description in #4783

Since only info and warning are used to log messages, any logger class providing these two methods would suffice.

And I can confirm that such a method isn't used anywhere.

git grep -E "\.error"

I have a strong preference for not having any "well we might need this in the future" code in the package, as it increases the size of the package and amount of code to be maintained without affecting the user experience. I think this PR should not add a .error() implementation at all. @StrikerRUS what do you think?

OK, let's remove requirements to have .error() method in custom logger class in this PR. But please note that there will be a breaking change in case we'll want to require it in the future.

Removed .error() related code.

Thanks!

please note that there will be a breaking change in case we'll want to require it in the future

Yes, I understand. I think the case where the lightgbm library would want to emit an ERROR-level log but not raise an exception is unlikely to come up.

Even the logger on the C++ side does not have a method for ERROR-level messages: https://github.com/microsoft/LightGBM/blob/17d4e007852b4f0b34b211a674cb26e6751c98e2/include/LightGBM/utils/log.h.

python-package/lightgbm/basic.py

RustingSword · 2021-12-13T03:00:10Z

Is there any way to directly test the logging utility in lightgbm

I find that maybe I can directly use lgb._log_info(msg) to test logging functions?

Also I'm wondering whether we should export _log_info, _log_warning, _log_error as public methods, say as info, warning and error. So once we registered our own logger, we can use lgb.info(), lgb.warning(), lgb.error() in user code, to make the usage and logging result consistent.

tests/python_package_test/test_utilities.py

jameslamb · 2022-03-20T01:28:43Z

@RustingSword are you still interested in pursuing this PR? If you can update this to the latest master, I'd be happy to provide another review.

I'm sorry, I realized today that you were waiting for feedback from us on #4880 (comment).

Also I'm wondering whether we should export _log_info, _log_warning, _log_error as public methods, say as info, warning and error

I don't believe those methods should be exported or that calling LightGBM's logger like lgb.info() should be encouraged. I think the ability to pass in your own logger means that if you want to have other code that calls the same logger as LightGBM, you can instantiate that logger in your code and pass it into register_logger().

…to register_logger

RustingSword · 2022-03-20T02:37:19Z

@jameslamb I rebased on master. If we don't expose logging methods, then there's no direct way to test logger. Should I duplicate the tests used in test_register_logger ? Personally I think the test is coupled with other functionalities, such as training, which have been tested in other module, e.g. test_engine.py.

jameslamb · 2022-03-20T03:18:21Z

I rebased on master

Since this project uses squash merging, just using git merge master and pushing a merge commit is sufficient. This is also typically easier...something about the way you rebased has made GitHub think this PR is changing 100 files, which is going to make it really difficult to review.

Can you please remove those commits and instead merge master, so the diff only shows what this PR is changing? If you're not sure how to, please let me know and I'd be happy to update this branch for you.

If we don't expose logging methods, then there's no direct way to test logger

Since you are injecting a logger, I think it's possible to design one for a test which modifies a global variable, and then to test the content of those global variables.

pseudo-code showing what I mean INFO-level logs (which could be extended for WARN and FATAL).

import lightgbm as lgb

info_logs = []

class _TestLogger:
    def info(self, msg: str) -> None:
        global info_logs
        info_logs.append(msg)

lgb.register_logger(_TestLogger())

# directly invoke `lgb.basic._log_info()`
lgb.basic._log_info("hello")

assert info_logs == ["hello"]

Besides that, you could have one similar test that performs training with lgb.train(), and just check that info_logs is non-empty. That way, the test wouldn't be as tightly coupled to LightGBM's training logic, but would still check "lgb.train() actually uses the logger registered with register_logger()".

RustingSword · 2022-03-20T03:46:24Z

Sorry, I will do a force update on this branch to merge master instead of rebase.

The suggested test method is great, will implement later.

jameslamb · 2022-03-20T03:48:50Z

Sounds good, thanks! We are VERY sorry it took so long to respond to your questions. Thanks very much for still being willing to contribute!

…to register_logger

StrikerRUS · 2022-03-22T23:48:01Z

I don't believe those methods should be exported or that calling LightGBM's logger like lgb.info() should be encouraged. I think the ability to pass in your own logger means that if you want to have other code that calls the same logger as LightGBM, you can instantiate that logger in your code and pass it into register_logger().

I'm totally agree with this statement.

StrikerRUS · 2022-03-22T23:49:32Z

@RustingSword
Could you please fix one linting error?

./tests/python_package_test/test_utilities.py:147:1: E303 too many blank lines (3)

RustingSword · 2022-03-23T02:06:29Z

@RustingSword Could you please fix one linting error?
./tests/python_package_test/test_utilities.py:147:1: E303 too many blank lines (3)

Sorry, missed this one, fixed.

RustingSword · 2022-03-23T02:11:04Z

When I'm running pytest locally, the expected log messages in test_register_logger miss two lines at the beginning, which are

WARNING | categorical_feature in Dataset is overridden.
New categorical_feature is [1]

I'm not sure whether this should be updated, since there seems no related test error in the ci pipeline.

StrikerRUS · 2022-03-25T01:00:41Z

@RustingSword

When I'm running pytest locally ...

Are you using the latest version of LightGBM?
Refer to #4768.

RustingSword · 2022-03-25T05:33:20Z

@RustingSword

When I'm running pytest locally ...

Are you using the latest version of LightGBM? Refer to #4768.

I see, I'm using 3.3.2, and #4768 is not included per #4930.

jameslamb

Tests look great, thank you very much! Please just see my one remaining comment about the .error() methood.

jameslamb

looks great to me, thanks very much for the idea and the contribution!

StrikerRUS

@RustingSword Thank you so much for your work!

Just one nit about docstring style:

StrikerRUS · 2022-03-27T21:18:28Z

python-package/lightgbm/basic.py

+    info_method_name: str, optional (default="info")
+        Method used to log info messages.
+    warning_method_name: str, optional (default="warning")
+        Method used to log warning messages.


Suggested change

info_method_name: str, optional (default="info")

Method used to log info messages.

warning_method_name: str, optional (default="warning")

Method used to log warning messages.

info_method_name : str, optional (default="info")

Method used to log info messages.

warning_method_name : str, optional (default="warning")

Method used to log warning messages.

Fixed. Maybe we can add a corresponding linter rule? Or even better, use a unified formatter.

Nice idea! Do you know any one?
pydocstyle linter we use here doesn't catch such inconsistency.

LightGBM/.ci/test.sh

Line 81 in d163c2c

pydocstyle --convention=numpy --add-ignore=D105 --match-dir="^(?!^external_libs|test|example).*" --match="(?!^test_|setup).*\.py" . || exit -1

Seems this is a known issue with pydocstyle. I tried numpydoc as mentioned in the comment, however it is not satisfying either. First, numpydoc is an extension to sphinx, and the validation rules cannot be set in command line options when used as a standalone tool. Second, numpydoc is not intelligent enough. We can see this in above issue, where a single missing whitespace will result in multiple false positives.

smalltest.no:PR01:Parameters {'a'} not documented smalltest.no:PR02:Unknown parameters {'a: int'}

These two errors are due to parsing error caused by the missing whitespace before colon.

So, I don't have suggested tool suited for this 😞

Ah no problem, thanks very much for investigating!

RustingSword · 2022-03-28T02:33:58Z

@jameslamb @StrikerRUS Thank you both for your excellent guidance!

github-actions · 2023-08-23T00:22:34Z

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

[python] allow to register any custom logger

9e41c2a

RustingSword requested review from StrikerRUS, chivee, henry0312, hzy46, jameslamb, shiyu1994 and tongwu-sh as code owners December 11, 2021 13:28

jameslamb changed the title ~~[python] allow to register any custom logger~~ [python] allow to register any custom logger (fixes #4783) Dec 12, 2021

jameslamb requested changes Dec 12, 2021

View reviewed changes

python-package/lightgbm/basic.py Outdated Show resolved Hide resolved

jameslamb added the feature label Dec 12, 2021

StrikerRUS requested changes Dec 12, 2021

View reviewed changes

allow customizable logging method name; add unit test

69bae5a

RustingSword commented Dec 13, 2021

View reviewed changes

tests/python_package_test/test_utilities.py Outdated Show resolved Hide resolved

gongxudong and others added 3 commits March 20, 2022 10:30

[python] allow to register any custom logger

c935acd

allow customizable logging method name; add unit test

5dd3b1d

Merge branch 'register_logger' of github.com:RustingSword/LightGBM in…

0990709

…to register_logger

RustingSword requested review from Laurae2, btrotta, guolinke, huanzhang12 and jmoralez as code owners March 20, 2022 02:30

RustingSword added 3 commits March 20, 2022 20:02

Merge branch 'master' into register_logger

c61891e

update tests

7f744ec

Merge branch 'register_logger' of github.com:RustingSword/LightGBM in…

72888a8

…to register_logger

fix lint error

866ebfe

jameslamb requested review from jameslamb and removed request for chivee March 26, 2022 04:07

jameslamb requested changes Mar 26, 2022

View reviewed changes

remove unused method

d78c00c

jameslamb approved these changes Mar 27, 2022

View reviewed changes

StrikerRUS approved these changes Mar 27, 2022

View reviewed changes

fix docstring style

2df945b

StrikerRUS merged commit 60e72d5 into microsoft:master Mar 28, 2022

jameslamb mentioned this pull request Oct 7, 2022

[DO NOT MERGE] Release v3.3.3 #5525

Closed

40 tasks

github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023

		def error(self, msg: str) -> None:
		warnings.warn(msg, stacklevel=3)

Conversation

RustingSword commented Dec 11, 2021

Uh oh!

ghost commented Dec 11, 2021 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jameslamb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

RustingSword commented Dec 12, 2021

Uh oh!

jameslamb commented Dec 12, 2021

Uh oh!

RustingSword commented Dec 12, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

RustingSword commented Dec 13, 2021

Uh oh!

Uh oh!

jameslamb commented Mar 20, 2022

Uh oh!

RustingSword commented Mar 20, 2022

Uh oh!

jameslamb commented Mar 20, 2022

Uh oh!

RustingSword commented Mar 20, 2022

Uh oh!

jameslamb commented Mar 20, 2022

Uh oh!

StrikerRUS commented Mar 22, 2022

Uh oh!

StrikerRUS commented Mar 22, 2022

Uh oh!

RustingSword commented Mar 23, 2022

Uh oh!

RustingSword commented Mar 23, 2022

Uh oh!

StrikerRUS commented Mar 25, 2022

Uh oh!

RustingSword commented Mar 25, 2022

Uh oh!

jameslamb left a comment

Choose a reason for hiding this comment

Uh oh!

jameslamb left a comment

Choose a reason for hiding this comment

Uh oh!

StrikerRUS left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RustingSword commented Mar 28, 2022

Uh oh!

github-actions bot commented Aug 23, 2023

ghost commented Dec 11, 2021 •

edited by ghost

Loading