Skip to content

[python-package] ensure that all callbacks are pickleable #5080

@jameslamb

Description

@jameslamb

Summary

All callback functions should be serializable with pickle, clooudpickle, and joblib.

Motivation

As described in #5012, in some interfaces to LightGBM, it's necessary to broadcast LightGBM parameters or objects from the lightgbm Python library to multiple processes / machines. The primary mechanism for this in Python is to serialize objects with a library like pickle, cloudpickle, or joblib.

#5012 made lgb.callback.early_stopping serializable, for the benefit of lightgbm-ray. I expect that would also be necessary to use callbacks in lightgbm.dask, and for any other settings where users want to pickle/unpickle LightGBM objects.

Description

Functions in lightgbm.callback:

Tests similar to the following should be added for each of these functions, to ensure that they're pickleable and that that remains true after future changes to this project.

@pytest.mark.parametrize('serializer', ["pickle", "joblib", "cloudpickle"])
def test_early_stopping_callback_is_picklable(serializer, tmp_path):
callback = lgb.early_stopping(stopping_rounds=5)
tmp_file = tmp_path / "early_stopping.pkl"
pickle_obj(
obj=callback,
filepath=tmp_file,
serializer=serializer
)
callback_from_disk = unpickle_obj(
filepath=tmp_file,
serializer=serializer
)
assert callback.stopping_rounds == callback_from_disk.stopping_rounds

It's possible that just adding such tests will be enough, and that the remaining functions are already pickleable. But if not, then changes will need to be made to support this.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions