fix: Adjust LGBM_DatasetCreateFromSampledColumn to handle distributed data#5344
fix: Adjust LGBM_DatasetCreateFromSampledColumn to handle distributed data#5344shiyu1994 merged 5 commits intomicrosoft:masterfrom
Conversation
|
I believe we can take the opportunity to make breaking changes in the next major release and split the current |
Sounds fine to me. Should I go ahead and make those changes? Also, does that mean this will get checked in when approved, or will it need to wait for some other condition related to incrementing the major version? |
|
Next release will be
I think it's better to wait for some other opinion on this (@guolinke @shiyu1994 ). |
|
I agree with @StrikerRUS , please go ahead |
|
Hey @StrikerRUS, @jameslamb, @shiyu1994 just wanted to gently ping on this, anything else Scott should tackle before approval. Thanks so much for all of your help, time, and thoughtful review comments! |
|
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
This PR adds a new API related to
LGBM_DatasetCreateFromSampledColumnthat splits the current single parameter for "num_total_data" into 2, one for total local data and one for total distributed data. The new API is calledLGBM_DatasetDistCreateFromSampledColumn, but that name can be changed. I kept the old one for back-compat.See issue here: #5343