Conversation
898acdb to
aa3bcff
Compare
jameslamb
left a comment
There was a problem hiding this comment.
Thanks for your interest in LightGBM. It'll be at least a few days until someone's able to review this, as we're currently focused on finishing a release (#6439 (comment)).
Every CI run on this PR will require a maintainer manually approving it, because you've never contributed here before. Sorry for the inconvenience, but GitHub introduced this as a security measure a few years ago and we've decided to leave it enabled. We do occasionally receive malicious pull requests trying to use our CI resources 🙃 |
|
Thanks for the feedback! Some CI does appear to run, though, and some of it exhibited IO failures the first time around which I was trying to address 🤷 . |
borchero
left a comment
There was a problem hiding this comment.
Nice, looking forward to this performance improvement! I left a few comments, esp. regarding the changes to the application code
Co-authored-by: Oliver Borchert <me@borchero.com>
Co-authored-by: Oliver Borchert <me@borchero.com>
borchero
left a comment
There was a problem hiding this comment.
Thanks for the updates, I think we're getting there! 😁
jameslamb
left a comment
There was a problem hiding this comment.
Thanks for this!
I'll dismiss my prior blocking review so it doesn't block merging. But I would like @shiyu1994 or @guolinke to also approve before we merge this. They might remember a reason that this copy was used in #1124.
@borchero reviewed and all my concerns were addressed
src/application/application.cpp
Outdated
| ncol = Common::StringToArray<int>(result_reader.Lines()[0], '\t').size(); | ||
| } | ||
| std::vector<int> pred_leaf; | ||
| pred_leaf.reserve(nrow * ncol); |
There was a problem hiding this comment.
Shall we use resize instead of reserve?
shiyu1994
left a comment
There was a problem hiding this comment.
Thanks for the contribution. This should be very helpful.
|
This pull request has been automatically locked since there has not been any recent activity since it was closed. |
The C-API exposes
LGBM_BoosterRefitwhich receives the predicted leaf indices as a flat buffer that is laid out in the shape ofnrow x ncol. TheBoosting::RefitTreefunction, on the other hand, expects these indices as a nested vector object (std::vector<std::vector<int>>). Creating this nested object requires various additional allocations and amounts to an entire copy of the initial buffer doubling the memory requirements.This PR changes the API of
Boosting::RefitTreeto take a pointer to a flat buffer ofint32s, just like the C-API, thus avoiding the copy. This assumes that this part of the API is not regarded as stable. The C-API remains unchanged.