[CUDA] CUDA Quantized Training (fixes #5606)#5933
Conversation
…htGBM into quantized-training
fix msvc compilation errors and warnings
…quantized training
enlarge allowed package size to 100M
|
@guolinke This is ready. Please check. |
| select_features_by_node_(select_features_by_node), | ||
| cuda_hist_(cuda_hist) { | ||
| InitFeatureMetaInfo(train_data); | ||
| if (has_categorical_feature_ && config->use_quantized_grad) { |
|
@jameslamb I've enlarged the size limitation for distributed package to 100M. Because we add a few more templates in the PR which add to the size of compiled file. Do you think it is OK? |
Thanks for the For now, since we're not distributing these CUDA wheels on PyPI, I think it's ok. Let's not let it block this PR. But if we pursue shipping a fat wheel in the future with CUDA support precompiled (like we talked about in Slack), 100MB will be a problem. There are limits on PyPI for both individual file size and cumulative project size. I don't know the exact numbers but shipping 100MB wheels would put us in the range of hitting them, I think. See these discussions:
There are also other concerns with such large wheels, e.g. for people using function-as-a-service things like AWS Lambda. See for example: I'll open a new issue in the next few days to discuss publishing wheels with CUDA support. |
|
I removed the LightGBM/.github/release-drafter.yml Lines 3 to 15 in f175ceb |
* add quantized training (first stage) * add histogram construction functions for integer gradients * add stochastic rounding * update docs * fix compilation errors by adding template instantiations * update files for compilation * fix compilation of gpu version * initialize gradient discretizer before share states * add a test case for quantized training * add quantized training for data distributed training * Delete origin.pred * Delete ifelse.pred * Delete LightGBM_model.txt * remove useless changes * fix lint error * remove debug loggings * fix mismatch of vector and allocator types * remove changes in main.cpp * fix bugs with uninitialized gradient discretizer * initialize ordered gradients in gradient discretizer * disable quantized training with gpu and cuda fix msvc compilation errors and warnings * fix bug in data parallel tree learner * make quantized training test deterministic * make quantized training in test case more accurate * refactor test_quantized_training * fix leaf splits initialization with quantized training * check distributed quantized training result * add cuda gradient discretizer * add quantized training for CUDA version in tree learner * remove cuda computability 6.1 and 6.2 * fix parts of gpu quantized training errors and warnings * fix build-python.sh to install locally built version * fix memory access bugs * fix lint errors * mark cuda quantized training on cuda with categorical features as unsupported * rename cuda_utils.h to cuda_utils.hu * enable quantized training with cuda * fix cuda quantized training with sparse row data * allow using global memory buffer in histogram construction with cuda quantized training * recover build-python.sh enlarge allowed package size to 100M
|
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Fixes #5606.
Adds quantized training for CUDA version.