Feature/better defaults improves performance through better torch dataloader usage #83

cregouby · 2022-02-06T16:13:00Z

This is an improvement step in usability and performance with

larger batch_size by default for a 70% performance gain
move resolve_data into datalaoder and dataset preprocessing and thus
Fix Dataloader cannot use num_workers>0L #78
Fix Add code coverage measure and badge #71
Fix tabnet_explain() do not provides correct result #82 (as per test result)
use global test fixture to save time for tests
improve readability of early-stopping code
Recorded performance on my laptop for high-performance {tabnet} profvis on CPU #79 (all are times in sec. , the lower, the better)

num_workers	CPU	GPU
0L	134s	46s
1L		105s
4L	227s

Conclusion : as expected, 4 workers is not enough to accelerate preprocessing. (note: num_workers with GPU is limited here because my GPU cannot afford more than two R process in memory)

pack `resolve_data` and `batch_to_device` force y to be a vector add validation_data test with `num_workers`

clean commented lines

fix pretrain dataset has no `y` lighten the num_workers test

better if else syntax add `resolve_data` tests

secure empty values for cat_idx and cat_dims add tests for `resolve_data`

…nto feature/better_defaults

codecov · 2022-02-07T10:53:53Z

Codecov Report

❗ No coverage uploaded for pull request base (main@f4f815f). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head c2e5f25 differs from pull request most recent head 928037f. Consider uploading reports for the commit 928037f to get more accurate results

@@          Coverage Diff           @@
##             main     #83   +/-   ##
======================================
  Coverage        ?   0.00%           
======================================
  Files           ?      10           
  Lines           ?    1054           
  Branches        ?       0           
======================================
  Hits            ?       0           
  Misses          ?    1054           
  Partials        ?       0

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f4f815f...928037f. Read the comment docs.

…e two twin models

reduce early_stopping_tolerance in test as it was not adapted to the data

dfalbel

Looks great to me.

Unfortunatelly the datasets/dataloaders interface is adding a lot of overhead here. Actually what would make it really faster is to get rid of it, they currently only make sense when .getitem is slow enough to compensate for their overhead - like reading an image from disk.

See also mlverse/torch#776

.github/workflows/R-CMD-check.yaml

dfalbel · 2022-02-10T11:34:23Z

.github/workflows/test-coverage.yaml

+  test-coverage:
+    runs-on: ubuntu-latest
+    env:
+      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}


I think we need TORCH_TEST=1 and TORCH_INSTALL=1 so coverage is correct.

Thanks a lot for the push ! ( I must admit i'm lost in that domain)

DESCRIPTION

Sure ! ( I was a bit afraid of the image size ) Co-authored-by: Daniel Falbel <[email protected]>

Co-authored-by: Daniel Falbel <[email protected]>

align with Luz as a reference

cregouby added 17 commits January 16, 2022 00:06

better early_stopping description

fc34bf3

turn early_stopping_monitor to "auto"

2a59bc1

update vignette with fitted models and improved vip plots

2213a33

add a conclusion

b473fab

add num_workers option to dataloader

a16b4cc

fix num_workers in predict and add tests

75541e6

split test file

48b5026

Merge remote-tracking branch 'origin/main' into feature/better_defaults

21bf467

align example with test

34dc275

move resolve_data into dataloader

4257fdb

pack `resolve_data` and `batch_to_device` force y to be a vector add validation_data test with `num_workers`

apply num_workers setup ds and dataloader to pretraining

5b27c39

clean commented lines

fix resolve_data now need device

38e73d6

fix pretrain dataset has no `y` lighten the num_workers test

fix missing dl turned into train_dl

365d7ac

better if else syntax add `resolve_data` tests

augment default batch_size and virtual_batch_size

662fa7e

secure empty values for cat_idx and cat_dims add tests for `resolve_data`

fix pre-embedding imputation accordng to new dataloader output

21da211

re-document

e3e1fec

update NEWS

73800d6

cregouby requested a review from dfalbel February 6, 2022 16:21

cregouby marked this pull request as draft February 6, 2022 17:52

cregouby added 9 commits February 7, 2022 09:32

prevent ggplot "polygon edge not found" error when building vignette

d59d60e

tentative to fix "stack overflow" errors on CI pipeline

f552467

Are the seeds governing the world ?

fa32b42

bypass CI Node stack overflow, and update hardhat print snaps

7d1cc42

add github action "code-coverage"

48d4a5e

take maintanership

9e51eae

take maintanership

97a5424

Merge branch 'feature/better_defaults' of github.com:mlverse/tabnet i…

97eded5

…nto feature/better_defaults

reduce infra expectation to avoid GPU OOM

2f8d828

ive more room for early stopping

c2e5f25

cregouby added 3 commits February 7, 2022 13:53

rename test for filename matching

d90d749

remove last Note stack overflow error

21df9b8

use global text fixture and withr::local_file when needed.

3ce3a4b

cregouby marked this pull request as ready for review February 7, 2022 23:33

cregouby added a commit that referenced this pull request Feb 9, 2022

move back to pre #83 features

e851dc1

cregouby added 2 commits February 10, 2022 08:32

fix GPU image crash and prevent too much divergence of learning in th…

c4ae82c

…e two twin models

improve early-stopping code readability

2bf43ad

reduce early_stopping_tolerance in test as it was not adapted to the data

dfalbel approved these changes Feb 10, 2022

View reviewed changes

cregouby and others added 6 commits February 10, 2022 15:19

Update .github/workflows/R-CMD-check.yaml

87041e0

Sure ! ( I was a bit afraid of the image size ) Co-authored-by: Daniel Falbel <[email protected]>

Update DESCRIPTION

b4677ac

Co-authored-by: Daniel Falbel <[email protected]>

Update test-coverage.yaml

15cfd36

align with Luz as a reference

complete alignement with {luz}

79a13f4

lower docker image size by ugrading it

2d17b76

lower image size to prevent core dump

928037f

cregouby merged commit e08f76d into main Feb 11, 2022

cregouby deleted the feature/better_defaults branch February 11, 2022 17:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/better defaults improves performance through better torch dataloader usage #83

Feature/better defaults improves performance through better torch dataloader usage #83

Uh oh!

cregouby commented Feb 6, 2022 •

edited

Loading

Uh oh!

codecov bot commented Feb 7, 2022 •

edited

Loading

Uh oh!

dfalbel left a comment

Uh oh!

Uh oh!

dfalbel Feb 10, 2022

Uh oh!

cregouby Feb 10, 2022

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Feature/better defaults improves performance through better torch dataloader usage #83

Feature/better defaults improves performance through better torch dataloader usage #83

Uh oh!

Conversation

cregouby commented Feb 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Feb 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dfalbel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dfalbel Feb 10, 2022

Choose a reason for hiding this comment

Uh oh!

cregouby Feb 10, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cregouby commented Feb 6, 2022 •

edited

Loading

codecov bot commented Feb 7, 2022 •

edited

Loading