Always display test coverage; add tests #240

tomaarsen · 2022-12-20T10:29:01Z

Hello!

Pull Request overview

When pytest is used:
- Always display test coverage.
- Always display the 10 slowest tests.
Add passing data.py tests.
Add skipped tests for onnx that fail currently.
Ensure that model.onnx is always deleted, even on test failures.

Details

`pytest` improvements

This PR introduces .coveragerc, the recommended configuration file for coverage. It's fairly standard, except for the exclusion of if TYPE_CHECKING: branches. As these are never true for users, we don't need to worry about the True branch here, so we can safely exclude these branches.

Additionally, setup.cfg is expanded with:

[tool:pytest]
testpaths = tests
addopts = --cov=setfit --durations=10

This implies pytest will now be equivalent to pytest --cov=setfit --durations=10 ./tests.

The --cov=setfit will print an ascii coverage report at the end of the pytest run, like so:

---------- coverage: platform win32, python 3.10.8-final-0 -----------
Name                                 Stmts   Miss Branch BrPart  Cover
----------------------------------------------------------------------
src\setfit\__init__.py                   5      0      0      0   100%
src\setfit\data.py                     114      0     54      0   100%
src\setfit\exporters\__init__.py         0      0      0      0   100%
src\setfit\exporters\onnx.py            95     22     34      8    72%
src\setfit\exporters\utils.py            4      2      0      0    50%
src\setfit\integrations.py              23      0      4      2    93%
src\setfit\logging.py                  142     61     26      2    52%
src\setfit\modeling.py                 313     50    102     20    81%
src\setfit\pipeline.py                   9      9      0      0     0%
src\setfit\trainer.py                  194     25     86     22    82%
src\setfit\trainer_distillation.py      69     17     22      7    67%
src\setfit\utils.py                     55     26      8      0    52%
----------------------------------------------------------------------
TOTAL                                 1023    212    336     61    77%

The --durations=10 will display the 10 slowest tests, like so:

====================================================================================== slowest 10 durations ====================================================================================== 
14.96s call     tests/exporters/test_onnx.py::test_export_onnx_sklearn_head
9.91s call     tests/test_trainer.py::TrainerHyperParameterOptunaIntegrationTest::test_hyperparameter_search
4.96s call     tests/test_trainer_distillation.py::DistillationSetFitTrainerTest::test_trainer_works_with_default_columns
3.80s setup    tests/test_modeling.py::SetFitModelDifferentiableHeadTest::test_max_length_is_larger_than_max_acceptable_length
3.69s call     tests/test_trainer.py::SetFitTrainerTest::test_raise_when_metric_value_is_invalid
3.07s call     tests/test_trainer.py::SetFitTrainerTest::test_trainer_works_with_model_init
2.97s call     tests/test_trainer.py::SetFitTrainerTest::test_trainer_support_callable_as_metric
2.92s call     tests/test_trainer.py::SetFitTrainerMultilabelTest::test_trainer_multilabel_support_callable_as_metric
2.66s call     tests/test_data.py::test_create_fewshot_splits_with_augmentation
2.23s call     tests/test_trainer_distillation.py::DistillationSetFitTrainerTest::test_column_mapping_multilabel

Additionally, pytest --cov-report=html can be used to create a HTML report in a coverage_report_html folder.

New tests

For data.py I have introduced several new tests. Although this code likely isn't frequently used by users of SetFit, I wanted to try and understand it better, and what better way to understand code than to try and verify that it works correctly? 😄

For the ONNX tests I created a simple Trainer that trains a differentiable head, and then I export that model to ONNX in the same way as the existing test. The reasoning here was that exporting a model was only tested for the sklearn head. I discovered that the tests fail regardless of the out_features passed to the differentiable head.

FAILED tests/exporters/test_onnx.py::test_export_onnx_torch_head[1] - ValueError: Using a target size (torch.Size([4])) that is different to the input size (torch.Size([4, 1])) is deprecated. Please ensure they have the same size.
FAILED tests/exporters/test_onnx.py::test_export_onnx_torch_head[2] - assert False
FAILED tests/exporters/test_onnx.py::test_export_onnx_torch_head[3] - assert False

At out_features of 1 there is an issue with training, and at 2 and 3 the following errors pop up, respectively:

            # Compare the results and ensure that we get the same predictions.
>           assert np.array_equal(onnx_preds, pytorch_preds)
E           assert False
E            +  where False = <function array_equal at 0x000001559F262EF0>(array([[7.2937156e-03, 9.9270624e-01],\n       [9.9964154e-01, 3.5840381e-04]], dtype=float32), array([1, 0], dtype=int64))
E            +    where <function array_equal at 0x000001559F262EF0> = np.array_equal

            # Compare the results and ensure that we get the same predictions.
>           assert np.array_equal(onnx_preds, pytorch_preds)
E           assert False
E            +  where False = <function array_equal at 0x000001559F262EF0>(array([[4.0040349e-04, 9.4026297e-01, 5.9336655e-02],\n       [8.8793278e-01, 9.9355258e-02, 1.2711961e-02]], dtype=float32), array([1, 0], dtype=int64))
E            +    where <function array_equal at 0x000001559F262EF0> = np.array_equal

As you can see, the actual model predictions stay 1-dimensional: simply an array with the number of samples to predict for as the length. However, the ONNX predictions are 2-dimensional of the shape (num_samples, out_features). Regardless of whether this is an issue on the PyTorch model, the ONNX model should always be equivalent to the PyTorch model.

I've skipped these tests with the reason of "ONNX exporting of SetFit model with Torch head not yet supported."

Additionally, I've encompassed the body of the ONNX tests with a try-finally, where the model.onnx file is deleted in the finally clause. That way, the file gets deleted even when the test fails.

All in all, this has improved the test coverage from 72% to 77%. It would increase much more when ONNX tests pass, and I've now identified some key code that is never executed, like SetFitBaseModel, SKLearnWrapper, SetFitPipeline and a majority of logging.py code.

Tom Aarsen

`pytest` now displays coverage stats and `pytest --cov-report html` produces a useful coverage report

However, these must be skipped currently, as they all fail!

For some reason, locally I get different warnings for ./make quality than the CI does.

tomaarsen · 2022-12-20T10:56:57Z

My local machine's and the CI's isort seem to disagree on what "correctly formatted" constitutes. I'll investigate.

tomaarsen · 2022-12-20T11:02:34Z

Resolved it - One of the pytest plugins that I was experimenting with had created a folder named datasets, so isort considered datasets as a local import. This causes it to, for example, be grouped alongside from setfit import ... rather than alongside from sentence_transformers import ....

Should be all set now.

lewtun

Thanks for adding this feature and catching these issues with the ONNX exporter @tomaarsen ! The PR LGTM, so I'll go ahead and merge this :)

tomaarsen added 4 commits December 19, 2022 13:22

Set up a coverage config file

9019e7a

`pytest` now displays coverage stats and `pytest --cov-report html` produces a useful coverage report

Always display the 10 slowest tests

ef9d99b

Expanded on data.py tests

c575f82

Wrote tests for exporting ONNX with Torch head

8870e3e

However, these must be skipped currently, as they all fail!

tomaarsen added CI Regarding the Continuous Integration tests Regarding the test suite labels Dec 20, 2022

tomaarsen and others added 2 commits December 20, 2022 11:38

Merge branch 'main' into enhancement/improve_tests

7b511a7

Reformat test_onnx.py

1ee7191

For some reason, locally I get different warnings for ./make quality than the CI does.

Rerun ./make style

fc6a22b

lewtun approved these changes Dec 28, 2022

View reviewed changes

lewtun merged commit 35c0511 into huggingface:main Dec 28, 2022

tomaarsen deleted the enhancement/improve_tests branch December 28, 2022 09:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Always display test coverage; add tests #240

Always display test coverage; add tests #240

Uh oh!

tomaarsen commented Dec 20, 2022

Uh oh!

tomaarsen commented Dec 20, 2022

Uh oh!

tomaarsen commented Dec 20, 2022 •

edited

Loading

Uh oh!

lewtun left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Always display test coverage; add tests #240

Always display test coverage; add tests #240

Uh oh!

Conversation

tomaarsen commented Dec 20, 2022

Pull Request overview

Details

pytest improvements

New tests

Uh oh!

tomaarsen commented Dec 20, 2022

Uh oh!

tomaarsen commented Dec 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lewtun left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`pytest` improvements

tomaarsen commented Dec 20, 2022 •

edited

Loading