-
Notifications
You must be signed in to change notification settings - Fork 465
Error while running classification regression test with DeiT-Tiny template #2567
Description
Error while running following TC
- tests/regression/classification/test_classification.py::TestRegressionMultiClassClassification::test_otx_train_cls_incr[Custom_Image_Classification_DeiT-Tiny]
Use cmd below to reproduce this issue on the local machine.
$ CI_DATA_ROOT=<absolute-path-to-ci-datasets> tox -vvv -e tests-cls-py310-pt1 -- tests/regression/classification/test_classification.py::TestRegressionMultiClassClassification::test_otx_train[Custom_Image_Classification_DeiT-Tiny] tests/regression/classification/test_classification.py::TestRegressionMultiClassClassification::test_otx_train_cls_incr[Custom_Image_Classification_DeiT-Tiny]
error capture
2023-10-19T05:44:33.8108903Z 2023-10-19 05:44:31,331 - mmcls - INFO - Epoch(val) [5][32] accuracy_top-1: 0.6125, accuracy_top-5: 0.9400, airplane accuracy: 0.7000, automobile accuracy: 0.6250, bird accuracy: 0.5750, cat accuracy: 0.3750, deer accuracy: 0.7000, dog accuracy: 0.5500, frog accuracy: 0.5250, horse accuracy: 0.6750, ship accuracy: 0.8250, truck accuracy: 0.5750, mean accuracy: 0.6125, accuracy: 0.6125, current_iters: 160
2023-10-19T05:44:33.8111028Z 2023-10-19 05:44:31,332 - mmcls - INFO - MemCacheHandlerBase uses 0 / 0 (0.0%) memory pool and store 0 items.
2023-10-19T05:44:33.8111701Z 2023-10-19 05:44:31,333 - mmcls - INFO -
2023-10-19T05:44:33.8112148Z Best Score: 0.6125, Current Score: 0.6125, Patience: 1 Count: 0
2023-10-19T05:44:33.8112617Z Process SpawnProcess-1:
2023-10-19T05:44:33.8112933Z Traceback (most recent call last):
2023-10-19T05:44:33.8119433Z File "/home/validation/actions-runner/_work/_tool/Python/3.10.13/x64/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
2023-10-19T05:44:33.8119937Z self.run()
2023-10-19T05:44:33.8120475Z File "/home/validation/actions-runner/_work/_tool/Python/3.10.13/x64/lib/python3.10/multiprocessing/process.py", line 108, in run
2023-10-19T05:44:33.8121113Z self._target(*self._args, **self._kwargs)
2023-10-19T05:44:33.8121841Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/cli/utils/multi_gpu.py", line 269, in run_child_process
2023-10-19T05:44:33.8122436Z train_func()
2023-10-19T05:44:33.8123035Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/cli/tools/train.py", line 290, in train
2023-10-19T05:44:33.8123575Z task.train(
2023-10-19T05:44:33.8124217Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/task.py", line 213, in train
2023-10-19T05:44:33.8124830Z results = self._train_model(dataset)
2023-10-19T05:44:33.8125600Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/task.py", line 410, in _train_model
2023-10-19T05:44:33.8126298Z train_model(
2023-10-19T05:44:33.8126889Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcls/apis/train.py", line 233, in train_model
2023-10-19T05:44:33.8127455Z runner.run(data_loaders, cfg.workflow)
2023-10-19T05:44:33.8128128Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcv/runner/epoch_based_runner.py", line 136, in run
2023-10-19T05:44:33.8128717Z epoch_runner(data_loaders[i], **kwargs)
2023-10-19T05:44:33.8129457Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/common/adapters/mmcv/runner.py", line 81, in train
2023-10-19T05:44:33.8130102Z self.run_iter(data_batch, train_mode=True, **kwargs)
2023-10-19T05:44:33.8131163Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcv/runner/epoch_based_runner.py", line 31, in run_iter
2023-10-19T05:44:33.8131871Z outputs = self.model.train_step(data_batch, self.optimizer,
2023-10-19T05:44:33.8132646Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcv/parallel/distributed.py", line 63, in train_step
2023-10-19T05:44:33.8133262Z output = self.module.train_step(*inputs[0], **kwargs[0])
2023-10-19T05:44:33.8134116Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/classifiers/mixin.py", line 29, in train_step
2023-10-19T05:44:33.8134860Z return super().train_step(data, optimizer, **kwargs)
2023-10-19T05:44:33.8135832Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/classifiers/mixin.py", line 105, in train_step
2023-10-19T05:44:33.8136574Z return super().train_step(data, optimizer, **kwargs)
2023-10-19T05:44:33.8137312Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcls/models/classifiers/base.py", line 139, in train_step
2023-10-19T05:44:33.8137890Z losses = self(**data)
2023-10-19T05:44:34.5742421Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
2023-10-19T05:44:34.5743463Z return forward_call(*input, **kwargs)
2023-10-19T05:44:34.5746123Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcv/runner/fp16_utils.py", line 149, in new_func
2023-10-19T05:44:34.5746887Z output = old_func(*new_args, **new_kwargs)
2023-10-19T05:44:34.5747809Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcls/models/classifiers/base.py", line 83, in forward
2023-10-19T05:44:34.5748419Z return self.forward_train(img, **kwargs)
2023-10-19T05:44:34.5749327Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/classifiers/custom_image_classifier.py", line 83, in forward_train
2023-10-19T05:44:34.5750118Z loss = self.head.forward_train(x, gt_label)
2023-10-19T05:44:34.5750921Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcls/models/heads/vision_transformer_head.py", line 122, in forward_train
2023-10-19T05:44:34.5751649Z losses = self.loss(cls_score, gt_label, **kwargs)
2023-10-19T05:44:34.5752532Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/heads/custom_vision_transformer_head.py", line 24, in loss
2023-10-19T05:44:34.5753340Z loss = self.compute_loss(cls_score, gt_label, feature=feature)
2023-10-19T05:44:34.5754077Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
2023-10-19T05:44:34.5754669Z return forward_call(*input, **kwargs)
2023-10-19T05:44:34.5755469Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/losses/ib_loss.py", line 61, in forward
2023-10-19T05:44:34.5756240Z feature = torch.sum(torch.abs(feature), 1).reshape(-1, 1)
2023-10-19T05:44:34.5756622Z TypeError: abs(): argument 'input' (position 1) must be Tensor, not NoneType
2023-10-19T05:44:34.5756920Z Traceback (most recent call last):
2023-10-19T05:44:34.5757441Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/bin/otx", line 8, in
2023-10-19T05:44:34.5757886Z sys.exit(main())
2023-10-19T05:44:34.5758476Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/cli/tools/cli.py", line 77, in main
2023-10-19T05:44:34.5759029Z results = globals()f"otx_{name}"
2023-10-19T05:44:34.5759668Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/cli/tools/train.py", line 192, in main
2023-10-19T05:44:34.5760262Z return train(exit_stack)
2023-10-19T05:44:34.5760882Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/cli/tools/train.py", line 290, in train
2023-10-19T05:44:34.5761420Z task.train(
2023-10-19T05:44:34.5762077Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/task.py", line 213, in train
2023-10-19T05:44:34.5762681Z results = self._train_model(dataset)
2023-10-19T05:44:34.5763434Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/task.py", line 410, in _train_model
2023-10-19T05:44:34.5764105Z train_model(
2023-10-19T05:44:34.5764743Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcls/apis/train.py", line 233, in train_model
2023-10-19T05:44:34.5765310Z runner.run(data_loaders, cfg.workflow)
2023-10-19T05:44:34.5765989Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcv/runner/epoch_based_runner.py", line 136, in run
2023-10-19T05:44:34.5766576Z epoch_runner(data_loaders[i], **kwargs)
2023-10-19T05:44:36.2499831Z
2023-10-19T05:44:36.2501500Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/common/adapters/mmcv/runner.py", line 81, in train
2023-10-19T05:44:36.2502702Z self.run_iter(data_batch, train_mode=True, **kwargs)
2023-10-19T05:44:36.2503929Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcv/runner/epoch_based_runner.py", line 31, in run_iter
2023-10-19T05:44:36.2505326Z outputs = self.model.train_step(data_batch, self.optimizer,
2023-10-19T05:44:36.2506559Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcv/parallel/distributed.py", line 63, in train_step
2023-10-19T05:44:36.2507599Z output = self.module.train_step(*inputs[0], **kwargs[0])
2023-10-19T05:44:36.2509023Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/classifiers/mixin.py", line 29, in train_step
2023-10-19T05:44:36.2510281Z return super().train_step(data, optimizer, **kwargs)
2023-10-19T05:44:36.2511711Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/classifiers/mixin.py", line 105, in train_step
2023-10-19T05:44:36.2512971Z return super().train_step(data, optimizer, **kwargs)
2023-10-19T05:44:36.2514154Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcls/models/classifiers/base.py", line 139, in train_step
2023-10-19T05:44:36.2515125Z losses = self(**data)
2023-10-19T05:44:36.2516161Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
2023-10-19T05:44:36.2517127Z return forward_call(*input, **kwargs)
2023-10-19T05:44:36.2518205Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcv/runner/fp16_utils.py", line 149, in new_func
2023-10-19T05:44:36.2519391Z 2023-10-19 05:44:31,333 | INFO : Balanced sampler will select balanced samples 32 times
2023-10-19T05:44:36.2520296Z 2023-10-19 05:44:34,659 | WARNING : Some of child processes are terminated abnormally. process exits.
2023-10-19T05:44:36.2520905Z output = old_func(*new_args, **new_kwargs)
2023-10-19T05:44:36.2522035Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcls/models/classifiers/base.py", line 83, in forward
2023-10-19T05:44:36.2523017Z return self.forward_train(img, **kwargs)
2023-10-19T05:44:36.2524490Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/classifiers/custom_image_classifier.py", line 83, in forward_train
2023-10-19T05:44:36.2525802Z loss = self.head.forward_train(x, gt_label)
2023-10-19T05:44:36.2527163Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcls/models/heads/vision_transformer_head.py", line 122, in forward_train
2023-10-19T05:44:36.2528266Z losses = self.loss(cls_score, gt_label, **kwargs)
2023-10-19T05:44:36.2529715Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/heads/custom_vision_transformer_head.py", line 24, in loss
2023-10-19T05:44:36.2531111Z loss = self.compute_loss(cls_score, gt_label, feature=feature)
2023-10-19T05:44:36.2532306Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
2023-10-19T05:44:36.2533267Z return forward_call(*input, **kwargs)
2023-10-19T05:44:36.2534690Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/losses/ib_loss.py", line 61, in forward
2023-10-19T05:44:36.2536044Z feature = torch.sum(torch.abs(feature), 1).reshape(-1, 1)
2023-10-19T05:44:36.2536660Z TypeError: abs(): argument 'input' (position 1) must be Tensor, not NoneType