-
Notifications
You must be signed in to change notification settings - Fork 446
Closed
Description
Hello,
I'm trying to use the receipt for training the X-UMX model on the musdb18 dataset but the following exception raises after finishing the validation of the first epoch:
Traceback (most recent call last): File "/opt/tuni/worktmp/repos/asteroid/egs/musdb18/X-UMX/train.py", line 498, in <module> main(arg_dic, plain_args) File "/opt/tuni/worktmp/repos/asteroid/egs/musdb18/X-UMX/train.py", line 465, in main trainer.fit(system) File "/worktmp/anaconda3/envs/py39torch20/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 520, in fit call._call_and_handle_interrupt( File "/worktmp/anaconda3/envs/py39torch20/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py", line 42, in _call_and_handle_interrupt return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs) File "/worktmp/anaconda3/envs/py39torch20/lib/python3.9/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 92, in launch return function(*args, **kwargs) File "/worktmp/anaconda3/envs/py39torch20/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 559, in _fit_impl self._run(model, ckpt_path=ckpt_path) File "/worktmp/anaconda3/envs/py39torch20/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 935, in _run results = self._run_stage() File "/worktmp/anaconda3/envs/py39torch20/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 978, in _run_stage self.fit_loop.run() File "/worktmp/anaconda3/envs/py39torch20/lib/python3.9/site-packages/pytorch_lightning/loops/fit_loop.py", line 202, in run self.on_advance_end() File "/worktmp/anaconda3/envs/py39torch20/lib/python3.9/site-packages/pytorch_lightning/loops/fit_loop.py", line 377, in on_advance_end self.epoch_loop.update_lr_schedulers("epoch", update_plateau_schedulers=not self.restarting) File "/worktmp/anaconda3/envs/py39torch20/lib/python3.9/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 300, in update_lr_schedulers self._update_learning_rates(interval=interval, update_plateau_schedulers=update_plateau_schedulers) File "/worktmp/anaconda3/envs/py39torch20/lib/python3.9/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 350, in _update_learning_rates call._call_lightning_module_hook( File "/worktmp/anaconda3/envs/py39torch20/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py", line 142, in _call_lightning_module_hook output = fn(*args, **kwargs) TypeError: lr_scheduler_step() missing 1 required positional argument: 'metric'
I am using Pytorch 2.0.0 and Lightning 2.0.2. In order to make the model work with Pytorch 2.0.0 I've made some minor changes in the X-UMX model (see #662) and I've reduced the batch size to make the model fit in my local GPU, but I don't think I've made any changes that should affect the Lightning training.
Maybe this could be related to be using recent versions of Pytorch and Lightning? I have no experience working with Lightning.
Best regards,
David
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels