[BUG] remove erroneous .detach() in TiDE multivariate prediction path by haoyu-haoyu · Pull Request #2208 · sktime/pytorch-forecasting

haoyu-haoyu · 2026-03-17T21:28:21Z

Summary

In the TiDE model's multivariate target code path (line 333), prediction tensors are processed with:

prediction = [i.clone().detach().requires_grad_(True) for i in prediction]

This is incorrect because:

.detach() disconnects the tensor from the computational graph
.requires_grad_(True) on a detached tensor creates a new leaf — it does NOT reconnect to the encoder/decoder graph
Gradients cannot flow back through the prediction during backpropagation
The model effectively cannot learn for multivariate targets

Fix

- prediction = [i.clone().detach().requires_grad_(True) for i in prediction]
+ prediction = [i.clone().requires_grad_(True) for i in prediction]

.clone() creates a copy (preserving gradient connection), and .requires_grad_(True) ensures gradient tracking — without the graph-breaking .detach().

Test plan

TiDE model training converges for multivariate targets
pytest tests/ -k tide passes
Gradient flow verified: loss.backward() propagates to encoder parameters

codecov · 2026-03-17T22:17:43Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (main@edbdeb4). Learn more about missing BASE report.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2208   +/-   ##
=======================================
  Coverage        ?   86.62%           
=======================================
  Files           ?      165           
  Lines           ?     9736           
  Branches        ?        0           
=======================================
  Hits            ?     8434           
  Misses          ?     1302           
  Partials        ?        0

Flag	Coverage Δ
cpu	`86.62% <100.00%> (?)`
pytest	`86.62% <100.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

phoeenniixx

Thanks a lot for catching this!
Although should there be .clone at all? It takes up more memory, no?
What do you think? @haoyu-haoyu

Also FYI @PranavBhatP @fkiraly @agobbifbk

haoyu-haoyu · 2026-03-19T09:56:54Z

Thanks for the approval @phoeenniixx! Good question about .clone().

Looking at the code flow:

prediction.permute(2, 0, 1) creates a view (not a copy)
for i in prediction yields slices that are also views into the same memory
Without .clone(), these views share the underlying storage

Whether .clone() is needed depends on whether transform_output() does any in-place operations on the prediction tensors. If it does, modifying one slice could corrupt another since they share memory.

My recommendation: keep .clone() for safety since we can't easily guarantee no in-place ops downstream. The memory overhead is minimal (one extra copy of the prediction tensor), and it prevents subtle bugs if the code is later modified.

That said, if you'd prefer to remove it for memory efficiency, the minimal change would be:

prediction = [i.requires_grad_(True) for i in prediction]

This keeps the views (no copy) and just enables gradient tracking. Happy to update either way!

haoyu-haoyu · 2026-03-19T10:02:28Z

After tracing the full call chain more carefully, I think you're right — .clone() is NOT needed here:

transform_output() → loss.rescale_parameters() → encoder() all create new tensors (de-normalization is not in-place), so views are safe
.requires_grad_(True) is also redundant — the views already inherit requires_grad=True from the autograd graph through permute()

The simplest correct code is just:

prediction = list(prediction)

Want me to update the PR to remove both .clone() and .requires_grad_(True)?

In the multivariate target code path, prediction tensors are cloned with .detach().requires_grad_(True). The .detach() call disconnects the tensor from the computational graph, creating an isolated leaf variable. The subsequent .requires_grad_(True) enables gradient tracking on this leaf but does NOT reconnect it to the encoder/decoder graph — so gradients cannot flow back through the prediction during backpropagation, effectively preventing the model from learning for multivariate targets. Fix: remove .detach() so .clone().requires_grad_(True) preserves the gradient connection while still creating a copy. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

haoyu-haoyu · 2026-03-19T11:57:55Z

Updated — simplified to prediction = list(prediction) per your suggestion.

The .clone() and .requires_grad_(True) were both unnecessary:

Views from permute() already share the autograd graph
transform_output() → rescale_parameters() creates new tensors (no in-place ops)
Gradient tracking is inherited from the original model output

One line, zero overhead.

haoyu-haoyu requested review from PranavBhatP, benHeid, fkiraly, fnhirwa, jdb78, phoeenniixx and yarnabrina as code owners March 17, 2026 21:28

phoeenniixx added the bug Something isn't working label Mar 19, 2026

github-project-automation bot added this to Bugfixing - pytorch-forecasting Mar 19, 2026

phoeenniixx changed the title ~~fix: remove erroneous .detach() in TiDE multivariate prediction path~~ [BUG] remove erroneous .detach() in TiDE multivariate prediction path Mar 19, 2026

phoeenniixx previously approved these changes Mar 19, 2026

View reviewed changes

haoyu-haoyu dismissed phoeenniixx’s stale review via 137cb5a March 19, 2026 11:57

haoyu-haoyu force-pushed the fix/tide-gradient-detach branch from 03ed98d to 137cb5a Compare March 19, 2026 11:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] remove erroneous .detach() in TiDE multivariate prediction path#2208

[BUG] remove erroneous .detach() in TiDE multivariate prediction path#2208
haoyu-haoyu wants to merge 1 commit intosktime:mainfrom
haoyu-haoyu:fix/tide-gradient-detach

haoyu-haoyu commented Mar 17, 2026

Uh oh!

codecov bot commented Mar 17, 2026 •

edited

Loading

Uh oh!

phoeenniixx left a comment

Uh oh!

haoyu-haoyu commented Mar 19, 2026

Uh oh!

haoyu-haoyu commented Mar 19, 2026

Uh oh!

haoyu-haoyu commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

haoyu-haoyu commented Mar 17, 2026

Summary

Fix

Test plan

Uh oh!

codecov bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

phoeenniixx left a comment

Choose a reason for hiding this comment

Uh oh!

haoyu-haoyu commented Mar 19, 2026

Uh oh!

haoyu-haoyu commented Mar 19, 2026

Uh oh!

haoyu-haoyu commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Mar 17, 2026 •

edited

Loading