Skip to content

fix dbt retry for microbatch models#12525

Merged
aahel merged 3 commits intomainfrom
fix/microbatch-model
Feb 25, 2026
Merged

fix dbt retry for microbatch models#12525
aahel merged 3 commits intomainfrom
fix/microbatch-model

Conversation

@aahel
Copy link
Contributor

@aahel aahel commented Feb 23, 2026

Resolves #11423

Problem

When a microbatch model fails during dbt run and dbt retry is executed later (e.g., days after), the retry uses the current date instead of the original failure date to compute which batches to run. This results in processing completely different batches than those that originally failed.

Root cause: In retry.py, the batch_map is only populated when a microbatch model had both successful and failed batches (len(result.batch_results.successful) != 0). When all batches fail (or the model was entirely skipped due to an upstream failure), the model is excluded from batch_map, so previous_batch_results is never set on the node. This causes get_batches() in run.py to fall through to normal batch computation, which uses get_invocation_started_at() — the current retry time, not the original run time.

Example from the issue: A model failed on 2025-03-21 with batches 03-18 through 03-21. When retried on 2025-03-25, it processed batches 03-22 through 03-25 instead — completely missing the original failed batches.

Solution

Track the original invocation time from the run_results.json artifact metadata and use it during batch computation on retry.

Changes:

  1. core/dbt/task/retry.py: Extract invocation_started_at from the previous run's metadata and pass it to the RunTask as original_invocation_started_at. Also changed self.task_class == RunTask to issubclass(self.task_class, RunTask) so that BuildTask (which extends RunTask) also gets microbatch retry behavior.

  2. core/dbt/task/run.py: Added original_invocation_started_at attribute to RunTask. Updated MicrobatchModelRunner.get_microbatch_builder() to use self.parent_task.original_invocation_started_at as default_end_time when available, falling back to get_invocation_started_at() for normal (non-retry) runs.

  3. tests/unit/task/test_run.py: Added two tests verifying that the microbatch builder uses the original invocation time during retry and falls back to the current time during normal runs.

Checklist

  • I have read the contributing guide and understand what's expected of me.
  • I have run this code in development, and it appears to resolve the stated issue.
  • This PR includes tests, or tests are not required or relevant for this PR.
  • This PR has no interface changes (e.g., macros, CLI, logs, JSON artifacts, config files, adapter interface, etc.) or this PR has already received feedback and approval from Product or DX.
  • This PR includes type annotations for new and modified functions.

@aahel aahel requested a review from a team as a code owner February 23, 2026 10:24
@cla-bot cla-bot bot added the cla:yes label Feb 23, 2026
@github-actions
Copy link
Contributor

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

@codecov
Copy link

codecov bot commented Feb 23, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.44%. Comparing base (019ebdc) to head (392e6c5).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #12525      +/-   ##
==========================================
- Coverage   91.45%   91.44%   -0.02%     
==========================================
  Files         203      203              
  Lines       25471    25497      +26     
==========================================
+ Hits        23294    23315      +21     
- Misses       2177     2182       +5     
Flag Coverage Δ
integration 88.30% <100.00%> (-0.03%) ⬇️
unit 65.44% <50.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
Unit Tests 65.44% <50.00%> (-0.03%) ⬇️
Integration Tests 88.30% <100.00%> (-0.03%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@aahel aahel self-assigned this Feb 23, 2026
Copy link
Contributor

@MichelleArk MichelleArk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would also be great to include some functional tests for this change.

Example microbatch functional tests testing retry behavior: https://github.com/dbt-labs/dbt-mantle/blob/main/tests/functional/microbatch/test_microbatch.py#L651-L681

Would be great to get a simple retry test for the build command + a microbatch model given that was introduced in this PR as well!

@aahel aahel requested a review from MichelleArk February 24, 2026 09:17
@aahel aahel force-pushed the fix/microbatch-model branch from 3f9f4d8 to 392e6c5 Compare February 25, 2026 05:00
@aahel aahel added backport 1.10.latest Tag for PR to be backported to the 1.10.latest branch backport 1.11.latest labels Feb 25, 2026
@aahel aahel merged commit e6da05f into main Feb 25, 2026
130 checks passed
@aahel aahel deleted the fix/microbatch-model branch February 25, 2026 05:28
@aahel aahel added backport 1.10.latest Tag for PR to be backported to the 1.10.latest branch backport 1.11.latest and removed backport 1.10.latest Tag for PR to be backported to the 1.10.latest branch backport 1.11.latest labels Feb 25, 2026
github-actions bot pushed a commit that referenced this pull request Feb 25, 2026
* fix dbt retry for microbatch models

* addeed changelog

* added functional tests

(cherry picked from commit e6da05f)
github-actions bot pushed a commit that referenced this pull request Feb 25, 2026
* fix dbt retry for microbatch models

* addeed changelog

* added functional tests

(cherry picked from commit e6da05f)
aahel added a commit that referenced this pull request Feb 25, 2026
* fix dbt retry for microbatch models

* addeed changelog

* added functional tests

(cherry picked from commit e6da05f)

Co-authored-by: aahel <aahel.guha@dbtlabs.com>
aahel added a commit that referenced this pull request Feb 25, 2026
* fix dbt retry for microbatch models

* addeed changelog

* added functional tests

(cherry picked from commit e6da05f)

Co-authored-by: aahel <aahel.guha@dbtlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 1.10.latest Tag for PR to be backported to the 1.10.latest branch backport 1.11.latest cla:yes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] dbt retry should use the original failure date as reference for microbatch models

2 participants