Skip to content

fix(plan): return err if one-time instruction fails#59

Merged
Vicente-Cheng merged 1 commit intoharvester:harvester-devfrom
starbops:fix-9886
Feb 24, 2026
Merged

fix(plan): return err if one-time instruction fails#59
Vicente-Cheng merged 1 commit intoharvester:harvester-devfrom
starbops:fix-9886

Conversation

@starbops
Copy link
Member

@starbops starbops commented Jan 20, 2026

Problem:

Even if any of the one-time instructions do not complete successfully, the entire plan execution still ends successfully (the returned err is nil). This results in RancherD being unable to retry, and the joining nodes remain unjoined.

Solution:

Return an error when there are any one-time instruction execution failure so that RancherD can further retry.

Related Issue(s):

harvester/harvester#9886

Test plan:

See harvester/harvester#9886. It could be hard to reproduce/test whether the fix works.

Additional documentation or context

Copilot AI review requested due to automatic review settings January 20, 2026 07:50
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds error handling for one-time instruction failures in the plan execution. Previously, when one-time instructions failed during plan execution, the Apply method would not return an error, allowing execution to continue even though critical instructions had failed. This change explicitly checks the OneTimeApplySucceeded flag after plan execution and returns an error if one-time instructions failed.

Changes:

  • Added explicit error checking for one-time instruction execution failures in RunWithKubernetesVersion function

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Zespre Chang <zespre.chang@suse.com>
Copy link
Member

@wheatdog wheatdog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@irishgordo
Copy link

Will this be within v1.7.1 content/release?

@starbops
Copy link
Member Author

Will this be within v1.7.1 content/release?

Currently, no, as there is a workaround.

Copy link
Member

@brandboat brandboat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice catch!

not related to this PR, just a note when I skimmed the code, I found L45 is weird

OneTimeInstructionAttempts: 100,

This is attempts count, while we fixed the count to 100, the supposed to increase for every retry, though it's fine to fix it to 100 I think, this param only affects the log output and env var CATTLE_AGENT_ATTEMPT_NUMBER and seems there is no place invoke this env var.

@starbops
Copy link
Member Author

LGTM, nice catch!

not related to this PR, just a note when I skimmed the code, I found L45 is weird

OneTimeInstructionAttempts: 100,

This is attempts count, while we fixed the count to 100, the supposed to increase for every retry, though it's fine to fix it to 100 I think, this param only affects the log output and env var CATTLE_AGENT_ATTEMPT_NUMBER and seems there is no place invoke this env var.

Yeah, it looks like the number only serves as an ID.

@Vicente-Cheng Vicente-Cheng merged commit 66c89f3 into harvester:harvester-dev Feb 24, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants