Skip to content

[v25.2.x] [CORE-13370] archival: Fence spillover command#27889

Merged
Lazin merged 1 commit intoredpanda-data:v25.2.xfrom
Lazin:manual-backport-27714-v25.2.x-467
Oct 4, 2025
Merged

[v25.2.x] [CORE-13370] archival: Fence spillover command#27889
Lazin merged 1 commit intoredpanda-data:v25.2.xfrom
Lazin:manual-backport-27714-v25.2.x-467

Conversation

@Lazin
Copy link
Copy Markdown
Contributor

@Lazin Lazin commented Oct 3, 2025

Backport of PR #27714

Also, extract fence initialization into a method in the ntp_archiver to
avoid code duplication.

There is a change in the control flow in the 'apply_spillover' method.
Previously, the spillover wouldn't stop in case of replication error
causing the error to be repeated. The loop would use manifest to create
a spillover manifest and replicate the command with archival STM. The
replicate method waits until the command is applied and propagates the
error back to the loop. In case of error the error was printed and the
loop continued. Since the state of the  manifest didn't change the loop
would produce the same manifesta and the same command causing new
failure.

This commit breaks if the spillover command can't be applied. This
guarantees forward progress.

Signed-off-by: Evgeny Lazin <4lazin@gmail.com>
(cherry picked from commit 35dc6d6)
@Lazin Lazin added this to the v25.2.x-next milestone Oct 3, 2025
@Lazin Lazin added the kind/backport PRs targeting a stable branch label Oct 3, 2025
@Lazin Lazin changed the title [v25.2.x] [CORE-13370] archival: Fence spillover command" [v25.2.x] [CORE-13370] archival: Fence spillover command Oct 3, 2025
@Lazin Lazin requested a review from oleiman October 3, 2025 13:01
@vbotbuildovich
Copy link
Copy Markdown
Collaborator

CI test results

test results on build#73536
test_class test_method test_arguments test_kind job_url test_status passed reason test_history
TopicDeleteCloudStorageTest drop_lifecycle_marker_test {"cloud_storage_type": 2} integration https://buildkite.com/redpanda/redpanda/builds/73536#0199aa5a-6539-4f24-add9-3d24d0f58da4 FLAKY 20/21 upstream reliability is '95.91836734693877'. current run reliability is '95.23809523809523'. drift is 0.68027 and the allowed drift is set to 50. The test should PASS https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=TopicDeleteCloudStorageTest&test_method=drop_lifecycle_marker_test

Copy link
Copy Markdown
Member

@oleiman oleiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, though i approved this already

@Lazin Lazin merged commit d1ba702 into redpanda-data:v25.2.x Oct 4, 2025
18 checks passed
@tyson-redpanda tyson-redpanda modified the milestones: v25.2.x-next, v25.2.12 Dec 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/redpanda kind/backport PRs targeting a stable branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants