migrations: deflake data_migrations_api_test#28325
migrations: deflake data_migrations_api_test#28325joe-redpanda merged 1 commit intoredpanda-data:devfrom
Conversation
|
/ci-repeat 1 |
CI test resultstest results on build#75513
test results on build#75613
test results on build#76177
test results on build#76685
test results on build#77205
|
There was a problem hiding this comment.
Pull Request Overview
This PR fixes flakiness in the data_migrations_api_test by improving error handling and shutdown behavior in the data migrations API. The changes ensure that sleep aborted exceptions during shutdown are properly caught and converted to appropriate HTTP errors, and that HTTP requests don't hang during shutdown.
Key Changes:
- Added error handling for
sleep_abortedexceptions inlist_migrations()to map them tocluster::errc::shutting_down - Updated
list_migrations()return type to include error handling viaresult<>wrapper - Modified admin server endpoint to handle errors from
list_migrations()and convert them to proper HTTP responses
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/v/cluster/data_migration_frontend.h | Changed list_migrations() return type to include error result wrapper |
| src/v/cluster/data_migration_frontend.cc | Added try-catch for sleep_aborted exception and converted to error result |
| src/v/redpanda/admin/migrations.cc | Added error handling in admin endpoint to process errors from list_migrations() |
0d38bce to
da1bdf6
Compare
|
Rebase atop Michal's changes |
|
Not sure how the "stop" change fell off in the rebase, readded and testing |
50b2052 to
f2ca443
Compare
|
no changes, rebased onto dev to reduce my build times |
|
As far as I understand router needs to be requested to stop early, in controller shutdown input. That's for Admin API to be able to stop. See #28491 |
makes sense, left a comment. Should we still have a stop that awaits the completion of the shutdown of the contained routers? |
f2ca443 to
6d41416
Compare
6d41416 to
4242b7b
Compare
Invoke on instance can fail on shutdown with a sleep_aborted exception. Catch the exception and translate to an errc::shutting_down code such that the admin server can handle it correctly by changing it into a service unavailable response.
4242b7b to
28f67e4
Compare
| co_return cluster::errc::shutting_down; | ||
| } | ||
| vlog(dm_log.error, "unexpected exception on list_migrations {}", eptr); | ||
| throw; |
There was a problem hiding this comment.
Throw with no expression just rethrows the current exception
Am I missing something?
There was a problem hiding this comment.
was a nitpick, I thought since eptr was already captured in a local variable, we may aswell just rethrow that.
| co_return cluster::errc::shutting_down; | ||
| } | ||
| vlog(dm_log.error, "unexpected exception on list_migrations {}", eptr); | ||
| throw; |
There was a problem hiding this comment.
was a nitpick, I thought since eptr was already captured in a local variable, we may aswell just rethrow that.
|
/backport v25.3.x |
Myriad fixes for data_migration_api_test.
Backports Required
Release Notes
Improvements