Always rollback transaction when retrying#19372
Conversation
| try: | ||
| with opentracing.start_active_span("db.rollback"): | ||
| conn.rollback() | ||
| except self.engine.module.Error as e1: | ||
| transaction_logger.warning("[TXN EROLL] {%s} %s", name, e1) |
There was a problem hiding this comment.
Moved conn.rollback() outside of the if i < MAX_NUMBER_OF_RETRIES: condition so we always rollback, then decide whether to keep retrying or not.
| # no additional calls | ||
| self.assertEqual(txn_mocks.exception_callback.call_count, 5) |
There was a problem hiding this comment.
This count changed because we changed the logic to be MAX_NUMBER_OF_RETRIES and align with that instead of being willy nilly with N and off-by-one.
| # FIXME: Every transaction should have been rolled back, see | ||
| # https://github.com/element-hq/synapse/issues/19202 | ||
| # self.assertEqual(txn_mocks.rollback.call_count, 1) |
There was a problem hiding this comment.
Something to fixup in a follow-up PR
| # FIXME: Sanity check that every transaction is either committed or rolled back, | ||
| # see https://github.com/element-hq/synapse/issues/19202 | ||
| # transaction_count = after_callback.call_count + exception_callback.call_count | ||
| # self.assertEqual( | ||
| # transaction_count, | ||
| # commit_mock.call_count + rollback_mock.call_count, | ||
| # "We expect every transaction attempt to either commit or rollback. " | ||
| # f"Saw {transaction_count} transactions, but only {commit_mock.call_count} commits and {rollback_mock.call_count} rollbacks", | ||
| # ) |
There was a problem hiding this comment.
We can't uncomment this yet because of the test_exception_callback case below.
Something to fixup in a follow-up PR
reivilibre
left a comment
There was a problem hiding this comment.
just a suggestion, otherwise I'm happy with this, thanks for tidying as always
| N = 5 | ||
| MAX_NUMBER_OF_RETRIES = 5 | ||
| while True: | ||
| cursor = conn.cursor( |
There was a problem hiding this comment.
given what you've ended up with, I'd be very tempted to i += 1 at the start of the loop
this then means:
- no need to
+1for display - no need for the awkward
i < MAX_NUMBER_OF_RETRIES - 1— you can drop the- 1 - (minor) don't have multiple code paths doing the increment
However the pedant in me is saying this should be MAX_NUMBER_OF_ATTEMPTS since it also counts the first attempt
In fact, couldn't we just use a for attempt_number in range(1, MAX_NUMBER_OF_ATTEMPTS+1): loop?
There was a problem hiding this comment.
I've updated to for attempt_number in range(1, MAX_NUMBER_OF_ATTEMPTS+1): but there was a little hiccup with the linter not being able to tell that all code-paths either return or raise, so I had to add a little "unreachable" hint.
Let me know what you think of the trade-off. Looks decent 👍
We can still have most of the benefit if we go back to using a while True loop 🤷
There was a problem hiding this comment.
Happy with either, if you're happy with this then let's just merge like that
|
Thanks for the review @reivilibre 🦃 |
This PR contains the following updates: | Package | Update | Change | |---|---|---| | [element-hq/synapse](https://github.com/element-hq/synapse) | minor | `1.145.0` → `1.146.0` | --- >⚠️ **Warning** > > Some dependencies could not be looked up. Check the Dependency Dashboard for more information. --- ### Release Notes <details> <summary>element-hq/synapse (element-hq/synapse)</summary> ### [`v1.146.0`](https://github.com/element-hq/synapse/releases/tag/v1.146.0) [Compare Source](element-hq/synapse@v1.145.0...v1.146.0rc1) ### Synapse 1.146.0 (2026-01-27) No significant changes since 1.146.0rc1. #### Deprecations and Removals - [MSC2697](matrix-org/matrix-spec-proposals#2697) (Dehydrated devices) has been removed, as the MSC is closed. Developers should migrate to [MSC3814](matrix-org/matrix-spec-proposals#3814). ([#​19346](element-hq/synapse#19346)) - Support for Ubuntu 25.04 (Plucky Puffin) has been dropped. Synapse no longer builds debian packages for Ubuntu 25.04. ### Synapse 1.146.0rc1 (2026-01-20) #### Features - Add a new config option [`enable_local_media_storage`](https://element-hq.github.io/synapse/latest/usage/configuration/config_documentation.html#enable_local_media_storage) which controls whether media is additionally stored locally when using configured `media_storage_providers`. Setting this to `false` allows off-site media storage without a local cache. Contributed by Patrice Brend'amour [@​dr](https://github.com/dr).allgood. ([#​19204](element-hq/synapse#19204)) - Stabilise support for [MSC4312](matrix-org/matrix-spec-proposals#4312 `m.oauth` User-Interactive Auth stage for resetting cross-signing identity with the OAuth 2.0 API. The old, unstable name (`org.matrix.cross_signing_reset`) is now deprecated and will be removed in a future release. ([#​19273](element-hq/synapse#19273)) - Refactor Grafana dashboard to use `server_name` label (instead of `instance`). ([#​19337](element-hq/synapse#19337)) #### Bugfixes - Fix joining a restricted v12 room locally when no local room creator is present but local users with sufficient power levels are. Contributed by [@​nexy7574](https://github.com/nexy7574). ([#​19321](element-hq/synapse#19321)) - Fixed parallel calls to `/_matrix/media/v1/create` being ratelimited for appservices even if `rate_limited: false` was set in the registration. Contributed by [@​tulir](https://github.com/tulir) @​ Beeper. ([#​19335](element-hq/synapse#19335)) - Fix a bug introduced in 1.61.0 where a user's membership in a room was accidentally ignored when considering access to historical state events in rooms with the "shared" history visibility. Contributed by Lukas Tautz. ([#​19353](element-hq/synapse#19353)) - [MSC4140](matrix-org/matrix-spec-proposals#4140): Store the JSON content of scheduled delayed events as text instead of a byte array. This fixes the inability to schedule a delayed event with non-ASCII characters in its content. ([#​19360](element-hq/synapse#19360)) - Always rollback database transactions when retrying (avoid orphaned connections). ([#​19372](element-hq/synapse#19372)) - Fix `InFlightGauge` typing to allow upgrading to `prometheus_client` 0.24. ([#​19379](element-hq/synapse#19379)) #### Updates to the Docker image - Add [Prometheus HTTP service discovery](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config) endpoint for easy discovery of all workers when using the `docker/Dockerfile-workers` image (see the [*Metrics* section of our Docker testing docs](docker/README-testing.md#metrics)). ([#​19336](element-hq/synapse#19336)) #### Improved Documentation - Remove docs on legacy metric names (no longer in the codebase since 2022-12-06). ([#​19341](element-hq/synapse#19341)) - Clarify how the estimated value of room complexity is calculated internally. ([#​19384](element-hq/synapse#19384)) #### Internal Changes - Add an internal `cancel_task` API to the task scheduler. ([#​19310](element-hq/synapse#19310)) - Tweak docstrings and signatures of `auth_types_for_event` and `get_catchup_room_event_ids`. ([#​19320](element-hq/synapse#19320)) - Replace usage of deprecated `assertEquals` with `assertEqual` in unit test code. ([#​19345](element-hq/synapse#19345)) - Drop support for Ubuntu 25.04 'Plucky Puffin', add support for Ubuntu 25.10 'Questing Quokka'. ([#​19348](element-hq/synapse#19348)) - Revert "Add an Admin API endpoint for listing quarantined media ([#​19268](element-hq/synapse#19268))". ([#​19351](element-hq/synapse#19351)) - Bump `mdbook` from 0.4.17 to 0.5.2 and remove our custom table-of-contents plugin in favour of the new default functionality. ([#​19356](element-hq/synapse#19356)) - Replace deprecated usage of PyGitHub's `GitRelease.title` with `.name` in release script. ([#​19358](element-hq/synapse#19358)) - Update the Element logo in Synapse's README to be an absolute URL, allowing it to render on other sites (such as PyPI). ([#​19368](element-hq/synapse#19368)) - Apply minor tweaks to v1.145.0 changelog. ([#​19376](element-hq/synapse#19376)) - Update Grafana dashboard syntax to use the latest from importing/exporting with Grafana 12.3.1. ([#​19381](element-hq/synapse#19381)) - Warn about skipping reactor metrics when using unknown reactor type. ([#​19383](element-hq/synapse#19383)) - Add support for reactor metrics with the `ProxiedReactor` used in worker Complement tests. ([#​19385](element-hq/synapse#19385)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0Mi42OS4yIiwidXBkYXRlZEluVmVyIjoiNDIuNjkuMiIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsiaW1hZ2UiXX0=--> Reviewed-on: https://gitea.alexlebens.dev/alexlebens/infrastructure/pulls/3533 Co-authored-by: Renovate Bot <renovate-bot@alexlebens.net> Co-committed-by: Renovate Bot <renovate-bot@alexlebens.net>
Always rollback transaction (connection) when retrying.
Previously, because
conn.rollback()was inside theif i < MAX_NUMBER_OF_RETRIES:condition, it never rolled back on the final retry.Part of #19202
There are other problems mentioned in #19202 but this is a nice standalone change.
Dev notes
See uncommitted transactions:
Pull Request Checklist
EventStoretoEventWorkerStore.".code blocks.