Skip to content

Conversation

@demmer
Copy link
Member

@demmer demmer commented Jan 27, 2025

Description

Note this is a RFC Draft PR -- soliciting feedback on the idea / proposed implementation.

Adds a new vtgate flag --default-multi-shard-autocommit which, as the name implies, opts the query engine into using multi-shard autocommit semantics by default, even if the plan does not contain the query directive MULTI_SHARD_AUTOCOMMIT.

The aim of this change is to help protect Slack or other large scale Vitess adopters that use autocommit from inadvertently expensive extra round trips on scatter DMLs.

As a general rule, Slack avoids scatter DMLs, but on occasion when they are needed, prefer the semantics of the MULTI_SHARD_AUTOCOMMIT=1 query directive (in which each shard executes its own autocommit DML) as opposed to the default behavior (in which each shard runs the DML in a transaction and vtgate issues a second round trip to commit).

We generally apply this as a directive in the application code, but that doesn't prevent mistakes where developers can inadvertently forget to apply the directive.

This PR adds an option to change the default behavior so that this is opted in by default. It should not affect any semantics on explicit transactions, nor should it affect any deployments that don't enable the feature.

Related Issue(s)

None.

Checklist

  • "Backport to:" labels have been added if this change should be back-ported to release branches
  • If this change is to be back-ported to previous releases, a justification is included in the PR description
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on CI?
  • Documentation was added or is not required

Deployment Notes

Adds a new vtgate flag --default-multi-shard-autocommit which, as the name
implies, opts the query engine into using multi-shard autocommit semantics by
default, even if the plan does not contain the query directive
MULTI_SHARD_AUTOCOMMIT.

Signed-off-by: Michael Demmer <[email protected]>
@vitess-bot
Copy link
Contributor

vitess-bot bot commented Jan 27, 2025

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • Ensure there is a link to an issue (except for internal cleanup and flaky test fixes), new features should have an RFC that documents use cases and test cases.

Tests

  • Bug fixes should have at least one unit or end-to-end test, enhancement and new features should have a sufficient number of tests.

Documentation

  • Apply the release notes (needs details) label if users need to know about this change.
  • New features should be documented.
  • There should be some code comments as to why things are implemented the way they are.
  • There should be a comment at the top of each new or modified test to explain what the test does.

New flags

  • Is this flag really necessary?
  • Flag names must be clear and intuitive, use dashes (-), and have a clear help text.

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow needs to be marked as required, the maintainer team must be notified.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from vitess-operator and arewefastyet, if used there.
  • vtctl command output order should be stable and awk-able.

@vitess-bot vitess-bot bot added NeedsBackportReason If backport labels have been applied to a PR, a justification is required NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsIssue A linked issue is missing for this Pull Request NeedsWebsiteDocsUpdate What it says labels Jan 27, 2025
@github-actions github-actions bot added this to the v22.0.0 milestone Jan 27, 2025
@codecov
Copy link

codecov bot commented Jan 27, 2025

Codecov Report

Attention: Patch coverage is 72.72727% with 3 lines in your changes missing coverage. Please review.

Project coverage is 67.69%. Comparing base (de33a39) to head (34774af).
Report is 229 commits behind head on main.

Files with missing lines Patch % Lines
go/vt/vtgate/vtgate.go 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #17635      +/-   ##
==========================================
+ Coverage   67.68%   67.69%   +0.01%     
==========================================
  Files        1586     1586              
  Lines      255647   255654       +7     
==========================================
+ Hits       173034   173075      +41     
+ Misses      82613    82579      -34     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@harshit-gangal
Copy link
Member

How about we use transaction_mode for this? Currently it can be single, multi, twopc.

@demmer
Copy link
Member Author

demmer commented Jan 28, 2025 via email

@github-actions
Copy link
Contributor

This PR is being marked as stale because it has been open for 30 days with no activity. To rectify, you may do any of the following:

  • Push additional commits to the associated branch.
  • Remove the stale label.
  • Add a comment indicating why it is not stale.

If no action is taken within 7 days, this PR will be closed.

@github-actions github-actions bot added Stale Marks PRs as stale after a period of inactivity, which are then closed after a grace period. and removed Stale Marks PRs as stale after a period of inactivity, which are then closed after a grace period. labels Feb 28, 2025
@demmer
Copy link
Member Author

demmer commented Mar 5, 2025

@harshit-gangal @deepthi This PR sat kinda idle. Is this something that we feel ok about moving forward?

I'm happy to work on the test conflicts if so.


func (dml *DML) execMultiShard(ctx context.Context, primitive Primitive, vcursor VCursor, rss []*srvtopo.ResolvedShard, queries []*querypb.BoundQuery) (*sqltypes.Result, error) {
autocommit := (len(rss) == 1 || dml.MultiShardAutocommit) && vcursor.AutocommitApproval()
autocommit := (len(rss) == 1 || vcursor.DefaultMultiShardAutocommit() || dml.MultiShardAutocommit) && vcursor.AutocommitApproval()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dml.MultiShardAutocommit should be authoritative over the default, so if we want to override the default let's say true with a false we have to consider that.

Copy link
Member

@harshit-gangal harshit-gangal Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need to store like a enum or a pointer bool in the plan to know if this is set or not.

Comment on lines +118 to +120

// DefaultMultiShardAutocommit will opt into autocommit semantics even for multi shard DMLs
DefaultMultiShardAutocommit bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This is not used by executor, it is only passed to vcusor. We can set on the vcursor config directly.

@harshit-gangal
Copy link
Member

@harshit-gangal @deepthi This PR sat kinda idle. Is this something that we feel ok about moving forward?

I'm happy to work on the test conflicts if so.

@demmer Sorry for the delay. I have reviewed the code now.

@frouioui frouioui modified the milestones: v22.0.0, v23.0.0 Apr 1, 2025
@github-actions
Copy link
Contributor

github-actions bot commented May 4, 2025

This PR is being marked as stale because it has been open for 30 days with no activity. To rectify, you may do any of the following:

  • Push additional commits to the associated branch.
  • Remove the stale label.
  • Add a comment indicating why it is not stale.

If no action is taken within 7 days, this PR will be closed.

@github-actions github-actions bot added the Stale Marks PRs as stale after a period of inactivity, which are then closed after a grace period. label May 4, 2025
@github-actions
Copy link
Contributor

This PR was closed because it has been stale for 7 days with no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

NeedsBackportReason If backport labels have been applied to a PR, a justification is required NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsIssue A linked issue is missing for this Pull Request NeedsWebsiteDocsUpdate What it says Stale Marks PRs as stale after a period of inactivity, which are then closed after a grace period.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants