Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flag to select tx throttler tablet type #12174

Merged

Conversation

timvaillancourt
Copy link
Contributor

@timvaillancourt timvaillancourt commented Jan 26, 2023

Description

This PR allows the user to specify which tablet type(s) should be monitored by the vttablet Transaction Throttler (Design Doc). The motivation for this is to allow rdonly tablets to be included in throttling

This involves the addition of the --tx-throttler-tablet-types flag (a comma-separated list of tablet types) with the default of replica for backwards-compatability. Only replica and/or rdonly are supported

This resolves this code comment in tx_throttler.go:

// TODO(erez): If this becomes necessary, we can add a configuration option that would
// determine whether we consider RDONLY tablets here, as well.

vttablet --help output:

tim@Tims-MacBook-Pro vitess % ./vttablet --help 2>&1 | grep tx-throttler-tablet-types
      --tx-throttler-tablet-types strings                                A comma-separated list of tablet types. Only tablets of this type are monitored for replication lag by the transaction throttler. Supported types are replica and/or rdonly. (default replica)

Backports to v14 + v15 would be much appreciated if they merge cleanly

Checklist

  • "Backport to:" labels have been added if this change should be back-ported
  • Tests were added or are not required
  • Documentation was added or is not required

Deployment Notes

Signed-off-by: Tim Vaillancourt <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>
@deepthi deepthi added NeedsWebsiteDocsUpdate What it says release notes (needs details) This PR needs to be listed in the release notes in a dedicated section (deprecation notice, etc...) Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: Query Serving and removed release notes (needs details) This PR needs to be listed in the release notes in a dedicated section (deprecation notice, etc...) labels Jan 27, 2023
@deepthi
Copy link
Member

deepthi commented Jan 27, 2023

Looks like our bot isn't running right now, otherwise there would have been a comment like this #12138 (comment)

Relevant points:

  • New flags should use dashes, not underscores
  • New flags need to be documented on the website

Re backports, we do them only for regressions / production blocking bugs. This doesn't seem like either. If that isn't right and you can provide a justification for the back port, I'm happy to discuss.

@timvaillancourt
Copy link
Contributor Author

timvaillancourt commented Jan 27, 2023

@deepthi thanks for the feedback, updating the PR based on this 👍

  • New flags should use dashes, not underscores

The majority of the existing tx throttler flags are using underscores, would it make sense for me to update those also or is an inconsistency acceptable?

Re backports, we do them only for regressions / production blocking bugs. This doesn't seem like either.

Good point. This does not meet the criteria for backporting 👍

Signed-off-by: Tim Vaillancourt <[email protected]>
@timvaillancourt
Copy link
Contributor Author

timvaillancourt commented Jan 27, 2023

The majority of the existing tx throttler flags are using underscores, would it make sense for me to update those also or is an inconsistency acceptable?

For now I have updated the new flag only to dashes. It appears the existing flags have dash-based aliases with help like: Synonym to -tx_throttler_healthcheck_cells 👍

@timvaillancourt
Copy link
Contributor Author

Website docs PR: vitessio/website#1368

Signed-off-by: Tim Vaillancourt <[email protected]>
Copy link
Member

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good to me. The only thing I'd ask is to add a unit test that actually tests with multiple tablet types.
e2e tests are expensive so I'm not going to ask for one. Can you confirm that you are able to run this locally and get the expected behavior?
Website PR is looking good.

@deepthi
Copy link
Member

deepthi commented Jan 31, 2023

The majority of the existing tx throttler flags are using underscores, would it make sense for me to update those also or is an inconsistency acceptable?

For now I have updated the new flag only to dashes. It appears the existing flags have dash-based aliases with help like: Synonym to -tx_throttler_healthcheck_cells 👍

You did the right thing. We eventually want to move all flags to be only dashes, but we have a long way to go until we get there. It will take a couple more releases (at least).

Signed-off-by: Tim Vaillancourt <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>
@timvaillancourt
Copy link
Contributor Author

@ajm188 good catches thanks, added those suggestions 👍

@harshit-gangal
Copy link
Member

I have a question here.

The purpose of the tx throttler is to avoid creating new transactions as the existing replicas are lagging and the new transactions would not be able to commit as they would not receive semi-sync acks.

What purpose we would solve to throttler on rdonly tablet types. primary tablets do not expect semi-sync acks from those tablets. We would be throttling user traffic when they can actually able to write without causing delays to normal working.

@timvaillancourt
Copy link
Contributor Author

What purpose we would solve to throttler on rdonly tablet types. primary tablets do not expect semi-sync acks from those tablets. We would be throttling user traffic when they can actually able to write without causing delays to normal working.

@harshit-gangal while I don't recommend it for most, in Slack's current use-case we read from rdonly tablets in some situations and "care" about the staleness of those reads from an application consistency perspective (vs durability)

@harshit-gangal
Copy link
Member

What purpose we would solve to throttler on rdonly tablet types. primary tablets do not expect semi-sync acks from those tablets. We would be throttling user traffic when they can actually able to write without causing delays to normal working.

@harshit-gangal while I don't recommend it for most, in Slack's current use-case we read from rdonly tablets in some situations and "care" about the staleness of those reads from an application consistency perspective (vs durability)

have you explored the flag discovery_high_replication_lag_minimum_serving? The purpose is to remove unhealthy tablets from serving traffic if they exceed the lag tolerance irrespective of not meeting the minimum tablets serving criteria.

@timvaillancourt
Copy link
Contributor Author

timvaillancourt commented May 12, 2023

have you explored the flag discovery_high_replication_lag_minimum_serving? The purpose is to remove unhealthy tablets from serving traffic if they exceed the lag tolerance irrespective of not meeting the minimum tablets serving criteria.

@harshit-gangal yes, we're using this to remove unhealthy replicas in prod, however losing availability to replicas causes other undesireable problems

The goal we have with including RDONLY is to reduce the rate of writes (tell low-priority clients to backoff) when we are exceeding the rate tablets can replicate to avoid both lag and loss of availability of these tablet types

cc @ejortegau / @shlomi-noach for more thoughts

@shlomi-noach
Copy link
Contributor

The discussion took many turns by now, but I wish to comment about the use of RDONLY and semi-sync etc. I think this flag is in line with existing tablet throttler flag --throttle_tablet_types.
It's also worth pointing out that semi-sync has no strict dependency on lag. It is possible for all semi-sync replicas to ACK all binlog events, and also have minutes of lag. So on that respect, too, there is no problem imO with including RDONLY tablets.

@harshit-gangal harshit-gangal merged commit c40623a into vitessio:main May 16, 2023
@timvaillancourt timvaillancourt deleted the tx-throttler-tablet-types branch May 16, 2023 10:22
timvaillancourt added a commit to slackhq/vitess that referenced this pull request May 16, 2023
* Add flag to select tx throttler tablet type

Signed-off-by: Tim Vaillancourt <[email protected]>

* REPLICA and/or RDONLY only

Signed-off-by: Tim Vaillancourt <[email protected]>

* Update flag help msg

Signed-off-by: Tim Vaillancourt <[email protected]>

* Lowercase types in help/doc

Signed-off-by: Tim Vaillancourt <[email protected]>

* Help update

Signed-off-by: Tim Vaillancourt <[email protected]>

* fix test

Signed-off-by: Tim Vaillancourt <[email protected]>

* No underscores in flag

Signed-off-by: Tim Vaillancourt <[email protected]>

* Fix test

Signed-off-by: Tim Vaillancourt <[email protected]>

* Fix merge

Signed-off-by: Tim Vaillancourt <[email protected]>

* PR suggestion, consolidate config logic

Signed-off-by: Tim Vaillancourt <[email protected]>

* Update go/vt/vttablet/tabletserver/tabletenv/config.go

Co-authored-by: Andrew Mason <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>

* Use topoproto.TabletTypeListFlag to handle flag

Signed-off-by: Tim Vaillancourt <[email protected]>

* Fix unit test

Signed-off-by: Tim Vaillancourt <[email protected]>

* Update go/vt/vttablet/tabletserver/tabletenv/config.go

Co-authored-by: Andrew Mason <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>

* improve test

Signed-off-by: Tim Vaillancourt <[email protected]>

* pr suggestions

Signed-off-by: Tim Vaillancourt <[email protected]>

* go fmt

Signed-off-by: Tim Vaillancourt <[email protected]>

---------

Signed-off-by: Tim Vaillancourt <[email protected]>
Co-authored-by: Andrew Mason <[email protected]>
timvaillancourt added a commit to slackhq/vitess that referenced this pull request Jun 30, 2023
timvaillancourt added a commit to slackhq/vitess that referenced this pull request Jul 5, 2023
…let type (vitessio#12174) (#95)

* Add flag to select tx throttler tablet type (vitessio#12174)

Signed-off-by: Tim Vaillancourt <[email protected]>

* Remove mistaken git add

Signed-off-by: Tim Vaillancourt <[email protected]>

---------

Signed-off-by: Tim Vaillancourt <[email protected]>
timvaillancourt added a commit to slackhq/vitess that referenced this pull request Apr 16, 2024
* Add flag to select tx throttler tablet type

Signed-off-by: Tim Vaillancourt <[email protected]>

* REPLICA and/or RDONLY only

Signed-off-by: Tim Vaillancourt <[email protected]>

* Update flag help msg

Signed-off-by: Tim Vaillancourt <[email protected]>

* Lowercase types in help/doc

Signed-off-by: Tim Vaillancourt <[email protected]>

* Help update

Signed-off-by: Tim Vaillancourt <[email protected]>

* fix test

Signed-off-by: Tim Vaillancourt <[email protected]>

* No underscores in flag

Signed-off-by: Tim Vaillancourt <[email protected]>

* Fix test

Signed-off-by: Tim Vaillancourt <[email protected]>

* Fix merge

Signed-off-by: Tim Vaillancourt <[email protected]>

* PR suggestion, consolidate config logic

Signed-off-by: Tim Vaillancourt <[email protected]>

* Update go/vt/vttablet/tabletserver/tabletenv/config.go

Co-authored-by: Andrew Mason <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>

* Use topoproto.TabletTypeListFlag to handle flag

Signed-off-by: Tim Vaillancourt <[email protected]>

* Fix unit test

Signed-off-by: Tim Vaillancourt <[email protected]>

* Update go/vt/vttablet/tabletserver/tabletenv/config.go

Co-authored-by: Andrew Mason <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>

* improve test

Signed-off-by: Tim Vaillancourt <[email protected]>

* pr suggestions

Signed-off-by: Tim Vaillancourt <[email protected]>

* go fmt

Signed-off-by: Tim Vaillancourt <[email protected]>

---------

Signed-off-by: Tim Vaillancourt <[email protected]>
Co-authored-by: Andrew Mason <[email protected]>
timvaillancourt added a commit to slackhq/vitess that referenced this pull request May 14, 2024
* Add flag to select tx throttler tablet type

Signed-off-by: Tim Vaillancourt <[email protected]>

* REPLICA and/or RDONLY only

Signed-off-by: Tim Vaillancourt <[email protected]>

* Update flag help msg

Signed-off-by: Tim Vaillancourt <[email protected]>

* Lowercase types in help/doc

Signed-off-by: Tim Vaillancourt <[email protected]>

* Help update

Signed-off-by: Tim Vaillancourt <[email protected]>

* fix test

Signed-off-by: Tim Vaillancourt <[email protected]>

* No underscores in flag

Signed-off-by: Tim Vaillancourt <[email protected]>

* Fix test

Signed-off-by: Tim Vaillancourt <[email protected]>

* Fix merge

Signed-off-by: Tim Vaillancourt <[email protected]>

* PR suggestion, consolidate config logic

Signed-off-by: Tim Vaillancourt <[email protected]>

* Update go/vt/vttablet/tabletserver/tabletenv/config.go

Co-authored-by: Andrew Mason <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>

* Use topoproto.TabletTypeListFlag to handle flag

Signed-off-by: Tim Vaillancourt <[email protected]>

* Fix unit test

Signed-off-by: Tim Vaillancourt <[email protected]>

* Update go/vt/vttablet/tabletserver/tabletenv/config.go

Co-authored-by: Andrew Mason <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>

* improve test

Signed-off-by: Tim Vaillancourt <[email protected]>

* pr suggestions

Signed-off-by: Tim Vaillancourt <[email protected]>

* go fmt

Signed-off-by: Tim Vaillancourt <[email protected]>

---------

Signed-off-by: Tim Vaillancourt <[email protected]>
Co-authored-by: Andrew Mason <[email protected]>
timvaillancourt added a commit to slackhq/vitess that referenced this pull request May 16, 2024
…pt. 2 (#350)

* Add priority support to transaction throttler (vitessio#12662)

* Add support for criticality query directive, and have TxThrottler respect that

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Remove unused variable

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Fix CI pipeline

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Address PR comments.

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Make linter happy & add extra test cases.

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Address PR comments.

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Fix circular import

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Make linter happy

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Address PR comments:

* Invalid criticality in query directive fails the query.

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Address PR comments:

* Renamed criticality to priority.
* Change error handling when parsing the priority from string to integer.

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Add missing piece of code that got lost during merge conflict resolution

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Fix vtadmin.js

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Address PR comments

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Fix unit tests (I think)

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Invert polarity of priority values

With this change, queries with PRIORITY=0 never get throttled, whereas those
with PRIORITY=100 always do (provided there is contention).

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Make linter happy

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Fix flag e2e test

Signed-off-by: Eduardo J. Ortega U <[email protected]>

---------

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Add flag to select tx throttler tablet type (vitessio#12174)

* Add flag to select tx throttler tablet type

Signed-off-by: Tim Vaillancourt <[email protected]>

* REPLICA and/or RDONLY only

Signed-off-by: Tim Vaillancourt <[email protected]>

* Update flag help msg

Signed-off-by: Tim Vaillancourt <[email protected]>

* Lowercase types in help/doc

Signed-off-by: Tim Vaillancourt <[email protected]>

* Help update

Signed-off-by: Tim Vaillancourt <[email protected]>

* fix test

Signed-off-by: Tim Vaillancourt <[email protected]>

* No underscores in flag

Signed-off-by: Tim Vaillancourt <[email protected]>

* Fix test

Signed-off-by: Tim Vaillancourt <[email protected]>

* Fix merge

Signed-off-by: Tim Vaillancourt <[email protected]>

* PR suggestion, consolidate config logic

Signed-off-by: Tim Vaillancourt <[email protected]>

* Update go/vt/vttablet/tabletserver/tabletenv/config.go

Co-authored-by: Andrew Mason <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>

* Use topoproto.TabletTypeListFlag to handle flag

Signed-off-by: Tim Vaillancourt <[email protected]>

* Fix unit test

Signed-off-by: Tim Vaillancourt <[email protected]>

* Update go/vt/vttablet/tabletserver/tabletenv/config.go

Co-authored-by: Andrew Mason <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>

* improve test

Signed-off-by: Tim Vaillancourt <[email protected]>

* pr suggestions

Signed-off-by: Tim Vaillancourt <[email protected]>

* go fmt

Signed-off-by: Tim Vaillancourt <[email protected]>

---------

Signed-off-by: Tim Vaillancourt <[email protected]>
Co-authored-by: Andrew Mason <[email protected]>

* txthrottler: further code cleanup (vitessio#12902)

* txthrottler: further code cleanup

Signed-off-by: Tim Vaillancourt <[email protected]>

* Fix bad merge resolution

Signed-off-by: Tim Vaillancourt <[email protected]>

---------

Signed-off-by: Tim Vaillancourt <[email protected]>

* TxThrottler support for transactions outside BEGIN/COMMIT (vitessio#13040)

* TxThrottler support for transactions outside BEGIN/COMMIT

This change allows the TxThrottler to throttle requests sent outside of
explicit transactions (i.e. explicit BEGIN/COMMIT blocks) when configured to do
so via a new config flag. Otherwise, it preserves the current/default behavior
of only throttling transactions inside BEGIN/COMMIT.

In addition, when this flag is passed, and because the call to throttle is done
in a context in which the execution plan is already known, this change uses the
plan type to make sure that throttling is triggered only when the query being
executed is INSERT/UPDATE/DELETE/LOAD, so that SELECTs and others no longer get
throttled unnecessarily, as they do not contribute to increasing replication
lag, which is what the TxThrottler aims at controlling.

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Fix e2e flag tests & TxThrottler unit test

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Throttle auto-commit statements in QueryExecutor instead of TxPool

This changes where we call the transaction throttler:

1. Statements in `BEGIN/COMMIT` blocks keep being throttled in
   `TabletServer.begin()`.
2. Additionally, throttling is added in QueryExecutor.execAutocommit() and
   `QueryExecutor.execAsTransaction()`.

We also change things so that throttling in this new case is not opt-in via
configuration flag but instead is the new and only behavior.

Finally, we remove some previously added changes to example scripts that had
been added with the intention of testing and are not part of the PR.

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Adds test cases for QueryExecutor.Execute() with TxThrottle throttling

To make unit testing simple here, we separated the interface and implementation
of the TxThrottle, and simply used a mock implementation of the interface in
the tests.

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Add note on new TxThrottler behavior in v17 changelog

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Fix PR number in changelog entry for TxThrottler behavior change.

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Make linter happy

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* Address PR comments

Signed-off-by: Eduardo J. Ortega U <[email protected]>

---------

Signed-off-by: Eduardo J. Ortega U <[email protected]>

* txthrottler: verify config at vttablet startup, consolidate funcs (vitessio#13115)

* txthrottler: verify config at vttablet startup, consolidate funcs

Signed-off-by: Tim Vaillancourt <[email protected]>

* Use explicit dest in prototext.Unmarshal

Signed-off-by: Tim Vaillancourt <[email protected]>

* Use for loop for TestVerifyTxThrottlerConfig

Signed-off-by: Tim Vaillancourt <[email protected]>

* Cleanup test

Signed-off-by: Tim Vaillancourt <[email protected]>

* Fix go vet complaint

Signed-off-by: Tim Vaillancourt <[email protected]>

* Add back synonym flag

Signed-off-by: Tim Vaillancourt <[email protected]>

* Update go/vt/vttablet/tabletserver/txthrottler/tx_throttler.go

Co-authored-by: Shlomi Noach <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>

* Address staticcheck linter error

Signed-off-by: Tim Vaillancourt <[email protected]>

---------

Signed-off-by: Tim Vaillancourt <[email protected]>
Co-authored-by: Shlomi Noach <[email protected]>

* gofumpt

Signed-off-by: Tim Vaillancourt <[email protected]>

---------

Signed-off-by: Eduardo J. Ortega U <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>
Co-authored-by: Eduardo J. Ortega U <[email protected]>
Co-authored-by: Andrew Mason <[email protected]>
Co-authored-by: Shlomi Noach <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Query Serving Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants