Tracking issue: VReplication throttling #7362

shlomi-noach · 2021-01-24T07:12:38Z

We wish to introduce throttling in VReplication to avoid overwhelming the databases with reads/writes.

VReplication's current behavior is greedy: if it can read table data, it will read it and push downstream. If it can read binary logs, it will, and push downstream. If it can pull wither table data or binary logs from upstream, it will, and write them onto the database.

In the current design, the source engine and the target engine throttle one another, at the rate of their respective database capabilities. As example, assume the source is a replica tablet, and the target is a primary tablet. The target tablet requests to pull data from the source. The source reads from the replica, pushes downstream; the target intercepts and writes to the backend primary MySQL. If the backend primary MySQL is too busy, that will push back the target tablet, which will in turn stop processing events from upstream, which will in turn throttle reading from the MySQL replica. Conversely, if the source replica is slow to respond, that dictates the pace of writes to the target primary.

However, on both source and target sides, operations are aggressive on the MySQL servers. The engines will read as fast as the replica allows, or write as fast as the target primary allows. This, unfortunately, does not take into consideration the overall health of the source and target shards. We want to avoid overwhelming source/target shards so as to keep them healthy.

Specifically, we want to throttle writes on target when those writes generate replication lags on the target shard. We want to throttle reads from the source replica when that replica itself is lagging. This will not only keep the shards in a health ystate, it also makes the cut-over safer & quicker, as at any point in time the lag between source and target is known to be small.

To that effect we first need to be able to throttle on a lagging replica (source side). Our existing table throttling only throttles on writes to a cluster's primary.
We will introduce lag-based "check-self" throttle check on all tablets.

Some work has begun, illustrated in next comments.

cc @rohit-nayak-ps

shlomi-noach · 2021-01-24T07:13:44Z

#7319 introduces a /throttler/check-self test, where each tablet (primary, replica, any) runs its own throttler mechanism to self check its own lag. This is true also for primaries as illustrated in the PR comment.

shlomi-noach · 2021-01-24T07:14:43Z

#7324 extends #7319 and adds source-side throttling based on /throttler/check-self.

shlomi-noach · 2021-01-27T08:25:37Z

Code-wise this is complete now that #7324 and #7364 are merged.

Documentation pending.

Doc updates; tablet throttler: check-self website#686 is related since vstreamer to throttle on source endpoint #7324 relies on check-self throttling.

shlomi-noach · 2021-01-27T09:44:21Z

Documentation: vitessio/website#689

shlomi-noach · 2021-01-28T06:52:43Z

code and documentation are merged, functionality is complete for now and no plan for further work at this time.

mattlord · 2024-09-25T16:15:20Z

@shlomi-noach I'm going to close this as done for now. We can re-open it if needed. In that case, can you clarify what's left in relation to this issue? Thanks!

shlomi-noach · 2024-09-25T17:13:44Z

This was definitely done.

This was referenced Jan 24, 2021

vstreamer to throttle on source endpoint #7324

Merged

VReplication: throttle on target tablet #7364

Merged

deepthi added the Component: VReplication label Jan 25, 2021

shlomi-noach self-assigned this Jan 27, 2021

shlomi-noach mentioned this issue Jan 27, 2021

Documenting VReplication throttling vitessio/website#689

Merged

askdba added the P3 label Jan 29, 2021

hallaroo mentioned this issue Jan 29, 2021

Throttling for VReplication streams #6616

Closed

shlomi-noach mentioned this issue Feb 1, 2021

Throttler: fix to client usage in vreplication and table GC #7422

Merged

8 tasks

mattlord mentioned this issue Mar 25, 2022

Support throttling vstreamer copy table work on source tablets #9923

Merged

3 tasks

ajm188 removed the P3 label Mar 9, 2023

mattlord closed this as completed Sep 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking issue: VReplication throttling #7362

Tracking issue: VReplication throttling #7362

shlomi-noach commented Jan 24, 2021

shlomi-noach commented Jan 24, 2021

shlomi-noach commented Jan 24, 2021

shlomi-noach commented Jan 27, 2021 •

edited

Loading

shlomi-noach commented Jan 27, 2021

shlomi-noach commented Jan 28, 2021

mattlord commented Sep 25, 2024 •

edited

Loading

shlomi-noach commented Sep 25, 2024

Tracking issue: VReplication throttling #7362

Tracking issue: VReplication throttling #7362

Comments

shlomi-noach commented Jan 24, 2021

shlomi-noach commented Jan 24, 2021

shlomi-noach commented Jan 24, 2021

shlomi-noach commented Jan 27, 2021 • edited Loading

shlomi-noach commented Jan 27, 2021

shlomi-noach commented Jan 28, 2021

mattlord commented Sep 25, 2024 • edited Loading

shlomi-noach commented Sep 25, 2024

shlomi-noach commented Jan 27, 2021 •

edited

Loading

mattlord commented Sep 25, 2024 •

edited

Loading