-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issue: VReplication throttling #7362
Comments
#7319 introduces a |
Code-wise this is complete now that #7324 and #7364 are merged. Documentation pending.
|
Documentation: vitessio/website#689 |
code and documentation are merged, functionality is complete for now and no plan for further work at this time. |
@shlomi-noach I'm going to close this as done for now. We can re-open it if needed. In that case, can you clarify what's left in relation to this issue? Thanks! |
This was definitely done. |
We wish to introduce throttling in VReplication to avoid overwhelming the databases with reads/writes.
VReplication's current behavior is greedy: if it can read table data, it will read it and push downstream. If it can read binary logs, it will, and push downstream. If it can pull wither table data or binary logs from upstream, it will, and write them onto the database.
In the current design, the source engine and the target engine throttle one another, at the rate of their respective database capabilities. As example, assume the source is a replica tablet, and the target is a primary tablet. The target tablet requests to pull data from the source. The source reads from the replica, pushes downstream; the target intercepts and writes to the backend primary MySQL. If the backend primary MySQL is too busy, that will push back the target tablet, which will in turn stop processing events from upstream, which will in turn throttle reading from the MySQL replica. Conversely, if the source replica is slow to respond, that dictates the pace of writes to the target primary.
However, on both source and target sides, operations are aggressive on the MySQL servers. The engines will read as fast as the replica allows, or write as fast as the target primary allows. This, unfortunately, does not take into consideration the overall health of the source and target shards. We want to avoid overwhelming source/target shards so as to keep them healthy.
Specifically, we want to throttle writes on target when those writes generate replication lags on the target shard. We want to throttle reads from the source replica when that replica itself is lagging. This will not only keep the shards in a health ystate, it also makes the cut-over safer & quicker, as at any point in time the lag between source and target is known to be small.
To that effect we first need to be able to throttle on a lagging replica (source side). Our existing table throttling only throttles on writes to a cluster's primary.
We will introduce lag-based "check-self" throttle check on all tablets.
Some work has begun, illustrated in next comments.
cc @rohit-nayak-ps
The text was updated successfully, but these errors were encountered: