Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VStream API: allow aligning streams from different shards to minimize skews across the streams #7626

Merged
merged 6 commits into from
Mar 26, 2021

Conversation

rohit-nayak-ps
Copy link
Contributor

@rohit-nayak-ps rohit-nayak-ps commented Mar 6, 2021

Minimizing skew across shard streams in the VStream API

Description

When VStream API is streaming from multiple shards we have multiple sources of events: one primary or replica tablet for each shard in the provided filter. The rate at which the events will be streamed by the source can depend on:

  • the replication lag on the source tablets (if a replica is selected)
  • the cpu load on the source tablet
  • possible network partitions or network delays

This can result in the events from some shards being well ahead of other shards. So, for example, if a row moves from the faster shard to a slower shard we might see the delete event in the faster shard much before the insert in the second, resulting in the row going "invisible" for the duration of the skew. This can affect user experience in applications where these events are used to refresh UI , for example.

For most applications where vstream api events feed into change data capture systems for auditing or reporting purposes these delays may be acceptable.

This PR adds a flag that the client can set. This flag enables skew detection between the various streams. Once a skew is detected, events for streams that are ahead are held back until the skew reaches an acceptable level.

Skew Detection

Each vstreamer event (vevent) contains two timestamps: one for when the database transaction occurred and the other is the current time on the source when the vevent was created. This lets us compute how far in the past the event we just received was created. We use this to determine which shard has the most recent and which one has the oldest event. Note that, for shards where there is no activity, vstreamer sends a heartbeat event every second. The transaction time for an heartbeat is the same as the current time on the source. (These heartbeats are not forwarded to the vstream since they are synthetic vreplication events).

If the difference between the fastest and slowest streams is greater than a threshold, we declare that we detected a skew. MySQL binlogs store the transaction timestamp in seconds. Also, on the vtgate serving the vstream, we adjust this time for clock skews between the vtgate and the source MySQL server. When the user sets the minimizeSkew flag we want to keep the events across shards to be in the same second: each transaction timestamp is within 1 second of each other. Fuzzy logic alert: To account for rounding-off of the transaction timestamp and the clock-skew we set the threshold to be 2 seconds, instead of 1 second, so that we don't keep stalling the streams due to cumulative round-offs.

Some other changes:

  • Publish a vtgate-level stat to record the number of all vstream api events that get delayed due to skews. It is also used for validating the logic in the unit tests
  • The mock vstreamer used in the tests has been enhanced to also accept events from a channel so as to trickle events from multiple source streams in the specific order as required for testing skew detection

Related Issue(s)

Checklist

  • Should this PR be backported?
  • Tests were added or are not required
  • Documentation was added or is not required

Impacted Areas in Vitess

Components that this PR will affect:

  • Query Serving
  • VReplication
  • Cluster Management
  • Build/CI
  • VTAdmin

@rohit-nayak-ps rohit-nayak-ps changed the title Rn vstream skew VStream API: allow aligning streams from different shards to minimize skews across the streams Mar 6, 2021
@rohit-nayak-ps rohit-nayak-ps force-pushed the rn_vstream_skew branch 3 times, most recently from 62b2c86 to e29a218 Compare March 12, 2021 07:42
@rohit-nayak-ps rohit-nayak-ps requested review from sougou, deepthi and harshit-gangal and removed request for sougou and deepthi March 12, 2021 09:12
@rohit-nayak-ps rohit-nayak-ps marked this pull request as ready for review March 12, 2021 09:12
@rohit-nayak-ps rohit-nayak-ps requested a review from systay as a code owner March 12, 2021 09:12
@aquarapid aquarapid self-requested a review March 17, 2021 21:57
Copy link
Contributor

@sougou sougou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While reviewing this code, I realize that VStream will regroup a chunked transaction into a single packet. This could blow up memory if the transaction is huge (which we have seen in the past). So, I think we'll have to change the logic to send chunks if we receive chunks.

But at the same time, we have to ensure that transactions are not fragmented, and also skew continue to work correctly. So, a refactor may be needed.

We can do this later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants