Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VSCopy: Resume the copy phase consistently from given GTID and lastpk #11103

Merged
merged 20 commits into from
Nov 10, 2022

Conversation

yoheimuta
Copy link
Contributor

@yoheimuta yoheimuta commented Aug 26, 2022

Description

The current vstream API doesn't resume a copy phase given the below VGTID.
Instead, it starts replicating the table from the given GTID while ignoring the lastpk information.

vgtid {
  shard_gtids {
    keyspace: "test_sharded_keyspace"
    shard: "-80"
    gtid: "MySQL56/ff540f7a-1d4d-11ed-93b5-0242ac110002:1-28"
    table_p_ks {
      table_name: "t1"
      lastpk {
        rows {
          lengths: 4
          values: "1000"
 ...

This PR allows vstream consumers to deal with the VGTID event as an opaque offset.
Given the non-empty GTID and the lastpk, the API is supposed to do the following:

  1. Catch up the t1 table from "MySQL56/ff540f7a-1d4d-11ed-93b5-0242ac110002:1-28" to the nearly latest
  2. Copy the t1 table with the condition "pk > 1000"

Related Issue(s)

Discussion: https://vitess.slack.com/archives/C0PQY0PTK/p1661155731380559

Implementation

Modify uvstreamer Init

Allow a non-empty position when tablePKs are specified. In this case an initlal catchup+fastforward phase will run before restarting copy phase on incompletely copied tables. (reference: https://vitess.io/docs/design-docs/vreplication/life-of-a-stream/)

Change how we filter events

Since we now run a catchup phase when we restart table copies, it is possible that duplicate events can be sent for rows that are not already copied, since we don't yet have logic to compare primary keys. To minimise this we filter out events for tables whose copy phase has not yet started.

Test changes

  1. New test TestVStreamCopyResume: passes both tablePKs and a specific position to validate new logic
    https://github.com/vitessio/vitess/pull/11103/files?diff=unified&w=1#diff-c98b8f318aac471272bf650c7824713c9664b46895380acf322ec3498a2fe578R236

  2. Updated test TestVStreamCopyCompleteFlow to account for change in logic for selecting events to send

Checklist

  • "Backport me!" label has been added if this change should be backported
  • Tests were added or are not required
  • Documentation PR: Document VStream website#1216

Pending issues

  1. Filter out events for rows not copied, for tables whose primary keys can be compared in code (essentially those that our eval engine can compare). This is not critical, since clients have to handle duplicate events in any case.
  2. Batch the copy phase (like we do for vreplication) so that we don't end up with a large vreplication lag (and possibly rotated binlogs). We didn't expect this to be critical since we are just streaming events, rather than inserting rows on the target, like we do for vreplication. But it is possible that some clients do more cpu/IO-intensive operations than just sending them to event sinks.

@vitess-bot
Copy link
Contributor

vitess-bot bot commented Aug 26, 2022

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • If this is a change that users need to know about, please apply the release notes (needs details) label so that merging is blocked unless the summary release notes document is included.

If a new flag is being introduced:

  • Is it really necessary to add this flag?
  • Flag names should be clear and intuitive (as far as possible)
  • Help text should be descriptive.
  • Flag names should use dashes (-) as word separators rather than underscores (_).

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow should be required, the maintainer team should be notified.

Bug fixes

  • There should be at least one unit or end-to-end test.
  • The Pull Request description should include a link to an issue that describes the bug.

Non-trivial changes

  • There should be some code comments as to why things are implemented the way they are.

New/Existing features

  • Should be documented, either by modifying the existing documentation or creating new documentation.
  • New features should have a link to a feature request issue or an RFC that documents the use cases, corner cases and test cases.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • vtctl command output order should be stable and awk-able.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from VTop, if used there.

@mattlord mattlord added Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: VReplication labels Aug 26, 2022
@mattlord mattlord self-assigned this Aug 26, 2022
@mattlord mattlord requested review from rohit-nayak-ps and removed request for systay, shlomi-noach and harshit-gangal August 29, 2022 13:55
Copy link
Contributor

@mattlord mattlord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we need to find a way to implement this high level feature w/o changing the basic copy phase event filtering for all VStreams in uvstreamer (the unified vstreamer abstraction/primitive that everything else uses) as this changes key behavior for VReplication workflows and other things.

I also don't think that change should be necessary as we should be able to mirror the VReplication MoveTables/Reshard behavior here. @rohit-nayak-ps and I can discuss further.

@mattlord mattlord self-requested a review August 30, 2022 05:13
@yoheimuta
Copy link
Contributor Author

@rohit-nayak-ps Thank you for your updates!
I reviewed it, and looks good to me 👍

@mattlord mattlord added release notes (needs details) This PR needs to be listed in the release notes in a dedicated section (deprecation notice, etc...) and removed Do Not Merge labels Nov 7, 2022
Copy link
Contributor

@mattlord mattlord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and makes sense to me! I only had some very minor question/comments/suggestions.

I'll push a commit now to mention this new feature in the 16.0 release notes.

We should also add a new VStream page in the 16.0 VReplication docs that can reference the existing copy table design doc. There we can touch on what it is, what use cases it serves, how it works and how to use it. We don't have any docs on the VTGate VStream API feature today (only the somewhat dated design doc) — so many people don't even know that it exists. I can get a PR started for that and add a reference in the description.

go/vt/vttablet/tabletserver/vstreamer/uvstreamer.go Outdated Show resolved Hide resolved
go/vt/vttablet/tabletserver/vstreamer/uvstreamer.go Outdated Show resolved Hide resolved
@mattlord mattlord requested a review from deepthi as a code owner November 7, 2022 23:47
mattlord added a commit to vitessio/website that referenced this pull request Nov 8, 2022
And the new vstream copy resume work added in:
  vitessio/vitess#11103

Signed-off-by: Matt Lord <[email protected]>
Copy link
Contributor

@mattlord mattlord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My review comments were extremely minor -- really only comment related -- so I'm going to approve now that I pushed an update to the release notes and started on the docs work (which is now linked): vitessio/website#1216

Thank you so much for the time you spent on this @yoheimuta !

@mattlord mattlord removed the release notes (needs details) This PR needs to be listed in the release notes in a dedicated section (deprecation notice, etc...) label Nov 8, 2022
mattlord added a commit to vitessio/website that referenced this pull request Nov 8, 2022
And the new vstream copy resume work added in:
  vitessio/vitess#11103

Signed-off-by: Matt Lord <[email protected]>
Copy link
Contributor

@rohit-nayak-ps rohit-nayak-ps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Collaborator

@systay systay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some nit-picks, but they are not stopping this PR

Signed-off-by: Matt Lord <[email protected]>
@mattlord mattlord requested a review from rsajwani as a code owner November 10, 2022 15:05
@mattlord mattlord merged commit e2e0699 into vitessio:main Nov 10, 2022
@yoheimuta yoheimuta deleted the vscopy-restart branch November 11, 2022 01:48
mattlord added a commit to vitessio/website that referenced this pull request Jan 7, 2023
* Document VStream

And the new vstream copy resume work added in:
  vitessio/vitess#11103

Signed-off-by: Matt Lord <[email protected]>

* Add more info

Signed-off-by: Matt Lord <[email protected]>

* Add VStreamFlags link

Signed-off-by: Matt Lord <[email protected]>

* Address review comments

Signed-off-by: Matt Lord <[email protected]>

* Document all RPC parameters and flags

Signed-off-by: Matt Lord <[email protected]>

* Various improvements after self review

Signed-off-by: Matt Lord <[email protected]>

Signed-off-by: Matt Lord <[email protected]>
timvaillancourt pushed a commit to slackhq/vitess that referenced this pull request May 27, 2024
…vitessio#11103)

* VSCopy: Demonstrate to fail a test case on which the vstream API is supposed to resume the copy phase consistently

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Resume the copy phase consistently from given GTID and lastpk

Signed-off-by: yoheimuta <[email protected]>

* Build out the unit test some more

Signed-off-by: Matt Lord <[email protected]>

* Update tests for new behavior

Signed-off-by: Matt Lord <[email protected]>

* Improve comments

Signed-off-by: Matt Lord <[email protected]>

* Limit uvstreamer changes and update test

Signed-off-by: Matt Lord <[email protected]>

* Revert uvstreamer test changes

Signed-off-by: Matt Lord <[email protected]>

* Revert all uvstream changes

Signed-off-by: Matt Lord <[email protected]>

* VCopy: Revert the last three commits

Signed-off-by: yoheimuta <[email protected]>

* VCopy: Add a new vstream type that allows picking up where we left off

Signed-off-by: yoheimuta <[email protected]>

* VCopy: Revert the unit test change

Signed-off-by: yoheimuta <[email protected]>

* VCopy: Fix the end-to-end CI test

Signed-off-by: yoheimuta <[email protected]>

* Update logic for setting up uvstreamer based on input vgtid/tablepks. Add more catchup events to test

Signed-off-by: Rohit Nayak <[email protected]>

* Refactor logic to decide if event is to be sent. Enhance unit and e2e tests.

Signed-off-by: Rohit Nayak <[email protected]>

* Don't send events for tables which we can identify as ones we haven't started copy for

Signed-off-by: Rohit Nayak <[email protected]>

* Minor changes after self-review

Signed-off-by: Rohit Nayak <[email protected]>

* Add vstream copy resume to release notes

Signed-off-by: Matt Lord <[email protected]>

* Address review comments

Signed-off-by: Matt Lord <[email protected]>

Signed-off-by: yoheimuta <[email protected]>
Signed-off-by: Matt Lord <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Co-authored-by: Matt Lord <[email protected]>
Co-authored-by: Rohit Nayak <[email protected]>
timvaillancourt added a commit to slackhq/vitess that referenced this pull request May 28, 2024
* add vtgate flag that explicitly allows vstream copy (#125)

* fix fs.BoolVar

Signed-off-by: Tim Vaillancourt <[email protected]>

* VSCopy: Resume the copy phase consistently from given GTID and lastpk (vitessio#11103)

* VSCopy: Demonstrate to fail a test case on which the vstream API is supposed to resume the copy phase consistently

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Resume the copy phase consistently from given GTID and lastpk

Signed-off-by: yoheimuta <[email protected]>

* Build out the unit test some more

Signed-off-by: Matt Lord <[email protected]>

* Update tests for new behavior

Signed-off-by: Matt Lord <[email protected]>

* Improve comments

Signed-off-by: Matt Lord <[email protected]>

* Limit uvstreamer changes and update test

Signed-off-by: Matt Lord <[email protected]>

* Revert uvstreamer test changes

Signed-off-by: Matt Lord <[email protected]>

* Revert all uvstream changes

Signed-off-by: Matt Lord <[email protected]>

* VCopy: Revert the last three commits

Signed-off-by: yoheimuta <[email protected]>

* VCopy: Add a new vstream type that allows picking up where we left off

Signed-off-by: yoheimuta <[email protected]>

* VCopy: Revert the unit test change

Signed-off-by: yoheimuta <[email protected]>

* VCopy: Fix the end-to-end CI test

Signed-off-by: yoheimuta <[email protected]>

* Update logic for setting up uvstreamer based on input vgtid/tablepks. Add more catchup events to test

Signed-off-by: Rohit Nayak <[email protected]>

* Refactor logic to decide if event is to be sent. Enhance unit and e2e tests.

Signed-off-by: Rohit Nayak <[email protected]>

* Don't send events for tables which we can identify as ones we haven't started copy for

Signed-off-by: Rohit Nayak <[email protected]>

* Minor changes after self-review

Signed-off-by: Rohit Nayak <[email protected]>

* Add vstream copy resume to release notes

Signed-off-by: Matt Lord <[email protected]>

* Address review comments

Signed-off-by: Matt Lord <[email protected]>

Signed-off-by: yoheimuta <[email protected]>
Signed-off-by: Matt Lord <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Co-authored-by: Matt Lord <[email protected]>
Co-authored-by: Rohit Nayak <[email protected]>

* VSCopy: Send COPY_COMPLETED events when the copy operation is done (vitessio#11740)

* VSCopy: Demonstrate to fail a test case on which the vstream API sends new events showing copy completed

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Send new events when the copy operation is done

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Fix typo

Signed-off-by: yoheimuta <[email protected]>

* Initialize new map for the 'vstream * from' vtgate sql interface. Make vtadmin web protos

Signed-off-by: Rohit Nayak <[email protected]>

* VSCopy: Make TestVStreamCopyBasic fail fast to avoid the end2end timeout out

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: stop sharing the 't1' table among multiple test cases running concurrently

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: refactor the function signature to be clearer

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: refactor the VEvents sorter to be simpler

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: refactor to stop the sorter from including a fully copied event

Signed-off-by: yoheimuta <[email protected]>

Signed-off-by: yoheimuta <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Co-authored-by: Rohit Nayak <[email protected]>

* VSCopy: Enable to copy from all shards in either a specified keyspace or all keyspaces (vitessio#11909)

* VSCopy: Demonstrate to fail a test case on which the vstream API request doesn't include keyspace and shard

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Copy from all shards in all keyspaces by specifying only an empty gtid

Signed-off-by: yoheimuta <[email protected]>

* tests: Make TestRowCount stable regardless of the number of keyspaces

Signed-off-by: yoheimuta <[email protected]>

* tests: Cleanup TestCreateAndDropDatabase correctly to stop TestVStreamCopyWithoutKeyspaceShard from failing when running tests together

Signed-off-by: yoheimuta <[email protected]>

* tests: Tweak to fix a comment

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: fix the unit tests when the input vgtid with an empty gtid lacks either keyspace or shard

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Keyspace wildcard selection lines up with the table wildcard selection

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Tests the VCopy with multiple keyspaces and resharding

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Make TestVStreamCopyMultiKeyspaceReshard clearer to check if the streaming two keyspaces works even after reshard

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Return an invalid argument error if shards are unspecified and gtid is neither 'current' nor empty

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Add a test description about its purpose and target

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Remove duplicate literals in the test

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Retain defaultReplicas variable in the test

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Explain why we are setting Match to 'customer.*' in the test

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Remove an unused VStreamFlag for the test

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Use sentence capitalization in the test

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Verify that we didn't lose any events or get duplicates of the keyspace being reshareded in the test

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Return a value instead of a pointer because there is no need to modify the value

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Add a comment describing what TestVStreamCopyFromAllKeyspacesAndAllShards is doing and why

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Add a comment describing why we expect these specific numbers of events from VStream API

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Tweak the test case name

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Make a utility function to sort COPY_COMPLETED events in the test

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Replace the matcher with a simpler one in the test

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Move the print debug call to the FailNow section in the test

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Use require.NoError in new tests

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Use require instead of t.Fatalf in the test

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Apply the reviewer's suggestion to make the error message easier to read

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Add a comment noting what we're actually testing

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Correct the test comment and elaborate the special-case

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Tweak an error message and a comment

Signed-off-by: yoheimuta <[email protected]>

* VSCopy: Adjust to a change in the signature of a test function that was introduced in the main repository

Signed-off-by: yoheimuta <[email protected]>

---------

Signed-off-by: yoheimuta <[email protected]>

* attempt unit test fix

Signed-off-by: Tim Vaillancourt <[email protected]>

* update test error expected

Signed-off-by: Tim Vaillancourt <[email protected]>

---------

Signed-off-by: Tim Vaillancourt <[email protected]>
Signed-off-by: yoheimuta <[email protected]>
Signed-off-by: Matt Lord <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Co-authored-by: pbibra <[email protected]>
Co-authored-by: yohei yoshimuta <[email protected]>
Co-authored-by: Matt Lord <[email protected]>
Co-authored-by: Rohit Nayak <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: VReplication Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants