Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New MoveTables and Resharding Workflows (v2) #7225

Closed
rohit-nayak-ps opened this issue Dec 26, 2020 · 3 comments
Closed

New MoveTables and Resharding Workflows (v2) #7225

rohit-nayak-ps opened this issue Dec 26, 2020 · 3 comments

Comments

@rohit-nayak-ps
Copy link
Contributor

rohit-nayak-ps commented Dec 26, 2020

New MoveTables and Resharding Workflows (V2)

Experimental

At this time, the v2 workflow changes are in the process of being reviewed and tested. Some commands/options maybe be modified. Feedback, suggestions, improvements are very welcome!

TL;DR

Over the last year and a half, VReplication has been used by a lot of users in several different environments, ranging from huge production-scale deployments to new users adopting Vitess using VReplication workflows. Their experience and feedback have suggested changes which we are incorporating as improved CLI workflows.

The current workflows for MoveTables and Reshards are a bit confusing and fragmented. Vtctl commands that are related to these workflows are not tightly coupled and some of the parameters are confusing.

The new workflows aim to create a uniform view so that users will always use a single MoveTables or Reshard command to refer to these workflows over their entire life cycle with sub-commands to navigate through each workflow. They also add/simplify operational aspects like following copy/replication progress and determining the current state of a workflow.

Additionally, SwitchReads and SwitchWrites are replaced by a more “correct” SwitchTraffic command which operate at the level of tablet types instead of reads and writes. And traffic for each tablet type can be switched independently of each other: currently you need to first SwitchReads before SwitchWrites.

Motivation

There are issues with the current workflow-related commands

  • SwitchReads/SwitchWrites/DropSources appear to be standalone commands. But they only make sense in a particular state of a workflow

  • Functionally these have different functionalities depending on whether they are related to MoveTables/Resharding workflows

  • They don’t make sense for custom Materialize workflows

  • SwitchReads and SwitchWrites are incorrect terms: SwitchReads does not switch reads for primary tablet types whereas SwitchWrites switches reads as well as writes to the primary tablet.

    The reason for supporting these separate commands are two-fold:

    • In the early days of VReplication we had yet to gain full confidence in the core algorithm. Also the backing control plane was developing organically
    • Large production setups may prefer to migrate in steps for operational reasons (for specific tablet types and one cell at a time, say)
  • Reversing reads and writes are confusing: the -reverse flag, while reversing replica/rdonly reads, requires the forward workflow but for reversing writes we need to address the reverse workflow. Ideally the reverse workflow should be opaque.

Not enough visibility

  • MoveTables just starts a workflow and is expected to run to completion. While Workflow Show does give all details, one needs to look at different attributes like the message column, current state, gtid position to deduce whether it is running fine
  • As a consequence of not tracking the overall state of the workflow the available transitions ( ⇒ valid subcommands) are not immediately apparent. This is particularly important in the context of the upcoming UI/UX for VReplication in vtAdmin.

Functionality issues

  • Currently we require reads to be switched before writes. This limitation is not required by for VReplication or by Vitess, so we can make it more flexible by allowing reads and writes to be switched in any order

Design

State Machine

We have an implicit state machine today in a resharding workflow which is documented and partially validated in code. We propose to use an explicit state machine using the fsm package. In a previous iteration we had used a finite state machine to make the state explicit.

However the fact that Vitess intrinsically supports switching reads for each tablet_type at the cell level meant that we would either have partial states or an anti-pattern of requiring separate set of states (potentially huge if number of cells is high) for each deployment.

We now deduce the state as discussed below. These can be used both to display the current state to users and for functionality required for the upcoming vtAdmin UI being built monitoring and managing workflows.

In the spirit of keeping the workflow state in Vitess (as opposed to storing it in the topo) we don’t store a workflow state separately but deduce it programmatically from the different artifacts like information in the _vt tables like vreplication/copy_state in all participating shards, topo routing rules and SrvKeyspace. While it seems wasteful to recompute this every time, in practice this is not inefficient since we only do this when the user invokes a cli command. The other option is to cache the current state in the topo but keeping it consistent is difficult due possible races and edge-cases.

Subcommand Pattern

Instead of the different commands we have now (MoveTables/SwitchReads/SwitchWrites/DropSources), we bring all of them under a single umbrella command: MoveTables or Reshard, depending on which type of vreplication workflow is being used.

Each subcommand will take any additional parameters/flags that are relevant to it using options and be of the form

Single Switch/Reverse command

SwitchReads and SwitchWrites are replaced by a single command SwitchTraffic and ReverseTraffic. This correctly reflects the fact that SwitchReads was switching traffic for replicas and readonly and SwitchWrites for primary tablets.

If one or more tablet types have to be switched or reversed, you specify them in a -tablet_types csv parameter. Not specifying tablet_types switches all traffic: so in a single command you can completely switch a workflow forward or backward.

Usage

MoveTables -options <SubCommand> <targetKs.workflow>

###Some examples:

MoveTables is now Start
MoveTables -source <sourceKeyspace> -tables <tableSpecs> Start ks.wf1

SwitchReads is now SwitchTraffic
MoveTables -cells <cell1> -tablet_types replica,rdonly SwitchTraffic ks.wf1

SwitchWrites is now SwitchTraffic
MoveTables -tablet_types master SwitchTraffic ks.wf1

SwitchReads -reverse is now SwitchTraffic
MoveTables -tablet_types replica,rdonly ReverseTraffic ks.wf1

DropSources is now Complete
MoveTables Complete ks.wf1

New, to abort an unswitched workflow: Abort
MoveTables Abort ks.wf1

New, to get current state/progress: Show/Progress
MoveTables Show/Progress ks.wf1

Notes

Existing Functionality

Except for the changes needed to make SwitchReads and SwitchWrites independent, the changes are essentially “cosmetic”: we will not be changing the underlying functionality implemented in the **wrangler, traffic_switcher **and **materializer_ _**packages.

Other Improvements

We will take this opportunity to also implement some additional functionality

  • Show (approximate) progress of a workflow using the information_schema, last processed event timestamps and data in copy_state. For workflows in Copy state we get the total row counts and disk sizes of all target and source shards using the values in the information_schema.
  • Anticipating the move to the new vtctld implementation we will use the subcommand pattern which will play well with cobra which is used by it.
  • New flag to keep data in DropSources/Complete to ask that data (source tables or shards) are not dropped but only the vreplication-related artifacts are cleaned up.

Deprecation and Backward Compatibility

  • The new functionality will be backwards compatible with the current one
  • The new MoveTables/Reshard commands will be invoked if the -v2 flag is set
  • Once we feel we are feature-complete and functionality is stable we deprecate the current MoveTables/Reshard commands behind a -v1 flag and then remove it along with the other commands like SwitchReads/Writes/DropSources.

Associated PRs:

@artemvovk
Copy link

One note is that MoveTables doesn't support a customized source_expression (e.g. in case where some historical data in the table does not fully comply with new schema - MySQL allows invalid data to be stored in ENUM colums), so, actually, I've been using Materialize and combinations of SwitchReads and SwitchWrites to simulate a MoveTables workflow.

If new version of MoveTables is able to to handle that - it'd be wonderful.

@rohit-nayak-ps
Copy link
Contributor Author

One note is that MoveTables doesn't support a customized source_expression (e.g. in case where some historical data in able does not fully comply with new schema - MySQL allows invalid data to be stored in ENUM colums

Valid point. Thanks for the input. We will plan to support this in the near future. Can you also provide examples of the kind of source expressions you use, so we can use those as test cases?

@artemvovk
Copy link

here's one. The problem I ran into is that a lot of enum columns had '' values (due to mysql happily swallowing invalid enum value writes)

  selectSQL: "SELECT id, site_id, CASE scope WHEN '' THEN NULL ELSE scope END as scope, actor_id, CASE actor_type WHEN '' THEN NULL ELSE actor_type END as actor_type, verb, acted_id, CASE acted_type WHEN '' THEN NULL ELSE acted_type END as acted_type, meta, created_at, account_id FROM activities;"

  ddl: CREATE TABLE `activities` (`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `site_id` bigint(20) unsigned NOT NULL, `scope` enum('account_closure',''hosted_page','merchant','recurring','token_api','transparent_post') DEFAULT NULL, `actor_id` bigint(20) unsigned DEFAULT NULL, `actor_type` enum('ApiKey','User','Services::MerchantIdentity::OauthClient') DEFAULT NULL, `verb` varchar(50) DEFAULT NULL, `acted_id` bigint(20) unsigned DEFAULT NULL, `acted_type` enum('Account','Charge','CouponRedemption','GiftCard','Invoice','Subscription','SubscriptionChange','Transaction') DEFAULT NULL, `meta` text, `created_at` datetime(6) NOT NULL, `account_id` bigint(20) unsigned DEFAULT NULL, `request_id` varchar(50) DEFAULT NULL, PRIMARY KEY (`id`), KEY `index_activities_on_account_id` (`account_id`), KEY `index_activities_on_acted_id_and_acted_type` (`acted_id`,`acted_type`), KEY `index_activities_on_created_at` (`created_at`), KEY `index_activites_on_site_id_and_acted_type` (`site_id`,`acted_type`), KEY `index_activities_on_site_id_and_actor_id` (`site_id`,`actor_id`), KEY `index_activities_on_site_id_and_created_at` (`site_id`,`created_at`), KEY `index_activities_on_request_id` (`request_id`)) ENGINE=InnoDB  DEFAULT CHARSET=utf8;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants