Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: VSchema based routing and resharding #4790

Open
sougou opened this issue Apr 7, 2019 · 1 comment
Open

RFC: VSchema based routing and resharding #4790

sougou opened this issue Apr 7, 2019 · 1 comment
Labels

Comments

@sougou
Copy link
Contributor

sougou commented Apr 7, 2019

Current design

VTGate performs routing using different methods:

  1. ServedFrom: If a keyspace is ‘served from’ another keyspace, then all queries that must go to that keyspace are sent to the source keyspace. This method is used for vertical resharding.
  2. Shard Map: There exists a shard map per tablet type. This is used during horizontal resharding. We start with all tablet types pointing to the source shard map. As we migrate each served type, the tablet types point to the target shard map.
  3. VSchema: The VSchema does all other forms of intelligent routing
    1. Figure out which keyspace to send a query to based on the table name.
    2. Identify the vindex to use, map it to a keyspace id and use the shard map to identify the shard to send the query to.

Problem statement

The ServedFrom scheme is meant to only allow for one-time vertical splitting of a keyspace into two. This does not meet other growing requirements like:

  • Move a table from an unsharded keyspace to another.
  • Move a table from an unsharded keyspace to a sharded keyspace.
  • Reshard a table using different keys.

This ServedFrom approach also does not allow us to reverse replication after a master migration because the model is not symmetric.

The Shard Map approach does meet existing and future needs for now.

Requirements

The new design should not only address the above problems, but it should also accommodate the following new use cases:

  • Reshard from anything to anything, with the option to change the primary vindex.
  • Continue to allow rdonly, replica and master migration during resharding.
  • Reversible after a master migration.
  • Reference table: An unsharded table can be materialized into all shards of a sharded keyspace. If so, vtgate should know to use local joins when such tables are joined with the unsharded tables of the same keyspace.
  • ‘Smart’ use of vreplicated tables. If a table is vreplicated, then we should allow vtgate to figure out whether to use the source or target based on how the query is formed. For example, use the table with the most suitable sharding key.
  • Requirement coming from poin-in-time restore feature: If there are duplicate tables in the vschema, allow for specifying which one to use if the table name is not qualified (designate a default).
  • Desirable: Choose table to read from based on column select list. If there exists a materialized view for a table, and if the query can be re-written to use the materialized view, then do so.

Proposed design

The high level proposal is to deprecate the ServedFrom approach in favor of implementing a more versatile functionality at the VSchema level.

The current VSchema design works at the per-keyspace level. But the above requirements define interactions that go across keyspaces. Although it’s possible to find a way to express these within the scope of individual keyspaces, it will be better to extend the structure of a VSchema.

We will introduce the concept of RoutingRules. These rules will be global instead of being keyspace-specific. However, they will become part of the SrvVSchema when all the vschemas are combined for serving.

Studying the above requirements, we can see two orthogonal concepts emerging:

  1. A tablet type can dictate where to send a query for a table.
  2. Materialization rules can dictate where to send a query

This can be represented as a map from (table,tablet_type) pair into a list of keyspace_qualified tables. This mapping will be resolved to specific table pointers after all the vschema is combined for all the keyspaces.

For example, a map for vertical resharding where rdonly has migrated will look like this:

t@rdonly: [target.t]
t: [source.t]

t@* is matched last.

By default, every unique table will be t: [ks.t]`.

In the case of a vreplicated table from ks1 to ks2, the rule will be t@*: [ks1.t, ks2.t]. This rule will mean that a reference to t can resolve to ks1.t or ks2.t, whichever is favorable. ks1 will be preferred by default.

Since the map and list are orthogonal, it’s possible to combine them like this:

t@master: [ks1.t]
t: [ks1.t, ks2.t]

Reference tables

The case of reference tables is different. This will need to be stored within the keyspace as metadata for the table (like sequences).

Transitioning state

We have to rely on the principle that the lockserver data cannot be relied upon for timely delivery. This means that workflows should use an alternate mechanisms for situations where timeliness is required.

For example, while migrating masters (or writes), we have to first force readonly on the source. This is currently achieved by pushing tablet control records or blacklisted tables into the topo. Instead, we’ll reimplement this by directly writing this metadata into the relevant vttablets where the action will be taken.

The topo changes will be used only to transmit the rest of the transitions. In the case of write transitions, there is a period of exposure where we would have marked the source as readonly and the vtgates have not received the updated vschema. This is unavoidable. However, we can have the assurance that no spurious writes will go to the source. We’ll only be serving some transient errors.

@sougou
Copy link
Contributor Author

sougou commented Apr 8, 2019

While drilling down on the design, I came across an additional use case: users that treat vitess as a multi-schema server connect to specific keyspaces. We have to make vertical splits work in such cases also. In order to accommodate this, we'll extend the routing rule key to be keyspace.table@tablet_type.

So, if a table t is migrating from ks1 to ks2, we can start with ks2.t: [ks1.t], and remove the entry once the migration is complete.

Additionally, the list of tables as target will change how the vtgate optimizer will work. Currently, the keyspace for a route gets decided when it gets created. Now that a table can be in different keyspaces, we'll create routes with multiple routing options. As the plan evolves we'll eliminate the ones that are not suitable. At the end, we'll choose the first one if more than one is left.

sougou added a commit to planetscale/vitess that referenced this issue Apr 23, 2019
This is the first part of the changes to implement vitessio#4790.
This part implements all the management functionality for
routing rules.

Signed-off-by: Sugu Sougoumarane <[email protected]>
sougou added a commit to planetscale/vitess that referenced this issue Apr 23, 2019
In this change the query routing takes the possibility that
there could be multiple target options for a given table. The
design for this is explained in vitessio#4790.

At a high level:
* VSchema.FindTableOrVindex function can return a list of
  tables instead of a single one.
* The route planbuilder creates multiple routeOptions, one
  for each table returned.
* All actions that affected the plan of a route are changed
  to update all routeOptions.
* If a particular routeOption cannot accommodate a pushed
  down construct, it's removed from the list. Previously,
  this was an error case. But if no options are left, then
  we return an error.
* If two routeOptions qualify for a merge of routes, then
  all other combinations that don't qualify are discarded.
  This is the case for joins, subqueries and unions.

More details:
vindexTable was renamed to the more appropriate vschemaTable.

In order to achieve this, a new routeOption data type was
introduced, and route was changed to contain a list of
routeOptions.

In symtab, tables used to point at the vschema table that
was used to build them. Since a table can now represent
multiple target tables, this field has been moved into
routeOption.

In symtab, columns used to contain a vindex member. Since
this can change depending on the target table, the routeOption
now contains a map of column to vindexes instead.

The routeOption also contains the vschemaTable. DMLs use
this information. Since DMLs have to be more deterministic
about the table they write to, they always choose the
first option.

At the beginning of the Wireup phase, we evaluate all existing
options and decide on the best available.

To be done:
When a table has multiple targets, the targets can have different
names than the original table. If so, the queries have to be
rewritten to address the new target tables. In order to do this,
each routeOption will contain a list of substitutions that will
be made during the Wireup phase.

Tests have to be written for the new flows.

Signed-off-by: Sugu Sougoumarane <[email protected]>
@sougou sougou changed the title RFP: VSchema based routing and resharding RFC: VSchema based routing and resharding May 4, 2019
@ajm188 ajm188 removed the P3 label Mar 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants