Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Point in time recovery (PITR) #4886

Closed
deepthi opened this issue May 28, 2019 · 4 comments · Fixed by #6408
Closed

Point in time recovery (PITR) #4886

deepthi opened this issue May 28, 2019 · 4 comments · Fixed by #6408

Comments

@deepthi
Copy link
Member

deepthi commented May 28, 2019

Feature Description

The ability to recover data in vitess to a (past) point in time.
It should be possible to have multiple recovery requests active at the same time.
It should also be possible to recover across sharding actions, i.e., you should be able to recover to a time when there were two shards even though at present there are four.

Use Case(s)

  • accidental deletion of data
  • corruption of data due to application bugs

Preconditions

  • there should be a backup that was taken before the desired point in time (PIT)
  • there should be continuous binlogs from the backup time to the desired PIT

Proposed design

Create a new keyspace

We will add the ability to create a new type of keyspace called a Snapshot keyspace. The normal keyspaces will have the (unimaginative) type of Normal. When a snapshot keyspace is created, a time is assigned to it, which is the time you want to recover to. The time will be in UTC. Snapshot keyspaces are always created off a base keyspace.

Start VTTablets

VTTablets will be started with their init_keyspace command-line-param set to the newly created snapshot keyspace. They will look for the most recent usable backup of the base keyspace (and specified -init_shard) that is earlier than the snapshot time and restore from that.

It is the responsibility of the operator to provide the correct -init_shard value while starting up vttablets.

The timestamp of the backup that was used for restoring will be saved in the local_metadata table along with the position (GTID) that was restored.

VTGate routing

VTGate will automatically exclude tablets belonging to snapshot keyspaces from query routing unless they are specifically addressed using use ks or by using queries of the form select ... from ks.table

VSchema

The base keyspace's vschema will be copied over to the new snapshot keyspace as a default. If desired this can be overwritten by the operator. Care needs to be taken to set require_explicit_routing to true when modifying a snapshot keyspace's vschema.

Applying Binlogs

#6267

@sverch
Copy link
Contributor

sverch commented Apr 30, 2020

@deepthi since #5160 is merged, does that mean that Vitess 4.0 (checking release notes) has the ability to do these keyspace snapshots? Is there any documentation on how to do that?

@deepthi
Copy link
Member Author

deepthi commented Apr 30, 2020

Yes, this is available in Vitess 4.0.
We don't actually have any documentation, so I'm just going to write some here.

To use this feature, you first create a SNAPSHOT keyspace.

vtctlclient ... CreateKeyspace -keyspace_type=SNAPSHOT -base_keyspace=original_keyspace
-snapshot_time=2019-05-29T20:45:30+00:00 recovery_keyspace

Then you bring up a vttablet with the following flags

-init_keyspace recovery_keyspace -init_shard x-y -init_db_name_override vt_original_keyspace -disable_active_reparents -enable_replication_reporter=false

The init_db_name_override flag is optional, it only has to be provided if it is being used with the original_keyspace.
You need to have at least one backup taken before the snapshot_time for this to work.

That's it!
You can see how our endtoend tests do this at https://github.com/vitessio/vitess/blob/master/go/test/endtoend/recovery/recovery_util.go

I also have some scripts that I used for local testing that I can share.

@deepthi
Copy link
Member Author

deepthi commented Jul 18, 2020

Fixed.

@deepthi deepthi closed this as completed Jul 18, 2020
@deepthi
Copy link
Member Author

deepthi commented Oct 7, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants