Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: test AOST restore in backup-restore/* roachtests #113534

Merged
merged 1 commit into from
Nov 2, 2023

Conversation

msbutler
Copy link
Collaborator

This patch allows the backup-restore driver to run and validate AOST restores from revision history backups. If the driver created a revision history backup, there's a 50% chance it will restore from an AOST between the full backup end time and the last incremental start time. A future patch will allow for AOST restores from non-revision history backups.

Epic: none

Release note: none

@msbutler msbutler added T-disaster-recovery backport-23.2.x Flags PRs that need to be backported to 23.2. labels Oct 31, 2023
@msbutler msbutler requested a review from adityamaru October 31, 2023 20:57
@msbutler msbutler self-assigned this Oct 31, 2023
@msbutler msbutler requested a review from a team as a code owner October 31, 2023 20:57
@msbutler msbutler requested review from DarrylWong and renatolabs and removed request for a team and DarrylWong October 31, 2023 20:57
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@msbutler
Copy link
Collaborator Author

I took this PR for a few test drives and it seems to work.

Copy link
Contributor

@renatolabs renatolabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Made an optional comment about failure reproducibility but otherwise looks good!

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru and @msbutler)


pkg/cmd/roachtest/tests/mixed_version_backup.go line 1043 at r1 (raw file):

	// Choose a random aost between min and max
	restoreAOST := hlc.Timestamp{
		WallTime: rng.Int63n(max.WallTime-min.WallTime) + min.WallTime,

For consideration: instead of a completely arbitrary timestamp in this range (which could be large), consider dividing this duration in N parts (say, N=5) and picking the time at the beginning of one of these slices. I believe this should be pretty sufficient in terms of correctness coverage while preserving some sense of similarity of runs when the same rng is used.

With the current implementation, each run will pick a potentially wildly different restoreAOST (since the Int63n argument will be different) even if run with the same seed. Say a bug only happens when a timestamp close to the full backup time is picked: re-running the test with the same seed will not help focus on that scenario.


pkg/cmd/roachtest/tests/mixed_version_backup.go line 2254 at r1 (raw file):

	restoreStmt := fmt.Sprintf(
		"%s FROM LATEST IN '%s'%s %s",
		restoreCmd, bc.uri(), aostClause, optionsStr,

Nit: you don't need the aostClause variable, you can just call aostFor(bc.restoreAOST) here.

@renatolabs
Copy link
Contributor

While you have your hands in this test, maybe you can sneak in this request I had made previously 😄 Or maybe I'm the only one who misses that, hehe.

#112373 (comment)

@msbutler msbutler force-pushed the butler-test-restore-aost branch from 1970d70 to 1a7a717 Compare November 1, 2023 19:54
Copy link
Collaborator Author

@msbutler msbutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @adityamaru and @renatolabs)


pkg/cmd/roachtest/tests/mixed_version_backup.go line 1043 at r1 (raw file):

Previously, renatolabs (Renato Costa) wrote…

For consideration: instead of a completely arbitrary timestamp in this range (which could be large), consider dividing this duration in N parts (say, N=5) and picking the time at the beginning of one of these slices. I believe this should be pretty sufficient in terms of correctness coverage while preserving some sense of similarity of runs when the same rng is used.

With the current implementation, each run will pick a potentially wildly different restoreAOST (since the Int63n argument will be different) even if run with the same seed. Say a bug only happens when a timestamp close to the full backup time is picked: re-running the test with the same seed will not help focus on that scenario.

ah very good call. fixed.


pkg/cmd/roachtest/tests/mixed_version_backup.go line 2254 at r1 (raw file):

Previously, renatolabs (Renato Costa) wrote…

Nit: you don't need the aostClause variable, you can just call aostFor(bc.restoreAOST) here.

done

Copy link
Contributor

@renatolabs renatolabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @adityamaru)

This patch allows the backup-restore driver to run and validate AOST restores
from revision history backups. If the driver created a revision history backup,
there's a 50% chance it will restore from an AOST between the full backup end
time and the last incremental start time. A future patch will allow for AOST
restores from non-revision history backups.

Epic: none

Release note: none
@msbutler msbutler force-pushed the butler-test-restore-aost branch from 1a7a717 to acae46b Compare November 1, 2023 20:58
@msbutler
Copy link
Collaborator Author

msbutler commented Nov 2, 2023

TFTR!

bors r=renatolabs

@craig
Copy link
Contributor

craig bot commented Nov 2, 2023

Build succeeded:

@craig craig bot merged commit 4dad7f2 into cockroachdb:master Nov 2, 2023
3 checks passed
@renatolabs
Copy link
Contributor

blathers backport 23.1

Copy link

blathers-crl bot commented Nov 2, 2023

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from acae46b to blathers/backport-release-23.1-113534: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.1 failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@renatolabs
Copy link
Contributor

Ah, it's missing #112356. @msbutler thoughts on backporting both to 23.1?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.2.x Flags PRs that need to be backported to 23.2. T-disaster-recovery
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants