Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backupccl: create readAsOfIterator for RESTORE #77281

Merged
merged 1 commit into from
May 25, 2022

Conversation

msbutler
Copy link
Collaborator

@msbutler msbutler commented Mar 2, 2022

Previously while ingesting SSTs in the restore processor,
processRestoreSpanEntry would manually skip delete tombstones and keys with
timestamps later than the AS OF SYSTEM TIME timestamp provided in the RESTORE
command. This PR wraps the restore processor's iterator with a new
readAsOfIterator which abstracts this logic away.

The readAsOfIterator wraps a SimpleMVCCIterator
ond only surfaces the latest valid key of a given MVCC key that is also below
the iterator's AOST timestamp, if set. Further, the iterator does not surface
delete tombstones, nor any MVCC keys shadowed by delete tombstones below the
AOST timestamp, if set.

Fixes #77276

Release note: None

@msbutler msbutler requested review from dt and erikgrinaker March 2, 2022 14:57
@msbutler msbutler requested a review from a team as a code owner March 2, 2022 14:57
@msbutler msbutler self-assigned this Mar 2, 2022
@msbutler msbutler requested a review from a team March 2, 2022 14:57
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@msbutler msbutler force-pushed the butler_multi_iter_asof_wrapper branch from 4b2ba3c to ea04166 Compare March 2, 2022 16:56
Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach looks good, but it could use more tests for corner cases.

pkg/storage/read_as_of_iterator.go Outdated Show resolved Hide resolved
pkg/storage/read_as_of_iterator.go Outdated Show resolved Hide resolved
pkg/storage/read_as_of_iterator.go Outdated Show resolved Hide resolved
pkg/storage/read_as_of_iterator.go Outdated Show resolved Hide resolved
pkg/storage/read_as_of_iterator.go Outdated Show resolved Hide resolved
@msbutler msbutler force-pushed the butler_multi_iter_asof_wrapper branch from ea04166 to 672746e Compare May 2, 2022 22:06
pkg/ccl/backupccl/restore_data_processor.go Outdated Show resolved Hide resolved
pkg/storage/read_as_of_iterator.go Outdated Show resolved Hide resolved
pkg/storage/read_as_of_iterator.go Outdated Show resolved Hide resolved
pkg/storage/read_as_of_iterator.go Show resolved Hide resolved
pkg/storage/read_as_of_iterator.go Outdated Show resolved Hide resolved
pkg/storage/read_as_of_iterator_test.go Outdated Show resolved Hide resolved
pkg/storage/read_as_of_iterator.go Show resolved Hide resolved
@msbutler msbutler force-pushed the butler_multi_iter_asof_wrapper branch from 672746e to 8eac3fc Compare May 24, 2022 15:08
// (i.e. asOf, if set) and iterate to the next valid key at or below the caller's
// key that also obeys the iterator's constraints.
maxTime := f.asOf
if f.asOf.IsEmpty() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this check isn't necessary, since an empty timestamp sorts before all other timestamps in MVCC keys -- passing f.asOf will do the right thing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh, this is kinda subtle. When you pass a key with an empty timestamp to the underlying iterator's SeekGE, you're right, it will seek to the latest version of that key. So yes, we can remove this check, but only because we call the internal seek before doing any key comparisons.

We can't, for example, remove this isEmpty() check in advance because an empty hlcTimestamp.walltime defaults to 0.

} else if !f.asOf.IsEmpty() && f.asOf.Less(f.iter.UnsafeKey().Timestamp) {

Copy link
Contributor

@erikgrinaker erikgrinaker May 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, MVCCKey and Timestamp order timestamps differently. The former orders them in descending order but with 0 first (which means that 3 is "less than" or "ordered before" 2, but 0 is before 3, i.e. 0,3,2,1), while Timestamp uses regular ascending order (0,1,2,3...). I suppose you get used to it, but definitely subtle and a bit of a footgun.

pkg/storage/read_as_of_iterator.go Show resolved Hide resolved
@erikgrinaker
Copy link
Contributor

PS: remember to squash the commits before merging.

Previously while ingesting SSTs in the restore processor,
`processRestoreSpanEntry` would manually skip delete tombstones and keys with
timestamps later than the AS OF SYSTEM TIME timestamp provided in the RESTORE
command. This PR wraps the restore processor's iterator with a new
readAsOfIterator which abstracts this logic away.

The readAsOfIterator wraps a SimpleMVCCIterator
ond only surfaces the latest valid key of a given MVCC key that is also below
the iterator's AOST timestamp, if set. Further, the iterator does not surface
delete tombstones, nor any MVCC keys shadowed by delete tombstones below the
AOST timestamp, if set.

Fixes cockroachdb#77276

Release note: None
@msbutler msbutler force-pushed the butler_multi_iter_asof_wrapper branch from 8eac3fc to cc71281 Compare May 25, 2022 16:14
@msbutler msbutler changed the title backupccl: create readAsOfMVCCIterator backupccl: create readAsOIterator May 25, 2022
@msbutler msbutler changed the title backupccl: create readAsOIterator backupccl: create readAsOfIterator for RESTORE May 25, 2022
@msbutler
Copy link
Collaborator Author

TFTR!!!

bors r = erikgrinaker

@msbutler
Copy link
Collaborator Author

bors r=erikgrinaker

@craig
Copy link
Contributor

craig bot commented May 25, 2022

Build succeeded:

@craig craig bot merged commit fa277e9 into cockroachdb:master May 25, 2022
@msbutler msbutler deleted the butler_multi_iter_asof_wrapper branch May 25, 2022 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

backupccl: create readAsOfMVCCIterator
4 participants