-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-22.2: roachtest: ignore duplicated events in fingerprint validator #89535
Merged
srosenberg
merged 2 commits into
release-22.2
from
blathers/backport-release-22.2-89332
Oct 10, 2022
Merged
release-22.2: roachtest: ignore duplicated events in fingerprint validator #89535
srosenberg
merged 2 commits into
release-22.2
from
blathers/backport-release-22.2-89332
Oct 10, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
To handle mixed-version CDC tests, the fingerprint validator had previously been updated to handle retries internally; this complicates the validator as the retry logic is only ever used in the mixed-version roachtests. This commit removes the retry logic from the validator and instead allows the caller to pass a `DBFunc` to be called whenever a database connection is needed after initialization. By passing a custom DBFunc, tests that need it (such as the mixed-versions roachtest) can pass a function that accounts for temporary unavailability of nodes. Specifically, we pass a function that blocks while nodes are being upgraded, to simplify reasoning of this test's behavior. In order to support that, we also change the signature of `NewFingerprintValidator` to return the actual concrete validator (which is now made public) instead of the `Validator` interface, following the generally-good approach of "accept interfaces, return structs" in Go. Release note: None
This commit updates the fingerprint validator (and its use in the `cdc/mixed-versions` test) to ignore duplicated events received by the validator. A previously implicit assumption of the validator is that any events that it receives are either not duplicated, or -- if they are duplicated -- they are within the previous resolved timestamp and the current resolved timestamp. However, that assumption is not justified by the changefeed guarantees and depends on how frequently `resolved` events are emitted and how often the changefeed checkpoints. In the specific case of the `cdc/mixed-versions` roachtest, it was possible for the changefeed to start from an old checkpoint (older than the last received `resolved` timestamp), causing it to re-emit old events that happened way before the previously observed resolved event. As a consequence, when the validator applies the update associated with that event, there is a mismatch with state of the original table as of the update's timestamp, as the fingerprint validator relies on the fact that updates are applied in order. To fix the issue, we now skip events that happen before the timestamp of the previous `resolved` event received. In addition, the caller can also tell the validator to verify that such out-of-order messages received by the validator have indeed been previously seen; if not, that would represent a violation of the changefeed's guarantees. Fixes: #87251. Release note: None
blathers-crl
bot
requested review from
shermanCRL
and removed request for
a team
October 6, 2022 21:35
blathers-crl
bot
force-pushed
the
blathers/backport-release-22.2-89332
branch
from
October 6, 2022 21:35
27027c8
to
8c4336d
Compare
Thanks for opening a backport. Please check the backport criteria before merging:
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
Add a brief release justification to the body of your PR to justify this backport. Some other things to consider:
|
blathers-crl
bot
requested review from
herkolategan,
renatolabs,
samiskin,
srosenberg and
a team
October 6, 2022 21:35
blathers-crl
bot
added
blathers-backport
This is a backport that Blathers created automatically.
O-robot
Originated from a bot.
labels
Oct 6, 2022
srosenberg
approved these changes
Oct 6, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
blathers-backport
This is a backport that Blathers created automatically.
O-robot
Originated from a bot.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport 2/2 commits from #89332 on behalf of @renatolabs.
/cc @cockroachdb/release
This commit updates the fingerprint validator (and its use in the
cdc/mixed-versions
test) to ignore duplicated events received by thevalidator.
A previously implicit assumption of the validator is that any events
that it receives are either not duplicated, or -- if they are
duplicated -- they are within the previous resolved timestamp and the
current resolved timestamp. However, that assumption is not justified
by the changefeed guarantees and depends on how frequently
resolved
events are emitted and how often the changefeed checkpoints.
In the specific case of the
cdc/mixed-versions
roachtest, it waspossible for the changefeed to start from an old checkpoint (older
than the last received
resolved
timestamp), causing it to re-emitold events that happened way before the previously observed resolved
event. As a consequence, when the validator applies the update
associated with that event, there is a mismatch with state of the
original table as of the update's timestamp, as the fingerprint
validator relies on the fact that updates are applied in order.
To fix the issue, we now skip events that happen before the timestamp
of the previous
resolved
event received. In addition, the caller canalso tell the validator to verify that such out-of-order messages
received by the validator have indeed been previously seen; if not,
that would represent a violation of the changefeed's guarantees.
Fixes: #87251.
Release note: None
Release justification: fix for roachtest