-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: fix bug whereby backfiller would drop spans on txn restart #54755
sql: fix bug whereby backfiller would drop spans on txn restart #54755
Conversation
This bug was caught by testing with cockroachdb#54695. Before that change, it would fail almost immediately, now it does not fail under stress. I'm open to suggestions on how to more generally test this. Release note (bug fix): Fixed a rare bug which can lead to index backfills failing in the face of transaction restarts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but I think someone on Bulk IO should take a look too.
What are the errors you can get from this? Does this ever fail at index validation?
Reviewed 1 of 1 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained
(@miretskiy I just added cockroachdb/bulk-prs but feel free to reassign to someone else) |
The main error I was seeing was:
|
Thanks. I wonder if the panic specifically is due to an inconsistency in in-memory state that goes away if the job restarts, and if there are more subtle possible corruption bugs that don't manifest an immediate crash in the backfiller. I think if that happens, we'd catch it in the validation step, but it would be ideal to confirm that. |
I'm pretty sure the operation of subtracting spans ends up being idempotent so we shouldn't have persistent correctness problems due to this one. The issue here is just that we might create an empty set of spans and then upon retry hit the panic. Given I don't see any sentry reports on this one, we might just be in luck |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @miretskiy)
bors r+ |
Build succeeded: |
Fixes a bug introduced in cockroachdb#54755 whereby we'd always subtract from the original set of spans rather than from the updated set of spans meaning that we could backfill the same span multiple times. Fixes cockroachdb#54775. Release note: None
54900: sql: fix a bug tracking spans in a backfill r=lucy-zhang a=ajwerner Fixes a bug introduced in #54755 whereby we'd always subtract from the original set of spans rather than from the updated set of spans meaning that we could backfill the same span multiple times. Fixes #54775. Release note: None Co-authored-by: Andrew Werner <[email protected]>
This bug was caught by testing with #54695. Before that change, it would fail
almost immediately, now it does not fail under stress. I'm open to suggestions
on how to more generally test this.
Release note (bug fix): Fixed a rare bug which can lead to index backfills
failing in the face of transaction restarts.