-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ccl: TestTenantLogic_create_index timed out #88202
Comments
There's something fishy about this one. I say that because we log the side effect with an event logger before we wait for the new setting value. We don't see a log event for that reset in the test logs. |
I rescind my analysis. We attach the logging to a different transaction than we use to write the cluster setting (for better or for worse, and most likely, for worse). |
The timeout here is because the pausepoint was still set when we went to do post-test dropping of the database. |
Hi @ajwerner, please add branch-* labels to identify which branch(es) this release-blocker affects. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
There's something very fishy going on here. I've added more logging, and it appears that the rangefeed is just missing the deletion event straight. It looks like there's a catch-up scan that somehow misses it or something. I'm honestly not sure. The repro is pretty slow. I'll spend more time digging tomorrow, but it's worrying. |
I think this might be related to Right now I'm headed to the office and have it running with that disabled, but I'm realizing as I type this that I could just as well disable 1PC and see whether it fails readily. |
I think this checks out. This is the only externally visible behavior that will differ with Is that the only issue here? If so, I can add a testing knob which replaces rangefeed events for point-sized range tombstones with regular point tombstone events. Alternatively, we can skip the relevant tests when that parameter is enabled. |
It will mean that any attempt to reset a cluster setting might fail in the rare case that a catch-up scan occurs. This actually explains some other failures we saw related to enabling settings. Another option is to somehow make the settingwatcher observe the range deletion tombstones. I suspect that the spanconfig watcher will also be sad about missing entries. |
Ok, I'll look into it, but may not get around to it until tomorrow. |
Honestly, I'm inclined to do something hacky here like avoid this tombstone on system tables in the resolve intent code. |
That's fine too. |
On the staging build here:
Seems related to #87201, cc @ajwerner
Jira issue: CRDB-19728
The text was updated successfully, but these errors were encountered: