-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: de-flake TestDistSQLFlowsVirtualTables #75638
sql: de-flake TestDistSQLFlowsVirtualTables #75638
Conversation
Fixes cockroachdb#75208 (speculatively). Wasn't able to repro the original failure, so here's a blind stab at fixing it. See [1] for the theory; I think it's possible for the query this test fires off to error out + get dequeued concurrently with when the assertions are made about it being enqueued. We should relax our assertions accordingly. [1]: github.com/cockroachdb/issues/75208#issuecomment-1023800242 Release note: None
Aside: I'm not really sure how #73876 caused this to flake more often. After staring at this test for hours I'm not really sure it's uncovering bugs in the span configs infra as much as shaking out latent flakiness in the test itself (which I still can't repro, 4hrs later). I'd be happy to hand off further investigation here to folks more familiar with this area if this PR ends up being insufficient. |
Actually I'm pretty sure it's that, this test has a pretty aggressive timeout for the query and the failing runs certainly take longer: cockroach/pkg/sql/crdb_internal_test.go Lines 594 to 595 in eb2a5cf
We could alternatively make the timeout less aggressive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I think this 1 second timeout is the cause of flakiness - in all failures in #75208 we see an output like
FAIL: TestDistSQLFlowsVirtualTables/MaxRunningFlows=0 (1.53s)
and the test run time is always greater than 1 second.
I'm also not sure how span config work caused this to flake more often, and I agree that it looks like the test problem, so apologies for pushing this flake to you. Thanks for figuring it out though!
Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @arulajmani)
Thanks! bors r+ |
75638: sql: de-flake TestDistSQLFlowsVirtualTables r=irfansharif a=irfansharif Fixes #75208 (speculatively). Wasn't able to repro the original failure, so here's a blind stab at fixing it. See [this](github.com//issues/75208#issuecomment-1023800242) for the theory; I think it's possible for the query this test fires off to error out + get dequeued concurrently with when the assertions are made about it being enqueued. We should relax our assertions accordingly. Release note: None Co-authored-by: irfan sharif <[email protected]>
Build failed: |
bors r+ |
Build succeeded: |
Fixes #75208 (speculatively). Wasn't able to repro the original failure,
so here's a blind stab at fixing it. See this for the theory; I think
it's possible for the query this test fires off to error out + get
dequeued concurrently with when the assertions are made about it being
enqueued. We should relax our assertions accordingly.
Release note: None