Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

leaktest: exclude long running logging goroutines #93989

Merged
merged 1 commit into from
Dec 21, 2022

Conversation

nicktrav
Copy link
Collaborator

@nicktrav nicktrav commented Dec 20, 2022

The leaktest package detects potential goroutine leaks by snapshotting the set of goroutines running when leaktest.AfterTest(t) is called, returning a closure, and comparing the set of goroutines when the closure is called (typically defer'd).

A race condition was uncovered in #93849 whereby logging-related goroutines that are scheduled by an init function in pkg/util/logging can sometimes be spawned after the AfterTest function is run. When the test completes and the closure is run, the test fails due to a difference in the before / after goroutine snapshots.

This mode of failure is deemed to be a false-positive. The intention of the logging goroutines are that they live for the duration of the process. However, exactly when the goroutines scheduled in the init functions actually start run, and hence show up in the goroutine snapshots, is non-deterministic.

Exclude the logging goroutines from the leaktest checks to reduce the flakiness of tests.

Closes #93849.

Release note: None.

Epic: CRDB-20293

The `leaktest` package detects potential goroutine leaks by snapshotting
the set of goroutines running when `leaktest.AfterTest(t)` is called,
returning a closure, and comparing the set of goroutines when the
closure is called (typically `defer`'d).

A race condition was uncovered in cockroachdb#93849 whereby logging-related
goroutines that are scheduled by an `init` function in
`pkg/util/logging` can sometimes be spawned _after_ the `AfterTest`
function is run. When the test completes and the closure is run, the
test fails due to a difference in the before / after goroutine
snapshots.

This mode of failure is deemed to be a false-positive. The intention of
the logging goroutines are that they live for the duration of the
process. However, exactly _when_ the goroutines scheduled in the `init`
functions actually start run, and hence show up in the goroutine
snapshots, is non-deterministic.

Exclude the logging goroutines from the `leaktest` checks to reduce the
flakiness of tests.

Closes cockroachdb#93849.

Release note: None.
@nicktrav nicktrav requested review from srosenberg, a team and smg260 and removed request for a team December 20, 2022 17:24
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@srosenberg srosenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Thanks for adding the exclusions; no more future flakes owing to these false positives.

Btw, the CI failure was already addressed; hopefully, it should pass on the next run; slack thread: https://cockroachlabs.slack.com/archives/C016CAD2HQ8/p1671566665663939?thread_ts=1671566463.649659&cid=C016CAD2HQ8

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @smg260)

@nicktrav
Copy link
Collaborator Author

TFTR!

bors r=srosenberg

@craig
Copy link
Contributor

craig bot commented Dec 21, 2022

Build succeeded:

@craig craig bot merged commit 8ae6026 into cockroachdb:master Dec 21, 2022
@nicktrav nicktrav deleted the nickt.ignore-flush-thread branch December 21, 2022 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

storage: TestMVCCDeleteRangeConcurrentTxn failed
3 participants