Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: successful test retries evidently delete leftover tmpdir artifacts from previously failed runs #103042

Open
rickystewart opened this issue May 10, 2023 · 0 comments
Assignees
Labels
A-ci Continuous Integration C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-dev-inf

Comments

@rickystewart
Copy link
Collaborator

rickystewart commented May 10, 2023

Internal ref.

This manifests as a failed tests saying you can find leftover test artifacts in some directory, when those artifacts cannot be found, specifically in cases where that test is successfully retried.

We do not have a root cause for this bug yet but my guess is that whatever code computes where these directories lives chooses the same directory for the re-run, and when the test succeeds those artifacts get wiped.

Jira issue: CRDB-27807

@rickystewart rickystewart added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-dev-inf A-ci Continuous Integration labels May 10, 2023
rickystewart added a commit to rickystewart/cockroach that referenced this issue Jul 6, 2023
First of all, test retries don't even have the correct behavior: cockroachdb#103042
This means that a successfully-retried test tramples the logs of
previously-failed tests, which is very confusing and erases your ability
to debug the test.

Also, we are focusing on quality and wiping out flaky and skipped tests.
This to me suggests we should not be retrying tests to let already-flaky
tests through. Rather, we should be surfacing real failures immediately.

For both of these reasons I turn off test retries for unit tests on
`master` and release branches.

We keep it for `staging` so `bors` is unaffected.

Epic: none
Release note: None
craig bot pushed a commit that referenced this issue Jul 7, 2023
106206: prereqs: delete tests r=rail a=rickystewart

These tests have always been skipped under Bazel because the implementation doesn't work in a Bazel world due to the dependency on `"golang.org/x/tools/go/packages"`. Since the command is only useful/ used in `make`, which is going to be deleted shortly, just delete the tests rather than waste time getting it working.

Also this is the last `broken_in_bazel` test, so rip out all the corresponding
logic too.

Epic: none
Release note: None
Closes: #61924
Closes: #92814

106343: ci: don't do retries on tests in CI on master, release branches r=rail a=rickystewart

First of all, test retries don't even have the correct behavior: #103042
This means that a successfully-retried test tramples the logs of
previously-failed tests, which is very confusing and erases your ability
to debug the test.

Also, we are focusing on quality and wiping out flaky and skipped tests.
This to me suggests we should not be retrying tests to let already-flaky
tests through. Rather, we should be surfacing real failures immediately.

For both of these reasons I turn off test retries for unit tests on
`master` and release branches.

We keep it for `staging` so `bors` is unaffected.

Epic: none
Release note: None

106411: kvflowhandle: fix mutex leak r=irfansharif a=irfansharif

Fixes #106078.

We were forgetting to unlock the mutex on the error path.

Release note: None

Co-authored-by: Ricky Stewart <[email protected]>
Co-authored-by: irfan sharif <[email protected]>
rickystewart added a commit to rickystewart/cockroach that referenced this issue Aug 2, 2023
First of all, test retries don't even have the correct behavior: cockroachdb#103042
This means that a successfully-retried test tramples the logs of
previously-failed tests, which is very confusing and erases your ability
to debug the test.

Also, we are focusing on quality and wiping out flaky and skipped tests.
This to me suggests we should not be retrying tests to let already-flaky
tests through. Rather, we should be surfacing real failures immediately.

For both of these reasons I turn off test retries for unit tests on
`master` and release branches.

We keep it for `staging` so `bors` is unaffected.

Epic: none
Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ci Continuous Integration C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-dev-inf
Projects
None yet
Development

No branches or pull requests

2 participants