ci: successful test retries evidently delete leftover tmpdir artifacts from previously failed runs #103042

rickystewart · 2023-05-10T17:08:57Z

This manifests as a failed tests saying you can find leftover test artifacts in some directory, when those artifacts cannot be found, specifically in cases where that test is successfully retried.

We do not have a root cause for this bug yet but my guess is that whatever code computes where these directories lives chooses the same directory for the re-run, and when the test succeeds those artifacts get wiped.

Jira issue: CRDB-27807

First of all, test retries don't even have the correct behavior: cockroachdb#103042 This means that a successfully-retried test tramples the logs of previously-failed tests, which is very confusing and erases your ability to debug the test. Also, we are focusing on quality and wiping out flaky and skipped tests. This to me suggests we should not be retrying tests to let already-flaky tests through. Rather, we should be surfacing real failures immediately. For both of these reasons I turn off test retries for unit tests on `master` and release branches. We keep it for `staging` so `bors` is unaffected. Epic: none Release note: None

106206: prereqs: delete tests r=rail a=rickystewart These tests have always been skipped under Bazel because the implementation doesn't work in a Bazel world due to the dependency on `"golang.org/x/tools/go/packages"`. Since the command is only useful/ used in `make`, which is going to be deleted shortly, just delete the tests rather than waste time getting it working. Also this is the last `broken_in_bazel` test, so rip out all the corresponding logic too. Epic: none Release note: None Closes: #61924 Closes: #92814 106343: ci: don't do retries on tests in CI on master, release branches r=rail a=rickystewart First of all, test retries don't even have the correct behavior: #103042 This means that a successfully-retried test tramples the logs of previously-failed tests, which is very confusing and erases your ability to debug the test. Also, we are focusing on quality and wiping out flaky and skipped tests. This to me suggests we should not be retrying tests to let already-flaky tests through. Rather, we should be surfacing real failures immediately. For both of these reasons I turn off test retries for unit tests on `master` and release branches. We keep it for `staging` so `bors` is unaffected. Epic: none Release note: None 106411: kvflowhandle: fix mutex leak r=irfansharif a=irfansharif Fixes #106078. We were forgetting to unlock the mutex on the error path. Release note: None Co-authored-by: Ricky Stewart <[email protected]> Co-authored-by: irfan sharif <[email protected]>

First of all, test retries don't even have the correct behavior: cockroachdb#103042 This means that a successfully-retried test tramples the logs of previously-failed tests, which is very confusing and erases your ability to debug the test. Also, we are focusing on quality and wiping out flaky and skipped tests. This to me suggests we should not be retrying tests to let already-flaky tests through. Rather, we should be surfacing real failures immediately. For both of these reasons I turn off test retries for unit tests on `master` and release branches. We keep it for `staging` so `bors` is unaffected. Epic: none Release note: None

rickystewart added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-dev-inf A-ci Continuous Integration labels May 10, 2023

exalate-issue-sync bot assigned rickystewart Jun 2, 2023

rickystewart mentioned this issue Jul 6, 2023

ci: don't do retries on tests in CI on master, release branches #106343

Merged

rafiss mentioned this issue Jul 31, 2023

pkg/sql/logictest/tests/cockroach-go-testserver-upgrade-to-master/cockroach-go-testserver-upgrade-to-master_test: TestLogic_mixed_version_can_login failed #107778

Closed

rickystewart mentioned this issue Aug 1, 2023

release-23.1: ci: don't do retries on tests in CI on master, release branches #107951

Merged

rafiss mentioned this issue Aug 1, 2023

ccl/schemachangerccl: TestBackupMixedVersionElements_base_add_column failed #107811

Closed

exalate-issue-sync bot assigned liamgillies and unassigned rickystewart Sep 21, 2023

kenliu-crl assigned rickystewart and unassigned liamgillies Sep 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: successful test retries evidently delete leftover tmpdir artifacts from previously failed runs #103042

ci: successful test retries evidently delete leftover tmpdir artifacts from previously failed runs #103042

rickystewart commented May 10, 2023 •

edited by cockroach-jira-scripts

Loading

ci: successful test retries evidently delete leftover tmpdir artifacts from previously failed runs #103042

ci: successful test retries evidently delete leftover tmpdir artifacts from previously failed runs #103042

Comments

rickystewart commented May 10, 2023 • edited by cockroach-jira-scripts Loading

rickystewart commented May 10, 2023 •

edited by cockroach-jira-scripts

Loading