roachtest: streamline debug collection #36562

tbg · 2019-04-05T08:18:55Z

Inspired by 1.

The case in which a test hit an error and the case in which a test
times out were handled differently and I had little confidence that
it was working as intended.

This simplifies by not running the test in the main goroutine, but
reserving the main goroutine for defering the debug and cleanup
actions while the test itself is now running in a child goroutine.

I tested this running tpcc/nodes=3/w=headroom locally (where it
runs with one warehouse and only for a minute) in the passing case
and in the case in which it fails with an artificially low timeout.

Release note: None

The case in which a test hit an error and the case in which a test times out were handled differently and I had little confidence that it was working as intended. This simplifies by not running the test in the main goroutine, but reserving the main goroutine for defering the debug and cleanup actions while the test itself is now running in a child goroutine. I tested this running `tpcc/nodes=3/w=headroom` locally (where it runs with one warehouse and only for a minute) in the passing case and in the case in which it fails with an artificially low timeout. Release note: None

cockroach-teamcity · 2019-04-05T08:19:01Z

This change is

tbg · 2019-04-05T08:20:54Z

pkg/cmd/roachtest/test.go

+			}
+			// NB: c.destroyed is nil for cloned clusters (i.e. in subtests).
+			if !debugEnabled && c.destroyed != nil {
+				c.Destroy(ctx)


PS I'm unclear why there's this random c.Destroy call that only fires on a timeout. Perhaps someone can educate me. I thought destroying would be left to a higher power. Perhaps we can just stop in all cases and don't need to track the bool?

petermattis

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @tbg)

pkg/cmd/roachtest/test.go, line 1190 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

PS I'm unclear why there's this random c.Destroy call that only fires on a timeout. Perhaps someone can educate me. I thought destroying would be left to a higher power. Perhaps we can just stop in all cases and don't need to track the bool?

As I've mentioned to @andreimatei before, this code has evolved into the current mess and deserves to be rewritten, but there is a high probability of fallout from any changes.

pkg/cmd/roachtest/test.go, line 1230 at r1 (raw file):

		t.mu.Unlock()

		// Run the test itself in a goroutine. The main goroutine is in charge

Does this muck with the test.runner stacktrace magic? See test.decorate. Also, test.runnerID is supposed to be the ID of the goroutine running the test. See test.Status, and test.Progress. I suspect switching to running the test in a goroutine is going to have a bunch of fallout.

andreimatei

Let's not change the existing code any more. I'm trying to get my rewrite merged asap.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @tbg)

tbg requested a review from andreimatei April 5, 2019 08:19

tbg commented Apr 5, 2019

View reviewed changes

petermattis requested changes Apr 5, 2019

View reviewed changes

tbg mentioned this pull request Apr 19, 2019

roachtest: scrub/all-checks/tpcc/w=1000 failed #35986

Closed

andreimatei reviewed Apr 19, 2019

View reviewed changes

tbg mentioned this pull request Apr 23, 2019

roachtest: kv/splits/nodes=3/quiesce=true failed #36319

Closed

tbg added the X-noremind Bots won't notify about PRs with X-noremind label Jun 19, 2019

tbg closed this Jan 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

roachtest: streamline debug collection #36562

roachtest: streamline debug collection #36562

tbg commented Apr 5, 2019

cockroach-teamcity commented Apr 5, 2019

tbg Apr 5, 2019

petermattis left a comment

andreimatei left a comment

roachtest: streamline debug collection #36562

roachtest: streamline debug collection #36562

Conversation

tbg commented Apr 5, 2019

cockroach-teamcity commented Apr 5, 2019

tbg Apr 5, 2019

Choose a reason for hiding this comment

petermattis left a comment

Choose a reason for hiding this comment

andreimatei left a comment

Choose a reason for hiding this comment