Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: clearrange/checks=true failed #51711

Closed
cockroach-teamcity opened this issue Jul 22, 2020 · 2 comments
Closed

roachtest: clearrange/checks=true failed #51711

cockroach-teamcity opened this issue Jul 22, 2020 · 2 comments
Assignees
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.
Milestone

Comments

@cockroach-teamcity
Copy link
Member

(roachtest).clearrange/checks=true failed on provisional_202007220233_v20.2.0-alpha.2@d3119926d33d808c6384cf3e99a7f7435f395489:

	cluster.go:2553,clearrange.go:170,clearrange.go:33,test_runner.go:757: monitor failure: unexpected node event: 10: dead
		(1) attached stack trace
		  | main.(*monitor).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2541
		  | main.(*monitor).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2549
		  | main.runClearRange
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/clearrange.go:170
		  | main.registerClearRange.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/clearrange.go:33
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:757
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1373
		Wraps: (2) monitor failure
		Wraps: (3) unexpected node event: 10: dead
		Error types: (1) *withstack.withStack (2) *errutil.withMessage (3) *errors.errorString

	cluster.go:1571,context.go:135,cluster.go:1560,test_runner.go:826: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-2107811-1595392378-26-n10cpu4 --oneshot --ignore-empty-nodes: exit status 1 1: 5203
		9: 4420
		4: 4414
		6: 4469
		5: 4627
		3: 4617
		7: 4545
		8: 4613
		10: dead
		2: 4505
		Error: UNCLASSIFIED_PROBLEM: 10: dead
		(1) UNCLASSIFIED_PROBLEM
		Wraps: (2) attached stack trace
		  | main.glob..func13
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1115
		  | main.wrap.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:266
		  | github.com/spf13/cobra.(*Command).execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:830
		  | github.com/spf13/cobra.(*Command).ExecuteC
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:914
		  | github.com/spf13/cobra.(*Command).Execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:864
		  | main.main
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1808
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:203
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1373
		Wraps: (3) 3 safe details enclosed
		Wraps: (4) 10: dead
		Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *safedetails.withSafeDetails (4) *errors.errorString

More

Artifacts: /clearrange/checks=true
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity cockroach-teamcity added branch-provisional_202007220233_v20.2.0-alpha.2 C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Jul 22, 2020
@cockroach-teamcity cockroach-teamcity added this to the 20.2 milestone Jul 22, 2020
@jbowens
Copy link
Collaborator

jbowens commented Jul 23, 2020

Looks like. both checks=false and checks=true failed with clock synchronization errors.

F200722 06:36:47.291829 29 server/server.go:300  [n3] clock synchronization error: this node is more than 500ms away from at least half of the known nodes (5 of 11 are within the offset)
goroutine 29 [running]:
github.com/cockroachdb/cockroach/pkg/util/log.getStacks(0x81f1c01, 0x7f7085bf4e98, 0x0, 0x7f7085bf4e98)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/get_stacks.go:25 +0xb8
github.com/cockroachdb/cockroach/pkg/util/log.(*loggerT).outputLogEntry(0x81f0d60, 0x4, 0x1623ffac045b2b84, 0x1d, 0x76ccbaa, 0x10, 0x12c, 0xc00af76e10, 0x84, 0xc001c0ba58, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:248 +0xa61

https://teamcity.cockroachdb.com/repository/download/Cockroach_Nightlies_WorkloadNightly/2107811:id/clearrange/checks%3Dfalse/run_1/5.logs/cockroach.log

W200722 06:17:38.179974 5921 server/node.go:747  [n4,summaries] health alerts detected: {Alerts:[{StoreID:0 Category:METRICS Description:liveness.heartbeatfailures Value:1}]}
W200722 06:17:38.567048 295 kv/txn.go:610  [n4,liveness-hb] failure aborting transaction: remote wall time is too far ahead (998.169895ms) to be trustworthy; abort caused by: remote wall time is too far ahead (998.699214ms) to be trustworthy
W200722 06:17:38.567118 295 kv/kvserver/node_liveness.go:538  [n4,liveness-hb] failed node liveness heartbeat: remote wall time is too far ahead (998.699214ms) to be trustworthy
W200722 06:17:38.677534 5950 sql/stmtdiagnostics/statement_diagnostics.go:122  [n4] error polling for statement diagnostics requests: stmt-diag-poll: remote wall time is too far ahead (998.884674ms) to be trustworthy
W200722 06:17:43.067019 295 kv/txn.go:610  [n4,liveness-hb] failure aborting transaction: remote wall time is too far ahead (998.113528ms) to be trustworthy; abort caused by: remote wall time is too far ahead (998.741985ms) to be trustworthy
W200722 06:17:43.067102 295 kv/kvserver/node_liveness.go:538  [n4,liveness-hb] failed node liveness heartbeat: remote wall time is too far ahead (998.741985ms) to be trustworthy
F200722 06:17:43.114656 38 server/server.go:300  [n4] clock synchronization error: this node is more than 500ms away from at least half of the known nodes (4 of 9 are within the offset)
goroutine 38 [running]:
github.com/cockroachdb/cockroach/pkg/util/log.getStacks(0x81f1c01, 0x7fa0b62c7ee8, 0x0, 0x7fa0b6447108)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/get_stacks.go:25 +0xb8

https://teamcity.cockroachdb.com/repository/download/Cockroach_Nightlies_WorkloadNightly/2107811:id/clearrange/checks%3Dtrue/run_1/10.logs/cockroach.log

@irfansharif
Copy link
Contributor

Let's chalk it up to an infra-flake.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.
Projects
None yet
Development

No branches or pull requests

4 participants