Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd-runner: fails on etcd-tester (election, no progress after 1 minute) #7891

Closed
gyuho opened this issue May 5, 2017 · 0 comments · Fixed by #7902
Closed

etcd-runner: fails on etcd-tester (election, no progress after 1 minute) #7891

gyuho opened this issue May 5, 2017 · 0 comments · Fixed by #7902
Assignees
Milestone

Comments

@gyuho
Copy link
Contributor

gyuho commented May 5, 2017

2017-05-05 20:38:04.251694 I | etcd-tester: [round#0 case#2] recovered failure
2017-05-05 20:38:04.251726 I | etcd-tester: [round#0 case#2] pausing the stressers...
2017-05-05 20:38:04.252160 I | etcd-tester: lease stresser "10.240.0.2:2379" is closed
2017-05-05 20:38:04.252322 I | etcd-tester: keyStresser "10.240.0.2:2379" is closed
2017-05-05 20:38:04.254374 I | etcd-tester: lease stresser "10.240.0.4:2379" is closed
2017-05-05 20:38:04.254689 I | etcd-tester: lease stresser "10.240.0.6:2379" is closed
2017-05-05 20:38:04.254839 I | etcd-tester: lease stresser "10.240.0.3:2379" is closed
2017-05-05 20:38:04.255294 I | etcd-tester: lease stresser "10.240.0.5:2379" is closed
2017-05-05 20:38:04.255727 I | etcd-tester: keyStresser "10.240.0.4:2379" is closed
2017-05-05 20:38:04.255819 I | etcd-tester: keyStresser "10.240.0.6:2379" is closed
2017-05-05 20:38:04.255859 I | etcd-tester: keyStresser "10.240.0.5:2379" is closed
2017-05-05 20:38:04.255906 I | etcd-tester: keyStresser "10.240.0.3:2379" is closed
2017-05-05 20:38:04.255930 I | etcd-tester: [round#0 case#2] paused stressers
2017-05-05 20:38:04.255938 I | etcd-tester: [round#0 case#2] wait until cluster is healthy
2017-05-05 20:38:04.283546 I | etcd-tester: [round#0 case#2] cluster is healthy
2017-05-05 20:38:04.283572 I | etcd-tester: [round#0 case#2] checking consistency and invariant of cluster
2017-05-05 20:38:10.325973 I | etcd-tester: [round#0 case#2] (/home/gyuho/go/bin/etcd-runner [election 1494016619491935753 --dial-timeout=10s --endpoints 10.240.0.4:2379 --total-client-connections=10 --rounds=0 --req-rate 100]) stderr 20:37:59.977209 no progress after 1 minute!
panic: no progress after 1 minute!

goroutine 1 [running]:
log.Panic(0xc420339ad8, 0x1, 0x1)
	/usr/local/go/src/log/log.go:322 +0xc0

etcd-tester --enable-pprof --etcd-runner=$etcd-runner --stresser=keys,lease,election-runner,watch-runner,lock-racer-runner,lease-runner --agent-endpoints="$(echo $AGENT_RPC_ENDPOINTS)"

@heyitsanthony heyitsanthony added this to the v3.2.0 milestone May 9, 2017
fanminshi added a commit to fanminshi/etcd that referenced this issue May 9, 2017
election runner can deadlock in atomic release().

suppose election runner has two clients A and B.
if A is a leader and B is a follower, B obtains lock
for release() and waits for A to close(nextc) which signal
next round is ready. However, A can only close(nextc) if it
obtains lock for release(); hence deadlock.

this pr removes atomicity of validate() and release() in global.go
and gives the responsibity of locking to each runner.

FIXES etcd-io#7891
fanminshi added a commit to fanminshi/etcd that referenced this issue May 9, 2017
election runner can deadlock in atomic release().

suppose election runner has two clients A and B.
if A is a leader and B is a follower, B obtains lock
for release() and waits for A to close(nextc) which signal
next round is ready. However, A can only close(nextc) if it
obtains lock for release(); hence deadlock.

this pr removes atomicity of validate() and release() in global.go
and gives the responsibility of locking to each runner.

FIXES etcd-io#7891
fanminshi added a commit to fanminshi/etcd that referenced this issue May 9, 2017
election runner can deadlock in atomic release().

suppose election runner has two clients A and B.
if A is a leader and B is a follower, B obtains lock
for release() and waits for A to close(nextc) which signal
next round is ready. However, A can only close(nextc) if it
obtains lock for release(); hence deadlock.

this pr removes atomicity of validate() and release() in global.go
and gives the responsibility of locking to each runner.

FIXES etcd-io#7891
fanminshi added a commit to fanminshi/etcd that referenced this issue May 9, 2017
election runner can deadlock in atomic release().

suppose election runner has two clients A and B.
if A is a leader and B is a follower, B obtains lock
for release() and waits for A to close(nextc) which signal
next round is ready. However, A can only close(nextc) if it
obtains lock for release(); hence deadlock.

this pr removes atomicity of validate() and release() in global.go
and gives the responsibility of locking to each runner.

FIXES etcd-io#7891
fanminshi added a commit to fanminshi/etcd that referenced this issue May 9, 2017
election runner can deadlock in atomic release().

suppose election runner has two clients A and B.
if A is a leader and B is a follower, B obtains lock
for release() and waits for A to close(nextc) which signal
next round is ready. However, A can only close(nextc) if it
obtains lock for release(); hence deadlock.

this pr removes atomicity of validate() and release() in global.go
and gives the responsibility of locking to each runner.

FIXES etcd-io#7891
yudai pushed a commit to yudai/etcd that referenced this issue Oct 5, 2017
election runner can deadlock in atomic release().

suppose election runner has two clients A and B.
if A is a leader and B is a follower, B obtains lock
for release() and waits for A to close(nextc) which signal
next round is ready. However, A can only close(nextc) if it
obtains lock for release(); hence deadlock.

this pr removes atomicity of validate() and release() in global.go
and gives the responsibility of locking to each runner.

FIXES etcd-io#7891
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

3 participants