Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply deadlocks in spite of timeout #145

Closed
hasKeef opened this issue Jul 28, 2016 · 3 comments
Closed

Apply deadlocks in spite of timeout #145

hasKeef opened this issue Jul 28, 2016 · 3 comments

Comments

@hasKeef
Copy link

hasKeef commented Jul 28, 2016

I admittedly didn't check closed issues to see if this was already opened somewhere

but at:
https://github.com/hashicorp/raft/blob/master/raft.go#L306

The timeout is only honored for enqueuing the request to write a log. In my case, I wrote a custom transport that seems to have issues since there's a lot of undocumented side things that transports need to consider. But luckily it's open source so I could figure it out.

But in real life, I can simulate this by taking down half of the cluster just as Apply is being called. Then a log will not reach quorum but rather than timing out, the whole future is just deadlocked. The timeout isn't forwarded on and used in the Error() call. And since the future can never reach quorum (for example, it was constructed when there were 8 nodes but now there are 3) the Error() call will just hang forever. Even if the nodes are restored, and we're back up to 8 nodes, it'll still be deadlocked because it won't retry those failed applies.

So it'd be good if the deferError also took a timeout and honored that while waiting on the channel, returning an appropriate timeout error if it expires.

@ongardie
Copy link
Contributor

ongardie commented Aug 12, 2016

Hi @hasKeef,

I think there's two issues here:

  1. The timeouts as defined are sort of lacking and misleading.
  2. The scenario you describe where the future will never return.

For this second one, does the issue still apply in the issue-84-integration branch (see issue #84)? We rewrote all the membership change stuff there, so maybe it's already fixed.

@stale
Copy link

stale bot commented Jun 6, 2019

Hey there,
We wanted to check in on this request since it has been inactive for at least 90 days.
Have you reviewed the latest godocs?
If you think this is still an important issue in the latest version of the Raft library or
its documentation please feel let us know and we'll keep it open for investigation.
If there is still no activity on this request in 30 days, we will go ahead and close it.
Thank you!

@stale stale bot added the waiting-reply label Jun 6, 2019
@stale
Copy link

stale bot commented Jul 6, 2019

Hey there, This issue has been automatically closed because there hasn't been any activity for a while. If you are still experiencing problems, or still have questions, feel free to open a new one :+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants