You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The timeout is only honored for enqueuing the request to write a log. In my case, I wrote a custom transport that seems to have issues since there's a lot of undocumented side things that transports need to consider. But luckily it's open source so I could figure it out.
But in real life, I can simulate this by taking down half of the cluster just as Apply is being called. Then a log will not reach quorum but rather than timing out, the whole future is just deadlocked. The timeout isn't forwarded on and used in the Error() call. And since the future can never reach quorum (for example, it was constructed when there were 8 nodes but now there are 3) the Error() call will just hang forever. Even if the nodes are restored, and we're back up to 8 nodes, it'll still be deadlocked because it won't retry those failed applies.
So it'd be good if the deferError also took a timeout and honored that while waiting on the channel, returning an appropriate timeout error if it expires.
The text was updated successfully, but these errors were encountered:
The timeouts as defined are sort of lacking and misleading.
The scenario you describe where the future will never return.
For this second one, does the issue still apply in the issue-84-integration branch (see issue #84)? We rewrote all the membership change stuff there, so maybe it's already fixed.
Hey there,
We wanted to check in on this request since it has been inactive for at least 90 days.
Have you reviewed the latest godocs?
If you think this is still an important issue in the latest version of the Raft library or its documentation please feel let us know and we'll keep it open for investigation.
If there is still no activity on this request in 30 days, we will go ahead and close it.
Thank you!
Hey there, This issue has been automatically closed because there hasn't been any activity for a while. If you are still experiencing problems, or still have questions, feel free to open a new one :+1
I admittedly didn't check closed issues to see if this was already opened somewhere
but at:
https://github.com/hashicorp/raft/blob/master/raft.go#L306
The timeout is only honored for enqueuing the request to write a log. In my case, I wrote a custom transport that seems to have issues since there's a lot of undocumented side things that transports need to consider. But luckily it's open source so I could figure it out.
But in real life, I can simulate this by taking down half of the cluster just as Apply is being called. Then a log will not reach quorum but rather than timing out, the whole future is just deadlocked. The timeout isn't forwarded on and used in the Error() call. And since the future can never reach quorum (for example, it was constructed when there were 8 nodes but now there are 3) the Error() call will just hang forever. Even if the nodes are restored, and we're back up to 8 nodes, it'll still be deadlocked because it won't retry those failed applies.
So it'd be good if the deferError also took a timeout and honored that while waiting on the channel, returning an appropriate timeout error if it expires.
The text was updated successfully, but these errors were encountered: