Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InmemTransport: respect timeout when sending #313

Merged
merged 1 commit into from
Mar 28, 2019

Conversation

babbageclunk
Copy link
Contributor

InmemTransport would block indefinitely when trying to make an RPC to a peer who has already shut down. This was causing some of Juju's tests for our raft machinery to hang sometimes when stopping raft nodes - see #312 for code that will reproduce the situation.

Guarding the send with a timeout (like there is on the receive) fixes it and seems like the right thing to do here.

@hashicorp-cla
Copy link

hashicorp-cla commented Mar 19, 2019

CLA assistant check
All committers have signed the CLA.

@babbageclunk
Copy link
Contributor Author

Hmm - I think the integration test that failed is flaky. If I run it in a loop against master it fails within 10 runs.

api.go Outdated
localAddr: localAddr,
logger: logger,
logs: logs,
protocolVersion: protocolVersion,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't mean to include this formatting change, the file got gofmt'd when I was adding debugging.

InmemTransport would block indefinitely when trying to make an RPC to a
peer who has already shut down. This was causing our tests using it to
hang sometimes when stopping raft nodes - guarding the send with a
timeout (like there is on the receive) resolves the problem.
babbageclunk added a commit to babbageclunk/juju that referenced this pull request Mar 19, 2019
This fixes the intermittent raftflag test hang which was caused by the
raft InmemTransport blocking indefinitely when trying to send an RPC and
the other end has stopped already. Proposed upstream at
hashicorp/raft#313, but using the juju/raft fork
until that is merged.
jujubot added a commit to juju/juju that referenced this pull request Mar 19, 2019
#9909

## Description of change

Updates hashicorp/raft dependency to use our own fork to respect timeouts when sending RPC requests (at least until our PR is merged upstream). See hashicorp/raft#313 for more detail.

## QA steps

Running the raftflag worker test under stress doesn't hang anymore.

## Documentation changes

None

## Bug reference

None
@schristoff
Copy link
Contributor

Thanks for fixing this odd timing bug in our testing system!

@schristoff schristoff merged commit 70c3e5d into hashicorp:master Mar 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants