-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: end stale outbound queue immediately on disconnect, auto retry outbound messages #3664
Merged
stringhandler
merged 4 commits into
tari-project:weatherwax
from
sdbondi:comms-messaging-outbound-clear-stale-asap
Jan 3, 2022
Merged
fix: end stale outbound queue immediately on disconnect, auto retry outbound messages #3664
stringhandler
merged 4 commits into
tari-project:weatherwax
from
sdbondi:comms-messaging-outbound-clear-stale-asap
Jan 3, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
stringhandler
approved these changes
Dec 23, 2021
stringhandler
approved these changes
Jan 3, 2022
sdbondi
added a commit
to sdbondi/tari
that referenced
this pull request
Jan 7, 2022
* development: chore: remove moving lock.mdb (tari-project#3674) chore: merge weatherwax feat!: provide a compact form of TransactionInput (tari-project#3460) v0.22.1.1 v0.22.1 ci: add build step (tari-project#3678) fix: edge cases causing bans during header/block sync (tari-project#3661) fix: end stale outbound queue immediately on disconnect, retry outbound messages (tari-project#3664) feat: add search by commitment to explorer (tari-project#3668) feat: tari launchpad (tari-project#3671) feat: base_node switching for console_wallet when status is offline (tari-project#3639) feat: improve wallet recovery and scanning handling of reorgs (tari-project#3655) feat: add GRPC call to search for utxo via commitment hex (tari-project#3666) feat: custom_base_node in config (tari-project#3651) fix: return correct index for include_pruned_utxos = false (tari-project#3663)
sdbondi
added a commit
to sdbondi/tari
that referenced
this pull request
Jan 10, 2022
* development: feat: dibbler new genesis block with faucet utxos (tari-project#3688) ci: fix clippy warning on generated proto module (tari-project#3690) test: fix metadata signature cucumber (tari-project#3687) refactor!: clean up #testnet reset TODOs (tari-project#3682) feat(comms)!: add signature to peer identity to allow third party identity updates (tari-project#3629) chore: remove moving lock.mdb (tari-project#3674) chore: merge weatherwax v0.22.1.1 v0.22.1 ci: add build step (tari-project#3678) fix: edge cases causing bans during header/block sync (tari-project#3661) fix: end stale outbound queue immediately on disconnect, retry outbound messages (tari-project#3664) feat: add search by commitment to explorer (tari-project#3668) feat: tari launchpad (tari-project#3671) feat: base_node switching for console_wallet when status is offline (tari-project#3639) feat: improve wallet recovery and scanning handling of reorgs (tari-project#3655) feat: add GRPC call to search for utxo via commitment hex (tari-project#3666) feat: custom_base_node in config (tari-project#3651) fix: return correct index for include_pruned_utxos = false (tari-project#3663)
sdbondi
added a commit
to sdbondi/tari
that referenced
this pull request
Jan 11, 2022
* development: (28 commits) feat: covenants implementation (tari-project#3656) ci: add tor script to binaries bundle (tari-project#3689) chore: remove testnet reset todo in wallet (tari-project#3685) feat: dibbler new genesis block with faucet utxos (tari-project#3688) ci: fix clippy warning on generated proto module (tari-project#3690) test: fix metadata signature cucumber (tari-project#3687) refactor!: clean up #testnet reset TODOs (tari-project#3682) feat(comms)!: add signature to peer identity to allow third party identity updates (tari-project#3629) chore: remove moving lock.mdb (tari-project#3674) chore: merge weatherwax feat!: provide a compact form of TransactionInput (tari-project#3460) fix: allow 0-conf in blockchain db (tari-project#3680) v0.22.1.1 v0.22.1 ci: add build step (tari-project#3678) test: fix cucumber WalletQuery.feature (tari-project#3677) test: fix `wait for` step (tari-project#3673) fix: edge cases causing bans during header/block sync (tari-project#3661) perf(comms)!: optimise connection establishment (tari-project#3658) fix: end stale outbound queue immediately on disconnect, retry outbound messages (tari-project#3664) ...
aviator-app bot
pushed a commit
that referenced
this pull request
Jan 12, 2022
Description --- - Handle case where connection is disconnected due to simultaneous dial can cause new connection to be removed from connectivity state - Remove messaging protocol timeouts, messaging protocol will end correctly when disconnected (ref #3664) - Remove condition that connection should have 0 substreams for reaping. A connection may only be reaped if it's age is >= 20 minutes and there are 0 handles held for the connection - Detect if remote outbound stream is closed, by reading from it (yamux requires reading to determine if the stream is still alive) - Ensure disconnect event can only be emitted once per peer connection - remove delayed connection close and "will close" event Motivation and Context --- Observed a case where two nodes simultaneously dial and end up with no connections to each other due to a mis-timed peer disconnected event. The connection would have to wait for DHT connectivity to eventually redial to recover. Inactivity timeouts for messaging were a patch for the outbound messaging staying open, but this is not properly detected and handled. How Has This Been Tested? --- Existing tests updated Manually: two base nodes that previously "lost" their connection due to this bug
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Motivation and Context
Due to yamux substream internals, the outbound stream does not indicate it has ended until we attempt to write a message. Causing the outbound stream to remain active (i.e. available for sending) when it cannot send a message. This message and potentially any others that are queued up will then be dropped. This is rectified by cutting off the outbound queue channel as soon as a connection is disconnected and 'rerouting' the queued messages (if any) to be retried. Retry will attempt to reconnect to the peer and failing so, will drop the queued messages since there is nothing we can do in this case.
This increases messaging reliability.
How Has This Been Tested?
Manually by sending transaction and ping messages between wallets and base nodes that are banned/shutdown