Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(better_route_back): Use target peer if exists #3088

Closed
wants to merge 2 commits into from
Closed

Conversation

mfornet
Copy link
Member

@mfornet mfornet commented Aug 5, 2020

In routed message that uses hash to route message back, add also target peer to it, so message are routed optimally when the path to such peer is known.

This upgrade change layout of the message, so extra work is required to make it backward compatible.

Fixes #2989

Most of dropped messages happens because the combination of routed back heuristic and rebalancing strategy. This issue is solved in this PR.

Test plan

  1. Existing tests (rust+nightly) extensively cover routing mechanism.
  2. Is backward compatible

@gitpod-io
Copy link

gitpod-io bot commented Aug 5, 2020

Copy link
Collaborator

@bowenwang1996 bowenwang1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels to me that we are making this unnecessarily complex. It should be rare when we do not know the author of the message. If that is the case, we can afford to drop the message.

@SkidanovAlex
Copy link
Collaborator

@mfornet do we actually cover cases in nightly in which the sender of a message is not a known peer?
We definitely need targeted tests for this change.

@mfornet
Copy link
Member Author

mfornet commented Aug 6, 2020

@mfornet do we actually cover cases in nightly in which the sender of a message is not a known peer?
We definitely need targeted tests for this change.

Yes, it is both covered in simplified rust test (useful for debugging) and nightly test (those with blacklisted peers)

@mfornet
Copy link
Member Author

mfornet commented Aug 6, 2020

It feels to me that we are making this unnecessarily complex. It should be rare when we do not know the author of the message. If that is the case, we can afford to drop the message.

@bowenwang1996 I don't think we should discard route back mechanism. It is useful for new peers connecting to the network, thinking mostly that we have a protection mechanism against an adversary that creates millions of ghost identities, and even in this case, the routing mechanism is still working.

@bowenwang1996
Copy link
Collaborator

Okay, but it seems to me that for the messages that require route back, we can afford to lose some of them. I looked at the messages that need a response, and the only relevant ones right now are PartialEncodedChunkResponse and StateResponse. For both we have retry mechanism, so as long as the message loss happens only occasionally, I don't think it is a huge concern.

In routed message that uses hash to route message back,
add also target peer to it, so message are routed optimally
when the path to such peer is known.

WIP: This upgrade change layout of the message, so extra
work is required to make it backward compatible.

Test plan
=========
1. Existing tests (rust+nightly) extensively cover routing mechanism.
2. Is backward compatible
@mfornet
Copy link
Member Author

mfornet commented Aug 13, 2020

[discussion] This PR is not backward compatible atm, and making it backward compatibly will introduce a lot of complexity to the code, I propose to make this protocol update and after Phase 1.5 rethink how to make such big upgrades easily without messing the code as proposed in near/NEPs#95

@mfornet mfornet requested a review from bowenwang1996 August 13, 2020 23:11
@bowenwang1996
Copy link
Collaborator

We can postpone it for now unless we see that this is causing some real performance issues. However, if we just drop routeback completely it seems that it will be much easier to upgrade (the message format won't change). Can we upgrade in that way and later add routeback back if it is necessary.

@frol
Copy link
Collaborator

frol commented Sep 8, 2020

Should we close or merge this?

@stale
Copy link

stale bot commented Jul 1, 2021

This PR has been automatically marked as stale because it has not had recent activity in the 2 weeks.
It will be closed in 3 days if no further activity occurs.
Thank you for your contributions.

@stale stale bot added the S-stale label Jul 1, 2021
@bowenwang1996
Copy link
Collaborator

Assigning this to @pmnoxx since we have been seeing a lot of routeback not found error on mainnet

@pmnoxx pmnoxx removed their assignment Mar 7, 2022
@akhi3030 akhi3030 removed the S-pinned label Nov 28, 2022
@akhi3030
Copy link
Collaborator

Given the age of the PR, I think it is no longer active.

@akhi3030 akhi3030 closed this Nov 28, 2022
@Ekleog-NEAR Ekleog-NEAR deleted the fix-2989 branch March 29, 2024 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-network Area: Network
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid route back cache if possible
6 participants