[Discussion] Reject sync request if too many peers are syncing #3147

MaksymZavershynskyi · 2020-08-12T05:34:56Z

Motivation

When all network nodes are rebooted after the update they try syncing at the same time and this makes the booting slow.

Proposed design 1

Node A should reject syncing requests using structured error X when it already has more than Y nodes syncing. Number Y should be determined by benchmarking the syncing code. When node B receives structured error X during the sync it should attempt more nodes, and have a some delay retry mechanism on the peers that returned structured error X. This will naturally line up nodes into a queue.

This also makes monitoring of such network easier, since it will be easier to observe why node has not synced yet.

Proposed design 2

@evgenykuzyakov proposed that nearup can have a random delay before it starts the node. @nearmax 's argument against it is that it will not work universally, e.g. it won't work when nodes are upgraded by the community and NEAR foundation does not have perfect control on when and how people start them. Besides having randomization is a heuristics which adds to the maintenance of the system.

bowenwang1996 · 2020-08-12T05:44:47Z

Let's not conflate nearup (which is a tool to manage the node) with the behavior of the node itself. Whatever we do with nearup should be separate from nearcore. As for syncing, since a node has a limited number of peers, the number of peers that are syncing is naturally limited. Also, other than state sync (for which we already have limits), syncing is not very resource intensive so I don't think that we probably don't need to impose extra restrictions, although I do agree that limiting the number of peers syncing is a way to prevent eclipse attack.

MaksymZavershynskyi · 2020-08-12T05:51:43Z

Let's not conflate nearup (which is a tool to manage the node) with the behavior of the node itself. Whatever we do with nearup should be separate from nearcore.

I agree, let's not add hacks into nearup, like adding a randomized timer, that would solve node issues. Inability of the node to efficiently communicate with the peers and decide when and how to sync is the node issue.

MaksymZavershynskyi · 2020-08-12T05:54:00Z

I think there is some mutual misunderstanding here.

I see, you were talking about rolling release. Closing this issue.

MaksymZavershynskyi added the A-network Area: Network label Aug 12, 2020

MaksymZavershynskyi assigned bowenwang1996 and mfornet Aug 12, 2020

MaksymZavershynskyi closed this as completed Aug 12, 2020

weekly-digest bot mentioned this issue Aug 14, 2020

Weekly Digest (7 August, 2020 - 14 August, 2020) #3163

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Discussion] Reject sync request if too many peers are syncing #3147

[Discussion] Reject sync request if too many peers are syncing #3147

MaksymZavershynskyi commented Aug 12, 2020

bowenwang1996 commented Aug 12, 2020

MaksymZavershynskyi commented Aug 12, 2020

MaksymZavershynskyi commented Aug 12, 2020

[Discussion] Reject sync request if too many peers are syncing #3147

[Discussion] Reject sync request if too many peers are syncing #3147

Comments

MaksymZavershynskyi commented Aug 12, 2020

Motivation

Proposed design 1

Proposed design 2

bowenwang1996 commented Aug 12, 2020

MaksymZavershynskyi commented Aug 12, 2020

MaksymZavershynskyi commented Aug 12, 2020