-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autosharding v1 #1846
Comments
@SionoiS , @alrevuelta Was going through peer manager code and noticed that there seems to be a limit of 1 connection towards a peer as per libp2p.
Not sure if this is the right place to bring this up, because this issue exists currently with static sharding as well. Thoughts. |
That is a good point! @chaitanyaprem In theory It's possible, my question would be how often does that happen in practice? I don't think we want multiple connections per peer. Although, I'm not as familiar with this part of the code than @alrevuelta I'll have to investigate further. |
Weekly Update
|
Yeah, even i couldn't think of a possible scenario. But i am guessing it would depend on the order in which peer connections get established. Since that order is hard to determine for a node, this can happen on any node while coming up or restarting or just during regular peer connection pruning itself. |
Yup. The minimum we would need is to only connect to peers in shards that we are interested in. We can know this using the ENR field indicating the shard the node is subscribed to. Note that this can be faked but afaik we can double check this once the connections is done. In other words:
Note that this has to be bypassed if someone configures the node will nimbus does something similar here Since subscriptions are dynamic (eg one can unsubscribe from the RPC) we need some prunning. Eg, we are subscribed to a given content topic (that maps to gossipsub topic X). Using the RPC we unsubscribe from that content topic, and we dont have any active subscription in topic X. In that case we need to prune relay connections to that node.
I don't see any reason for changing this by now. In any case imho this falls out of scope for sharding, so we can reasses it but in another milestone. @SionoiS Can we add this peer management task to this milestone? Without it autosharding won't be really completed. Also, as discussed during the offsite, we have the concept of "altruistic sharding" (we need a better name). In other words, your node will be connected to the pubsub topics you are interested in (derived from the content topcs you are subscribed) BUT also to (1?) extra random? shard that most likely will keep changing (this is the altruistic shard, because you are not really interested in that one). This is to ensure a good coverage of the shards. Still to be designed. Adding it here #1892. Perhaps it should be part of this sharding milestone @SionoiS ? |
@alrevuelta Yes I agree. If we don't fix this, autosharding can silently fail which is very bad UX.
Nice to have but not required IMO. Maybe add another autosharding milestone for updates like this, fixes and "shard space" redesign? |
Not just pruning, we may have to relax/increase number of max-connections or relay connections based on number of shards a node is subscribed to or rebalance connections across shards.
That is fine, this can be looked up at a later stage.
As @SionoiS suggested, this also can be taken up as part of a later milestone and not part of this one. |
Weekly Update
|
At the GossipSub lvl each topic should have 4-12 full message peers and many more Metadata only peers. I didn't see in the code a way to distinguish between peers based on topic or type (full/metadata). Having a connection hard limit seams like a bad idea in this context. Relay is built on top of GossipSub and I don't think we can really modify libp2p. I don't see a good way forward yet. I'm still 🤔 |
My mistake i mentioned 6-12, agreed that it has to be 4-12 full-message peers. For now, it maybe ok to apply connection limits based for full-message peerings. We may have to run larger node simulations in order to see the issues we may face with current approach and arrive at a number for metadata peers. @alrevuelta Thoughts?
I don't think we need to modify libp2p, but rather apply our limits on top of it. |
I talked with @alrevuelta and we can indeed modify nim-libp2p to get easier access to the data we need. I still not sure about having a limit. If GossipSub manage 4-12 full-msg peers per topic. It was designed this way as this is what works best. Maybe only limiting metadata peers makes more sense? |
Ah, i keep forgetting we are maintaining nim-libp2p. In this case yes, for go-waku we will have to implement it outside libp2p.
True, then for what other reason are we having this relay connection limit check? We may have to understand the reason behind it. Also, we need to check if gossipsub is tearing down connections beyond specified limit of full-message peers. And we also need to verify the behaviour for fan-out as well.
This would definitely constrain resource usage, otherwise a node may get flooded with connections and not able to protect itself. |
I removed both live subbing and shard peer management task. These task will be part of another milestone or epic. |
Weekly Update achieved: Complete! FILTER, LIGHTPUSH and RFC merged. |
Planned start date: 28 Jun 2023
Due date: 18 August 2023
Summary
Design and implementation of autosharding. Autosharding is the automatic assignment of content topics to shards.
Acceptance Criteria
Tasks
RAID (Risks, Assumptions, Issues and Dependencies)
Autosharding will affect various other protocols and may have unintended consequences.
#1846 (comment)
The text was updated successfully, but these errors were encountered: