-
Notifications
You must be signed in to change notification settings - Fork 977
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backpressure in Gossipsub #4667
Comments
Hello. I'd love to take this on |
That would be awesome! I'll assign you :) |
#4756 implements solution (1), more specifically tail dropping in the @divagant-martian raised concerns with this approach. After syncing with @divagant-martian, @jxs and @thomaseizinger, here is a rough summary. Concerns with tail drop implemented in #4756
(Above a publish message is authored by the local node whereas a forward message is published by another node but forwarded by the local node.) Alternative solutionsThere are many solutions to fulfill these requirements. Above all, I would like to keep added complexity low. Smarter dropping in
|
Thanks for writing this up @mxinden ! I recorded a related issue in #4781 in how we would go about reducing memory usage when encoding protobuf messages. There is some overlap in the solutions. Most importantly, both require us to pass down different messages from the behaviour to the handler. I think a good starting point would be to do that first. We could then fan-out (no pun intended) the work and tackle message dropping and memory-efficient encoding separately. |
I agree with the current solution. I think we shouldn't be dropping some control messages like GRAFT, PRUNE, SUBSCRIBE,UNSUBSCRIBE for the reasons that have been listed. I know that we want to keep the complexity simple here, but my initial thoughts were to time-bound the forward messages and maybe the publish messages. I think if there are forward messages hanging around for > 300ms its prob not worth forwarding them anyway (and we should prioritize newer messages rather than trying to get old ones out the door which are now likely duplicates/waste) There is also probably an optimisation we could do, but this one will add prob too much complexity. When we forward messages, we check to make sure we haven't seen this message from peers we want to forward to (this happens in the behaviour) because if we send it to these peers, it's a waste because it just gets registered as a duplicate. Often there is a race between us receiving the message from multiple peers and trying to forward that message on. If a node waits a bit in the validation, it often doesn't have to forward it on, because it receives it from all its mesh peers. If we have the situation where a forward message is sitting in the One solution could be that in the Anyway. This has become a pretty high priority for us, so I'm happy to help out and contribute here, but don't want to step on anyone's toes. Is someone already planning on implementing the proposed soln by @mxinden ? |
Great, thanks for weighing in @AgeManning ! Regardless of which optimisations we implement, I think it is safe to say that we will need to change the interface between gossipsub's That allows the
This should be easy to do once we have separate queues. I'd probably hold off the time-bound optimisation for now and see what effects the others have.
This also seems very reasonable and easy to implement because we just have to check local state.
I don't know of anybody actively working on. There is a related effort to reduce memory allocations in gossipsub which I will need a similar refactoring in terms of changing the interface between behaviour and handler. cc @joshuef for visibility here. Would it be safe to say that we should always prioritize control messages over publishes? Then we could do the following: Use the regular channel between behaviour and handler for control messages (i.e. Then, in a separate channel (per |
Yeah nice. Others may want to weigh in here, but its not clear cut which messages should get prioritised. There is a control message called IHAVE and I think that one can be prioritised lower than anything else. As we are not going to be this fine grained (i.e individual control messages) my gut feeling would be the following prioritisation:
However, if we split up the GRAFT/PRUNE from the IHAVE/IWANT. I think it would go like:
|
The goal here is to reduce the memory footprint. Subscribe and control messages are tiny in size (correct me if I am wrong). What would be the benefit of prioritizing publish messages over subscribe and control messages? Why not always put subscribe and control first? They don't take up much memory. Given their size, they don't delay either publish or forward messages significantly. |
My thinking was that the most important thing, the one that we want to reduce the latency the most on, is published messages. We want to send those with very high priority. I agree that control messages are tiny and if they get queued we can probably group them in the handler into a single rpc message.
The handler pulls from the queues in this order and groups them if we can into a single RPC. I think the most important thing we need to address is the dropping strategy. I think we probably just need to drop forward messages and maybe have some cap on the published messages. The forward messages are time-sensitive, so I still think we should have some time-based component to it. Time-LRU cache or something. |
After #4811 me and @thomaseizinger discussed how could we then move to implement what has been discussed in this issue. I plan if everybody agrees, on submitting a PR that adds a new unidirectional direct communication channel between the This communication could be something like a bounded If the communication channel queue between the This won't affect other communication protocols as it will only be removed from the list of channels in the CC @AgeManning @thomaseizinger @mxinden @joshuef feel free to correct/add/address anything I might have missed/miswritten. |
Thanks for the write-up @jxs!
I am not familiar enough with gossipsub to say this for sure but my take would have been: Swap the handler to I think that is a sensible behaviour. If the remote node is genuinely slow, closing the gossipsub streams will free up resources on both ends. If no other protocol is active as a result, the keep-alive algorithm will close the connection. If the remote node is malicious and purposely doesn't read from the stream, we'll close it eventually and refuse to speak to it going forward (on this connection). Handler and behaviour run in different tasks, meaning the only reason why the handler would not be reading from the channel is because the stream is too slow. The idea @jxs and myself were exploring is to directly hook up the The channel already acts as a buffer so there isn't any need to have another buffer within |
A couple of questions/notes about this:
|
Great question! I think the benefit is that we end up sending the most important messages first, meaning if it does fill up, the messages being dropped should only be unimportant ones. |
I don't think a priority queue, e.g. https://github.com/rmcgibbo/async-priority-channel, is the simplest solution to the problem at hand, namely to prevent Say that the priority queue is full. It has a limit of 10 messages. 5 of these messages are forward messages. Say that the local node wants to send a publish message. In such case the forward messages would prevent the publish message from entering the queue to the Instead I suggest the following: Have two channels from Control and publish message channel (aka. priority channel)
forward message channel (aka. non-priority channel)
time sensitive forward messages
We can add a I recommend doing this as a follow-up. If I understand correctly the most pressing issue is to bound the |
Thanks for the comments @divagant-martian @mxinden! I now see that the priority-channel was a detour 😅 The solution sketched above looks good, thank you @mxinden ! |
Hi Max, and thanks for the input!
Yeah but that will also happen if we use two channels. If the peer doesn't make progress reading bytes of a
we can also achieve this with the
yeah also makes sense to me, I take it you agree that we also drop the peer and its queue in that case right? |
Two channels guarantee that forward messages don't take away capacity of publish messages. E.g. in the case of a single (priority-) queue with a capacity of 10, one might have 10 forward messages, where an additional publish message would need to be dropped. In the case of two channels, each with a capacity of 5, forward messages can not starve publish messages.
But forward messages might still starve publish messages. Say that the queue contains 10 forward messages each
I am undecided here and thus I suggest not dropping the peer as part of a first iteration of this GitHub issue. Networks might want to still support slow peers, as long as they don't exhaust the local nodes memory (e.g. through filling up Under the assumption that the Gossipsub scoring mechanism is functional (which I think is not a safe assumption to make), I would expect slow peers to eventually be scored low and thus disconnected. Under the assumption that the Gossipsub scoring mechanism is not functional, I suggest simply emitting an event, leaving it up to the user whether to disconnect or not. |
Meanwhile, while trying to implement the channels solution, the smoke tests fail with timeout, this seems to be because as the Update: My bad wasn't registering the channel to be awake on a new message, as I was not polling but using I also discovered that the |
That sounds like a bug in either
Note that |
I just wanted to throw into the mix the possibility that it could be our node that is slow. I think so far the assumption has been slow peers. |
Right. So all things considered, it appears that we can solve the OOM issue by using two bounded channels. Initially, we can leave out the optimisation of dropping old forward messages. That can be added with a timer later. Initially, we can also not drop peers if the channel is full but simply return I think that is a basic but coherent solution. Perhaps we can add some metrics as well to track the sizes of these channels. We could even go with a single channel to start with and only introduce the 2nd channel once we drop old forward messages. |
Previously, the `NetworkBehaviour` constructed a `proto::RPC` type and sent that into an unbounded queue to the `ConnectionHandler`. With such a broad type, the `ConnectionHandler` cannot do any prioritization on which messages to drop if a connection slows down. To enable a more fine-granular handling of messages, we make the interface between `NetworkBehaviour` and `ConnectionHandler` more specific by introducing an `RpcOut` type that differentiates between `Publish`, `Forward`, `Subscribe`, etc. Related: #4667. Pull-Request: #4811.
Is this issue considered done with #5595 merged? |
cc @jxs |
yup, thanks for raising this! #5595 addressed this |
Description
See #3078 for tracking issue on backpressure in rust-libp2p in general.
Today
EnabledHandler::send_queue
can grow unboundedly, where the GossipsubNetworkBehaviour
sends messages faster than the GossipsubEnabledHandler
can process them.rust-libp2p/protocols/gossipsub/src/handler.rs
Lines 99 to 100 in ecdd0ff
I see two solutions:
EnabledHandler
oncesend_queue
reaches a certain size.NetworkBehaviour
andConnectionHandler
and only forward fromNetworkBehaviour
toConnectionHandler
when the latter can handle another message. I.e. drop messages for a specific peer inNetworkBehaviour
instead of as suggested in (1) inConnectionHandler
.Motivation
See #3078. Prevents past failures like #4572.
Requirements
Bound
send_queue
, i.e. make it impossible for it to grow unboundedly.Open questions
No response
Are you planning to do it yourself in a pull request ?
No
The text was updated successfully, but these errors were encountered: