Graphsync Simplication Idea: one libp2p stream per request, no message queues (except for retries) #259
Labels
effort/weeks
Estimated to take multiple weeks
kind/discussion
Topical discussion; usually not changes to codebase
What
The message protocol format of go-graphsync combines multiple-requests and responses into a single message on a single stream.
Perhaps it is time to reexamine the protocol and the rationale for combining request and responses, and having blocks live as a seperate element from responses.
Why was it this way
Design rationale in practice
In filecoin, data is almost always streamed to different blockstores (CAR files). This means that ultimately, we almost never need to deduplicate across requests. In fact, we have a whole extension to graphsync just to prevent this.
Consequences
Package multiple requests and responses into a single message stream is the source of much go routine complexity — as you execute a response in one thread, you have to send the data over to a per peer message queue thread that is packaging up responses to go over the network. Meanwhile, on the other side, matching up responses and blocks with their associated requests is more complex than it needs to be.
Moreover, cause message sending is one queue per peer, our allocator backpressure system is per peer as well -- when it might make more sense for it to be per request.
The need to handle multiple requests and responses and match them up to to blocks has been an ongoing source of difficulty for/complaints from implementors of graphsync in other languages.
Why leave it as it is
If remains to be seen if multiple request/response per message and associated deduplication is more important for the IPFS use case, where data tends to land in a single blockstore.
The text was updated successfully, but these errors were encountered: