-
Notifications
You must be signed in to change notification settings - Fork 112
[WIP] Work towards improving the duplicate blocks issue #8
Conversation
287e22f
to
ff672d6
Compare
Alright, got a few tests (more definitely needed) that show various fetching scenarios pretty well. I've also got a 'first candidate solution' that improves things pretty nicely in some cases, but in some cases it increases latency significantly. Before we try to micro-optimize to fix that latency bump, i want to get more test scenarios that also have different latencies. |
Fun things to try:
I'm also curious to try out the 'multi-block-bitswap' branch from @taylormike |
The first (naive) fix i've written here just splits up the wants amongst the peers in the 'active' set, which is the set of peers we've received data from in this session. This does a decent job of load balancing, but it would be better to track which peers are faster, and weight requests to them higher. The code also currently will send out wants to all peers until we get some peers in the 'active' set. This is problematic when we request a whole bunch of highly available data all at once, as all the wants will go out before we are able to filter anything. We can probably have a limit on the number of live wants a session can have before getting any active peer information. |
ff672d6
to
5bf791a
Compare
@Stebalien wanna take a look at the set of test scenarios here? I could use some help brainstorming more |
I'm now looking into what happens when the blocks we are fetching arent clustered onto the same peers (i.e. if one peer has the first item, they don't necessarily have the next). This is currently a pretty slow usecase, with fetching 100 blocks, randomly interspersed across a set of ten peers taking ~50 seconds. My current idea is to keep track of the number of blocks we receive, and the number of 'broadcasts' we send out, and if that ratio is too low, bump up the number of peers we sent each want to until that ratio gets better. A really simple approach brought the time for this scenario down from 50 seconds to 30 seconds, which is pretty nice. Next i'm gonna go look into how this affects more optimal usecases (it shouldnt actually affect them at all, because the ratio should never drop that low) |
discovering something mildly annoying... it seems the requests are falling back to broadcast even in cases where they really shouldnt (like when everyone has every block). Gonna look into this... |
meh, was a bug in my new code. Still seeing an unfortunate number of broadcasts, but its more reasonable now. |
8e5d6a9
to
e70588a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of nits while familiarizing myself with the code.
@@ -47,6 +48,8 @@ type impl struct { | |||
|
|||
// inbound messages from the network are forwarded to the receiver | |||
receiver Receiver | |||
|
|||
stats NetworkStats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this needs to go at the top of the struct as it needs to be 64-bit aligned:
The first word in a variable or in an allocated struct, array, or slice can be relied upon to be 64-bit aligned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The compiler should really catch that for us...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should, but it doesn't...
@@ -153,6 +174,10 @@ func (s *Session) interestedIn(c *cid.Cid) bool { | |||
const provSearchDelay = time.Second * 10 | |||
|
|||
func (s *Session) addActivePeer(p peer.ID) { | |||
if s.liveWantsLimit == broadcastLiveWantsLimit { | |||
s.liveWantsLimit = targetedLiveWantsLimit | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a fan of using "constant" variables like this... how about a wantBudget()
function (or something like that) that returns len(s.liveWants) - ((len(s.activePeers) > 0) ? targetedLiveWantsLimit : broadcastLiveWantsLimit)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, totally.
Summarizing our offline discussion, I'm not sure if the With the
Without remembering anything about the DAG structure, about the only solution I can think of is to somehow split peers into "active" and "inactive" groups. That is, if a peer frequently doesn't have blocks we're looking for, we drop them to the inactive group; if we start running low on active peers, we start pulling in inactive peers. One way to do this is to simply sort peers by some frecency (frequent/recency) metric that tracks how reliably they deliver blocks. We can then ask for blocks from the top N (where N can float based on how many duplicate blocks we appear to be receiving). This should work fine for, e.g., sequential access. However, it won't work well with, e.g., Ideally, we'd use the DAG structure. Looking at this structure, I'd predict that a peer is more likely to have a node if we know they have either sibling or uncle nodes and less likely if we know they don't have sibling nodes.
In some cases, we could probably also learn from cousins but I'm not sure if there's anything we can learn from cousins that we can't learn from uncles without assuming a balanced tree. So, without changing the interfaces/protocol, we can probably keep a small, in-memory, shared table (cache?) mapping cids we may want (children of nodes we've fetched) to peers that likely have them (not sure how we'll keep track/update how likely they are to have them). Given this information, I'd:
Basically, we'd:
|
}) | ||
t.Run("10Nodes-OnePeerPerBlock-UnixfsFetch", func(t *testing.T) { | ||
subtestDistributeAndFetch(t, 10, 100, onePeerPerBlock, unixfsFileFetch) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably have a test that looks more like reading entire files out of a large directory structure. I wonder if we should try building an actual dag.
Heres the results so far:
|
Several of the tests are slower now, which is mildly disconcerting. But they have lower duplicate blocks and message counts across the board, which is a win in my book. |
@whyrusleeping we should test this against a gx install (leaving this here so we don't forget, I'll try to do this as soon as I get a non-mobile internet connection). |
Not sure if this would work or not, ignore it if it is silly. Would sorting peers into latency groups then altering the behavior for low latency peers (~<20ms) to a 'Do you have?' vs a 'Send if you have!' strategy work? While this does not solve the problem it might help mitigate it. |
@bonekill Yeah, the idea of having a 'do you have' protocol extension is interesting. There is also the question of 'will you send this to me?' which i think is the more important one. Note that this is only really important in the case where peers dont have the data. In the optimal case, with many peers who all have the data, there is no reason to add that extra overhead. |
Also back to the original idea you are working on, you could utilize the incoming requests as well to build the probability of nodes having data. |
08464d3
to
5aa8341
Compare
@bonekill Yeah, @Stebalien and I were chatting about having a generalized 'who might have X' subsystem that could be informed by many different inputs. I think that will help significantly moving forward. In any case, I'd like to get this PR merged roughly as is. @Stebalien do you think you could give me a hand? |
7f5f2f4
to
84a7035
Compare
If this is the correct way to use a session, I've tried this patch when listing wikipedia on IPFS and, unfortunately, it still appeared to get 1:1 duplicate blocks and have a 2/1 upload to download speed. We should do some more through tests to figure out what's actually going on. @hannahhoward? |
Still not seeing any improvement. |
Yes. Besides updating |
84a7035
to
5d4e4bb
Compare
Blocked on passing the new tests. The answer is probably to just skip them or relax them but it still needs to pass them. |
So, I finally got this to work by just bubbling the changes (hopefully correctly, I didn't get any dupes). Unfortunately, the results are not encouraging. While pinning wikipedia:
As for bandwidth, I tested 4 variants:
(all of these start off slightly (except 2 which starts off significantly) higher and then drop down) So, maybe I've just been testing this wrong but, at the very minimum, we need decent benchmarks showing some improvement before merging this. |
I tend to agree. I think the first priority is real world benchmarks, so we have a more systematic way to test whether we're making a real world difference. On the question of merging -- I am also skeptical of a large change that add complexity with a seemingly negative affect on transfer speed. I think once we have real world benchmarks, it's worth revisiting this PR to see if there's something we're missing, cause maybe there is just a small change needed to unlock the improvements. I'm going to start looking at real world benchmark tests today. I'd recommend we either close this PR (it's still archived and can be re-opened later) or add a WIP label to indicate it needs further work. Also the tests in this PR are valuable on their own (I think) and I wonder if they should be split off. |
I'll add a WIP.
I agree. Are you volunteering 😄? |
Yes, what I meant with my previous (not very clear) comment was that the broadcast system (through the Lines 209 to 218 in a2ce7a2
How long that time is will depend on how fast new (requested) blocks arrive, and that in turn depends heavily only on the broadcast requests, since the targeted requests to specific peers through the duplicate block system manages absolute and relatively small values of 3-10 while the broadcast is relative to the number of peers, an order of magnitude (or more) than the duplicate blocks requests. This is not reflected in the local tests because with the fake 10 ms delay in the Lines 164 to 171 in a2ce7a2
I think that more insightful statistics (beyond the useful duplicate block count) will be needed to understand the interaction between the two request systems beyond black box testing dl/ul bandwidth. |
@schomatis Yeah, we should try ramping up the baseTickDelay and provSearchDelay values. We should also set up a more 'real world' test case, and try to get some real numbers and reproducibility out of them. |
- add a delay generator that similates real world latencies one might encounter on the internet - modify virtual network to accept different latencies for different peers based on using NextWaitTime on passed delay - modify dup_blocks_test subtestDistributeAndFetch to accept a custom delay - Add a real world benchmarks that simulates the kinds of problems one might encounter bitswaping with a long lived session and a large swarm of peers with real world latency distribution (that causes #8 not to function well in practice)
- add a delay generator that similates real world latencies one might encounter on the internet - modify virtual network to accept different latencies for different peers based on using NextWaitTime on passed delay - modify dup_blocks_test subtestDistributeAndFetch to accept a custom delay - Add a real world benchmarks that simulates the kinds of problems one might encounter bitswaping with a long lived session and a large swarm of peers with real world latency distributions (that causes #8 not to function well in practice)
- add a delay generator that similates real world latencies one might encounter on the internet - modify virtual network to accept different latencies for different peers based on using NextWaitTime on passed delay - modify dup_blocks_test subtestDistributeAndFetch to accept a custom delay - Add a real world benchmarks that simulates the kinds of problems one might encounter bitswaping with a long lived session and a large swarm of peers with real world latency distributions (that causes #8 not to function well in practice)
Bitswap currently has a bad problem of getting duplicate blocks. That is, if we ask for X, and more than one of our peers has X, we will probably get X from each of those peers. Wasting bandwidth.
In this branch, we will attempt to fix the problem (or at least, improve it).
The first thing i've added here is a test that shows the problem. From there, we can try and get the duplication factor as low as possible.