-
-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lots of duplicate packets are being send #23
Comments
Thanks, this is very detailed. I did not know Wireshark was aware of uTP that's helpful. If this implementation backs off in the presence of TCP in its current form, that would be coincidental, I've put no effort into ensuring that. I'll take a more detailed look at your findings tomorrow. |
Thanks. As an addition. I've noticed that the background traffic is less effected there are less out-of-order packets being buffered: // Maximum out-of-order packets to buffer.
maxUnackedInbound = 64
maxUnackedSends = 64 The lower the number is the better performance the background traffic has. The performance of the uTP transfer seems not to be effected. I have to investigate this further though. |
I did some further investigation and added some counters to see how many packets are received and how many of those are duplicates. I noticed that my hack in What does seem to influence the number of duplicates is my hack in Together with changing the The number of duplicates I'm seeing (without the hacks) are somewhere from 3% up to 10% of the total packages that are being send. With my hacks the number drops to around 1%. |
Another update. I do not think that both hacks from the initial post fix the root cause of this problem. My current theory is that the initial sending of the packets is flooding the network and as a result maybe The hack look like this: func (c *Conn) Write(p []byte) (n int, err error) {
time.Sleep(c.resendTimeout()) // sleep before writing the initial message
mu.Lock()
defer mu.Unlock()
for len(p) != 0 {
... |
Fixing the extension bytes shouldn't affect resends at all. It should only potentially improve compatibility out there with other utp implementations. But it was helpful you spotted this, and I've fixed it. The amount of resending should be directly affected by the timeouts and allowed unacked sends. However decreasing the allowed number of unacked sends will decrease throughput when there is high latency. I believe the spec limits unacked send packets by adjusting its window size. This implementation doesn't do that. It might be very effective to block writes, and thereby further unacked sends when there are sends that haven't been acked in a timely fashion, as we track the usual latency. I would guess that the issues you're describing are all resulting from your OS being overwhelmed by the number of sends, and failing huge batches of consecutive sends, which results in "resend storms". What OS/distribution/arch are on you on? What Go version? What kind of data rates are you achieving? |
I concur with your guess. I add another post to this topic with some of my findings using the I'm currently testing UTP within a network emulator which emulates a small wifi network (maximum rate of 11Mbit/s) with 3 nodes in the network. The nodes all run Ubuntu 14.04 x86_64 with go version 1.8. With the current master branch of this repo I currently achieving around the maximum transfer rate (11Mbit/s) in my network. |
I've compared the current master with the I'm doing the following test in the 3 node network I mentioned above. The maximum data rate is 11Mbit/s. Initially I tested how long it would take to transfer a 25MB file from node 3 to node 2 using secure copy (scp). This is my benchmark for current masterThen I tried how long it would take to do a 100MB file transfer using With the current Next I tried transferring 100MB with sleep hackThen I tried the same using the sleep hack. Now the The conclusionThe congestion control works beter with the hack but also slows down the transfer when there is no background traffic. There is something strange in the measurements. It seems that more non-duplicate packets are received using the current graphsI've used wireshark to graph the throughput for both scenarios. For the graphs I transfer a 25Mb file using current master without scpcurrent master with scpsleep hack without scpsleep hack with scp |
I've played around with the code trying to come up with a more elegant solution for the sleep hack. I pushed something to my fork of the repository that works very well in my configuration. It needs more testing and also the test cases currently seem to deadlock. With these changes I get transfer speeds that are near the speeds of the current master while it still backs off when there is background traffic. It also reduced the number of duplicate packages to almost zero for me. Code is here: https://github.com/eelcocramer/utp/tree/sleep-hack |
I think my guess about your OS is probably right on. Your code has the effect of slowing down the number of sends avoiding large numbers of consecutive packets from being dropped. It does it by holding the package lock, which isn't ideal. I think a solution identifying either the true rate limit would be more effective. |
I don't understand why the OS should be reason. Shouldn't the protocol be more gentle on the available resources (bandwidth, CPU)? I agree that my changes are not ideal but the uTP spec says the following in the paragraph about congestion control:
I was under the assumption that Currently this timeout value is used only Also the transfer times seem not to be affected to much but I will do some more testing on this. Do you have benchmarks test that you use to measure the throughput? Thanks. |
According these Syncthing bench there is plenty of room for improvement: The current version gives me
My fork gives me
|
Just a quick note. One of the reasons this is not used in syncthing is because it hogs everything that's available. We are trying to incorporate KCP but that has issues with getting stun to work with it. |
Thanks. I'm seeing the same behaviour using this library with IPFS that my main reason for investigating this. |
Take a look at a new library I'm working on: https://github.com/anacrolix/go-libutp. |
Thanks. I noticed that repository last week. I will give it try. |
Bare in mind, if you require cross compiling this makes things very difficult due to cgo. |
I am trying to hunt down an issue I have with this library where I notice that the uTP transfer does not back-off when there is TCP background traffic.
Tracing this issue I noticed that an extensive number of packets are being resend. I noticed this was primarily due to this call to
resend
which is caused by this call toackSkipped
. Using wireshark I noticed that lots of ACK packets do have theSelective ACKs
extension enabled but do not have an extension bitmask:According to the specification the length should be:
There are also ACK packets that do have this set correctly:
Now as a quick hack I've tried not calling
ackSkipped
when there are no bytes in the extension:and I've also changed
ackSkipped
to not resend the packets that often:For me this is dramatically decreasing the number of duplicates that are being send.
However it does not fix my issue with uTP traffic throttling back when there is TCP background traffic.
Does this make sense?
The text was updated successfully, but these errors were encountered: