-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve data channel (SCTP) performance #62
Comments
Preparation in progress. I am going to use this tool. My focus is to measure the performance of SCTP protocol (not really about the bufferedAmount control). |
The first pass of a benchmark on pion/[email protected] in comparison with TCP on various RTT (using Network Link Conditioner). Conditions:
Obviously:
|
Further investigation revealed:
I attempted to increase the receive buffer size from 64KB (current) up to 1MB, but throughput did not improve beyond the size of 128KB. Pion/sctp uses delayed Ack to reduce the number of Acks. The downside of it is since Ack is delayed, perceived RTT becomes large, and it would affect the overall throughput. To mitigate that, it uses "Immdidate Ack" to request the receiver to generate Ack immediately. But I found that there was a bug that prevented the immediate-ack from being sent frequently enough. By fixing this (fix-1), it improved the speed of cwnd growth a lot. My test environment involves WiFi network, and I am seeing packet loss constantly, and sometimes a large number of packets are lost (large enought the fast-recovery algorithm could not recover). When T3-Rtx timer fires, Pion/sctp tries to retransmit unack'ed packets but it was very very slow due to a bug. By fixing it (fix-2), it was able to recover from the massive loss as fast as what TCP seems to be doing. By having both fix-1 and fix-2, Pion/sctp was able to yield close to the theoretical throughput:
Now I am comfortable with increasing the SCTP's receive buffer to 1MB. |
Num of sacks has been slighly increased Relates to #62
Num of sacks has been slighly increased Relates to #62
Those fix-1 and fix-2 have landed in pion/sctp v1.7.0.
|
@Sean-Der @enobufs Context:
Our problem before 2.1.6 was the following:
We hoped that this release might improve this situation, however, it turns out that the Browser -> Pion issue remains. I could start a new issue but I somehow believe that it might be related to your performance improvements and maybe we just have to tune some "knobs" to get also the browsers on board. If you need more information about our use case, I would be very happy to tell more about it. Kind Regards, |
Thanks @chrisprobst so much for the feedback! I personally have not looked closely at the interaction between pion and browser, just yet. I will definitely look into it as soon as I can. Let me ask you a few questions...
|
Thanks @enobufs for your quick response!
Thanks for your time and investigating! Best, |
Ahhh I see! I have some idea about what is going on. Pion's delayed ack might have a problem. Great information! Thank you! |
This gives us hope! It would be extremely helpful for us to rely on this direction (Browser->Pion). If you need help / further information, please ask! :-) |
Dear @enobufs, do you have an idea, how much time & effort this would take? We are really interested in this bugfix and would like to help. Is there a good starting point or is it just a tiny issue better solved by a core developer? |
@chrisprobst I got some time today and will let you know my findings at the end of the day. (sorry, I was sick for the last few days... now I am feeling fine!) |
@enobufs Oh I wish you the very best! Your help is really appreciated! |
@chrisprobst I was distracted by another bug (unrelated) to fix, but I got my data channel between pion and chrome working began writing test cases. My time on weekdays is very limited, but I'm working on it, hoping to get something for you by this weekend. |
oops - was a wrong button. |
@chrisprobst |
My findings:
RFC 4960 Sec. 6.2:
TODO:
|
😱 that is an amazing graph, that is so exciting!!! Amazing @enobufs I can't wait to see what people think |
@chrisprobst |
@enobufs What a great work, I will test it out for sure! Short question, because I am not too familiar with the release management of pion. I was using the latest 2.1.6. I saw that there is now a 2.1.8, can I use it, will it contain the patch? And one more time @enobufs, what a great work! We appreciate your work so much. Of course, we appreciate the work of all committers ;-). |
@chrisprobst just tagged We just run the whole test suite with updated libs (to make sure nothing regresses) so takes a little time to go across :) excited to hear how the testing goes! |
Very nice, thanks for the release. |
Great job @enobufs ! |
@enobufs @Sean-Der We did a lot of testing, we are happy with the results. No more delay, this solves a huge pain for us, really! We used the work-around to only match pion devices, because pion-pion always worked superb. This restriction can now be removed and our initial production tests show that it works as expected. Thanks guys! |
That is amazing news, @enobufs is really a rockstar :D @chrisprobst if you ever have ANYTHING that can be better I would love to know. I am sure people are hitting issues, but just not telling us. I am ready to jump on anything I can :) |
All Pion need is love! |
@enobufs @Sean-Der Thank you very much, we appreciate this offer a lot. For us, Pion is an enabling technology, because Go is so easy to cross compile on mobile, there are simply no other options for WebRTC cross-device development. Fyi: Pion will soon be used for a very large radio station in South America, we expect a lot of devices using Pion to hear radio every day. We will keep you posted. Thanks again for your hard work and good product. Of, course this goes to all other committers as well. |
@chrisprobst could you share any results of using the latest Pion versions in your use-case? I have a window to update the versions I use in production and would love to know how much better things perform. |
I think this is done. Closing. |
This is related to #62 in which we removed the use of immediate ack (I) bit. This test became unstable as we no longer use immediate ack. Relates to pion/webrtc#1270
This is related to #62 in which we removed the use of immediate ack (I) bit. This test became unstable as we no longer use immediate ack. Relates to pion/webrtc#1270
This is related to #62 in which we removed the use of immediate ack (I) bit. This test became unstable as we no longer use immediate ack. Relates to pion/webrtc#1270
This is related to #62 in which we removed the use of immediate ack (I) bit. This test became unstable as we no longer use immediate ack. Relates to pion/webrtc#1270
This is related to #62 in which we removed the use of immediate ack (I) bit. This test became unstable as we no longer use immediate ack. Relates to pion/webrtc#1270
Motivation
SCTP (datachannel) performance is perceived very low particularly over a real network with latency with limited bandwidth. No one appears to have properly measured performance. We should identify underlying problems causing the slowness with correct measurement, then tackle those to improve it.
The text was updated successfully, but these errors were encountered: