[WebRTC] investigate data-channels-flow-control example throughput performance issue #101

rainliu · 2021-09-25T20:30:47Z

Pion has more than 500Mbps+,

Peer Connection State has changed: connected (offerer)
Peer Connection State has changed: connected (answerer)
2021/09/25 13:27:29 OnOpen: data-824638619994. Start sending a series of 1024-byte packets as fast as it can
2021/09/25 13:27:29 OnOpen: data-824636958938. Start receiving data
2021/09/25 13:27:30 Throughput: 570.646 Mbps
2021/09/25 13:27:31 Throughput: 569.753 Mbps
2021/09/25 13:27:32 Throughput: 573.001 Mbps
2021/09/25 13:27:33 Throughput: 572.452 Mbps
2021/09/25 13:27:34 Throughput: 571.297 Mbps
2021/09/25 13:27:35 Throughput: 569.525 Mbps
2021/09/25 13:27:36 Throughput: 567.463 Mbps
...

but, webrtc-rs only has around 13Mbps:

Peer Connection State has changed: connected (offerer)
Peer Connection State has changed: connected (answerer)
OnOpen: data-1. Start sending a series of 1024-byte packets as fast as it can
OnOpen: data-1. Start receiving data
Throughput: 12.990 Mbps
Throughput: 13.698 Mbps
Throughput: 13.559 Mbps
Throughput: 13.345 Mbps
Throughput: 13.565 Mbps
Throughput: 13.582 Mbps

The text was updated successfully, but these errors were encountered:

rainliu · 2021-09-25T20:39:19Z

cargo build --release --example data-channels-flow-control increases performance, but not comparable to pion.

./target/release/examples/data-channels-flow-control
Press ctlr-c to stop
Peer Connection State has changed: connected (offerer)
Peer Connection State has changed: connected (answerer)
OnOpen: data-1. Start sending a series of 1024-byte packets as fast as it can
OnOpen: data-1. Start receiving data
Throughput: 175.556 Mbps
Throughput: 106.104 Mbps
Throughput: 76.986 Mbps
Throughput: 61.450 Mbps
Throughput: 51.632 Mbps
Throughput: 44.797 Mbps
Throughput: 39.733 Mbps
Throughput: 35.619 Mbps
Throughput: 32.330 Mbps
Throughput: 29.491 Mbps
Throughput: 43.142 Mbps
Throughput: 48.350 Mbps
Throughput: 46.386 Mbps
Throughput: 44.221 Mbps
Throughput: 48.071 Mbps
Throughput: 55.550 Mbps
Throughput: 53.980 Mbps
Throughput: 52.263 Mbps

whans · 2021-10-05T04:15:36Z

should be tokio performance limit

some other benchmark

goroutines: 3.22234675s total, 3.222346ms avg per iteration
rust_threads: 16.980509645s total, 16.980509ms avg per iteration
rust_tokio: 9.56997204s total, 9.569972ms avg per iteration
rust_tokio_block_in_place: 3.578928749s total, 3.578928ms avg per iteration

https://www.reddit.com/r/rust/comments/lg0a7b/benchmarking_tokio_tasks_and_goroutines/

vitdevelop · 2021-10-05T15:46:41Z

@rainliu I made some benchmarks between Go(Pion) and Rust with long running time, maybe will help.

Go(Pion)
From start of benchmark throughput was grow up to 844Mps

17:49:06 Throughput: 721.371 Mbps
17:49:07 Throughput: 727.991 Mbps
17:49:08 Throughput: 743.665 Mbps
...
17:49:39 Throughput: 842.728 Mbps
17:49:40 Throughput: 843.339 Mbps
17:49:41 Throughput: 843.672 Mbps
17:49:42 Throughput: 844.272 Mbps
17:49:43 Throughput: 844.782 Mbps
17:49:44 Throughput: 844.855 Mbps

after has been throwed an exception
mux ERROR: 17:49:45 mux: ending readLoop dispatch error packetio.Buffer is full, discarding write

and throughput started to go slow down without stop to stable point.
I stopped benchmark on

18:44:45 Throughput: 9.966 Mbps

Rust
From start of benchmark throughput was

Throughput: 229.521 Mbps
Throughput: 231.489 Mbps
Throughput: 231.780 Mbps
Throughput: 231.662 Mbps
Throughput: 231.965 Mbps

after that, started to go down and reached lowest point

Throughput: 23.023 Mbps
Throughput: 22.849 Mbps
Throughput: 22.677 Mbps

after started slowly to grow up and balancing between 66.436 Mbps - 41.511 Mbps

CPU/RAM

Go

RAM 22.304 MB
CPU ~1%

Rust

RAM 162MB(stopped at that point, without recycling)
CPU ~120%

vitdevelop · 2021-10-05T15:58:25Z

should be tokio performance limit

some other benchmark

goroutines: 3.22234675s total, 3.222346ms avg per iteration rust_threads: 16.980509645s total, 16.980509ms avg per iteration rust_tokio: 9.56997204s total, 9.569972ms avg per iteration rust_tokio_block_in_place: 3.578928749s total, 3.578928ms avg per iteration

https://www.reddit.com/r/rust/comments/lg0a7b/benchmarking_tokio_tasks_and_goroutines/

@whans Benchmarks was made on I/O for file, not on socket. Linux have different behavior for files and sockets.
If you will try std's file IO inside async block or task::block_in_place you will have very fast values.
That is because Linux use read-ahead for files.
Also benchmarks was made on /dev/urandom and /dev/null which is in-memory files.

@rainliu I suppose that, here example use socket connection.

whans · 2021-10-05T23:17:23Z

@vitdevelop thanks
compare std::net::UdpSocket vs tokio::net::UdpSocket
std::UdpSocket is almost twice as fast as tokio::UdpSocket

rainliu · 2021-10-06T03:11:01Z

thanks @vitdevelop and @whans for the benchmarking.

Look like we need some efforts to profile the hotspots/bottlenecks.

whans · 2021-10-06T10:38:43Z

add tokio console to check the schedule issue

https://github.com/tokio-rs/console

vitdevelop · 2021-10-06T14:15:33Z

add tokio console to check the schedule issue

https://github.com/tokio-rs/console

@whans thanks for tokio-console, awesome tool

I tried to check with tokio-console busy/idle times for tasks but didn't connect to console_subscriber.
After I figured out that before offer/answer I can connect and I putted some tokio::time::sleep points to see the image,
after offer/answer started to execute, console hangs up.

10 sec before create_oferer
3 sec before create_answerer
3 sec before create_offer and set_remote_description
3 sec before create_answer and set_remote_description

Here is last tokio-console data

whans · 2021-10-07T08:14:09Z

@vitdevelop
you need to slow down the packet sending rate.
add sleep in sending task

vitdevelop · 2021-10-08T10:32:51Z

@whans Sleep in sending task helped for some time

@rainliu After ~5 min I figured out that a lot of tasks webrtc-sctp-0.3.8/src/timer/[ack,rtx]_timer.rs:[43,153] was spawned, around 500-600.
Most cpu busy time is in that tasks.

I attached the screenshot.

rainliu · 2021-10-09T05:52:41Z

@vitdevelop, thanks for the finding. Could you submit a PR to add tokio-console/console_subscriber to data-channels-flow-control example? so, I can take a look.

vitdevelop · 2021-10-10T08:53:06Z

@rainliu Added PR
webrtc-rs/examples#1

whans · 2021-10-11T00:21:37Z

output about: perf top -p

ramyak-mehra · 2024-03-29T20:25:19Z

I was doing some testing of my own on sctp and on just write to streams it takes almost 22x slower compared to same example on pion side. I was seeing an issue to use IO-free state machine style, while that will speed up the protocol it still doesnt explain the root cause. The rust example shouldn't be 22x slower it should be atleast comparable to the pion one if not faster. Has anyone explored the root cause of such slow down of the protocol?

rainliu added benchmark benchmark the peformance performance improvement labels Sep 27, 2021

rainliu mentioned this issue Oct 19, 2021

WebRTC.rs Roadmap #1

Open

41 tasks

xnorpx mentioned this issue Apr 30, 2022

[All] Benchmark and Performance Improvements #118

Open

k0nserv mentioned this issue Jun 24, 2022

Current status of data channel functionality? #196

Closed

k0nserv added the subcrate:data For issues specific to the data crate label Aug 23, 2022

KillingSpark mentioned this issue Dec 8, 2022

Data channel fails to send packets at a frequently speed when works as a rtc server and connected by a browser. #360

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WebRTC] investigate data-channels-flow-control example throughput performance issue #101

[WebRTC] investigate data-channels-flow-control example throughput performance issue #101

rainliu commented Sep 25, 2021

rainliu commented Sep 25, 2021

whans commented Oct 5, 2021

vitdevelop commented Oct 5, 2021

vitdevelop commented Oct 5, 2021 •

edited

Loading

whans commented Oct 5, 2021

rainliu commented Oct 6, 2021

whans commented Oct 6, 2021

vitdevelop commented Oct 6, 2021

whans commented Oct 7, 2021

vitdevelop commented Oct 8, 2021 •

edited

Loading

rainliu commented Oct 9, 2021

vitdevelop commented Oct 10, 2021

whans commented Oct 11, 2021

ramyak-mehra commented Mar 29, 2024

[WebRTC] investigate data-channels-flow-control example throughput performance issue #101

[WebRTC] investigate data-channels-flow-control example throughput performance issue #101

Comments

rainliu commented Sep 25, 2021

rainliu commented Sep 25, 2021

whans commented Oct 5, 2021

vitdevelop commented Oct 5, 2021

vitdevelop commented Oct 5, 2021 • edited Loading

whans commented Oct 5, 2021

rainliu commented Oct 6, 2021

whans commented Oct 6, 2021

vitdevelop commented Oct 6, 2021

whans commented Oct 7, 2021

vitdevelop commented Oct 8, 2021 • edited Loading

rainliu commented Oct 9, 2021

vitdevelop commented Oct 10, 2021

whans commented Oct 11, 2021

ramyak-mehra commented Mar 29, 2024

vitdevelop commented Oct 5, 2021 •

edited

Loading

vitdevelop commented Oct 8, 2021 •

edited

Loading