Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong TP reporting for long delay lines like xDSL uplink #910

Closed
zokl opened this issue Sep 2, 2019 · 7 comments
Closed

Wrong TP reporting for long delay lines like xDSL uplink #910

zokl opened this issue Sep 2, 2019 · 7 comments
Assignees
Labels

Comments

@zokl
Copy link

zokl commented Sep 2, 2019

Hello Developers,

I would like to report an issue in measurement of real xDSL line. We try to measure uplink direction (512 kbps) but we obtain result without throughput, however RTT, CWIN and retransmission are colected. Throughput field is always printed, however only the first item has non zero value. All other values are equal to zero.

The example output follows:
iperf_dsl_issue.txt

We used iperf version 3.6 and 3.7. Both of them have the same problem.

  • Version of iperf3: 3.7
  • Hardware: x86
  • Operating system (and distribution, if any): OpenWRT - master branch

The issue could be reproduced by TC with the following syntax:
tc qdisc add dev eth1 root tbf rate 512kbit burst 1540 limit 384k

@bmah888
Copy link
Contributor

bmah888 commented Sep 10, 2019

I'm starting to look at this. I have partially reproduced the problem you're seeing. It would help a little bit if you can send me the command-line arguments you are using on the client and server side....I can sort of dig them out of the JSON output but having the actual arguments would be useful.

My first guess is the problem is caused by a combination of: 1) the really slow 512kbps link speed and 2) the fairly large default send size used by TCP tests (128KB) and 3) trying to do 3 parallel streams. At that link speed, it takes about 2 seconds for to finish a send (yes it's chopped up into smaller TCP segments), and if you're doing 3 parallel streams there might very well be intervals where iperf3 doesn't actually record the sending or reception of a complete send.

If you try something smaller, like --length 1k, it forces iperf3 to do finer-grained measurements by doing small sends, and you get more intuitive results.

(I was going to insert some output here to illustrate my point, but I have to figure out how to get it out of the VM I was testing on, sigh.)

Note that iperf3 was originally designed for high-speed networks that are several orders of magnitude faster than the environment that you're testing, and so some adjustment of parameters might be necessary to get useful results.

@bmah888 bmah888 self-assigned this Sep 10, 2019
@acooks
Copy link
Contributor

acooks commented Sep 10, 2019

The log file shows that the test did not complete successfully.

I regularly use iperf3 (with patches) on these kinds of links. I think this patch is relevant to this issue, but it was previously submitted and rejected as redundant.

@zokl
Copy link
Author

zokl commented Sep 10, 2019

The log file shows that the test did not complete successfully.

I regularly use iperf3 (with patches) on these kinds of links. I think this patch is relevant to this issue, but it was previously submitted and rejected as redundant.

This is not our problem. We are missing data during the test.

@acooks
Copy link
Contributor

acooks commented Sep 10, 2019 via email

@zokl
Copy link
Author

zokl commented Sep 10, 2019

I'm starting to look at this. I have partially reproduced the problem you're seeing. It would help a little bit if you can send me the command-line arguments you are using on the client and server side....I can sort of dig them out of the JSON output but having the actual arguments would be useful.

For our test and comparison between several network technologies we use the same parameters for server and client like:

/usr/bin/iperf3 -c 147.32.211.37 --connect-timeout 1000 -t 90 --logfile XYZ.iperf3 -p 5201 --get-server-output -J --parallel 3 --window 1500k --set-mss 1400 -C cubic

My first guess is the problem is caused by a combination of: 1) the really slow 512kbps link speed and 2) the fairly large default send size used by TCP tests (128KB) and 3) trying to do 3 parallel streams. At that link speed, it takes about 2 seconds for to finish a send (yes it's chopped up into smaller TCP segments), and if you're doing 3 parallel streams there might very well be intervals where iperf3 doesn't actually record the sending or reception of a complete send.

Yes, what you describe will be the cause of this behavior. Is it possible to improve the bit rate calculation so that it can record the bit rate even at these parameters? RTT, CWND and retransmission work. Only the yhroughput has a problem.

If you try something smaller, like --length 1k, it forces iperf3 to do finer-grained measurements by doing small sends, and you get more intuitive results.

TCP window size bellow 128k works.

Communication graph with TCP window 64k - works
iperf3_TP-problem-w64k.txt
iperf3_TP-problem-w64k

Communication graph with wrong throughput results
iperf3_TP-problem.txt
iperf3_TP-problem

(I was going to insert some output here to illustrate my point, but I have to figure out how to get it out of the VM I was testing on, sigh.)

Note that iperf3 was originally designed for high-speed networks that are several orders of magnitude faster than the environment that you're testing, and so some adjustment of parameters might be necessary to get useful results.

@bmah888
Copy link
Contributor

bmah888 commented Oct 1, 2019

@zokl: I think really the key here is to change the --length parameter. iperf3 can only compute the throughput on the basis of complete send calls into the network, and the granularity of those is what the --length parameter is supposed to control. Put another way, if on average there is less than one send call that can complete during a measurement interval, you're going to have some measurement intervals with zero throughput, because iperf3 can only measure the time between complete send calls.

Hmmm. Now that I'm thinking about it, here are a couple other ideas. Another thing you could try is specifying a larger value for --interval (the default is 1 second).

Also you can try (weirdly) reducing the TCP window size, because a burst of packets can get absorbed by the sender buffers on the client side. (This is a funny variant of the buffer bloat problem.)

@zokl
Copy link
Author

zokl commented Oct 2, 2019

Hi Bruce,
Thank you very much for your answer. What you write is true. It helps to change the interval and especially the TCP window. Packet size does not move much on xDSL. With GPRS / EDGE or BPL communication, packet size handling behaves better. Overall, the measurement is confusing for me that the test is performed; most parameters are calculated; only the throughput is 0.

@zokl zokl closed this as completed Oct 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants