-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Bad file descriptor" 5 seconds after specified connection duration #753
Comments
You're right that this I feel like Steve Jobs saying "You're holding it wrong." :-) In theory we could change the 5 seconds to some other value, or even make it a parameter, but to me that's kind of a hacky workaround. |
I am getting this issue in a back-to-back setup where two servers are connected to each other (25G link) I am trying to run the same Test Case multiple times after each other to see the reproducibility of the test. However on the second iteration it always comes with this issue.
Tried to run the server in normal CLI mode and started via the Python-wrapper Configuration used
|
Looks like it might be a problem in the python wrapper, not the iperf3 source code. I commented out some of the del functions and now seems to execute fine
|
Think I found a better workaround (not modifying the python-iperf3 wrapper):
|
It sounds like this isn't really an iperf3 issue, so closing for now. |
Hi, I am able to reproduce this issue with pure iperf3 binaries, compiled from the master. Just few condition have to be met:
I am running server with these arguments I put three breakpoints on the server side:
after long test (15 minutes) this happened:
as a result client stucks completely... |
I found out the issue - control socket timed out on the router, when client wanted send "test ends" control message, it don't arrives to the server and this specific router don't reply with any RST packet back! Client stuck at this point... I tried to install libkeepalive on the client (Ubuntu Linux) and use it with iperf client: This solves the situation, control sockets remains established and test is ended properly... So, my suggestion is to use TCP keepalive for control socket as a default (or some kind of application heartbeat) and improve error handling on the server - |
@bmah888 I have a perfectly legit use case. I'm performance testing IoT equipment meant to be low bw, lossy, best effort kind of wireless connections. The devices use a buffer, on a lossy link, where throughput could easily be pegged. The OP identified my exact setup, which describes real world solutions for large utilities, etc. I've been stopped in my tracks by this bug. Going to try @Karry workaround. |
I think this is related to issue 751 and PR 859 #859 There is a race condition between the termination message from the client and server_timer_proc(). |
Context
Version of iperf3:
iperf 3.5 (cJSON 1.5.2)
Hardware:
VirtualBox VM
Operating system (and distribution, if any):
Ubuntu 18.04
Bug Report
The problem occurs when a connection lasts more than 5 seconds after the specified duration (when the data keeps arriving to the receiver even though the sender has stopped emitting). This can be the result of large queues in network equipment together with low bandwidth.
The bug is related to issues #645 #648 #653
Expected Behavior
There could be two expected behaviors:
Actual Behavior
As soon as the connection length of the client reaches the duration specified by the sender plus 5 second, the message "iperf3: error - select failed: Bad file descriptor" is displayed.
In the example below, the link capacity is 5 Mb/s and the client is sending UDP data at 10 Mb/s
Client:
Server:
The bug is a bit tricky to reproduce: you need to have a link which buffers packets in a queue (I believe many routers do that), as well as a link with a low capacity (so that all the packets stored in the queue cannot be transfered withing 5 seconds on the link).
Then, just send data in UDP with a bitrate higher than the link capacity. This way the last packet sent by the sender will arrive more than 5 seconds after at the receiver and that will cause the bug.
The text was updated successfully, but these errors were encountered: