Error: Too many files open #92

bretrouse · 2013-08-20T16:02:47Z

Hello,

When using locust with 1 master and 4 slaves, running a 50,000 users at 200 hatched per second I'm receiving the following error:

'ConnectionError(MaxRetryError("HTTPConnectionPool(host='rewresnwww6ld', port=80): Max retries exceeded with url: /api/activities (Caused by <class 'socket.error'>: [Errno 24] Too many open files)",),)'

This seems to be coming from the requests library. My ulimit is unlimited and I've applied the other settings below from this post:

echo “10152 65535″ > /proc/sys/net/ipv4/ip_local_port_range
sysctl -w fs.file-max=128000
sysctl -w net.ipv4.tcp_keepalive_time=300
sysctl -w net.core.somaxconn=250000
sysctl -w net.ipv4.tcp_max_syn_backlog=2500
sysctl -w net.core.netdev_max_backlog=2500
ulimit -n 10240

Any ideas? I can't effectively loadtest at this point as the error rate climbs after ~5000 users have been generated.

The text was updated successfully, but these errors were encountered:

cgbystrom · 2013-08-20T19:02:55Z

It is likely your sockets that end up in TIME_WAIT state, which effectively blocks them for re-use for a temporary time period.

See http://serverfault.com/questions/212093/how-to-reduce-number-of-sockets-in-time-wait and http://www.lognormal.com/blog/2012/09/27/linux-tcpip-tuning/ for more info.

One could argue that Locust should re-use sockets when doing big tests. We've been thinking about that for when testing Battlelog (multi-million user tests) to reduce this behavior, since it isn't optimal to re-use sockets too quickly (hence the default TIME_WAIT timeout). However, reusing sockets won't test the actual TCP accept handshake which also puts stress on your system. But in most cases, this isn't your actual bottleneck anyway.

Jahaja · 2013-08-20T23:37:12Z

This should not be a case of TCP port exhaustion as that would not generate that error. (Rather it would generate EAGAIN on connect())

I think it's more likely that your python processess don't actually have the intended resource limit. You could confirm this by printing it out in your locustfile.

import resource
print resource.getrlimit(resource.RLIMIT_NOFILE)

However, reusing sockets won't test the actual TCP accept handshake which also puts stress on your system. But in most cases, this isn't your actual bottleneck anyway.

I think it would actually. The peer would most likely be gone and it would be required to reestablish that connection from scratch. The only thing that would be reused is probably the kernel resources allocated for that socket. That said, I'd imagine reusing the sockets could create quite strange errors on a shaky network.

bretrouse · 2013-08-21T00:24:12Z

That appears to have been the issue. Once I added this call to my locustfile I was able to bring my servers down. Thanks for the help. Still unsure why the python process wasn't respecting my ulimit settings, but able to work around it for now.

resource.setrlimit(resource.RLIMIT_NOFILE, (999999, 999999))

Thanks guys.

thehackercat · 2017-08-31T03:13:38Z

Firstly check if the the socket is close. Python socket should call socket.close() after socket.shutdown(2), then the connection will be delocalized and released.

Then enlarge the maximum open files in /etc/security/limits.comf.

bretrouse closed this as completed Aug 21, 2013

concurrencylabs mentioned this issue Sep 5, 2016

Too many files open concurrencylabs/mqtt-locust#1

Closed

BlinkMo mentioned this issue Aug 1, 2017

Locust throwing connection error failures #638

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error: Too many files open #92

Error: Too many files open #92

bretrouse commented Aug 20, 2013

cgbystrom commented Aug 20, 2013

Jahaja commented Aug 20, 2013

bretrouse commented Aug 21, 2013

thehackercat commented Aug 31, 2017

Error: Too many files open #92

Error: Too many files open #92

Comments

bretrouse commented Aug 20, 2013

cgbystrom commented Aug 20, 2013

Jahaja commented Aug 20, 2013

bretrouse commented Aug 21, 2013

thehackercat commented Aug 31, 2017