-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possibly performance regression in the latest versions of locust #2690
Comments
Hmm... There IS a known performance regression in OpenSSL 3.x (which was usually introduced in Python 3.12, but maybe your python build is different somehow?), see #2555 The issue will hit tests which close/reopen the connection especially hard (as the issue arises at ssl negotiation) Can you check to see which ssl version you are running? As a workaround, see if you can run run another python version or keep connections alive (I know, not as realistic but better than nothing) |
Hi, I used ubuntu 20.04 for Amazon EC2. I managed install the python 3.10 and the latest locust. The CPU usage became low. However, the through put did not follow the constant_throughput(1) spec. 1500 users only gave me less than 800 rps. Here is my python env: (locust_env) ubuntu@ip-172-31-10-204: blinker 1.8.1 |
Hi! Did you check your ssl version?
|
Yes, I did that. In fact I used ubuntu 20.04 which uses openssl 1.1.1f. I also updated the python to 3.10. With this setup, the CPU usage was lower, however, I found that even if I set wait = constant_throughput(1) for the test user, 1500 users only gave me less than 800 rps (I have already mentioned this in my previous reply). I did not see this issue when I use locust 2.17.0. |
What are your response times like? Wait times can only limit throughput, not increase it, so if a task takes more than 1s to complete you wont get 1 request/user/s. |
The average response time is less than 700ms. Also, when I used older version of locust (e.g. 2.17.0), I did not have this issue. |
Hmm.. only thing I can think of is if Amazon is throttling somehow. What if you skip closing the session/connection? Can you see how many dns lookups are made? (Using tcpdump or something else). If you close the session then maybe there is a new dns lookup for each task iteration? |
I can take a look if there is new dns lookup. However, the same target server and same tests, why locust 2.17.0 did not have the issue. Any major change to the connection logic? |
Not that I can think of :-/ But does 2.17.0 not exhibit this problem on python 3.11/Amazon Linux 2023? |
Just report back. I changed my system combination. Right now, I am using Amazon Linux 2 with Python 3.10. The ssl version is 1.1.1g. I also follow the instruction https://repost.aws/knowledge-center/dns-resolution-failures-ec2-linux to enable the local dns cache. With this setup, the latency is much lower and CPU usage per worker is at low level as well. However, even with this setup, the RPS does not hold. I run a test with 1200 users, each with constant_throughput(1) request rate. the RPS is quite far from 1200. It stopped around 800 and started to drop on its own. |
What are the response times? If a task takes more than the constant_pacing time, you’ll get falling throughput. |
I tried to run the locust 2.17 on the exact same OS (Amazon Linux 2 with Python 3.10). It also showed the same issue. I think the issue is on the load test side because the server being tested is the same. I suspect there could be something in the OS environment that slows down the connection. However, one thing I don't understand is that when the number of users reaches the desired number, the rps can not reach the expected number and starts to drop and eventually drop to a very low number. It seems locust loses control of creating new connections. I have enabled local dns cache. Anything else would you suggest me to try out? Thanks |
The main thing I would like to investigate is on the receiving end. Is there some throttling going on? How many locust workers are you using? Are they spread out over multiple machines? Are they passing thru a NAT?
Again I ask: What are your response times? If response times increase enough, you'll get falling RPS. Nothing to do with Locust, it is just math: If you have a certain number of concurrent users and response times go up you'll get falling throughput. |
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 10 days. |
Just got the latest locust 2.31. Everything else was the same. The above issue was resolved. Any major improvement in 2.31? |
There was a performance fix in requests 2.32.0, but it should really only be needed for openssl 3.x, which you didn't have :) https://github.com/psf/requests/releases/tag/v2.32.0 But its nice that it works for you now :) Ok to close? |
Or maybe what you were experiencing was a version of this: #2812 ? That was fixed in Locust 2.31. |
Prerequisites
Description
I used to use Amazon Linux 2 as the base OS for my load tests. Because the python available on that OS is 3.7, the latest locust I could get was 2.17.0. With 5 c5n.xlarge EC2 instances (each has 4 vCPU) as workers, I could use spawn 1200 users. The wait_time for the test was set to constant_thoughtput(1) so that the total 1200 rps stress could be achieved.
Recently, I updated the base OS to Amazon Linux 2023. The python version became 3.11. I could use the latest version of locust - 2.26.0. However, the above setup (5 c5n.xlarge EC2 instances) could not provide the desired load. It could only spawn totally about 830 users. The total rsp was only around 330 even though the wait_time was still constant_thoughtput(1). I noticed that CPU usage of each worker process was close to 100% already.
The server being tested did not change. The same locustfile was used for tests. However, the performance between the above 2 locust setup was day and night difference. This seems like a regression.
Here is the output of the python 3.11 environment:
Package Version
blinker 1.7.0
Brotli 1.1.0
certifi 2024.2.2
charset-normalizer 3.3.2
click 8.1.7
ConfigArgParse 1.7
Flask 3.0.3
Flask-Cors 4.0.0
Flask-Login 0.6.3
gevent 24.2.1
geventhttpclient 2.2.1
greenlet 3.0.3
idna 3.7
itsdangerous 2.2.0
Jinja2 3.1.3
locust 2.26.0
MarkupSafe 2.1.5
msgpack 1.0.8
pip 22.3.1
psutil 5.9.8
pyzmq 26.0.2
requests 2.31.0
roundrobin 0.0.4
setuptools 65.5.1
urllib3 2.2.1
Werkzeug 3.0.2
zope.event 5.0
zope.interface 6.3
Command line
master side: locust -f /opt/locustfile.py --master worker side: locust -f - --worker --master-host <master_ip> --processes -1
Locustfile contents
Python version
3.11
Locust version
2.26.0
Operating system
Amazon Linux 2023
The text was updated successfully, but these errors were encountered: