-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running 200 Chrome instances in two nodes at the same docker host fails because of X window limits. #370
Comments
thank you VERY much for your thoroughly detailed report - i'll be investigating this. |
I see that you are using fermium. fermium was released late last night (eastern time.) was this an issue in earlier versions as well? |
Yes, this happened in 3.0.1-dysprosium as well. I should have noted that, sorry. We started exploring the possibilities of migrating our test platform from Selenium / PhantomJS to Selenium / Chrome in earnest only yesterday, and ran into this issue right away. Diagnosing the cause took a bit more time though. I cloned the project to see if I could fix this in a local build today, which is when I noted that the version had changed. As an aside, the two java nodes in the comparison were started as two separate user processes. Perhaps I should try and see if that still works when started from the same script. |
@Menthalion very interesting issue. Out of curiosity, have you tried Chrome headless ? Is still in the DEV channel but wondering how does it handle your 200+ parallel tests. google-chrome-unstable --headless --disable-gpu --remote-debugging-port=9222 https://testmysite.thinkwithgoogle.com |
I'm getting an error trying to set chrome startup arguments through selenium capabilities on that version. It's officially not supported by the chromedriver, so that might be it. The same error came up on the Internet with mismatched driver / chrome versions. So I can't really test this up to scale with my current code. Seeing it doesn't seem to use screens, my guess would be it would run fine.
|
@elgalu @ddavison We managed to get Chrome headless to run at the bare metal Ubuntu 16.04 machine (24 cores, 140GB) , but hit another limit scaling up to 600 browsers (6 nodes of 100 browsers on the same machine), failing our tests at the ~235 browser mark. The error message of the Selenium nodes' Java process was The amount of processes from We managed to solve this by changing the following Linux setttings:
After these changes we could start up and use 600 headless Chromes, which took up around 30K processes, as well as ~5GB in ~1200 Chromium temp files. The same limits apply (even worse) to normal Chrome processes that are run through With the same settings, we never got above ~450 xvfb wrapped normal Chrome instances over 6 nodes of 100 browser each, with Chromes failing to start at the ~35K process mark. |
Great detailed report @Menthalion thank you so much. Have you tried this scenario with Zalenium ? @diemol and I have created this auto vertical scaling docker-selenium solution, you can quickly try it out with the one-liner installer: curl -sSL https://raw.githubusercontent.com/dosel/t/i/p | bash -s 3 start My guess is it won't handle that many tests in a 16GB 8 cores machine but if you have a moment to try it I'm a bit curious also. |
@elgalu Well, the 16GB machine I did the earlier tests on was my local machine, the 600 test was done on one of the two nodes of our main end-to-end test infrastructure, which have 24 cores / 140GB each. I'll see if we can give Zalenium a try there, I can't promise anything since we already spent a large part of the allotted project time on getting to this level. I'd need to bribe a scrum master here, or come up with some really convincing arguments why we'd need to investigate yet another avenue. |
@elgalu does Zalenium already support Chrome headless ? |
No. And I think it won't as that would kill the video recording functionality, the live preview feature, etc.. But it may be included in docker-selenium for the non-debug images in the future I guess. |
@elgalu I don't think it will have much chance running succesfully then, since we're already running into resource problems on bare metal ? |
As much as I want to promote Zalenium (since we developed it with @elgalu), I think it is not the right tool for your scenario since we run each test in a new container created on demand. Nevertheless, perhaps you can try it in a different scenario and give us some feedback if you want :) |
Thanks @diemol , this use case is indeed different: We're using Selenium for a load test of our product, which is why we need so many browsers. I'll keep an eye out for Zalenium for the tests we launch from our build stack, it seems promising there. @elgalu Chrome headless did scale for a simple test of starting the browsers and pointing them to a url, but it seems to be too immature yet for testing our SPA (a lot of the Selenium functions we called failed) which is a shame. |
@ddavison Although my initial findings of the scaling of vfdb-run nodes in docker vs vfdb-run on bare metal still stand, my conclusion why seems to have been wrong. After doing some more tests I found out that a xvfb-run inside a docker doesn't share resources with the host. However, an n amount of chromes on a xvfb-run node inside the docker seems to consume twice as much xvfb resources as a similar xvfb-run node on bare metal. This can be checked with So the docker limit is not the amount of chromes in total over all nodes=128. but the amount of chromes inside a node is limited to 64, since every chrome ran in a docker seems to consume 4 local resources, vs 2 for every chrome on bare metal. I verified this by running 4 nodes of 50 browsers each, and starting 200 browsers, which worked like a charm. This might be increased by adding Another option to increase it would be to identify why more processes are spawned per chrome run on the docker. In any of these cases (not only for chrome which consumes 2 resources in the docker but 4 in bare metal, but also for firefox where there are 3 resources needed on bare metal already)), a warning might be in order about these limits for the nodes. This because the browsers will just silently stop spawning, there is no warning or timeout when this happens. |
@Menthalion I am trying to do the exact same thing as you, load testing with real browsers. We have a setup thats using Nightwatch, TestArmada, BrowserMobProxy, HarStorage and then Selenium Grid with PhantomJS in Docker containers connected to the Grid, plus lots of custom JS code to drive the tests and generate the reports. I gave up on trying to get Selenium Docker Chrome to run at scale because it seems to resource heavy. I couldn't run 100 on 16 Cores, but for PhantomJS I can. I am also considering some Ghetto Cloud to get to 200 by running about 30-40 PhantomJS nodes on a bunch of the Dev iMac 5ks. Id be interested to chat / compare notes with what you have achieved seeing as there are very little resources I can find out there on this specific topic which is to run lots of real browsers parallel for Load Testing. |
@ddavison @elgalu After getting bare metal scalability for Selenium / virtual framebuffer Chrome out of the way, I've been trying to get past the ~64 chrome instances per selenium node docker by changing local builds, but to no avail. I made sure none of the kernel / systemd limiting factors are in place, replaced Xvfb with an xpra / xdummy combination, but still can't get a handle on why the Xwin limit / consumption within the docker setup is twice as high as on two different bare metal systems |
@madhavajay I've been able to get 100 chromes to run on a simple 8 core / 16 GB Dell desktop with these selenium dockers. I started 1 hub, and 2 nodes with 50 chromes each. Don't forget each node you start also has a java process running, so running one browser per dockernode like you might be able to do with PhantomJS isn't an option. Be sure to have no limiting factors on the host OS, since it can be resource intensive in handles and processes
Then start the hub |
@madhavajay I've been able to get 100 chromes to run on a simple 8 core / 16 GB Dell desktop with these selenium dockers. I started 1 hub, and 2 nodes with 50 chromes each. Don't forget each node you start also has a java process running, so running one browser per dockernode like you might be able to do with PhantomJS isn't an option. Be sure to have no limiting factors on the host OS, since it can be resource intensive in handles and processes echo "DefaultTasksMax=infinity" >> /etc/systemd/system.conf echo "* soft nproc unlimited" >> /etc/security/limits.conf
|
Hi, |
|
Hi, |
@ddavison agree it's very convenient running in docker, but... scalable and small? Adding extra code to execute does not, as a rule, improve performance or reduce file size. |
@DanielHeath |
Ahh, that's a more sensible interpretation, thank you. Yes, being able to use fleet management tooling is definitely a scalability advantage. |
Sorry that I only saw your question now @JarominP . We're currently running around 1000 headless chrome instances bare metal on a 5 year old 24 core 148 GB memory linux server (bear in mind this is for testing a very resource intensive SPA). The OP issue here should not be a limiting factor in docker anymore either since the framebuffer isn't needed when running chrome headless in a docker. Before chrome headless was a thing we had to give up running it in a docker because of these limitations. Afterwards we had not enough incentive to switch back. |
What worked for me is running hundreds of headless browsers with multiple tabs with an extension that manages the session per tab(cookies, proxy, navigation). Spawning a new tab in an existsing browser is less memory intensive |
hi Menthalion,
And then:
|
I have around 10 Oracle Linux 7 servers with 8 core and 16 GB ram. What is the most optimum solution so that my scripts run at a good speed. |
Hi all, I just went again and read all the comments, and to be honest, these docker images are not thought to be running a large number of browsers inside them, we actually recommend to just run one browser instance per container. I understand different people have different approaches, but we believe containers should not be used in this way. Having said that, I will close this issue since different approaches and comments have been done, but it is clear that there is no single formula that will work for all (since each browser will use different resources based on the website that it is loading). Hopefully, all these comments serve as a knowledge base for others who try to achieve the same. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Meta -
Image(s):
node-chrome
node-firefox
Docker Version:
3.0.1-fermium, 3.0.1-dysprosium
OS:
Ubuntu Linux 16.04
Expected Behavior
On the same Docker host (Ubuntu 16.04, 16GB , 8 cores)
Actual Behavior -
Suspected Cause
The xvfb-run command used to run the 'headless' browser instances can only support 256 max 'clients' for it's 'screen'. After starting ~64 chrome instances (every chrome seems to consume 2 'clients'), this limit is reached and no new ones can be started. For Firefox, this is ~46 instances (it seems that Firefox consumes 3 'clients').
This was verified by adding the
-maxclients 512
parameter to the xvfb-run -s parameter, which doubles the amount of 'clients' per 'screen', and made us able to launch double the amount of browsers per node.The more scalable solution is that processes started through separate xvfb-runs could use different 'screens', good for 256 clients each, by using the -a option. Our use case calls for ~600 browsers per host, so we wanted to use 6 nodes of 100 instances each.
However, it seems that separate chrome-node dockers on the same host share the same 'screen', meaning even with multiple separate chrome-node dockers, the maximum amount of started chrome instances is still limited to ~128 (256 screen 'clients')
Compare
When two selenium nodes are started on the same host machine with
xvfb-run -a -s "-screen 0 1024x768x24 -ac +extension RANDR" java -Dwebdriver.chrome.driver=./chromedriver -jar selenium-server-standalone-3.0.1.jar -role node -nodeConfig node<#>.json
With the following node1.json
this will not exhibit the same limitations, and can start the 200 browsers without a hitch.
What we tried
We tried fixing this by cloning this project, and change the entrypoint.sh of the NodeBase source from
xvfb-run -n $SERVERNUM
toxvfb-run -a
, but this still exhibited the same problem.The text was updated successfully, but these errors were encountered: