-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[🐛 Bug]: Cannot connect through NoVNC #2045
Comments
@Earlopain, thank you for creating this issue. We will troubleshoot it as soon as we can. Info for maintainersTriage this issue by using labels.
If information is missing, add a helpful comment and then
If the issue is a question, add the
If the issue is valid but there is no time to troubleshoot it, consider adding the
If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C),
add the applicable
After troubleshooting the issue, please add the Thank you! |
Hi @Earlopain, the same docker compose file that you shared but I could not reproduce. The image tag is used |
Hi there, no error in the console. The websocket is being openened but is not receiving any data. The first two packet seems to be some kind ping/pong type of deal, but again, that's just not happening. I did just now test on another machine, Windows this time, and I have no trouble getting it to work there. I tested with firefox running on the host, and firefox running through wsl as well because why not. Both worked no problem. I'm going to set up a fresh linux vm and check how it behaves there. |
I have encountered the same problem as you, but I encountered it on K8S. My VNC interface is blank, but my request can run normally. This is the same from the previous version 20231110 to this version 20231129. |
I don‘t understand why the container shows vnc port is 7900,but the service open port 6900:5900,could you please explain it?@VietND96 |
Hi @VietND96, I have installed docker in a fresh linux vm with https://endeavouros.com/ installed. After setting up docker with the following commands and starting the selenium image I observe the same symptoms as in my initial report:
It may have something to do with arch/endevouros being a rolling release and as such always having the latest versions, or it may be linux specific. I'm not sure with what host OS you were testing with. For the record, here are the docker/compose versions in use: $ docker compose version
Docker Compose version 2.23.3
$ docker -v
Docker version 24.0.7, build afdd53b4e3 |
Hi @zhaoyaohui0, as my understanding
|
@Earlopain @zhaoyaohui0 where is this failing? Which environments? The report is very ambiguous. |
@diemol I have provided additional information in my followup comment, is that not enough? I unfortunatly don't have more than "install this OS, setup up docker there and try again". How would I go about gathering more useful information for you, or what are you looking for? |
You also mention Kubernetes at the beginning of the issue. Hence my question. Also, how popular is that OS? I mean, we try to provide something that works in most OS, but if it fails in a few and the user base is small, we won't troubleshoot that because we are a small team, and we try to focus on the common use cases. Having said that, do you see the same with Ubuntu? macOS? Windows? |
Kubernetes was the other person, I'm just using it through docker. Endevouros is Arch with a GUI installer, it ships exactly the same software + some small GUI applications on top. I used it because it is convenient and easy to set up, contrary to when setting up Arch on your own. I did test on Windows and had no trouble there. I don't own an Apple device so nothing for me to do there. I can try out Ubuntu in a bit when I'm at my home PC. I will install latest docker versions, see how that turns out and let you know then. |
I gave it a try with Ubuntu 23.10 and it just worked as well. Ended up installing plain Arch instead of EndeavourOS just to make sure and it doesn't work with that. Here are some other findings: I enabled stdout logging for the other services and as expected NoVNC is trying to establish a connection. I accidentally left it open while testing and after a whooping 2.5 minutes it actually managed to connect. selenium-1 | 172.23.0.1 - - [05/Dec/2023 16:32:44] 172.23.0.1: Plain non-SSL (ws://) WebSocket connection
selenium-1 | 172.23.0.1 - - [05/Dec/2023 16:32:44] 172.23.0.1: Path: '/websockify'
selenium-1 | 172.23.0.1 - - [05/Dec/2023 16:32:44] connecting to: localhost:5900
selenium-1 | 05/12/2023 16:35:18 Got connection from client 127.0.0.1 After establishing a connection once, future connections still take the 2.5 minutes to establish. It doesn't seem to have anything to do with NoVNC. I exposed port 5900, wanting to connect with a local client, and that takes this long as well. I did a few runs, and the duration seems consistent. For 5 runs, it always took 154 seconds. I don't know what one would do with this information though. This all seems very nonsensical to me especially considering it works with other OSes and its just docker in the end. |
For K8s, the URL to access grid UI that you are using with schema |
I've started reducing the docker image and with a majority of the selenium things removed I still run into this issue. At this point I'm almost certain it got nothing to do with anything in this repo, so feel free to close this issue, from my side at least. I'll continue to investigate myself and make the report for this at the proper place, if I manage to actually find it. |
Thank you for your troubleshooting. I will close this based on your comments but feel free to add your findings in additional comments. |
I did some digging and have found the root cause. Inside the docker container This code in libvncserver enumerates them all, taking up huge amounts of CPU time. I didn't notice CPU spinning beforehand. The temporary solution is quite simple: set the ulimit for docker manually: version: "3"
services:
selenium:
image: selenium/standalone-chrome:4.15.0-20231110
environment:
- SE_VNC_NO_PASSWORD=1
shm_size: 2gb
ports:
- ${EXPOSED_VNC_PORT:-7900}:7900
ulimits:
nofile:
soft: 65536
hard: 65536 I don't know why these limits would differ from the host, documentation states they are inherited. My host value is just a measly 524288, but it is what it is. As for why it worked with focal but not with jammy, perhaps this codepath wasn't hit before. The limit is still high inside docker, what do I know. Here's some prior art: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=920913 |
Wow, great troubleshooting! Thanks for sharing. |
I verified the fix above. My case was both the VNC and noVNC lead to very long wait to connect, next to forever. In rarity, it reached password prompt but it still waits afterward and timeout. Can we put this in README on troubleshoot section? |
I'm not so sure on the value of that. This only happens when distros use the prepackaged systemd unit files with very recent docker and systemd versions, which in reality not very many actually do. Once upstream releases versions that contain a fix this section would pretty much becomes obsolete. You seem to have found this through issues just fine, I think that is good enough. |
I saw a few Dockerfiles have a practice that displays a warning if ulimit -n is too high when running Docker. I also tried added one to notice the user acda753
|
The idea is there, yes. However if ulimit is already set to a lower value in the container then trying to set it to something higher will return a non-zero exit code, at least for an unprivileged user. That needs to be accounted for. In addition, TIL that ulimit is a shell buildin and supervisord seems to only starts actual binaries (so I think After doing both of that, it works fine for me. Nice that a workaround is being considered here (: |
Thank you. This fixed my issue as well. |
* Guard against high `ulimit -n` when starting vnc Recent versions of docker in combination with the upstream systemd unit files pass an incredibly high `ulimit -n` to the docker container, up to 1 billion. That causes minute high delays and CPU spinning when connecting to VNC while it enumerates all the file descriptors. See #2045 * Update failure message * Allow the ulimit to be configurable by env
New releases will contain a workaround, a section in the readme for this shouldn't be needed anymore. See #2058 |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
What happened?
When upgrading
standalone-chrome
from4.15.0-20231108
to4.15.0-20231110
, the NoVNC web interface is not able to connect. Issue exists on the latest version4.15.0-20231129
as well, I just tested in which version it started.It's perpetually stuck in the "Connecting..." screen, the websocket being openend is not recieving any data.
The only difference between these two versions is the upgrade from Focal to Jammy in PR #1923
Command used to start Selenium Grid with Docker (or Kubernetes)
Relevant log output
Operating System
Arch Linux
Docker Selenium version (tag or chart version)
4.15.0-20231110
The text was updated successfully, but these errors were encountered: