Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

browser.quit doesn't close session #418

Closed
fernandomm opened this issue Nov 17, 2023 · 8 comments · Fixed by #448
Closed

browser.quit doesn't close session #418

fernandomm opened this issue Nov 17, 2023 · 8 comments · Fixed by #448

Comments

@fernandomm
Copy link

I'm using Ferrum 0.14 + Ruby 3.2.2 with Browserless. Browserless not only makes it simple to run Chrome headless but it also offers some extras like checking which sessions are active and what it's doing.

So, when I start it there are no sessions:

[1] pry(main)> JSON.parse(RestClient.get('http://chrome:3333/sessions'))
=> []

I start Ferrum and can see that the session is now showing up:

[2] pry(main)> browser = Ferrum::Browser.new(url: 'http://chrome:3333')
=> #<Ferrum::Browser:0x0000ffff7f309e70

[3] pry(main)> JSON.parse(RestClient.get('http://chrome:3333/sessions'))
=> [{"description"=>"",
  "devtoolsFrontendUrl"=>"/devtools/inspector.html?ws=0.0.0.0:3333/devtools/page/A87609DAEBF35E0FF8AF4E0C4FD64D4D",
  "id"=>"A87609DAEBF35E0FF8AF4E0C4FD64D4D",
  "title"=>"about:blank",
  "type"=>"page",
  "url"=>"about:blank",
  "webSocketDebuggerUrl"=>"ws://0.0.0.0:3333/devtools/page/A87609DAEBF35E0FF8AF4E0C4FD64D4D",
  "port"=>"33131",
  "browserId"=>"e2bb3742-d16a-4c94-931a-29336a8153cf",
  "trackingId"=>nil,
  "browserWSEndpoint"=>"ws://0.0.0.0:3333/devtools/browser/e2bb3742-d16a-4c94-931a-29336a8153cf"}]

But when i run browser.quit, it doesn't close the session, leaving it open forever:

[4] pry(main)> browser.quit
=> nil

[5] pry(main)> JSON.parse(RestClient.get('http://chrome:3333/sessions'))
=> [{"description"=>"",
  "devtoolsFrontendUrl"=>"/devtools/inspector.html?ws=0.0.0.0:3333/devtools/page/A87609DAEBF35E0FF8AF4E0C4FD64D4D",
  "id"=>"A87609DAEBF35E0FF8AF4E0C4FD64D4D",
  "title"=>"about:blank",
  "type"=>"page",
  "url"=>"about:blank",
  "webSocketDebuggerUrl"=>"ws://0.0.0.0:3333/devtools/page/A87609DAEBF35E0FF8AF4E0C4FD64D4D",
  "port"=>"33131",
  "browserId"=>"e2bb3742-d16a-4c94-931a-29336a8153cf",
  "trackingId"=>nil,
  "browserWSEndpoint"=>"ws://0.0.0.0:3333/devtools/browser/e2bb3742-d16a-4c94-931a-29336a8153cf"}]

The session is only closed after I close the IRB console and terminate the process.

[6] pry(main)> exit 0
root@c31d1e2b8b58:/myapp# curl http://chrome:3333/sessions
[]

When using Ferrum in a long running process like Sidekiq/Puma, this results in thousands of sessions open since the process is never terminated.

I guess that it might also explain the issues with zombie processes previously reported ( #364 ).

Is browser.close the correct way of terminating the session or am I missing something?

@route
Copy link
Member

route commented Nov 25, 2023

FIrst of all I cannot reproduce it with docker run -p 3000:3000 browserless/chrome.
Second it cannot explain zombie processes because simply there's no process created for Chrome, since it's running in Docker. What can happen when you call browser.quit is we simply close websocket connection to browserless, if you want to close the tab opened, just close it or dispose the whole context.

@route route closed this as completed Nov 25, 2023
@fernandomm
Copy link
Author

Thanks for the reply. I was able to dedicate more time and check why it works for you and wasn't work for me.

The issue seems to happen when using Docker, more specifically the internal network. Here is a repo to quickly reproduce the issue https://github.com/fernandomm/ferrum418#readme

Basically when I connect using the internal service name ( browserless ), it fails:

$ docker-compose exec app bash -l -c 'bundle exec rails runner /rails/bug.rb http://browserless:3000'
Number of sessions (initial): 0
Number of sessions (before browser.quit): 1
Number of sessions (after browser.quit): 1

But if I connect to the port that is exposed by Docker at the host, it works. In this case I'm using Docker for Mac, but I was also able to reproduce it in Linux and Docker Swarm.

$ docker-compose exec app bash -l -c 'bundle exec rails runner /rails/bug.rb http://host.docker.internal:3000'
Number of sessions (initial): 0
Number of sessions (before browser.quit): 1
Number of sessions (after browser.quit): 0

I understand that this issue is related to Docker/network and may not be related to the gem. But I'm trying to investigate it further although I have no experience with the CDP protocol.

Do you have any suggestions or helpful tips on what I should look into?

Thanks again.

@route
Copy link
Member

route commented Dec 5, 2023

I'm not sure if there is any difference between docker run -p 3000:3000 ghcr.io/browserless/chrome and docker run -p 3000:3000 browserless/chrome because I personally don't use browserless, but looking at the output there's. I think it's not docker issue at all, it's implementation of browserless, whatever sits in front of chrome can close or not session. I don't think that you should just simply disconnect, just dispose the whole context or close the page page.close before moving on, this would be a correct behavior instead of just diconnecting.

Don't expect browser.quit to do any job, because it kills only chrome it spawned before. Suggestion is, before disconnect, close the page ;)

@fernandomm
Copy link
Author

I tried to use page.close but it didn't made any difference. It still left "about:blank" sessions open which were only terminated after Browserless's CONNECTION_TIMEOUT is reached.

In one of my tests, I added a simple nginx proxy in front of browserless/chrome container. After that the issue went away.

Now browser.quit works as expected inside docker and closes the session immediately.

I don't know what nginx does differently but, since I have dedicated more time than expected to this issue, I will just accept and use it :)

Thanks a lot for the help. I'm leaving the nginx conf below in case someone experience a similar error.

upstream browserless {
  zone upstream_dynamic 64k;
  server browserless:3000;
}

server {
  proxy_next_upstream error timeout http_500 http_503 http_429 non_idempotent;
  listen 80;

  location / {
    proxy_pass http://browserless;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection 'upgrade';
    proxy_set_header Host $host;
    proxy_cache_bypass $http_upgrade;
    proxy_connect_timeout 10;
    proxy_send_timeout 900;
    proxy_read_timeout 900;
    send_timeout 900;
  }
}

@sloanesturz
Copy link
Contributor

Hello! I have almost the exact same issue. I changed my code to call .command('Browser.close') at the end of its use. This seems to really shut down the connection on Browersless's side -- instead of waiting for the long timeout.

def with_browser(&block)
  browser =
    Ferrum::Browser.new(url: MY_BROWSERLESS_DOCKER_URL)

    results = block.call(browser)

    browser.command('Browser.close') # this really closes the browser, more than just .quit
    browser.quit

    results
end

@route
Copy link
Member

route commented Feb 23, 2024

@sloanesturz mind opening a PR with Browser.close command added to browser?

@Nakilon
Copy link
Contributor

Nakilon commented Jun 30, 2024

the issues with zombie processes previously reported ( #364 ).

Dockerhub link there has rotten. Here is new one: https://docs.docker.com/compose/compose-file/05-services/#init

@route
Copy link
Member

route commented Nov 20, 2024

@Nakilon good point!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants