Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kill notebook its self when server cull idle kernel #5441

Closed
wants to merge 1 commit into from
Closed

kill notebook its self when server cull idle kernel #5441

wants to merge 1 commit into from

Conversation

levinxo
Copy link
Contributor

@levinxo levinxo commented May 12, 2020

As we know, kernelManager will periodic check if kernel is inactivity. When set MappingKernelManager.cull_connected=True, it will cull idle kernel with connections. Then, when NotebookApp.shutdown_no_activity_timeout has exhausted, jupyter will call tornado's ioloop.stop() to stop the whole server and release resources. Now we can find jupyter process hanging and will not exit automatic.

Detail info (debug with tcpdump / wireshark / strace / telnet):

When kernel killed by kernel manager, it reply client a msg with msg_type=shutdown_reply, but client does not handle this msg. Tornado server will wait for websocket connection produced by client, and never exit for a very long time.

When we telnet tornado's port 8888, its status is listening but not accept new connection.

Debug jupyter process with strace we can find it block in a event: poll([{fd=280, events=POLLIN}], 1, -1.

@levinxo levinxo closed this May 12, 2020
@levinxo levinxo reopened this May 12, 2020
@kevin-bates kevin-bates mentioned this pull request May 12, 2020
24 tasks
@blink1073 blink1073 added this to the 6.1 milestone May 15, 2020
Copy link
Member

@Zsailer Zsailer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes here look good to me—but the tests are failing. @levinxo, can you verify that the tests pass locally?

@levinxo
Copy link
Contributor Author

levinxo commented May 17, 2020

I have tested it locally by running python3 -m notebook.jstest services. it has the same error with travis-ci produced. I will work on it to find out why and try to fix it :)

@kevin-bates
Copy link
Member

I can't really comment on the code change or test issue, but, from a functional standpoint, I've confirmed that these changes address the issue (after reproducing the issue prior to running with these changes).

Thanks @levinxo.

@levinxo levinxo closed this Jul 9, 2020
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 23, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants