Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for Kernel Management (view and shutdown running kernels both local and remote) #1378

Open
nikitakit opened this issue Sep 24, 2020 · 46 comments
Assignees
Labels
feature-request Request for new features or functionality notebook-kernel Kernels issues (start/restart/switch/execution, install ipykernel)
Milestone

Comments

@nikitakit
Copy link

Feature: Notebook Editor, Interactive Window, Python Editor cells

I'd like to request manual kernel management, following the principles of: no kernel will ever be started without the user's explicit intent, and no kernel will ever be shut down without the user's intent.

Description

I just switched to using vscode about a month ago, and I find myself missing a large number of kernel management features that I'm used to from Atom+Hydrogen.

Here are some undesirable behaviors that I've observed:

  • Quitting VScode will kill any kernels associated with Python interactive windows. There is no way to disable this behavior.
  • Occasionally, vscode will demand a reload. For example when Remote-SSH loses connection to a remote host, it will eventually give up on reconnecting and force you to reload vscode (there seems to be no way to disable this behavior). When VScode comes back online, any kernels associated with python interactive windows are gone (also no way to disable this). Ergo, there is no way to persist python sessions on a laptop, where power management will inevitably trigger sleep state and SSH disconnect. This is 100% a Python extension issue, because even if the SSH extension didn't force a reload there will always be another buggy extension that does. In fact, I'm afraid to install any new extensions because that would trigger a reload and delete all my work in the Python interactive window.
  • I can't find a way to attach the Python interactive window to an existing kernel. ("Existing Kernel" means, for example, one I spawned via the Jupyter notebook web interface). I have vscode python configured to use the exact same notebook server, but how do I actually connect to a kernel?
  • Opening a notebook automatically creates a new kernel with the default python interpreter. The "Disable Jupyter Auto Start" setting doesn't actually fix this. With Jupyter auto start disabled, opening a notebook does nothing (as expected), but the moment I go to select the kernel/interpreter type that I actually want, vscode will in the background auto-spawn a kernel with the current default type before I get a chance to finish navigating the menus and select the type I actually want.

All of these behaviors are driving me crazy. I just want a simple mode of operation where no kernel will be started without my consent, and no kernel will ever be shut down without my consent. I understand how the current behavior may be useful as a default, but there's not even an option for an alternative. For example in Atom+Hydrogen, using automatic kernel detection will trigger automatic kernel management just like vscode, but the moment you manually specify a Jupyter server URL it will switch to manual management.

Why is all of this a big issue for my workflow? Well, I work on machine learning, which seems to be well in scope of "data science" tools. My ML workloads have two key characteristics: (a) training can take hours or days and is not robust to unanticipated restarts (b) a GPU must be allocated for all python processes. Automatic shutdowns can cause me to lose up to days of work if I didn't save the model recently. Automatic startup and shutdown is also an issue because it creates a messy interaction with GPU scheduling/allocation. With Atom+Hydrogen things are so much simpler because I can just start a kernel manually, give it a GPU allocation, and I never have to worry about the tools creating extraneous kernels or deleting all my work when I didn't intend for that to happen.

Microsoft Data Science for VS Code Engineering Team: @rchiodo, @IanMatthewHuff, @DavidKutu, @DonJayamanne, @greazer, @joyceerhl

@DonJayamanne
Copy link
Contributor

Thanks for filing this issue. We'll discuss these feature requests in our triage meeting

@greazer greazer changed the title Manual Kernel Management Allow for Manual Kernel Management (turn off auto-start and auto-end) Sep 24, 2020
@achalddave
Copy link

I'd like to voice agreement, as I have the same frustrations, as do others I have introduced vscode notebooks too. Perhaps the key issue for me is a subset of @nikitakit's concerns: It's extremely difficult to have a persistent kernel across ssh disconnects, vscode restarts, closing the jupyter tab, etc.

Ideally, I would like to use VSCode's jupyter notebooks the way I use Jupyter notebooks in the browser: When I open a notebook, a kernel is started which is associated with that notebook. Regardless of whether I close the tab, close my browser, or restart my laptop (assuming the Jupyter server is running remotely), when I re-open the notebook, the same kernel will be used. This is incredibly helpful, even in simple settings where code may take ~15 mins to run and I want to put my laptop to sleep and get coffee. (Please let me know if there is an existing way to do this, or if this is the wrong place to post!)

@rchiodo
Copy link
Contributor

rchiodo commented Oct 7, 2020

Theoretically (haven't tested it) if you start the notebook server yourself and then pick it as a remote server, the notebook should reconnect to the same kernel on reopening (we save the live kernel id for remote sessions).

@rchiodo
Copy link
Contributor

rchiodo commented Oct 7, 2020

The problem with implementing this the way Jupyter does is that there's nothing that would shutdown the server (if we start it). In your example of using the browser, you (or somebody else) started the Jupyter server. You're responsible for shutting it down. In the case where we start the jupyter server (or just the kernel as we do now), we need to close it down at some point.

Would it work if we didn't close kernels on notebook close, but rather only when shutting down VS code? I guess I'm asking if when you take a break from VS code do you leave it running? I believe if we didn't shutdown kernels on notebook close, the sleep of notebook wouldn't shutdown the kernel (could be wrong though, depends upon what VS code does to the extension host process on sleep).

@achalddave
Copy link

achalddave commented Oct 7, 2020

Thanks for the quick reply! Maybe this is a bug or user error, because I tried the following before posting:

  1. Start Jupyter server manually
  2. Open vscode notebook "a.ipynb", select jupyter server from (1)
  3. "Reload" VScode
  4. Open "a.ipynb"

My understanding is that (2) spawned the kernel, (3) closed the kernel, and (4) spawned a new kernel. Concretely, (4) did not open with the same python state (e.g. variables) as (2) was in before I closed it.

Usually, I don't explicitly close VS code when I take a break. However, when my laptop goes to sleep, my ssh disconnects, and reloading VSCode is a quick solution to reconnecting (as suggested by this dialog)
Screen Shot 2020-10-07 at 10 01 53 AM

Ideally, maybe VSCode should only shut down a kernel if it is also going to shut down the server. Thus, a notebook->kernel mapping is created per server, and is never changed unless the user explicitly requests a new kernel or shuts down the server (manually, or automatically because the server was started by VSCode). Does this make sense? I'm not super familiar with Jupyter servers and kernels except as a user, so I am likely to be missing key issues, but hopefully this provides an understanding of the behavior I think would be useful.

Edit: Note, this was with VSCode-Insiders using the "preview notebook." I can retry this with the standard notebook in a bit.

@rchiodo
Copy link
Contributor

rchiodo commented Oct 7, 2020

I believe preview-notebook doesn't work with remote so well. Can you try with the original notebook editor?

What should have happened is:

  1. Server is created
  2. We create a kernel for an empty notebook if we can't find a match. Make sure you 'save' the notebook.
  3. We don't touch the session (kernel should still be running)
  4. Open the ipynb and it should load the kernel using the live session id.

@rchiodo
Copy link
Contributor

rchiodo commented Oct 7, 2020

Thus, a notebook->kernel mapping is created per server, and is never changed unless the user explicitly requests a new kernel or shuts down the server (manually, or automatically because the server was started by VSCode).

This should already be happening for remote servers. We only kill kernels for local (owned by us) sessions. For local, we currently close the kernels when closing the window associated with a notebook. We could delay this until the whole of VS code shuts down.

@achalddave
Copy link

achalddave commented Oct 7, 2020

Huh, you're totally right, my bad. I just tried steps 1-4 again with the preview notebook standard notebook, and even tried quitting vscode entirely, reloading vscode, and toggling my wifi. In all cases the variables I instantiated in step (2) were maintained when I re-opened the notebook in step (4). I'm not really sure what happened the last time I tried this, so I'll keep trying this and report back if it happens again. Thank you for the help!

Edit: I originally said it was the preview notebook, but I was actually using the standard one. This does not work with the preview notebook.

@achalddave
Copy link

Quick update: I realized I was using the standard notebook, not the preview notebook when I posted above. The preview notebook doesn't seem to use the same kernel after a reload; is there an existing bug I can follow for this? I tried searching but couldn't find it.

@rchiodo
Copy link
Contributor

rchiodo commented Oct 7, 2020

@rchiodo
Copy link
Contributor

rchiodo commented Oct 7, 2020

Sorry that's the overall 'kernel' fixup for native notebooks. This one is specifically remember remote kernel ids:
https://github.com/microsoft/vscode-python/issues/13249

@nikitakit
Copy link
Author

In your example of using the browser, you (or somebody else) started the Jupyter server. You're responsible for shutting it down.

I would like to point out that I'm using vscode with a jupyter notebook server I started myself (I just provide the extension with the URL including a token), and I still lose all my work whenever the SSH extension forces an editor reload. I'm using the Python interactive window.

@rchiodo
Copy link
Contributor

rchiodo commented Oct 8, 2020

I would like to point out that I'm using vscode with a jupyter notebook server I started myself (I just provide the extension with the URL including a token), and I still lose all my work whenever the SSH extension forces an editor reload. I'm using the Python interactive window

Yes the interactive window does not reuse kernels. It would be weird if it did. Perhaps we can detect the editor reload case but not sure.

@nikitakit
Copy link
Author

Yes the interactive window does not reuse kernels. It would be weird if it did.

I'm sad to hear this. 😢

As someone who used Hydrogen before, I've seen firsthand the benefits of working with plain python files (that have cells delimited by # %%) over the ipynb format (which works poorly for version control, running as a standalone script, and collaborating with people who don't use Jupyter). At this point I could never go back to using notebooks except for simple solo projects that never outgrow a single file.

@rchiodo
Copy link
Contributor

rchiodo commented Oct 9, 2020

As someone who used Hydrogen before, I've seen firsthand the benefits of working with plain python files (that have cells delimited by # %%) over the ipynb format (which works poorly for version control, running as a standalone script, and collaborating with people who don't use Jupyter). At this point I could never go back to using notebooks except for simple solo projects that never outgrow a single file.

Did Hydrogen leave the kernel around even after closing a python file? I think we'd have to do some other way of closing/creating kernels then. Some sort of kernel management story outside of what we do today.

@nikitakit
Copy link
Author

As someone who used Hydrogen before, I've seen firsthand the benefits of working with plain python files (that have cells delimited by # %%) over the ipynb format (which works poorly for version control, running as a standalone script, and collaborating with people who don't use Jupyter). At this point I could never go back to using notebooks except for simple solo projects that never outgrow a single file.

Did Hydrogen leave the kernel around even after closing a python file? I think we'd have to do some other way of closing/creating kernels then. Some sort of kernel management story outside of what we do today.

When connected to an outside notebook server, kernels would remain open until you shut them down manually (or stopped the notebook server itself).

@rchiodo
Copy link
Contributor

rchiodo commented Oct 9, 2020

How did it know which kernel to use for a file? Did it associate one per python file?

@nikitakit
Copy link
Author

How did it know which kernel to use for a file? Did it associate one per python file?

I used hydrogen in a kernel-per-file mode. When you first open a file, it starts out with no associated kernel. Then you can run a command to open up a kernel switcher that lets you select what Jupyter server you want and whether you want to connect to an existing kernel or spawn a new one.

I believe hydrogen also had a global setting for having a single active Python kernel per editor window. With this setting enabled, the Python kernel would be reused automatically, but you wouldn't be able to get a second one without opening a new editor window. (The "one kernel" restriction in this mode is actually per programming language, so you could still have a python kernel for python files and then a Julia kernel in addition to that)

@achalddave
Copy link

achalddave commented Oct 9, 2020

Huh, you're totally right, my bad. I just tried steps 1-4 again with the preview notebook standard notebook, and even tried quitting vscode entirely, reloading vscode, and toggling my wifi. In all cases the variables I instantiated in step (2) were maintained when I re-opened the notebook in step (4). I'm not really sure what happened the last time I tried this, so I'll keep trying this and report back if it happens again. Thank you for the help!

For completeness (apologies for deviating from the original issue), I'd like to update and say this seems to only work if the kernel was started outside VSCode (e.g., by opening the notebook in the browser once and then attaching to that kernel manually). Even if the jupyter server is started outside of VSCode, but the kernel is started by VSCode when opening a file, VSCode kills the kernel when the tab is closed.

I used hydrogen in a kernel-per-file mode. When you first open a file, it starts out with no associated kernel. Then you can run a command to open up a kernel switcher that lets you select what Jupyter server you want and whether you want to connect to an existing kernel or spawn a new one.

This sounds like an incredibly useful feature. The main reason I'm trying to use notebooks instead of the interactive window is because I need persistence, and I figured notebooks would be easier to make persistent than interactive windows.

However, as far as I can tell, the hacky fix for both notebooks and interactive windows is to start the python kernel (not just the server) outside of VSCode and manually attach to that kernel. For notebooks, this setting will be remembered across sessions. For interactive windows, you'll need to re-attach to that kernel every time the tab or window is closed, but VSCode won't kill the kernel, at least on my version (1.50.0-insider, python extension 2020.9.114305). Not sure if there's an easier way.

@DonJayamanne DonJayamanne transferred this issue from microsoft/vscode-python Nov 13, 2020
@rchiodo
Copy link
Contributor

rchiodo commented Jan 29, 2021

I'm dogfooding today and I find the lack of kernel management disturbing. I close a window and all the kernel data is lost? I hit close by accident and now I have to recreate my state?

I really want a way to recreate my state on next startup. If this requires leaving the kernel running until I close it, that seems okay. Alternatively if the kernel could just be rehydrated, that would be cool too. (Kernel rehydration sounds interesting because you could send it places).

@thakkarparth007
Copy link

thakkarparth007 commented Feb 10, 2021

The problem with implementing this the way Jupyter does is that there's nothing that would shutdown the server (if we start it). In your example of using the browser, you (or somebody else) started the Jupyter server. You're responsible for shutting it down. In the case where we start the jupyter server (or just the kernel as we do now), we need to close it down at some point.

Does it make sense to offer two options to users:

  1. First one shuts down the jupyter server once the VSCode window exits (while explicitly asking user's permission before closing). However, it might be tricky to handle this in cases such as me remoting to a server via vscode, and my laptop shutting down due to power failure. [This isn't the option I'd prefer].
  2. Second, and the option I prefer is to let the server stay alive even when VSCode window exits. If I'm remoting to a server, this would work perfectly. I can run a notebook for a long running task, close my VSCode window and do something else. When I remote into the server again, I'd want VSCode to connect to that running jupyter server. If this means a VSCode server instance has to keep running on the remote, that sounds fine to me (although not sure if that's required). Even if I'm not remoting, and just wanting to have a long running notebook on my local system, this should still work. When I open VSCode again, I should see the list of running kernels.

Essentially, I'm asking for jupyter-like behaviour, except that in option 2, the user decides when to start and shut down the server, no matter where it's running (local/remote). The user will be responsible for resource management. We're anyway used to doing that with Jupyter, this wouldn't be any different.

And a panel that shows running kernels etc. would help greatly in making this whole thing transparent.

@zpincus
Copy link

zpincus commented May 12, 2021

In every other context, vscode seems to go to great lengths to maintain state across reloads / reconnects. So it's very surprising that vscode would nuke its jupyter server / kernels every time the window is reloaded or (if running as a ssh remote) the remote server is disconnected / reconnected. This strikes me as a pretty major misfeature.

For example, if a terminal is open (locally or remote) and you reload the window, the state is maintained. If you disconnect from a remote session and then reconnect, the same terminal window is waiting for you with zero loss of state. This state preservation is one of the best features of the remote tools, and was clearly a design priority.

So it's especially weird that the jupyter notebooks are the opposite, and (by design) lose their state so easily. Conceptually, a notebook session should be no different than a terminal session, and should have keep-alive behavior that matches. I would think it really should be considered a bug for a vscode-started jupyter server and its notebook sessions to end in circumstances that wouldn't end a terminal session.

The problem is of course that there's a way to explicitly terminate a terminal session when the user is done with it, but not so for notebooks. A dialog asking whether vscode should terminate the session when closing a notebook's tab (and maybe when closing the whole window?) would probably be sufficient. It could also help to vscode explicitly pop up an alert when reconnecting to an existing session (with a "do you want to restart the session" as an option). Again, following the same keep-alive rules that the terminal does for its embedded sessions would be a really good start.

It's good that some of these issues can be worked around by running an external jupyter server. (Of course, there are some bugs with that #5862...) But really there's no reason that servers started by vscode (which are much easier to work with, and play better with the debugger etc.) should not play by the same rules. Trashing user state like this is really a dealbreaker for doing substantial work with notebooks, especially on remote servers.

@hawktang
Copy link

hawktang commented Jun 22, 2021

I have test with manual setup server, with password or token.

The kernel will be restart everytime vscode open or close the ipynb file.

image

@DanielHabenicht
Copy link

DanielHabenicht commented Jan 31, 2022

Also want to say I agree.
I am bringing yet another use case: Installing Python Packages. (on WindowsTM)
If the Kernel is still running any interaction with its .dll will end up in

ERROR: Could not install packages due to an OSError: [WinError 5] Access is denied: 'C:\\Users\\danie\\AppData\\Local\\Programs\\Python\\Python310\\Lib\\site-packages\\pywin32_system32\\pythoncom310.dll'
Consider using the `--user` option or check the permissions.

So reinstalling packages with a pip install -I -r requirements.txt will end up in an error until one closes VSCode or kills the process.

@DonJayamanne DonJayamanne self-assigned this Jan 31, 2022
@greazer greazer added feature-request Request for new features or functionality and removed enhancement labels May 4, 2022
@DonJayamanne
Copy link
Contributor

DonJayamanne commented Aug 11, 2022

Hi everyone, today you can manage some of the existing kernels (shut them down) using the following extension
https://marketplace.visualstudio.com/items?itemName=ms-toolsai.vscode-jupyter-powertoys

You'll need to open the Jupyter panel to view the running kernels and from there you can shut it down.
Screen Shot 2022-08-12 at 04 36 32

@DonJayamanne DonJayamanne removed their assignment Aug 22, 2022
@cossio
Copy link

cossio commented Nov 20, 2022

+1 Leaving long computations running in a persistent kernel would be very nice. This is the main reason I use JupyterLab instead of vscode right now.

@DonJayamanne DonJayamanne self-assigned this Nov 20, 2022
@urlicht

This comment was marked as off-topic.

@DonJayamanne

This comment was marked as resolved.

@urlicht
Copy link

urlicht commented Dec 20, 2022

It'd be awesome to be able to start computation in vscode, move onto a different machine and then be able to have the same kernel open/running on viscose and/or jupyterlab

You'd need a kernel running on a remote server for this to be possible, you might want to try using remote SSH, github codespaces or the like for that.

We already have Jupyter servers on the remote clusters and can run kernels via vscode. The problem is as @cossio mentioned, vscode seems to quit the kernels if I quit the application. With Jupyter Lab, you can quit the browser and then later reconnect back to the same kernel (all computation retained), even from a different machine. My understanding is that's not currently possible on vscode because of this issue?

@dwahdany
Copy link

dwahdany commented Jan 17, 2023

Thanks a ton @zpincus for the explanation here! Would be absolutely lost without you :). Minor contribution, but wanted to add some even more explicit instruction for anyone reaching this issue without a lot of context on VSCode. Running on VS 1.64.0-insider build on Windows OS.

Steps

* Connect VSCode to remote server via RemoteSSH

* (optional) If you don't have a target jupyter server instance already, launch jupyter notebook from command line to run in background with `nohup jupyter notebook & `. This will ensure that your server persists across VS code sessions by setting up the port forwarding on the remote server as a background process. Copy the generated token seen in `tail nohup.out` (should look something like `http://localhost:8890/?token=be27f057c61ef5a258fc9f1cc989905b9085450e34305f44`).

* Select the icon in the bottom left specifying the jupyter connection (by default "Jupyter Server: Local"), select existing URI, paste in the link you copied from previous step. For better tracking of kernels, you can also look at opening the link in your typical explorer.
  ![image](https://user-images.githubusercontent.com/28548757/147990148-109e5fff-42f9-4460-b2d8-797baf17972a.png)

* Run some commands in your .ipynb file

* Go to select kernel, and specify the session-specific kernel for that notebook in question
  ![image](https://user-images.githubusercontent.com/28548757/147990067-e21098d1-3125-4ab8-aee0-052cbc67b863.png)
  ![image](https://user-images.githubusercontent.com/28548757/147990428-85787a4c-cad3-4a00-9738-56ce74cfcc74.png)

* Reopen notebook/restart VSCode, observe that environment state is persisted across sessions.

Is this tedious to do for every individual notebook one wishes to persist? Yes, absolutely. Is it less tedious than having to rerun remote notebooks anytime your computer snoozes/sleeps? ¯_(ツ)_/¯

Thanks for the detailed response. I'm wondering whether your answer is missing a step. How can one start a new jupyter session from vscode? From my understanding, it's required to start the session from jupyter notebook/lab and only then can you select it in vscode, correct?

Or am I just missing some options to create a new session here?

Edit: For anyone new to vscode and wondering the same thing: the very top option should create a new kernel. (I had another bug that prevented it from working, hence the confusion)

image

@DonJayamanne DonJayamanne modified the milestone: May 2023 Apr 28, 2023
@MovsisyanM
Copy link

Mine doesn't even show the list of kernels when I try to connect through the "http://localhost:{PORT}/?token={TOKEN}" existing jupyter instance. I would very much like the feature of jupyter notebook and jupyter lab of exiting the file and still having the kernel do it's job (training a model / data processing / et cetera). I am left to modify the code via vscode and run it via jupyter notebook which is silly

@DonJayamanne DonJayamanne changed the title Allow for Manual Kernel Management (turn off auto-start and auto-end) Allow for Kernel Management Oct 27, 2023
@DonJayamanne DonJayamanne changed the title Allow for Kernel Management Allow for Kernel Management (shtudown kernels, view running kernels, local and remote) Oct 27, 2023
@DonJayamanne DonJayamanne changed the title Allow for Kernel Management (shtudown kernels, view running kernels, local and remote) Allow for Kernel Management (view and shutdown running kernels both local and remote) Oct 27, 2023
@syo093c
Copy link

syo093c commented Nov 2, 2023

Here is how I deal with this problem.

  1. Open a jupyter server manually at tmux
  2. select jupyter server at vscode

This way even if you quit vscode, the jupyter server still exists and all the work is still there the next time you link vscode.

@DonJayamanne
Copy link
Contributor

This way even if you quit vscode, the jupyter server still exists and all the work is still there the next time you link vscode.

@syo093c Please upvote this issue #3998

@do-me
Copy link

do-me commented Nov 20, 2023

Thanks @syo093c! To make it more concrete:

Tutorial

  1. Open local VS Code and connect to remote server via SSH as usual 
  2. Open a terminal in VS Code and attach a new tmux session with tmux new-session -s jupyter_session
  3. Run jupyter server. It will serve jupyter on localhost, default should be localhost:8888
  4. In VS Code go to the kernel settings and select Select Another Kernel then Existing Jupyter Server and enter localhost:8888 (or any other port it returned in step 3).

image

 

image

 

image

That's it. As tmux always keeps the sesion alive, Jupyter doesn't care about your client shutting down or disconnecting. Note that theoretically more clients could connect to the same session. 

If you want to shut down the server eventually just reattach to the tmux session with tmux attach-session -t jupyter_session and kill the server with CTRL+C.

@DonJayamanne
Copy link
Contributor

DonJayamanne commented Nov 20, 2023

Persistent Jupyter Sessions between VS Code Reloads/re-connects

For those looking at persistent jupyter connections, such as @do-me @syo093c @MovsisyanM
Please could you upvote this issue #3998

I'm sorry for pinging each of you, however the upvotes on issues help us prioritise issues.
Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request Request for new features or functionality notebook-kernel Kernels issues (start/restart/switch/execution, install ipykernel)
Projects
None yet
Development

No branches or pull requests