Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛[bug] Kernel status: pending #8976

Closed
rikirolly opened this issue Mar 8, 2024 · 11 comments
Closed

🐛[bug] Kernel status: pending #8976

rikirolly opened this issue Mar 8, 2024 · 11 comments
Labels

Comments

@rikirolly
Copy link

rikirolly commented Mar 8, 2024

Describe the bug

I deployed Determined on AWS EKS Kubernetes cluster using the Helm chart. I tested to execute the MNIST experiment and worked perfectly. When I try to launch a new Jupyter Notebook, the new pod with the determined-container start correctly and the new window showing the Jupyter Notebook page. When I try to start a New Notebook the Kernel status remain in connecting state.

Reproduction Steps

  1. Deployed Determined on AWS EKS Kubernetes cluster using the Helm chart
  2. Launch Jupyter
  3. Start a new Notebook

Expected Behavior

Run correctly the notebook

Screenshot

image

Environment

  • AWS G5 Instance
  • Kubernetes
  • Chrome
  • Version 122.0.6261.111 (Official Build) (64-bit)

Additional Context

No response

@rikirolly rikirolly added the bug label Mar 8, 2024
@ioga
Copy link
Contributor

ioga commented Mar 8, 2024

hello, this is new and unexpected, we've never seen an issue like that.

can you please share the task logs after you try to launch a kernel? (go to Tasks in det UI, find your notebook, click triple dots on the right -> View logs)

are you using the default environment image or a custom one?
if default, which version of determined is this? if custom, does the default image work?

@rikirolly
Copy link
Author

Hi @ioga,
I deployed the original image: determined 0.28.1.
This is the log:

<info>    [2024-03-08 19:05:58] || INFO: Scheduling Prova (id: 8dfd2288-b405-417d-810e-dcf7c72b8618.1)
<info>    [2024-03-08 19:05:59] [1d1099dc] Pod cmd-8dfd2288-b405-417d-810e-dcf7c72b8618-0-8dfd2288-b405-417d-810e-dcf7c72b8618.1-superb-salmon: Waiting for resources. 0 GPUs are available, 1 GPUs required
<info>    [2024-03-08 19:06:00] [1d1099dc] Pod cmd-8dfd2288-b405-417d-810e-dcf7c72b8618-0-8dfd2288-b405-417d-810e-dcf7c72b8618.1-superb-salmon: Pod should schedule on: machine/roj-fkp9r
<info>    [2024-03-08 19:07:28] [1d1099dc] Pod cmd-8dfd2288-b405-417d-810e-dcf7c72b8618-0-8dfd2288-b405-417d-810e-dcf7c72b8618.1-superb-salmon: Pod resources allocated.
<info>    [2024-03-08 19:07:28] || INFO: Prova was assigned to an agent
<info>    [2024-03-08 19:07:29] [1d1099dc] Pod cmd-8dfd2288-b405-417d-810e-dcf7c72b8618-0-8dfd2288-b405-417d-810e-dcf7c72b8618.1-superb-salmon: Pulling image "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-f66cbce"
<info>    [2024-03-08 19:10:24] [1d1099dc] Pod cmd-8dfd2288-b405-417d-810e-dcf7c72b8618-0-8dfd2288-b405-417d-810e-dcf7c72b8618.1-superb-salmon: Successfully pulled image "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-f66cbce" in 2m55.083s (2m55.083s including waiting)
<info>    [2024-03-08 19:10:24] [1d1099dc] Pod cmd-8dfd2288-b405-417d-810e-dcf7c72b8618-0-8dfd2288-b405-417d-810e-dcf7c72b8618.1-superb-salmon: Created container determined-init-container
<info>    [2024-03-08 19:10:25] [1d1099dc] Pod cmd-8dfd2288-b405-417d-810e-dcf7c72b8618-0-8dfd2288-b405-417d-810e-dcf7c72b8618.1-superb-salmon: Started container determined-init-container
<info>    [2024-03-08 19:12:38] [1d1099dc] Pod cmd-8dfd2288-b405-417d-810e-dcf7c72b8618-0-8dfd2288-b405-417d-810e-dcf7c72b8618.1-superb-salmon: Container image "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-f66cbce" already present on machine
<info>    [2024-03-08 19:12:39] [1d1099dc] Pod cmd-8dfd2288-b405-417d-810e-dcf7c72b8618-0-8dfd2288-b405-417d-810e-dcf7c72b8618.1-superb-salmon: Created container determined-container
<info>    [2024-03-08 19:12:39] [1d1099dc] Pod cmd-8dfd2288-b405-417d-810e-dcf7c72b8618-0-8dfd2288-b405-417d-810e-dcf7c72b8618.1-superb-salmon: Started container determined-container
<info>    [2024-03-08 19:12:39] [1d1099dc] Resources for Prova have started
<warning> [2024-03-08 19:12:41] [1d1099dc] Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
<info>    [2024-03-08 19:12:41] [1d1099dc] [26] determined: detected 1 gpus
<info>    [2024-03-08 19:12:41] [1d1099dc] [26] determined: detected 1 gpus
<info>    [2024-03-08 19:12:41] [1d1099dc] [26] determined: Running task container on agent_id=ip-172-31-190-126.eu-central-1.compute.internal, hostname=cmd-8dfd2288-b405-417d-810e-dcf7c72b8618-0-8dfd2288-b405-417d-8 with visible GPUs ['GPU-ac319bae-9d4b-405b-9458-026eb40ad7ca']
<>        [2024-03-08 19:12:41] [1d1099dc] + test -f startup-hook.sh
<>        [2024-03-08 19:12:41] [1d1099dc] + set +x
<warning> [2024-03-08 19:12:42] [1d1099dc] [ServerApp] ServerApp.token config is deprecated in 2.0. Use IdentityProvider.token.
<warning> [2024-03-08 19:12:42] [1d1099dc] [ServerApp] A `_jupyter_server_extension_points` function was not found in nbclassic. Instead, a `_jupyter_server_extension_paths` function was found and will be used for now. This function name will be deprecated in future releases of Jupyter Server.
<warning> [2024-03-08 19:12:42] [1d1099dc] [ServerApp] A `_jupyter_server_extension_points` function was not found in notebook_shim. Instead, a `_jupyter_server_extension_paths` function was found and will be used for now. This function name will be deprecated in future releases of Jupyter Server.
<info>    [2024-03-08 19:12:42] [1d1099dc] [ServerApp] jupyter_archive | extension was successfully linked.
<info>    [2024-03-08 19:12:42] [1d1099dc] [ServerApp] jupyter_server_terminals | extension was successfully linked.
<info>    [2024-03-08 19:12:42] [1d1099dc] [ServerApp] jupyterlab | extension was successfully linked.
<info>    [2024-03-08 19:12:42] [1d1099dc] [ServerApp] nbclassic | extension was successfully linked.
<info>    [2024-03-08 19:12:42] [1d1099dc] [ServerApp] Writing Jupyter server cookie secret to /run/determined/jupyter/runtime/jupyter_cookie_secret
<info>    [2024-03-08 19:12:42] [1d1099dc] [ServerApp] notebook_shim | extension was successfully linked.
<warning> [2024-03-08 19:12:42] [1d1099dc] [ServerApp] All authentication is disabled.  Anyone who can connect to this server will be able to run code.
<info>    [2024-03-08 19:12:42] [1d1099dc] [ServerApp] notebook_shim | extension was successfully loaded.
<info>    [2024-03-08 19:12:42] [1d1099dc] [ServerApp] jupyter_archive | extension was successfully loaded.
<info>    [2024-03-08 19:12:42] [1d1099dc] [ServerApp] jupyter_server_terminals | extension was successfully loaded.
<info>    [2024-03-08 19:12:42] [1d1099dc] [LabApp] JupyterLab extension loaded from /opt/conda/lib/python3.9/site-packages/jupyterlab
<info>    [2024-03-08 19:12:42] [1d1099dc] [LabApp] JupyterLab application directory is /opt/conda/share/jupyter/lab
<info>    [2024-03-08 19:12:42] [1d1099dc] [ServerApp] jupyterlab | extension was successfully loaded.
<>        [2024-03-08 19:12:42] [1d1099dc]
<>        [2024-03-08 19:12:42] [1d1099dc]   _   _          _      _
<>        [2024-03-08 19:12:42] [1d1099dc]  | | | |_ __  __| |__ _| |_ ___
<>        [2024-03-08 19:12:42] [1d1099dc]  | |_| | '_ \/ _` / _` |  _/ -_)
<>        [2024-03-08 19:12:42] [1d1099dc]   \___/| .__/\__,_\__,_|\__\___|
<>        [2024-03-08 19:12:42] [1d1099dc]        |_|
<>        [2024-03-08 19:12:42] [1d1099dc]
<>        [2024-03-08 19:12:42] [1d1099dc] Read the migration plan to Notebook 7 to learn about the new features and the actions to take if you are using extensions.
<>        [2024-03-08 19:12:42] [1d1099dc]
<>        [2024-03-08 19:12:42] [1d1099dc] https://jupyter-notebook.readthedocs.io/en/latest/migrate_to_notebook7.html
<>        [2024-03-08 19:12:42] [1d1099dc]
<>        [2024-03-08 19:12:42] [1d1099dc] Please note that updating to Notebook 7 might break some of your extensions.
<>        [2024-03-08 19:12:42] [1d1099dc]
<info>    [2024-03-08 19:12:42] [1d1099dc] [ServerApp] nbclassic | extension was successfully loaded.
<info>    [2024-03-08 19:12:42] [1d1099dc] [ServerApp] Serving notebooks from local directory: /
<info>    [2024-03-08 19:12:42] [1d1099dc] [ServerApp] Jupyter Server 2.12.5 is running at:
<info>    [2024-03-08 19:12:42] [1d1099dc] [ServerApp] https://cmd-8dfd2288-b405-417d-810e-dcf7c72b8618-0-8dfd2288-b405-417d-8:2936/proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/lab
<info>    [2024-03-08 19:12:42] [1d1099dc] [ServerApp]     https://127.0.0.1:2936/proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/lab
<info>    [2024-03-08 19:12:42] [1d1099dc] [ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
<info>    [2024-03-08 19:12:42] || INFO: Service of Prova is available
<info>    [2024-03-08 19:12:43] [1d1099dc] [ServerApp] 302 GET /proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/ (@172.31.38.210) 0.71ms
<warning> [2024-03-08 19:12:46] [1d1099dc] [LabApp] Could not determine jupyterlab build status without nodejs
<info>    [2024-03-08 19:13:43] [1d1099dc] [ServerApp] Creating new notebook in /run/determined/workdir
<info>    [2024-03-08 19:13:43] [1d1099dc] [ServerApp] Writing notebook-signing key to /run/determined/jupyter/data/notebook_secret
<info>    [2024-03-08 19:13:43] [1d1099dc] [ServerApp] Kernel started: 515c94f5-68d2-4adf-aebf-13049c345904
<warning> [2024-03-08 19:13:44] [1d1099dc] [ServerApp] 400 GET /proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/api/kernels/515c94f5-68d2-4adf-aebf-13049c345904/channels?session_id=e51a50d8-2295-4de5-85e8-3be2ea300a73 ([email protected]) 115.84ms referer=None
<warning> [2024-03-08 19:13:44] [1d1099dc] [ServerApp] 400 GET /proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/api/kernels/515c94f5-68d2-4adf-aebf-13049c345904/channels?session_id=fafa4b8f-b3ed-4f7c-a235-03b30e5fa4b0 ([email protected]) 1.40ms referer=None
<warning> [2024-03-08 19:13:44] [1d1099dc] [ServerApp] 400 GET /proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/api/kernels/515c94f5-68d2-4adf-aebf-13049c345904/channels?session_id=b7c8c3bd-14c6-4016-b559-e9ee8be551a9 ([email protected]) 1.11ms referer=None
<warning> [2024-03-08 19:13:44] [1d1099dc] [ServerApp] Replacing stale connection: 515c94f5-68d2-4adf-aebf-13049c345904:e51a50d8-2295-4de5-85e8-3be2ea300a73
<warning> [2024-03-08 19:13:44] [1d1099dc] [ServerApp] 400 GET /proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/api/kernels/515c94f5-68d2-4adf-aebf-13049c345904/channels?session_id=e51a50d8-2295-4de5-85e8-3be2ea300a73 ([email protected]) 1.05ms referer=None
<warning> [2024-03-08 19:13:45] [1d1099dc] [ServerApp] Replacing stale connection: 515c94f5-68d2-4adf-aebf-13049c345904:fafa4b8f-b3ed-4f7c-a235-03b30e5fa4b0
<warning> [2024-03-08 19:13:45] [1d1099dc] [ServerApp] 400 GET /proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/api/kernels/515c94f5-68d2-4adf-aebf-13049c345904/channels?session_id=fafa4b8f-b3ed-4f7c-a235-03b30e5fa4b0 ([email protected]) 1.24ms referer=None
<warning> [2024-03-08 19:13:45] [1d1099dc] [ServerApp] Replacing stale connection: 515c94f5-68d2-4adf-aebf-13049c345904:b7c8c3bd-14c6-4016-b559-e9ee8be551a9
<warning> [2024-03-08 19:13:45] [1d1099dc] [ServerApp] 400 GET /proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/api/kernels/515c94f5-68d2-4adf-aebf-13049c345904/channels?session_id=b7c8c3bd-14c6-4016-b559-e9ee8be551a9 ([email protected]) 1.18ms referer=None
<warning> [2024-03-08 19:13:45] [1d1099dc] [ServerApp] Replacing stale connection: 515c94f5-68d2-4adf-aebf-13049c345904:e51a50d8-2295-4de5-85e8-3be2ea300a73
<warning> [2024-03-08 19:13:45] [1d1099dc] [ServerApp] 400 GET /proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/api/kernels/515c94f5-68d2-4adf-aebf-13049c345904/channels?session_id=e51a50d8-2295-4de5-85e8-3be2ea300a73 ([email protected]) 1.19ms referer=None
<warning> [2024-03-08 19:13:45] [1d1099dc] [ServerApp] Replacing stale connection: 515c94f5-68d2-4adf-aebf-13049c345904:fafa4b8f-b3ed-4f7c-a235-03b30e5fa4b0
<warning> [2024-03-08 19:13:45] [1d1099dc] [ServerApp] 400 GET /proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/api/kernels/515c94f5-68d2-4adf-aebf-13049c345904/channels?session_id=fafa4b8f-b3ed-4f7c-a235-03b30e5fa4b0 ([email protected]) 1.16ms referer=None
<warning> [2024-03-08 19:13:46] [1d1099dc] [ServerApp] Replacing stale connection: 515c94f5-68d2-4adf-aebf-13049c345904:b7c8c3bd-14c6-4016-b559-e9ee8be551a9
<warning> [2024-03-08 19:13:46] [1d1099dc] [ServerApp] 400 GET /proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/api/kernels/515c94f5-68d2-4adf-aebf-13049c345904/channels?session_id=b7c8c3bd-14c6-4016-b559-e9ee8be551a9 ([email protected]) 1.11ms referer=None
<warning> [2024-03-08 19:13:46] [1d1099dc] [ServerApp] Replacing stale connection: 515c94f5-68d2-4adf-aebf-13049c345904:b7c8c3bd-14c6-4016-b559-e9ee8be551a9
<warning> [2024-03-08 19:13:46] [1d1099dc] [ServerApp] 400 GET /proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/api/kernels/515c94f5-68d2-4adf-aebf-13049c345904/channels?session_id=b7c8c3bd-14c6-4016-b559-e9ee8be551a9 ([email protected]) 1.55ms referer=None
<warning> [2024-03-08 19:13:46] [1d1099dc] [ServerApp] Replacing stale connection: 515c94f5-68d2-4adf-aebf-13049c345904:e51a50d8-2295-4de5-85e8-3be2ea300a73
<warning> [2024-03-08 19:13:46] [1d1099dc] [ServerApp] 400 GET /proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/api/kernels/515c94f5-68d2-4adf-aebf-13049c345904/channels?session_id=e51a50d8-2295-4de5-85e8-3be2ea300a73 ([email protected]) 1.07ms referer=None
<warning> [2024-03-08 19:13:47] [1d1099dc] [ServerApp] Replacing stale connection: 515c94f5-68d2-4adf-aebf-13049c345904:fafa4b8f-b3ed-4f7c-a235-03b30e5fa4b0
<warning> [2024-03-08 19:13:47] [1d1099dc] [ServerApp] 400 GET /proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/api/kernels/515c94f5-68d2-4adf-aebf-13049c345904/channels?session_id=fafa4b8f-b3ed-4f7c-a235-03b30e5fa4b0 ([email protected]) 1.24ms referer=None
<warning> [2024-03-08 19:13:52] [1d1099dc] [ServerApp] Replacing stale connection: 515c94f5-68d2-4adf-aebf-13049c345904:b7c8c3bd-14c6-4016-b559-e9ee8be551a9
<warning> [2024-03-08 19:13:52] [1d1099dc] [ServerApp] 400 GET /proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/api/kernels/515c94f5-68d2-4adf-aebf-13049c345904/channels?session_id=b7c8c3bd-14c6-4016-b559-e9ee8be551a9 ([email protected]) 1.57ms referer=None
<warning> [2024-03-08 19:13:53] [1d1099dc] [ServerApp] Replacing stale connection: 515c94f5-68d2-4adf-aebf-13049c345904:fafa4b8f-b3ed-4f7c-a235-03b30e5fa4b0
<warning> [2024-03-08 19:13:53] [1d1099dc] [ServerApp] 400 GET /proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/api/kernels/515c94f5-68d2-4adf-aebf-13049c345904/channels?session_id=fafa4b8f-b3ed-4f7c-a235-03b30e5fa4b0 ([email protected]) 1.34ms referer=None
<warning> [2024-03-08 19:13:54] [1d1099dc] [ServerApp] Replacing stale connection: 515c94f5-68d2-4adf-aebf-13049c345904:e51a50d8-2295-4de5-85e8-3be2ea300a73
<warning> [2024-03-08 19:13:54] [1d1099dc] [ServerApp] 400 GET /proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/api/kernels/515c94f5-68d2-4adf-aebf-13049c345904/channels?session_id=e51a50d8-2295-4de5-85e8-3be2ea300a73 ([email protected]) 1.45ms referer=None
<warning> [2024-03-08 19:13:57] [1d1099dc] [ServerApp] Replacing stale connection: 515c94f5-68d2-4adf-aebf-13049c345904:b7c8c3bd-14c6-4016-b559-e9ee8be551a9
<warning> [2024-03-08 19:13:57] [1d1099dc] [ServerApp] 400 GET /proxy/8dfd2288-b405-417d-810e-dcf7c72b8618/api/kernels/515c94f5-68d2-4adf-aebf-13049c345904/channels?session_id=b7c8c3bd-14c6-4016-b559-e9ee8be551a9 ([email protected]) 1.91ms referer=None

@ioga
Copy link
Contributor

ioga commented Mar 8, 2024

I've searched the jupyter github for these Replacing stale connection errors, but could not find anything promising there.

A couple ideas:

  1. Could you try running with a newer image instead and see if it changes anything: determinedai/environments:cuda-11.8-pytorch-2.0-gpu-622d512 (e.g. specify it in the UI or det notebook start --config environment.image=determinedai/environments:cuda-11.8-pytorch-2.0-gpu-622d512 --config resources.slots=1)
  2. I am curious what response does the client get to these 400 errors. Can you open the web dev tools in the browser on the notebook page, go the network tab, find these (or any other) failing requests, and see if there's anything useful there?

@rikirolly
Copy link
Author

Same errors:

<info>    [2024-03-08 20:01:36] || INFO: Scheduling JupyterLab (informally-renewed-eft) (id: 9b2fa481-7653-4516-b702-fab98e56cc73.1)
<info>    [2024-03-08 20:01:37] [866acf46] Pod cmd-9b2fa481-7653-4516-b702-fab98e56cc73-0-9b2fa481-7653-4516-b702-fab98e56cc73.1-moved-gelding: Waiting for resources. 0 GPUs are available, 1 GPUs required
<info>    [2024-03-08 20:01:38] [866acf46] Pod cmd-9b2fa481-7653-4516-b702-fab98e56cc73-0-9b2fa481-7653-4516-b702-fab98e56cc73.1-moved-gelding: Pod should schedule on: machine/roj-tzpq2
<info>    [2024-03-08 20:02:59] [866acf46] Pod cmd-9b2fa481-7653-4516-b702-fab98e56cc73-0-9b2fa481-7653-4516-b702-fab98e56cc73.1-moved-gelding: Pod resources allocated.
<info>    [2024-03-08 20:02:59] || INFO: JupyterLab (informally-renewed-eft) was assigned to an agent
<info>    [2024-03-08 20:03:00] [866acf46] Pod cmd-9b2fa481-7653-4516-b702-fab98e56cc73-0-9b2fa481-7653-4516-b702-fab98e56cc73.1-moved-gelding: Pulling image "determinedai/environments:cuda-11.8-pytorch-2.0-gpu-622d512"
<info>    [2024-03-08 20:05:51] [866acf46] Pod cmd-9b2fa481-7653-4516-b702-fab98e56cc73-0-9b2fa481-7653-4516-b702-fab98e56cc73.1-moved-gelding: Successfully pulled image "determinedai/environments:cuda-11.8-pytorch-2.0-gpu-622d512" in 2m50.68s (2m50.68s including waiting)
<info>    [2024-03-08 20:05:51] [866acf46] Pod cmd-9b2fa481-7653-4516-b702-fab98e56cc73-0-9b2fa481-7653-4516-b702-fab98e56cc73.1-moved-gelding: Created container determined-init-container
<info>    [2024-03-08 20:05:52] [866acf46] Pod cmd-9b2fa481-7653-4516-b702-fab98e56cc73-0-9b2fa481-7653-4516-b702-fab98e56cc73.1-moved-gelding: Started container determined-init-container
<info>    [2024-03-08 20:06:56] [866acf46] Pod cmd-9b2fa481-7653-4516-b702-fab98e56cc73-0-9b2fa481-7653-4516-b702-fab98e56cc73.1-moved-gelding: Container image "determinedai/environments:cuda-11.8-pytorch-2.0-gpu-622d512" already present on machine
<info>    [2024-03-08 20:06:56] [866acf46] Pod cmd-9b2fa481-7653-4516-b702-fab98e56cc73-0-9b2fa481-7653-4516-b702-fab98e56cc73.1-moved-gelding: Created container determined-container
<info>    [2024-03-08 20:06:56] [866acf46] Pod cmd-9b2fa481-7653-4516-b702-fab98e56cc73-0-9b2fa481-7653-4516-b702-fab98e56cc73.1-moved-gelding: Started container determined-container
<info>    [2024-03-08 20:06:57] [866acf46] Resources for JupyterLab (informally-renewed-eft) have started
<warning> [2024-03-08 20:06:58] [866acf46] Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
<info>    [2024-03-08 20:06:59] [866acf46] [26] determined: detected 1 gpus
<info>    [2024-03-08 20:06:59] [866acf46] [26] determined: detected 1 gpus
<info>    [2024-03-08 20:06:59] [866acf46] [26] determined: Running task container on agent_id=ip-172-31-141-114.eu-central-1.compute.internal, hostname=cmd-9b2fa481-7653-4516-b702-fab98e56cc73-0-9b2fa481-7653-4516-b with visible GPUs ['GPU-c21aabce-572b-7006-25fa-8fbe58217ec3']
<>        [2024-03-08 20:06:59] [866acf46] + test -f startup-hook.sh
<>        [2024-03-08 20:06:59] [866acf46] + set +x
<warning> [2024-03-08 20:06:59] [866acf46] [ServerApp] ServerApp.token config is deprecated in 2.0. Use IdentityProvider.token.
<info>    [2024-03-08 20:06:59] [866acf46] [ServerApp] Package jupyterlab took 0.0000s to import
<info>    [2024-03-08 20:06:59] [866acf46] [ServerApp] Package jupyter_archive took 0.0007s to import
<info>    [2024-03-08 20:06:59] [866acf46] [ServerApp] Package jupyter_server_terminals took 0.0029s to import
<info>    [2024-03-08 20:06:59] [866acf46] [ServerApp] Package nbclassic took 0.0000s to import
<warning> [2024-03-08 20:06:59] [866acf46] [ServerApp] A `_jupyter_server_extension_points` function was not found in nbclassic. Instead, a `_jupyter_server_extension_paths` function was found and will be used for now. This function name will be deprecated in future releases of Jupyter Server.
<info>    [2024-03-08 20:06:59] [866acf46] [ServerApp] Package notebook_shim took 0.0000s to import
<warning> [2024-03-08 20:06:59] [866acf46] [ServerApp] A `_jupyter_server_extension_points` function was not found in notebook_shim. Instead, a `_jupyter_server_extension_paths` function was found and will be used for now. This function name will be deprecated in future releases of Jupyter Server.
<info>    [2024-03-08 20:06:59] [866acf46] [ServerApp] jupyter_archive | extension was successfully linked.
<info>    [2024-03-08 20:06:59] [866acf46] [ServerApp] jupyter_server_terminals | extension was successfully linked.
<info>    [2024-03-08 20:06:59] [866acf46] [ServerApp] jupyterlab | extension was successfully linked.
<info>    [2024-03-08 20:06:59] [866acf46] [ServerApp] nbclassic | extension was successfully linked.
<info>    [2024-03-08 20:06:59] [866acf46] [ServerApp] Writing Jupyter server cookie secret to /run/determined/jupyter/runtime/jupyter_cookie_secret
<info>    [2024-03-08 20:06:59] [866acf46] [ServerApp] notebook_shim | extension was successfully linked.
<warning> [2024-03-08 20:06:59] [866acf46] [ServerApp] All authentication is disabled.  Anyone who can connect to this server will be able to run code.
<info>    [2024-03-08 20:06:59] [866acf46] [ServerApp] notebook_shim | extension was successfully loaded.
<info>    [2024-03-08 20:06:59] [866acf46] [ServerApp] jupyter_archive | extension was successfully loaded.
<info>    [2024-03-08 20:06:59] [866acf46] [ServerApp] jupyter_server_terminals | extension was successfully loaded.
<info>    [2024-03-08 20:06:59] [866acf46] [LabApp] JupyterLab extension loaded from /opt/conda/lib/python3.10/site-packages/jupyterlab
<info>    [2024-03-08 20:06:59] [866acf46] [LabApp] JupyterLab application directory is /opt/conda/share/jupyter/lab
<info>    [2024-03-08 20:06:59] [866acf46] [ServerApp] jupyterlab | extension was successfully loaded.
<>        [2024-03-08 20:07:00] [866acf46]
<>        [2024-03-08 20:07:00] [866acf46]   _   _          _      _
<>        [2024-03-08 20:07:00] [866acf46]  | | | |_ __  __| |__ _| |_ ___
<>        [2024-03-08 20:07:00] [866acf46]  | |_| | '_ \/ _` / _` |  _/ -_)
<>        [2024-03-08 20:07:00] [866acf46]   \___/| .__/\__,_\__,_|\__\___|
<>        [2024-03-08 20:07:00] [866acf46]        |_|
<>        [2024-03-08 20:07:00] [866acf46]
<>        [2024-03-08 20:07:00] [866acf46] Read the migration plan to Notebook 7 to learn about the new features and the actions to take if you are using extensions.
<>        [2024-03-08 20:07:00] [866acf46]
<>        [2024-03-08 20:07:00] [866acf46] https://jupyter-notebook.readthedocs.io/en/latest/migrate_to_notebook7.html
<>        [2024-03-08 20:07:00] [866acf46]
<>        [2024-03-08 20:07:00] [866acf46] Please note that updating to Notebook 7 might break some of your extensions.
<>        [2024-03-08 20:07:00] [866acf46]
<info>    [2024-03-08 20:07:00] [866acf46] [ServerApp] nbclassic | extension was successfully loaded.
<info>    [2024-03-08 20:07:00] [866acf46] [ServerApp] Serving notebooks from local directory: /
<info>    [2024-03-08 20:07:00] [866acf46] [ServerApp] Jupyter Server 2.10.0 is running at:
<info>    [2024-03-08 20:07:00] [866acf46] [ServerApp] https://cmd-9b2fa481-7653-4516-b702-fab98e56cc73-0-9b2fa481-7653-4516-b:3184/proxy/9b2fa481-7653-4516-b702-fab98e56cc73/lab
<info>    [2024-03-08 20:07:00] [866acf46] [ServerApp]     https://127.0.0.1:3184/proxy/9b2fa481-7653-4516-b702-fab98e56cc73/lab
<info>    [2024-03-08 20:07:00] [866acf46] [ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
<info>    [2024-03-08 20:07:00] || INFO: Service of JupyterLab (informally-renewed-eft) is available
<info>    [2024-03-08 20:07:03] [866acf46] [ServerApp] 302 GET /proxy/9b2fa481-7653-4516-b702-fab98e56cc73/ (@172.31.6.102) 0.46ms
<warning> [2024-03-08 20:07:07] [866acf46] [LabApp] Could not determine jupyterlab build status without nodejs
<info>    [2024-03-08 20:07:12] [866acf46] [ServerApp] Creating new notebook in /run/determined/workdir
<info>    [2024-03-08 20:07:12] [866acf46] [ServerApp] Writing notebook-signing key to /run/determined/jupyter/data/notebook_secret
<info>    [2024-03-08 20:07:12] [866acf46] [ServerApp] Kernel started: 4f653b0b-1c5b-49c3-bf11-987f60be7bd7
<warning> [2024-03-08 20:07:13] [866acf46] [ServerApp] 400 GET /proxy/9b2fa481-7653-4516-b702-fab98e56cc73/api/kernels/4f653b0b-1c5b-49c3-bf11-987f60be7bd7/channels?session_id=80d9b771-d334-4c4a-b977-049532c2e696 ([email protected]) 115.08ms referer=None
<warning> [2024-03-08 20:07:13] [866acf46] [ServerApp] 400 GET /proxy/9b2fa481-7653-4516-b702-fab98e56cc73/api/kernels/4f653b0b-1c5b-49c3-bf11-987f60be7bd7/channels?session_id=401788b7-ccca-404d-b8d9-0da44f049b50 ([email protected]) 1.21ms referer=None
<warning> [2024-03-08 20:07:13] [866acf46] [ServerApp] 400 GET /proxy/9b2fa481-7653-4516-b702-fab98e56cc73/api/kernels/4f653b0b-1c5b-49c3-bf11-987f60be7bd7/channels?session_id=08070098-058a-4495-86b6-09697085754d ([email protected]) 1.07ms referer=None
<warning> [2024-03-08 20:07:13] [866acf46] [ServerApp] Replacing stale connection: 4f653b0b-1c5b-49c3-bf11-987f60be7bd7:80d9b771-d334-4c4a-b977-049532c2e696
<warning> [2024-03-08 20:07:13] [866acf46] [ServerApp] 400 GET /proxy/9b2fa481-7653-4516-b702-fab98e56cc73/api/kernels/4f653b0b-1c5b-49c3-bf11-987f60be7bd7/channels?session_id=80d9b771-d334-4c4a-b977-049532c2e696 ([email protected]) 1.40ms referer=None
<warning> [2024-03-08 20:07:14] [866acf46] [ServerApp] Replacing stale connection: 4f653b0b-1c5b-49c3-bf11-987f60be7bd7:401788b7-ccca-404d-b8d9-0da44f049b50
<warning> [2024-03-08 20:07:14] [866acf46] [ServerApp] 400 GET /proxy/9b2fa481-7653-4516-b702-fab98e56cc73/api/kernels/4f653b0b-1c5b-49c3-bf11-987f60be7bd7/channels?session_id=401788b7-ccca-404d-b8d9-0da44f049b50 ([email protected]) 1.03ms referer=None
<warning> [2024-03-08 20:07:14] [866acf46] [ServerApp] Replacing stale connection: 4f653b0b-1c5b-49c3-bf11-987f60be7bd7:08070098-058a-4495-86b6-09697085754d
<warning> [2024-03-08 20:07:14] [866acf46] [ServerApp] 400 GET /proxy/9b2fa481-7653-4516-b702-fab98e56cc73/api/kernels/4f653b0b-1c5b-49c3-bf11-987f60be7bd7/channels?session_id=08070098-058a-4495-86b6-09697085754d ([email protected]) 1.23ms referer=None
<warning> [2024-03-08 20:07:14] [866acf46] [ServerApp] Replacing stale connection: 4f653b0b-1c5b-49c3-bf11-987f60be7bd7:08070098-058a-4495-86b6-09697085754d
<warning> [2024-03-08 20:07:14] [866acf46] [ServerApp] 400 GET /proxy/9b2fa481-7653-4516-b702-fab98e56cc73/api/kernels/4f653b0b-1c5b-49c3-bf11-987f60be7bd7/channels?session_id=08070098-058a-4495-86b6-09697085754d ([email protected]) 1.22ms referer=None
<warning> [2024-03-08 20:07:15] [866acf46] [ServerApp] Replacing stale connection: 4f653b0b-1c5b-49c3-bf11-987f60be7bd7:80d9b771-d334-4c4a-b977-049532c2e696
<warning> [2024-03-08 20:07:15] [866acf46] [ServerApp] 400 GET /proxy/9b2fa481-7653-4516-b702-fab98e56cc73/api/kernels/4f653b0b-1c5b-49c3-bf11-987f60be7bd7/channels?session_id=80d9b771-d334-4c4a-b977-049532c2e696 ([email protected]) 1.18ms referer=None
<warning> [2024-03-08 20:07:15] [866acf46] [ServerApp] Replacing stale connection: 4f653b0b-1c5b-49c3-bf11-987f60be7bd7:08070098-058a-4495-86b6-09697085754d
<warning> [2024-03-08 20:07:15] [866acf46] [ServerApp] 400 GET /proxy/9b2fa481-7653-4516-b702-fab98e56cc73/api/kernels/4f653b0b-1c5b-49c3-bf11-987f60be7bd7/channels?session_id=08070098-058a-4495-86b6-09697085754d ([email protected]) 1.46ms referer=None
<warning> [2024-03-08 20:07:15] [866acf46] [ServerApp] Replacing stale connection: 4f653b0b-1c5b-49c3-bf11-987f60be7bd7:401788b7-ccca-404d-b8d9-0da44f049b50
<warning> [2024-03-08 20:07:15] [866acf46] [ServerApp] 400 GET /proxy/9b2fa481-7653-4516-b702-fab98e56cc73/api/kernels/4f653b0b-1c5b-49c3-bf11-987f60be7bd7/channels?session_id=401788b7-ccca-404d-b8d9-0da44f049b50 ([email protected]) 1.12ms referer=None
<warning> [2024-03-08 20:07:16] [866acf46] [ServerApp] Replacing stale connection: 4f653b0b-1c5b-49c3-bf11-987f60be7bd7:80d9b771-d334-4c4a-b977-049532c2e696
<warning> [2024-03-08 20:07:16] [866acf46] [ServerApp] 400 GET /proxy/9b2fa481-7653-4516-b702-fab98e56cc73/api/kernels/4f653b0b-1c5b-49c3-bf11-987f60be7bd7/channels?session_id=80d9b771-d334-4c4a-b977-049532c2e696 ([email protected]) 1.37ms referer=None
<warning> [2024-03-08 20:07:18] [866acf46] [ServerApp] Replacing stale connection: 4f653b0b-1c5b-49c3-bf11-987f60be7bd7:401788b7-ccca-404d-b8d9-0da44f049b50
<warning> [2024-03-08 20:07:18] [866acf46] [ServerApp] 400 GET /proxy/9b2fa481-7653-4516-b702-fab98e56cc73/api/kernels/4f653b0b-1c5b-49c3-bf11-987f60be7bd7/channels?session_id=401788b7-ccca-404d-b8d9-0da44f049b50 ([email protected]) 1.32ms referer=None
<warning> [2024-03-08 20:07:20] [866acf46] [ServerApp] Replacing stale connection: 4f653b0b-1c5b-49c3-bf11-987f60be7bd7:401788b7-ccca-404d-b8d9-0da44f049b50
<warning> [2024-03-08 20:07:20] [866acf46] [ServerApp] 400 GET /proxy/9b2fa481-7653-4516-b702-fab98e56cc73/api/kernels/4f653b0b-1c5b-49c3-bf11-987f60be7bd7/channels?session_id=401788b7-ccca-404d-b8d9-0da44f049b50 ([email protected]) 1.49ms referer=None
<warning> [2024-03-08 20:07:21] [866acf46] [ServerApp] Replacing stale connection: 4f653b0b-1c5b-49c3-bf11-987f60be7bd7:08070098-058a-4495-86b6-09697085754d
<warning> [2024-03-08 20:07:21] [866acf46] [ServerApp] 400 GET /proxy/9b2fa481-7653-4516-b702-fab98e56cc73/api/kernels/4f653b0b-1c5b-49c3-bf11-987f60be7bd7/channels?session_id=08070098-058a-4495-86b6-09697085754d ([email protected]) 1.30ms referer=None
<warning> [2024-03-08 20:07:24] [866acf46] [ServerApp] Replacing stale connection: 4f653b0b-1c5b-49c3-bf11-987f60be7bd7:80d9b771-d334-4c4a-b977-049532c2e696
<warning> [2024-03-08 20:07:24] [866acf46] [ServerApp] 400 GET /proxy/9b2fa481-7653-4516-b702-fab98e56cc73/api/kernels/4f653b0b-1c5b-49c3-bf11-987f60be7bd7/channels?session_id=80d9b771-d334-4c4a-b977-049532c2e696 ([email protected]) 1.35ms referer=None

From the Chrome client I get 200 from all requests apart from these few requests:
image

@rikirolly
Copy link
Author

I just tried the 0.29.0 version and I get the same problem. This is the log:

<info>    [2024-03-09 10:24:40] || INFO: Scheduling Prova (id: de2bdeeb-3498-4b64-baac-8749175add8f.1)
<info>    [2024-03-09 10:24:41] [7e09c140] Pod cmd-de2bdeeb-3498-4b64-baac-8749175add8f-0-de2bdeeb-3498-4b64-baac-8749175add8f.1-nearby-bonefish: Waiting for resources. 0 GPUs are available, 1 GPUs required
<info>    [2024-03-09 10:24:42] [7e09c140] Pod cmd-de2bdeeb-3498-4b64-baac-8749175add8f-0-de2bdeeb-3498-4b64-baac-8749175add8f.1-nearby-bonefish: Pod should schedule on: machine/roj-294pr
<info>    [2024-03-09 10:26:07] [7e09c140] Pod cmd-de2bdeeb-3498-4b64-baac-8749175add8f-0-de2bdeeb-3498-4b64-baac-8749175add8f.1-nearby-bonefish: Pod resources allocated.
<info>    [2024-03-09 10:26:07] || INFO: Prova was assigned to an agent
<info>    [2024-03-09 10:26:08] [7e09c140] Pod cmd-de2bdeeb-3498-4b64-baac-8749175add8f-0-de2bdeeb-3498-4b64-baac-8749175add8f.1-nearby-bonefish: Pulling image "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-f66cbce"
<info>    [2024-03-09 10:28:46] [7e09c140] Pod cmd-de2bdeeb-3498-4b64-baac-8749175add8f-0-de2bdeeb-3498-4b64-baac-8749175add8f.1-nearby-bonefish: Successfully pulled image "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-f66cbce" in 2m37.624s (2m37.624s including waiting)
<info>    [2024-03-09 10:28:46] [7e09c140] Pod cmd-de2bdeeb-3498-4b64-baac-8749175add8f-0-de2bdeeb-3498-4b64-baac-8749175add8f.1-nearby-bonefish: Created container determined-init-container
<info>    [2024-03-09 10:28:48] [7e09c140] Pod cmd-de2bdeeb-3498-4b64-baac-8749175add8f-0-de2bdeeb-3498-4b64-baac-8749175add8f.1-nearby-bonefish: Started container determined-init-container
<info>    [2024-03-09 10:30:14] [7e09c140] Pod cmd-de2bdeeb-3498-4b64-baac-8749175add8f-0-de2bdeeb-3498-4b64-baac-8749175add8f.1-nearby-bonefish: Container image "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-f66cbce" already present on machine
<info>    [2024-03-09 10:30:15] [7e09c140] Pod cmd-de2bdeeb-3498-4b64-baac-8749175add8f-0-de2bdeeb-3498-4b64-baac-8749175add8f.1-nearby-bonefish: Created container determined-container
<info>    [2024-03-09 10:30:15] [7e09c140] Pod cmd-de2bdeeb-3498-4b64-baac-8749175add8f-0-de2bdeeb-3498-4b64-baac-8749175add8f.1-nearby-bonefish: Started container determined-container
<info>    [2024-03-09 10:30:15] [7e09c140] Resources for Prova have started
<warning> [2024-03-09 10:30:17] [7e09c140] Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
<info>    [2024-03-09 10:30:17] [7e09c140] [26] determined: detected 1 gpus
<info>    [2024-03-09 10:30:17] [7e09c140] [26] determined: detected 1 gpus
<info>    [2024-03-09 10:30:17] [7e09c140] [26] determined: Running task container on agent_id=ip-172-31-145-106.eu-central-1.compute.internal, hostname=cmd-de2bdeeb-3498-4b64-baac-8749175add8f-0-de2bdeeb-3498-4b64-b with visible GPUs ['GPU-ac319bae-9d4b-405b-9458-026eb40ad7ca']
<>        [2024-03-09 10:30:17] [7e09c140] + test -f startup-hook.sh
<>        [2024-03-09 10:30:17] [7e09c140] + set +x
<>        [2024-03-09 10:30:18] [7e09c140] Traceback (most recent call last):
<>        [2024-03-09 10:30:18] [7e09c140]   File "/run/determined/jupyter/check_idle.py", line 103, in <module>
<>        [2024-03-09 10:30:18] [7e09c140]     main()
<>        [2024-03-09 10:30:18] [7e09c140]   File "/run/determined/jupyter/check_idle.py", line 78, in main
<>        [2024-03-09 10:30:18] [7e09c140]     utp = authentication.login_with_cache(info.master_url, cert=cert)
<>        [2024-03-09 10:30:18] [7e09c140] NameError: name 'info' is not defined
<warning> [2024-03-09 10:30:18] [7e09c140] [ServerApp] ServerApp.token config is deprecated in 2.0. Use IdentityProvider.token.
<warning> [2024-03-09 10:30:18] [7e09c140] [ServerApp] A `_jupyter_server_extension_points` function was not found in nbclassic. Instead, a `_jupyter_server_extension_paths` function was found and will be used for now. This function name will be deprecated in future releases of Jupyter Server.
<warning> [2024-03-09 10:30:18] [7e09c140] [ServerApp] A `_jupyter_server_extension_points` function was not found in notebook_shim. Instead, a `_jupyter_server_extension_paths` function was found and will be used for now. This function name will be deprecated in future releases of Jupyter Server.
<info>    [2024-03-09 10:30:18] [7e09c140] [ServerApp] jupyter_archive | extension was successfully linked.
<info>    [2024-03-09 10:30:18] [7e09c140] [ServerApp] jupyter_server_terminals | extension was successfully linked.
<info>    [2024-03-09 10:30:18] [7e09c140] [ServerApp] jupyterlab | extension was successfully linked.
<info>    [2024-03-09 10:30:18] [7e09c140] [ServerApp] nbclassic | extension was successfully linked.
<info>    [2024-03-09 10:30:18] [7e09c140] [ServerApp] Writing Jupyter server cookie secret to /run/determined/jupyter/runtime/jupyter_cookie_secret
<info>    [2024-03-09 10:30:18] [7e09c140] [ServerApp] notebook_shim | extension was successfully linked.
<warning> [2024-03-09 10:30:18] [7e09c140] [ServerApp] All authentication is disabled.  Anyone who can connect to this server will be able to run code.
<info>    [2024-03-09 10:30:18] [7e09c140] [ServerApp] notebook_shim | extension was successfully loaded.
<info>    [2024-03-09 10:30:18] [7e09c140] [ServerApp] jupyter_archive | extension was successfully loaded.
<info>    [2024-03-09 10:30:18] [7e09c140] [ServerApp] jupyter_server_terminals | extension was successfully loaded.
<info>    [2024-03-09 10:30:18] [7e09c140] [LabApp] JupyterLab extension loaded from /opt/conda/lib/python3.9/site-packages/jupyterlab
<info>    [2024-03-09 10:30:18] [7e09c140] [LabApp] JupyterLab application directory is /opt/conda/share/jupyter/lab
<info>    [2024-03-09 10:30:18] [7e09c140] [ServerApp] jupyterlab | extension was successfully loaded.
<>        [2024-03-09 10:30:18] [7e09c140]
<>        [2024-03-09 10:30:18] [7e09c140]   _   _          _      _
<>        [2024-03-09 10:30:18] [7e09c140]  | | | |_ __  __| |__ _| |_ ___
<>        [2024-03-09 10:30:18] [7e09c140]  | |_| | '_ \/ _` / _` |  _/ -_)
<>        [2024-03-09 10:30:18] [7e09c140]   \___/| .__/\__,_\__,_|\__\___|
<>        [2024-03-09 10:30:18] [7e09c140]        |_|
<>        [2024-03-09 10:30:18] [7e09c140]
<>        [2024-03-09 10:30:18] [7e09c140] Read the migration plan to Notebook 7 to learn about the new features and the actions to take if you are using extensions.
<>        [2024-03-09 10:30:18] [7e09c140]
<>        [2024-03-09 10:30:18] [7e09c140] https://jupyter-notebook.readthedocs.io/en/latest/migrate_to_notebook7.html
<>        [2024-03-09 10:30:18] [7e09c140]
<>        [2024-03-09 10:30:18] [7e09c140] Please note that updating to Notebook 7 might break some of your extensions.
<>        [2024-03-09 10:30:18] [7e09c140]
<info>    [2024-03-09 10:30:18] [7e09c140] [ServerApp] nbclassic | extension was successfully loaded.
<info>    [2024-03-09 10:30:18] [7e09c140] [ServerApp] Serving notebooks from local directory: /
<info>    [2024-03-09 10:30:18] [7e09c140] [ServerApp] Jupyter Server 2.12.5 is running at:
<info>    [2024-03-09 10:30:18] [7e09c140] [ServerApp] https://cmd-de2bdeeb-3498-4b64-baac-8749175add8f-0-de2bdeeb-3498-4b64-b:3051/proxy/de2bdeeb-3498-4b64-baac-8749175add8f/lab
<info>    [2024-03-09 10:30:18] [7e09c140] [ServerApp]     https://127.0.0.1:3051/proxy/de2bdeeb-3498-4b64-baac-8749175add8f/lab
<info>    [2024-03-09 10:30:18] [7e09c140] [ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
<info>    [2024-03-09 10:30:18] || INFO: Service of Prova is available
<info>    [2024-03-09 10:30:19] [7e09c140] [ServerApp] 302 GET /proxy/de2bdeeb-3498-4b64-baac-8749175add8f/ (@172.31.34.200) 0.45ms
<warning> [2024-03-09 10:30:50] [7e09c140] [LabApp] Could not determine jupyterlab build status without nodejs
<info>    [2024-03-09 10:31:06] [7e09c140] [ServerApp] Creating new notebook in /run/determined/workdir
<info>    [2024-03-09 10:31:06] [7e09c140] [ServerApp] Writing notebook-signing key to /run/determined/jupyter/data/notebook_secret
<info>    [2024-03-09 10:31:08] [7e09c140] [ServerApp] Kernel started: cda0300d-1a93-481f-862e-b4201e4f92f8
<warning> [2024-03-09 10:31:09] [7e09c140] [ServerApp] 400 GET /proxy/de2bdeeb-3498-4b64-baac-8749175add8f/api/kernels/cda0300d-1a93-481f-862e-b4201e4f92f8/channels?session_id=31c9a297-d63a-4c12-9707-ce3714ae5213 ([email protected]) 5.82ms referer=None
<warning> [2024-03-09 10:31:10] [7e09c140] [ServerApp] 400 GET /proxy/de2bdeeb-3498-4b64-baac-8749175add8f/api/kernels/cda0300d-1a93-481f-862e-b4201e4f92f8/channels?session_id=7f601a2d-d4c7-4b46-a74b-4c04903d3534 ([email protected]) 1.17ms referer=None
<warning> [2024-03-09 10:31:11] [7e09c140] [ServerApp] 400 GET /proxy/de2bdeeb-3498-4b64-baac-8749175add8f/api/kernels/cda0300d-1a93-481f-862e-b4201e4f92f8/channels?session_id=bc7778aa-2576-4232-9997-cc363e81ce0a ([email protected]) 1.12ms referer=None
<warning> [2024-03-09 10:31:12] [7e09c140] [ServerApp] Replacing stale connection: cda0300d-1a93-481f-862e-b4201e4f92f8:31c9a297-d63a-4c12-9707-ce3714ae5213
<warning> [2024-03-09 10:31:12] [7e09c140] [ServerApp] 400 GET /proxy/de2bdeeb-3498-4b64-baac-8749175add8f/api/kernels/cda0300d-1a93-481f-862e-b4201e4f92f8/channels?session_id=31c9a297-d63a-4c12-9707-ce3714ae5213 ([email protected]) 1.49ms referer=None
<warning> [2024-03-09 10:31:13] [7e09c140] [ServerApp] Replacing stale connection: cda0300d-1a93-481f-862e-b4201e4f92f8:7f601a2d-d4c7-4b46-a74b-4c04903d3534
<warning> [2024-03-09 10:31:13] [7e09c140] [ServerApp] 400 GET /proxy/de2bdeeb-3498-4b64-baac-8749175add8f/api/kernels/cda0300d-1a93-481f-862e-b4201e4f92f8/channels?session_id=7f601a2d-d4c7-4b46-a74b-4c04903d3534 ([email protected]) 1.37ms referer=None
<warning> [2024-03-09 10:31:14] [7e09c140] [ServerApp] Replacing stale connection: cda0300d-1a93-481f-862e-b4201e4f92f8:bc7778aa-2576-4232-9997-cc363e81ce0a
<warning> [2024-03-09 10:31:14] [7e09c140] [ServerApp] 400 GET /proxy/de2bdeeb-3498-4b64-baac-8749175add8f/api/kernels/cda0300d-1a93-481f-862e-b4201e4f92f8/channels?session_id=bc7778aa-2576-4232-9997-cc363e81ce0a ([email protected]) 1.54ms referer=None
<warning> [2024-03-09 10:31:15] [7e09c140] [ServerApp] Replacing stale connection: cda0300d-1a93-481f-862e-b4201e4f92f8:31c9a297-d63a-4c12-9707-ce3714ae5213
<warning> [2024-03-09 10:31:15] [7e09c140] [ServerApp] 400 GET /proxy/de2bdeeb-3498-4b64-baac-8749175add8f/api/kernels/cda0300d-1a93-481f-862e-b4201e4f92f8/channels?session_id=31c9a297-d63a-4c12-9707-ce3714ae5213 ([email protected]) 1.65ms referer=None
<warning> [2024-03-09 10:31:16] [7e09c140] [ServerApp] Replacing stale connection: cda0300d-1a93-481f-862e-b4201e4f92f8:7f601a2d-d4c7-4b46-a74b-4c04903d3534
<warning> [2024-03-09 10:31:16] [7e09c140] [ServerApp] 400 GET /proxy/de2bdeeb-3498-4b64-baac-8749175add8f/api/kernels/cda0300d-1a93-481f-862e-b4201e4f92f8/channels?session_id=7f601a2d-d4c7-4b46-a74b-4c04903d3534 ([email protected]) 1.35ms referer=None
<warning> [2024-03-09 10:31:16] [7e09c140] [ServerApp] Replacing stale connection: cda0300d-1a93-481f-862e-b4201e4f92f8:bc7778aa-2576-4232-9997-cc363e81ce0a
<warning> [2024-03-09 10:31:16] [7e09c140] [ServerApp] 400 GET /proxy/de2bdeeb-3498-4b64-baac-8749175add8f/api/kernels/cda0300d-1a93-481f-862e-b4201e4f92f8/channels?session_id=bc7778aa-2576-4232-9997-cc363e81ce0a ([email protected]) 1.12ms referer=None
<warning> [2024-03-09 10:31:16] [7e09c140] [ServerApp] Replacing stale connection: cda0300d-1a93-481f-862e-b4201e4f92f8:31c9a297-d63a-4c12-9707-ce3714ae5213
<warning> [2024-03-09 10:31:16] [7e09c140] [ServerApp] 400 GET /proxy/de2bdeeb-3498-4b64-baac-8749175add8f/api/kernels/cda0300d-1a93-481f-862e-b4201e4f92f8/channels?session_id=31c9a297-d63a-4c12-9707-ce3714ae5213 ([email protected]) 1.26ms referer=None
<warning> [2024-03-09 10:31:17] [7e09c140] [ServerApp] Replacing stale connection: cda0300d-1a93-481f-862e-b4201e4f92f8:bc7778aa-2576-4232-9997-cc363e81ce0a
<warning> [2024-03-09 10:31:17] [7e09c140] [ServerApp] 400 GET /proxy/de2bdeeb-3498-4b64-baac-8749175add8f/api/kernels/cda0300d-1a93-481f-862e-b4201e4f92f8/channels?session_id=bc7778aa-2576-4232-9997-cc363e81ce0a ([email protected]) 1.24ms referer=None
<warning> [2024-03-09 10:31:18] [7e09c140] [ServerApp] Replacing stale connection: cda0300d-1a93-481f-862e-b4201e4f92f8:7f601a2d-d4c7-4b46-a74b-4c04903d3534
<warning> [2024-03-09 10:31:18] [7e09c140] [ServerApp] 400 GET /proxy/de2bdeeb-3498-4b64-baac-8749175add8f/api/kernels/cda0300d-1a93-481f-862e-b4201e4f92f8/channels?session_id=7f601a2d-d4c7-4b46-a74b-4c04903d3534 ([email protected]) 1.32ms referer=None
<warning> [2024-03-09 10:31:19] [7e09c140] [ServerApp] Replacing stale connection: cda0300d-1a93-481f-862e-b4201e4f92f8:31c9a297-d63a-4c12-9707-ce3714ae5213
<warning> [2024-03-09 10:31:19] [7e09c140] [ServerApp] 400 GET /proxy/de2bdeeb-3498-4b64-baac-8749175add8f/api/kernels/cda0300d-1a93-481f-862e-b4201e4f92f8/channels?session_id=31c9a297-d63a-4c12-9707-ce3714ae5213 ([email protected]) 1.41ms referer=None
<warning> [2024-03-09 10:31:21] [7e09c140] [ServerApp] Replacing stale connection: cda0300d-1a93-481f-862e-b4201e4f92f8:bc7778aa-2576-4232-9997-cc363e81ce0a
<warning> [2024-03-09 10:31:21] [7e09c140] [ServerApp] 400 GET /proxy/de2bdeeb-3498-4b64-baac-8749175add8f/api/kernels/cda0300d-1a93-481f-862e-b4201e4f92f8/channels?session_id=bc7778aa-2576-4232-9997-cc363e81ce0a ([email protected]) 1.57ms referer=None

@rikirolly
Copy link
Author

In my previous experiments I was using Kubernetes 1.29, I have now tried to deploy a new Kubernetes 1.26 cluster but I have got the same problem. O don't know what to try more... any suggestion?

@rikirolly
Copy link
Author

How to solve this error in the log?

<>        [2024-03-09 10:30:17] [7e09c140] + test -f startup-hook.sh
<>        [2024-03-09 10:30:17] [7e09c140] + set +x
<>        [2024-03-09 10:30:18] [7e09c140] Traceback (most recent call last):
<>        [2024-03-09 10:30:18] [7e09c140]   File "/run/determined/jupyter/check_idle.py", line 103, in <module>
<>        [2024-03-09 10:30:18] [7e09c140]     main()
<>        [2024-03-09 10:30:18] [7e09c140]   File "/run/determined/jupyter/check_idle.py", line 78, in main
<>        [2024-03-09 10:30:18] [7e09c140]     utp = authentication.login_with_cache(info.master_url, cert=cert)
<>        [2024-03-09 10:30:18] [7e09c140] NameError: name 'info' is not defined

@rikirolly
Copy link
Author

I tried Determined 0.28.0 package with similar results:

<info>    [2024-03-11 08:52:24] [86b614ab] Pod cmd-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa-0-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa.1-lasting-lamprey: Waiting for resources. 0 GPUs are available, 1 GPUs required
<info>    [2024-03-11 08:52:24] || INFO: Scheduling Prova (id: 5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa.1)
<info>    [2024-03-11 08:52:25] [86b614ab] Pod cmd-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa-0-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa.1-lasting-lamprey: Pod should schedule on: machine/roj-kz7bh
<info>    [2024-03-11 08:53:29] [86b614ab] Pod cmd-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa-0-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa.1-lasting-lamprey: Pod resources allocated.
<info>    [2024-03-11 08:53:29] [86b614ab] Pod cmd-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa-0-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa.1-lasting-lamprey: Pulling image "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-f66cbce"
<info>    [2024-03-11 08:53:29] || INFO: Prova was assigned to an agent
<info>    [2024-03-11 08:56:44] [86b614ab] Pod cmd-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa-0-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa.1-lasting-lamprey: Successfully pulled image "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-f66cbce" in 3m15.113373877s (3m15.113394107s including waiting)
<info>    [2024-03-11 08:56:44] [86b614ab] Pod cmd-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa-0-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa.1-lasting-lamprey: Created container determined-init-container
<info>    [2024-03-11 08:56:46] [86b614ab] Pod cmd-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa-0-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa.1-lasting-lamprey: Started container determined-init-container
<info>    [2024-03-11 08:57:17] [86b614ab] Pod cmd-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa-0-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa.1-lasting-lamprey: Container image "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-f66cbce" already present on machine
<info>    [2024-03-11 08:57:17] [86b614ab] Pod cmd-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa-0-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa.1-lasting-lamprey: Created container determined-container
<info>    [2024-03-11 08:57:17] [86b614ab] Pod cmd-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa-0-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa.1-lasting-lamprey: Started container determined-container
<info>    [2024-03-11 08:57:18] [86b614ab] Resources for Prova have started
<warning> [2024-03-11 08:57:19] [86b614ab] Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
<info>    [2024-03-11 08:57:19] [86b614ab] [25] determined: detected 1 gpus
<info>    [2024-03-11 08:57:19] [86b614ab] [25] determined: detected 1 gpus
<info>    [2024-03-11 08:57:19] [86b614ab] [25] determined: Running task container on agent_id=ip-172-31-170-249.eu-central-1.compute.internal, hostname=cmd-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa-0-5b2aef79-8a54-4294-b with visible GPUs ['GPU-8ec58339-2eb1-0f1b-9ef6-9cf09479bac2']
<>        [2024-03-11 08:57:20] [86b614ab] + test -f startup-hook.sh
<>        [2024-03-11 08:57:20] [86b614ab] + set +x
<warning> [2024-03-11 08:57:20] [86b614ab] [ServerApp] ServerApp.token config is deprecated in 2.0. Use IdentityProvider.token.
<warning> [2024-03-11 08:57:20] [86b614ab] [ServerApp] A `_jupyter_server_extension_points` function was not found in nbclassic. Instead, a `_jupyter_server_extension_paths` function was found and will be used for now. This function name will be deprecated in future releases of Jupyter Server.
<warning> [2024-03-11 08:57:20] [86b614ab] [ServerApp] A `_jupyter_server_extension_points` function was not found in notebook_shim. Instead, a `_jupyter_server_extension_paths` function was found and will be used for now. This function name will be deprecated in future releases of Jupyter Server.
<info>    [2024-03-11 08:57:20] [86b614ab] [ServerApp] jupyter_archive | extension was successfully linked.
<info>    [2024-03-11 08:57:20] [86b614ab] [ServerApp] jupyter_server_terminals | extension was successfully linked.
<info>    [2024-03-11 08:57:20] [86b614ab] [ServerApp] jupyterlab | extension was successfully linked.
<info>    [2024-03-11 08:57:20] [86b614ab] [ServerApp] nbclassic | extension was successfully linked.
<info>    [2024-03-11 08:57:20] [86b614ab] [ServerApp] Writing Jupyter server cookie secret to /run/determined/jupyter/runtime/jupyter_cookie_secret
<info>    [2024-03-11 08:57:20] [86b614ab] [ServerApp] notebook_shim | extension was successfully linked.
<warning> [2024-03-11 08:57:20] [86b614ab] [ServerApp] All authentication is disabled.  Anyone who can connect to this server will be able to run code.
<info>    [2024-03-11 08:57:20] [86b614ab] [ServerApp] notebook_shim | extension was successfully loaded.
<info>    [2024-03-11 08:57:20] [86b614ab] [ServerApp] jupyter_archive | extension was successfully loaded.
<info>    [2024-03-11 08:57:20] [86b614ab] [ServerApp] jupyter_server_terminals | extension was successfully loaded.
<info>    [2024-03-11 08:57:20] [86b614ab] [LabApp] JupyterLab extension loaded from /opt/conda/lib/python3.9/site-packages/jupyterlab
<info>    [2024-03-11 08:57:20] [86b614ab] [LabApp] JupyterLab application directory is /opt/conda/share/jupyter/lab
<info>    [2024-03-11 08:57:20] [86b614ab] [ServerApp] jupyterlab | extension was successfully loaded.
<>        [2024-03-11 08:57:20] [86b614ab]
<>        [2024-03-11 08:57:20] [86b614ab]   _   _          _      _
<>        [2024-03-11 08:57:20] [86b614ab]  | | | |_ __  __| |__ _| |_ ___
<>        [2024-03-11 08:57:20] [86b614ab]  | |_| | '_ \/ _` / _` |  _/ -_)
<>        [2024-03-11 08:57:20] [86b614ab]   \___/| .__/\__,_\__,_|\__\___|
<>        [2024-03-11 08:57:20] [86b614ab]        |_|
<>        [2024-03-11 08:57:20] [86b614ab]
<>        [2024-03-11 08:57:20] [86b614ab] Read the migration plan to Notebook 7 to learn about the new features and the actions to take if you are using extensions.
<>        [2024-03-11 08:57:20] [86b614ab]
<>        [2024-03-11 08:57:20] [86b614ab] https://jupyter-notebook.readthedocs.io/en/latest/migrate_to_notebook7.html
<>        [2024-03-11 08:57:20] [86b614ab]
<>        [2024-03-11 08:57:20] [86b614ab] Please note that updating to Notebook 7 might break some of your extensions.
<>        [2024-03-11 08:57:20] [86b614ab]
<info>    [2024-03-11 08:57:20] [86b614ab] [ServerApp] nbclassic | extension was successfully loaded.
<info>    [2024-03-11 08:57:20] [86b614ab] [ServerApp] Serving notebooks from local directory: /
<info>    [2024-03-11 08:57:20] [86b614ab] [ServerApp] Jupyter Server 2.12.5 is running at:
<info>    [2024-03-11 08:57:20] [86b614ab] [ServerApp] https://cmd-5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa-0-5b2aef79-8a54-4294-b:2925/proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/lab
<info>    [2024-03-11 08:57:20] [86b614ab] [ServerApp]     https://127.0.0.1:2925/proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/lab
<info>    [2024-03-11 08:57:20] [86b614ab] [ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
<info>    [2024-03-11 08:57:20] || INFO: Service of Prova is available
<info>    [2024-03-11 08:58:30] [86b614ab] [ServerApp] 302 GET /proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/ (@172.31.12.119) 0.39ms
<warning> [2024-03-11 08:58:32] [86b614ab] [LabApp] Could not determine jupyterlab build status without nodejs
<info>    [2024-03-11 08:58:42] [86b614ab] [ServerApp] Creating new notebook in /run/determined/workdir
<info>    [2024-03-11 08:58:42] [86b614ab] [ServerApp] Writing notebook-signing key to /run/determined/jupyter/data/notebook_secret
<info>    [2024-03-11 08:58:42] [86b614ab] [ServerApp] Kernel started: e68cf9c8-7054-46a7-8c8f-e0b047942bc5
<warning> [2024-03-11 08:58:42] [86b614ab] [ServerApp] 400 GET /proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/api/kernels/e68cf9c8-7054-46a7-8c8f-e0b047942bc5/channels?session_id=ce6e2615-b30d-4520-bf81-4fd30c70795f ([email protected]) 202.48ms referer=None
<warning> [2024-03-11 08:58:43] [86b614ab] [ServerApp] 400 GET /proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/api/kernels/e68cf9c8-7054-46a7-8c8f-e0b047942bc5/channels?session_id=5875e5fe-9b1e-47ea-a4ba-254b96d591d3 ([email protected]) 1.16ms referer=None
<warning> [2024-03-11 08:58:43] [86b614ab] [ServerApp] 400 GET /proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/api/kernels/e68cf9c8-7054-46a7-8c8f-e0b047942bc5/channels?session_id=ba03c600-eb4e-4d3f-bdc6-95a2ee81bfec ([email protected]) 1.10ms referer=None
<warning> [2024-03-11 08:58:43] [86b614ab] [ServerApp] Replacing stale connection: e68cf9c8-7054-46a7-8c8f-e0b047942bc5:ce6e2615-b30d-4520-bf81-4fd30c70795f
<warning> [2024-03-11 08:58:43] [86b614ab] [ServerApp] 400 GET /proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/api/kernels/e68cf9c8-7054-46a7-8c8f-e0b047942bc5/channels?session_id=ce6e2615-b30d-4520-bf81-4fd30c70795f ([email protected]) 1.10ms referer=None
<warning> [2024-03-11 08:58:43] [86b614ab] [ServerApp] Replacing stale connection: e68cf9c8-7054-46a7-8c8f-e0b047942bc5:5875e5fe-9b1e-47ea-a4ba-254b96d591d3
<warning> [2024-03-11 08:58:43] [86b614ab] [ServerApp] 400 GET /proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/api/kernels/e68cf9c8-7054-46a7-8c8f-e0b047942bc5/channels?session_id=5875e5fe-9b1e-47ea-a4ba-254b96d591d3 ([email protected]) 1.44ms referer=None
<warning> [2024-03-11 08:58:43] [86b614ab] [ServerApp] Replacing stale connection: e68cf9c8-7054-46a7-8c8f-e0b047942bc5:ba03c600-eb4e-4d3f-bdc6-95a2ee81bfec
<warning> [2024-03-11 08:58:43] [86b614ab] [ServerApp] 400 GET /proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/api/kernels/e68cf9c8-7054-46a7-8c8f-e0b047942bc5/channels?session_id=ba03c600-eb4e-4d3f-bdc6-95a2ee81bfec ([email protected]) 1.08ms referer=None
<warning> [2024-03-11 08:58:44] [86b614ab] [ServerApp] Replacing stale connection: e68cf9c8-7054-46a7-8c8f-e0b047942bc5:ce6e2615-b30d-4520-bf81-4fd30c70795f
<warning> [2024-03-11 08:58:44] [86b614ab] [ServerApp] 400 GET /proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/api/kernels/e68cf9c8-7054-46a7-8c8f-e0b047942bc5/channels?session_id=ce6e2615-b30d-4520-bf81-4fd30c70795f ([email protected]) 1.69ms referer=None
<warning> [2024-03-11 08:58:44] [86b614ab] [ServerApp] Replacing stale connection: e68cf9c8-7054-46a7-8c8f-e0b047942bc5:5875e5fe-9b1e-47ea-a4ba-254b96d591d3
<warning> [2024-03-11 08:58:44] [86b614ab] [ServerApp] 400 GET /proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/api/kernels/e68cf9c8-7054-46a7-8c8f-e0b047942bc5/channels?session_id=5875e5fe-9b1e-47ea-a4ba-254b96d591d3 ([email protected]) 1.35ms referer=None
<warning> [2024-03-11 08:58:44] [86b614ab] [ServerApp] Replacing stale connection: e68cf9c8-7054-46a7-8c8f-e0b047942bc5:ba03c600-eb4e-4d3f-bdc6-95a2ee81bfec
<warning> [2024-03-11 08:58:44] [86b614ab] [ServerApp] 400 GET /proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/api/kernels/e68cf9c8-7054-46a7-8c8f-e0b047942bc5/channels?session_id=ba03c600-eb4e-4d3f-bdc6-95a2ee81bfec ([email protected]) 1.37ms referer=None
<warning> [2024-03-11 08:58:44] [86b614ab] [ServerApp] Replacing stale connection: e68cf9c8-7054-46a7-8c8f-e0b047942bc5:ce6e2615-b30d-4520-bf81-4fd30c70795f
<warning> [2024-03-11 08:58:44] [86b614ab] [ServerApp] 400 GET /proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/api/kernels/e68cf9c8-7054-46a7-8c8f-e0b047942bc5/channels?session_id=ce6e2615-b30d-4520-bf81-4fd30c70795f ([email protected]) 1.17ms referer=None
<warning> [2024-03-11 08:58:44] [86b614ab] [ServerApp] Replacing stale connection: e68cf9c8-7054-46a7-8c8f-e0b047942bc5:5875e5fe-9b1e-47ea-a4ba-254b96d591d3
<warning> [2024-03-11 08:58:44] [86b614ab] [ServerApp] 400 GET /proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/api/kernels/e68cf9c8-7054-46a7-8c8f-e0b047942bc5/channels?session_id=5875e5fe-9b1e-47ea-a4ba-254b96d591d3 ([email protected]) 1.14ms referer=None
<warning> [2024-03-11 08:58:47] [86b614ab] [ServerApp] Replacing stale connection: e68cf9c8-7054-46a7-8c8f-e0b047942bc5:ba03c600-eb4e-4d3f-bdc6-95a2ee81bfec
<warning> [2024-03-11 08:58:47] [86b614ab] [ServerApp] 400 GET /proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/api/kernels/e68cf9c8-7054-46a7-8c8f-e0b047942bc5/channels?session_id=ba03c600-eb4e-4d3f-bdc6-95a2ee81bfec ([email protected]) 1.45ms referer=None
<warning> [2024-03-11 08:58:48] [86b614ab] [ServerApp] Replacing stale connection: e68cf9c8-7054-46a7-8c8f-e0b047942bc5:5875e5fe-9b1e-47ea-a4ba-254b96d591d3
<warning> [2024-03-11 08:58:48] [86b614ab] [ServerApp] 400 GET /proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/api/kernels/e68cf9c8-7054-46a7-8c8f-e0b047942bc5/channels?session_id=5875e5fe-9b1e-47ea-a4ba-254b96d591d3 ([email protected]) 1.63ms referer=None
<warning> [2024-03-11 08:58:50] [86b614ab] [ServerApp] Replacing stale connection: e68cf9c8-7054-46a7-8c8f-e0b047942bc5:ce6e2615-b30d-4520-bf81-4fd30c70795f
<warning> [2024-03-11 08:58:50] [86b614ab] [ServerApp] 400 GET /proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/api/kernels/e68cf9c8-7054-46a7-8c8f-e0b047942bc5/channels?session_id=ce6e2615-b30d-4520-bf81-4fd30c70795f ([email protected]) 1.45ms referer=None
<warning> [2024-03-11 08:58:54] [86b614ab] [ServerApp] Replacing stale connection: e68cf9c8-7054-46a7-8c8f-e0b047942bc5:5875e5fe-9b1e-47ea-a4ba-254b96d591d3
<warning> [2024-03-11 08:58:54] [86b614ab] [ServerApp] 400 GET /proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/api/kernels/e68cf9c8-7054-46a7-8c8f-e0b047942bc5/channels?session_id=5875e5fe-9b1e-47ea-a4ba-254b96d591d3 ([email protected]) 1.43ms referer=None
<warning> [2024-03-11 08:58:56] [86b614ab] [ServerApp] Replacing stale connection: e68cf9c8-7054-46a7-8c8f-e0b047942bc5:ba03c600-eb4e-4d3f-bdc6-95a2ee81bfec
<warning> [2024-03-11 08:58:56] [86b614ab] [ServerApp] 400 GET /proxy/5b2aef79-8a54-4294-bfa3-b21c7a7fa5fa/api/kernels/e68cf9c8-7054-46a7-8c8f-e0b047942bc5/channels?session_id=ba03c600-eb4e-4d3f-bdc6-95a2ee81bfec ([email protected]) 1.58ms referer=None

@rikirolly
Copy link
Author

And this is the log with Determined 0.27.0 version:

<info>    [2024-03-11 09:09:01] || INFO: Scheduling Prova (id: 4964624c-6c58-42a1-b702-7171b303155d.1)
<info>    [2024-03-11 09:09:02] [f48b4ab6] Pod cmd-4964624c-6c58-42a1-b702-7171b303155d-0-4964624c-6c58-42a1-b702-7171b303155d.1-related-locust: Pod resources allocated.
<info>    [2024-03-11 09:09:02] [f48b4ab6] Pod cmd-4964624c-6c58-42a1-b702-7171b303155d-0-4964624c-6c58-42a1-b702-7171b303155d.1-related-locust: Pulling image "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-622d512"
<info>    [2024-03-11 09:09:02] || INFO: Prova was assigned to an agent
<info>    [2024-03-11 09:12:13] [f48b4ab6] Pod cmd-4964624c-6c58-42a1-b702-7171b303155d-0-4964624c-6c58-42a1-b702-7171b303155d.1-related-locust: Successfully pulled image "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-622d512" in 3m10.984993682s (3m10.985002432s including waiting)
<info>    [2024-03-11 09:12:13] [f48b4ab6] Pod cmd-4964624c-6c58-42a1-b702-7171b303155d-0-4964624c-6c58-42a1-b702-7171b303155d.1-related-locust: Created container determined-init-container
<info>    [2024-03-11 09:12:18] [f48b4ab6] Pod cmd-4964624c-6c58-42a1-b702-7171b303155d-0-4964624c-6c58-42a1-b702-7171b303155d.1-related-locust: Started container determined-init-container
<info>    [2024-03-11 09:12:35] [f48b4ab6] Pod cmd-4964624c-6c58-42a1-b702-7171b303155d-0-4964624c-6c58-42a1-b702-7171b303155d.1-related-locust: Container image "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-622d512" already present on machine
<info>    [2024-03-11 09:12:35] [f48b4ab6] Pod cmd-4964624c-6c58-42a1-b702-7171b303155d-0-4964624c-6c58-42a1-b702-7171b303155d.1-related-locust: Created container determined-container
<info>    [2024-03-11 09:12:35] [f48b4ab6] Pod cmd-4964624c-6c58-42a1-b702-7171b303155d-0-4964624c-6c58-42a1-b702-7171b303155d.1-related-locust: Started container determined-container
<info>    [2024-03-11 09:12:36] [f48b4ab6] Resources for Prova have started
<warning> [2024-03-11 09:12:37] [f48b4ab6] Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
<info>    [2024-03-11 09:12:38] [f48b4ab6] [25] determined: detected 1 gpus
<info>    [2024-03-11 09:12:38] [f48b4ab6] [25] determined: detected 1 gpus
<info>    [2024-03-11 09:12:38] [f48b4ab6] [25] determined: Running task container on agent_id=ip-172-31-170-249.eu-central-1.compute.internal, hostname=cmd-4964624c-6c58-42a1-b702-7171b303155d-0-4964624c-6c58-42a1-b with visible GPUs ['GPU-8ec58339-2eb1-0f1b-9ef6-9cf09479bac2']
<>        [2024-03-11 09:12:38] [f48b4ab6] + test -f startup-hook.sh
<>        [2024-03-11 09:12:38] [f48b4ab6] + set +x
<warning> [2024-03-11 09:12:38] [f48b4ab6] [ServerApp] ServerApp.token config is deprecated in 2.0. Use IdentityProvider.token.
<info>    [2024-03-11 09:12:38] [f48b4ab6] [ServerApp] Package jupyterlab took 0.0000s to import
<info>    [2024-03-11 09:12:38] [f48b4ab6] [ServerApp] Package jupyter_archive took 0.0007s to import
<info>    [2024-03-11 09:12:38] [f48b4ab6] [ServerApp] Package jupyter_server_terminals took 0.0032s to import
<info>    [2024-03-11 09:12:38] [f48b4ab6] [ServerApp] Package nbclassic took 0.0000s to import
<warning> [2024-03-11 09:12:38] [f48b4ab6] [ServerApp] A `_jupyter_server_extension_points` function was not found in nbclassic. Instead, a `_jupyter_server_extension_paths` function was found and will be used for now. This function name will be deprecated in future releases of Jupyter Server.
<info>    [2024-03-11 09:12:38] [f48b4ab6] [ServerApp] Package notebook_shim took 0.0000s to import
<warning> [2024-03-11 09:12:38] [f48b4ab6] [ServerApp] A `_jupyter_server_extension_points` function was not found in notebook_shim. Instead, a `_jupyter_server_extension_paths` function was found and will be used for now. This function name will be deprecated in future releases of Jupyter Server.
<info>    [2024-03-11 09:12:38] [f48b4ab6] [ServerApp] jupyter_archive | extension was successfully linked.
<info>    [2024-03-11 09:12:38] [f48b4ab6] [ServerApp] jupyter_server_terminals | extension was successfully linked.
<info>    [2024-03-11 09:12:38] [f48b4ab6] [ServerApp] jupyterlab | extension was successfully linked.
<info>    [2024-03-11 09:12:38] [f48b4ab6] [ServerApp] nbclassic | extension was successfully linked.
<info>    [2024-03-11 09:12:38] [f48b4ab6] [ServerApp] Writing Jupyter server cookie secret to /run/determined/jupyter/runtime/jupyter_cookie_secret
<info>    [2024-03-11 09:12:39] [f48b4ab6] [ServerApp] notebook_shim | extension was successfully linked.
<warning> [2024-03-11 09:12:39] [f48b4ab6] [ServerApp] All authentication is disabled.  Anyone who can connect to this server will be able to run code.
<info>    [2024-03-11 09:12:39] [f48b4ab6] [ServerApp] notebook_shim | extension was successfully loaded.
<info>    [2024-03-11 09:12:39] [f48b4ab6] [ServerApp] jupyter_archive | extension was successfully loaded.
<info>    [2024-03-11 09:12:39] [f48b4ab6] [ServerApp] jupyter_server_terminals | extension was successfully loaded.
<info>    [2024-03-11 09:12:39] [f48b4ab6] [LabApp] JupyterLab extension loaded from /opt/conda/lib/python3.9/site-packages/jupyterlab
<info>    [2024-03-11 09:12:39] [f48b4ab6] [LabApp] JupyterLab application directory is /opt/conda/share/jupyter/lab
<info>    [2024-03-11 09:12:39] [f48b4ab6] [ServerApp] jupyterlab | extension was successfully loaded.
<>        [2024-03-11 09:12:39] [f48b4ab6]
<>        [2024-03-11 09:12:39] [f48b4ab6]   _   _          _      _
<>        [2024-03-11 09:12:39] [f48b4ab6]  | | | |_ __  __| |__ _| |_ ___
<>        [2024-03-11 09:12:39] [f48b4ab6]  | |_| | '_ \/ _` / _` |  _/ -_)
<>        [2024-03-11 09:12:39] [f48b4ab6]   \___/| .__/\__,_\__,_|\__\___|
<>        [2024-03-11 09:12:39] [f48b4ab6]        |_|
<>        [2024-03-11 09:12:39] [f48b4ab6]
<>        [2024-03-11 09:12:39] [f48b4ab6] Read the migration plan to Notebook 7 to learn about the new features and the actions to take if you are using extensions.
<>        [2024-03-11 09:12:39] [f48b4ab6]
<>        [2024-03-11 09:12:39] [f48b4ab6] https://jupyter-notebook.readthedocs.io/en/latest/migrate_to_notebook7.html
<>        [2024-03-11 09:12:39] [f48b4ab6]
<>        [2024-03-11 09:12:39] [f48b4ab6] Please note that updating to Notebook 7 might break some of your extensions.
<>        [2024-03-11 09:12:39] [f48b4ab6]
<info>    [2024-03-11 09:12:39] [f48b4ab6] [ServerApp] nbclassic | extension was successfully loaded.
<info>    [2024-03-11 09:12:39] [f48b4ab6] [ServerApp] Serving notebooks from local directory: /run/determined/workdir
<info>    [2024-03-11 09:12:39] [f48b4ab6] [ServerApp] Jupyter Server 2.10.0 is running at:
<info>    [2024-03-11 09:12:39] [f48b4ab6] [ServerApp] https://cmd-4964624c-6c58-42a1-b702-7171b303155d-0-4964624c-6c58-42a1-b:2993/proxy/4964624c-6c58-42a1-b702-7171b303155d/lab
<info>    [2024-03-11 09:12:39] [f48b4ab6] [ServerApp]     https://127.0.0.1:2993/proxy/4964624c-6c58-42a1-b702-7171b303155d/lab
<info>    [2024-03-11 09:12:39] [f48b4ab6] [ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
<info>    [2024-03-11 09:12:39] || INFO: Service of Prova is available
<info>    [2024-03-11 09:12:39] [f48b4ab6] [ServerApp] 302 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/ (@172.31.38.32) 0.36ms
<warning> [2024-03-11 09:12:42] [f48b4ab6] [LabApp] Could not determine jupyterlab build status without nodejs
<info>    [2024-03-11 09:12:55] [f48b4ab6] [ServerApp] Creating new notebook in
<info>    [2024-03-11 09:12:55] [f48b4ab6] [ServerApp] Writing notebook-signing key to /run/determined/jupyter/data/notebook_secret
<info>    [2024-03-11 09:12:55] [f48b4ab6] [ServerApp] Kernel started: 68808012-fad0-4d23-b284-074c886335b7
<warning> [2024-03-11 09:12:56] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=09b9ce3e-35de-48bc-9656-c77f73463087 ([email protected]) 202.76ms referer=None
<warning> [2024-03-11 09:12:56] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=ecbce716-cd13-4654-8e80-8344dca7fb0e ([email protected]) 1.23ms referer=None
<warning> [2024-03-11 09:12:56] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=65879399-f004-48c8-ad22-98740f318a5b ([email protected]) 1.29ms referer=None
<warning> [2024-03-11 09:12:56] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:09b9ce3e-35de-48bc-9656-c77f73463087
<warning> [2024-03-11 09:12:56] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=09b9ce3e-35de-48bc-9656-c77f73463087 ([email protected]) 1.22ms referer=None
<warning> [2024-03-11 09:12:56] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:ecbce716-cd13-4654-8e80-8344dca7fb0e
<warning> [2024-03-11 09:12:56] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=ecbce716-cd13-4654-8e80-8344dca7fb0e ([email protected]) 1.21ms referer=None
<warning> [2024-03-11 09:12:56] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:65879399-f004-48c8-ad22-98740f318a5b
<warning> [2024-03-11 09:12:56] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=65879399-f004-48c8-ad22-98740f318a5b ([email protected]) 1.15ms referer=None
<warning> [2024-03-11 09:12:56] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:09b9ce3e-35de-48bc-9656-c77f73463087
<warning> [2024-03-11 09:12:56] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=09b9ce3e-35de-48bc-9656-c77f73463087 ([email protected]) 1.88ms referer=None
<warning> [2024-03-11 09:12:57] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:65879399-f004-48c8-ad22-98740f318a5b
<warning> [2024-03-11 09:12:57] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=65879399-f004-48c8-ad22-98740f318a5b ([email protected]) 1.77ms referer=None
<warning> [2024-03-11 09:12:57] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:ecbce716-cd13-4654-8e80-8344dca7fb0e
<warning> [2024-03-11 09:12:57] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=ecbce716-cd13-4654-8e80-8344dca7fb0e ([email protected]) 1.35ms referer=None
<warning> [2024-03-11 09:12:58] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:09b9ce3e-35de-48bc-9656-c77f73463087
<warning> [2024-03-11 09:12:58] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=09b9ce3e-35de-48bc-9656-c77f73463087 ([email protected]) 1.44ms referer=None
<warning> [2024-03-11 09:12:58] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:ecbce716-cd13-4654-8e80-8344dca7fb0e
<warning> [2024-03-11 09:12:58] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=ecbce716-cd13-4654-8e80-8344dca7fb0e ([email protected]) 1.35ms referer=None
<warning> [2024-03-11 09:12:58] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:65879399-f004-48c8-ad22-98740f318a5b
<warning> [2024-03-11 09:12:58] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=65879399-f004-48c8-ad22-98740f318a5b ([email protected]) 1.27ms referer=None
<warning> [2024-03-11 09:13:02] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:65879399-f004-48c8-ad22-98740f318a5b
<warning> [2024-03-11 09:13:02] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=65879399-f004-48c8-ad22-98740f318a5b ([email protected]) 1.41ms referer=None
<warning> [2024-03-11 09:13:03] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:09b9ce3e-35de-48bc-9656-c77f73463087
<warning> [2024-03-11 09:13:03] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=09b9ce3e-35de-48bc-9656-c77f73463087 ([email protected]) 1.29ms referer=None
<warning> [2024-03-11 09:13:07] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:ecbce716-cd13-4654-8e80-8344dca7fb0e
<warning> [2024-03-11 09:13:07] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=ecbce716-cd13-4654-8e80-8344dca7fb0e ([email protected]) 1.42ms referer=None
<warning> [2024-03-11 09:13:09] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:65879399-f004-48c8-ad22-98740f318a5b
<warning> [2024-03-11 09:13:09] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=65879399-f004-48c8-ad22-98740f318a5b ([email protected]) 1.46ms referer=None
<warning> [2024-03-11 09:13:12] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:ecbce716-cd13-4654-8e80-8344dca7fb0e
<warning> [2024-03-11 09:13:12] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=ecbce716-cd13-4654-8e80-8344dca7fb0e ([email protected]) 1.42ms referer=None
<warning> [2024-03-11 09:13:18] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:09b9ce3e-35de-48bc-9656-c77f73463087
<warning> [2024-03-11 09:13:18] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=09b9ce3e-35de-48bc-9656-c77f73463087 ([email protected]) 1.40ms referer=None
<warning> [2024-03-11 09:13:23] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:65879399-f004-48c8-ad22-98740f318a5b
<warning> [2024-03-11 09:13:23] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=65879399-f004-48c8-ad22-98740f318a5b ([email protected]) 1.44ms referer=None
<warning> [2024-03-11 09:13:26] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:ecbce716-cd13-4654-8e80-8344dca7fb0e
<warning> [2024-03-11 09:13:26] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=ecbce716-cd13-4654-8e80-8344dca7fb0e ([email protected]) 1.43ms referer=None
<warning> [2024-03-11 09:13:42] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:09b9ce3e-35de-48bc-9656-c77f73463087
<warning> [2024-03-11 09:13:42] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=09b9ce3e-35de-48bc-9656-c77f73463087 ([email protected]) 1.42ms referer=None
<warning> [2024-03-11 09:13:53] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:65879399-f004-48c8-ad22-98740f318a5b
<warning> [2024-03-11 09:13:53] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=65879399-f004-48c8-ad22-98740f318a5b ([email protected]) 1.43ms referer=None
<warning> [2024-03-11 09:13:56] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:ecbce716-cd13-4654-8e80-8344dca7fb0e
<warning> [2024-03-11 09:13:56] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=ecbce716-cd13-4654-8e80-8344dca7fb0e ([email protected]) 2.49ms referer=None
<warning> [2024-03-11 09:13:57] [f48b4ab6] [ServerApp] Replacing stale connection: 68808012-fad0-4d23-b284-074c886335b7:09b9ce3e-35de-48bc-9656-c77f73463087
<warning> [2024-03-11 09:13:57] [f48b4ab6] [ServerApp] 400 GET /proxy/4964624c-6c58-42a1-b702-7171b303155d/api/kernels/68808012-fad0-4d23-b284-074c886335b7/channels?session_id=09b9ce3e-35de-48bc-9656-c77f73463087 ([email protected]) 1.42ms referer=None

@rikirolly
Copy link
Author

I initially tested to ensure it was working correctly with the Determined LoadBalancer and without HTTPS. I discovered that the problem was related to my NGINX Ingress. I solved the issue by annotating the ingress to enable the WebSocket connection:
nginx.org/websocket-services: determined-master-service-determined

@ioga
Copy link
Contributor

ioga commented Mar 25, 2024

great, thanks for the update.

@ioga ioga closed this as completed Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants