Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: dstack apply detaches from runs with "Can't connect to the remote host" #2049

Open
r4victor opened this issue Dec 3, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@r4victor
Copy link
Collaborator

r4victor commented Dec 3, 2024

Steps to reproduce

dstack marks a run as running if at least one job is running. If the first job is still pulling but some other job is already running, dstack apply tries to connect to the first job and exits with "Can't connect to the remote host" after some timeout.

  1. Create a two node fleet.
  2. Start a multi-node task. Use a heavy docker image that is pulled on the second instance but not on the first.
  3. Get the error:
Submit a new run? [y/n]: y
little-turtle-1 provisioning completed (running)
Can't connect to the remote host

dstack apply should connect to the first job only when it's running.

Actual behaviour

No response

Expected behaviour

No response

dstack version

master

Server logs

No response

Additional information

No response

@r4victor r4victor added the bug Something isn't working label Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant