Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Eternal hang at "Waiting for SSH to become available" #68

Closed
derks opened this issue Jan 15, 2014 · 10 comments
Closed

Eternal hang at "Waiting for SSH to become available" #68

derks opened this issue Jan 15, 2014 · 10 comments

Comments

@derks
Copy link

derks commented Jan 15, 2014

I suppose this is likely an issue with Fog more so than vagrant-rackspace, however it should be noted here. Last night half a dozen or so of our builds hung up at "Waiting for SSH to become available" and sat there all night long in that state. In Rackspace I could see that the box was active, but trying "vagrant ssh" also hung manually... and simply "ssh [email protected]" also hung.

As we are using Vagrant for automated testing, etc... I would have expected Vagrant to timeout at some point... am I missing something, or shouldn't this already be implemented?

Thanks.

@elight
Copy link

elight commented Jan 21, 2014

@derks I don't suppose you've tried running the Vagrantfile since? Doesn't sound like something likely to be transient but...

@derks
Copy link
Author

derks commented Jan 21, 2014

@elight Yeah, we've definitely been using it since.. and everything is working fine. I imagine that the cloud provider had issues at the time, being that I couldn't even ssh directly to them. The primary issue however is that I would have expected Vagrant or Fog to timeout at some point... but neither ever did.

@elight
Copy link

elight commented Jan 21, 2014

That cloud provider is us (Rackspace) so 🙀!

Sorry it took me so long to see this!

It may be an issue with excon as well. I'll take a quick look.

@elight
Copy link

elight commented Jan 21, 2014

@derks For future reference, there's some decent advice here. Otherwise, please do call our support. You should get a human right away!

@derks
Copy link
Author

derks commented Jan 21, 2014

@elight Hey thanks... I am familiar with Support (I was a racker for 9 years ;). However I'm not really concerned about why the servers didn't come up as expected. My concern is as it relates to Vagrant and the Rackspace plugin in this situation. If a server doesn't come up, or Vagrant can not SSH to it for any reason, the plugin should handle that more gracefully. Having Vagrant wait for a server to come up indefinitely, with no timeout, is not very useful (and can be pretty expensive) in automated environments like we are using it for testing.

Ideally, Vagrant would have timed out at some point waiting for SSH... my automated tests would have failed... and I would have resolved it in the morning just like anything else. But coming in to find that Vagrant was spinning it's tires on the companies dime was a bit of a PITA.

Just hoping to improve the way the plugin handles this type of situation. Hopefully it won't be a regular occurrence, but I'd rather have Vagrant tell me that something failed and that it handled that failure, rather than our bank account revealing that something didn't fail and we ended up paying for those 4 dozen test instances that were only suppose to be online for 5-10 minutes each... but were instead all online for 72 hours over the weekend while nobody was looking. ;)

@maxlinc
Copy link
Contributor

maxlinc commented Jan 22, 2014

I was able to reproduce, or at least create a similar scenario, by building a VM without public_net. Just added this to my Vagrantfile:

    rs.network '00000000-0000-0000-0000-000000000000', :attached => false

It looks like @derks is right. There is a timeout for server creation, but not waiting for SSH:

# Wait for SSH to become available
env[:ui].info(I18n.t("vagrant_rackspace.waiting_for_ssh"))
while true
# If we're interrupted then just back out
break if env[:interrupted]
break if env[:machine].communicate.ready?
sleep 2
end
env[:ui].info(I18n.t("vagrant_rackspace.ready"))
end

@elight
Copy link

elight commented Jan 22, 2014

Fog OpenStack doesn't handle private IPs with respect to SSH. I wonder if the same is true for fog-rackspace.

On Tue, Jan 21, 2014 at 9:15 PM, Max Lincoln [email protected]
wrote:

I was able to reproduce, or at least create a similar scenario, by building a VM without public_net. Just added this to my Vagrantfile:

    rs.network '00000000-0000-0000-0000-000000000000', :attached => false

It looks like @derks is right. There is a timeout for server creation, but not waiting for SSH:

# Wait for SSH to become available
env[:ui].info(I18n.t("vagrant_rackspace.waiting_for_ssh"))
while true
# If we're interrupted then just back out
break if env[:interrupted]
break if env[:machine].communicate.ready?
sleep 2
end
env[:ui].info(I18n.t("vagrant_rackspace.ready"))
end

Reply to this email directly or view it on GitHub:
#68 (comment)

@krames
Copy link
Collaborator

krames commented Jan 22, 2014

@elight This issue effects all providers. It just opened up an issue for it the other day. It should be easy enough to fix.

fog/fog#2584

@maxlinc
Copy link
Contributor

maxlinc commented Jan 22, 2014

I think there are at least three issues, only one of them related to fog/fog#2584:

  1. This issue: ssh retries indefinitely (even w/ a public IP)
  2. No differentiation between retriable errors (like SSH not ready) and clear errors (access denied)
  3. Cannot use with private IP (related to [openstack] Fog::Compute::Server#sshable? is not working for VM with private IP only fog/fog#2584)

Vagrant has a retriable helper that makes the first two fairly easy to fix, but the question of what action to take after all retries are exhausted is interesting. Just exiting with a non-zero error code but would require the user/script to handle the error and cleanup unwanted servers. "Rolling back" gets more complicated, as you'd need to distinguish between vagrant up and vagrant ssh.

@smashwilson
Copy link
Collaborator

Alright; @maxlinc is splitting this in two. Let's track discussions there instead.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants