Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

terraform.EvalValidateResource Errors: [connection is shut down] #456

Closed
Koleon opened this issue Apr 6, 2018 · 5 comments
Closed

terraform.EvalValidateResource Errors: [connection is shut down] #456

Koleon opened this issue Apr 6, 2018 · 5 comments
Labels
bug Type: Bug crash Impact: Crash

Comments

@Koleon
Copy link

Koleon commented Apr 6, 2018

Hello there,
I deploy different linux distros and Windows machines from VMware templates. In total about 25 machines. Unfortunately the main.tf is pretty complex and I cannot simulate the panic situation via smaller main.tf (it just do not fail at all).

Terraform Version

Terraform v0.11.5
+ provider.vsphere v1.3.3

Terraform Configuration Files

Crash Output

https://gist.github.com/Koleon/82bbbea2d787799397ea116bef7e8ea6

Expected Behavior

All VMs are deployed without any error.

Actual Behavior

A few of VMs are deployed some of them are not.

Steps to Reproduce

  1. terraform init
  2. terraform apply

Additional Context

References

@catsby catsby added the bug Type: Bug label Apr 6, 2018
@Koleon
Copy link
Author

Koleon commented Apr 9, 2018

Bump, is there anyway how to help out with debugging this issue? Does anyone have an idea what might be wrong please?
EDIT: Issue seems to be operation system related (Ubuntu 16.04). After my colleague clone my repo, it work like a charm on Fedora 27.

@vancluever vancluever added the crash Impact: Crash label Apr 10, 2018
@vancluever
Copy link
Contributor

@Koleon we can add a nil check for the crash, but we would issue an error in that case anyway, and without an easy way to reproduce the situation, I'm not 100% inclined to as we usually like to add tests for things. I understand it's large, but do you mind sharing your config?

The line you quoted is a situation that does not seem to come up during normal testing operations (DVSManagerLookupDvPortGroup returns errors when it can't find a port group or encounters another error), and we test on Ubuntu 16.04 as well.

Thanks!

@Koleon
Copy link
Author

Koleon commented Apr 11, 2018

@vancluever thanks for your reply and effort to help out. Just to be sure, I installed new ubuntu 16.04 and run terraform v0.11.7 once again, unfortunately it failed. So please see attachment - terraform-cycz.tar.gz

We have 2 workspaces - default, bt1, lately there might be bt2. Terraform files are divided into 2 groups. Files related to bt{1-2} start with wks_bt_{dmz, srv, ops, usr}.tf and wks_default_{global, main}.tf related to defaut workspace. Most of the rest files are self-explaining hopefully. The crash happens only during bt1 deploy.
Thank you for cooperation!

vancluever added a commit that referenced this issue Apr 12, 2018
I'm not too sure how this can happen exactly, but we have gotten a crash
report on it (#456). Every invocation of DVSManagerLookupDvPortGroup
that I can see should return some sort of fault, however the report here
indicates that under some scenarios, a nil result can be returned
instead with no fault.

This patch handles this, wrapping the case in a specific error. Based on
the results of this (we will probably have to rely on user feedback to
see what exactly kinds of scenarios this can happen under), we might
handle the wrapped error higher up the stack and possibly ignore it,
depending on whether or not this is coming from cloned templates
immediately on post-refresh.
@vancluever
Copy link
Contributor

Hey @Koleon, there is a fix now for this in #471. If you have the ability to do so, would you please try your issue again with a custom built provider binary against the branch in the PR to see that resolves the issue for you? This can help us confirm that the issue will be fixed in the next release.

If you still run into the error (there's a good chance you will, but this time without a crash) the new error message should tell you the failing DVS UUID and portgroup. If you get valid values for these, can you try looking them up using the DVSManagerLookupDvPortGroup function in the MOB? The URL for the MO would be https://yourvcenterserver.local/mob/?moid=DVSManager (where yourvcenterserver.local is your vCenter server). PS: More info in the MOB here.

Thanks!

@vancluever
Copy link
Contributor

Hey @Koleon, this fix is now in master. I'm going to close this now, but if you have a chance can you give it a go, and open a new issue with the results, as it should no longer crash, and if it does it will probably be due to another issue?

Thanks!

@ghost ghost locked and limited conversation to collaborators Apr 19, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Type: Bug crash Impact: Crash
Projects
None yet
Development

No branches or pull requests

3 participants