Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: rpc error: code = Unavailable desc = transport is closing , caused by flattenRKEClusterNodes #249

Closed
bizmate opened this issue Sep 18, 2020 · 6 comments · Fixed by #250
Assignees
Labels
bug Something isn't working

Comments

@bizmate
Copy link

bizmate commented Sep 18, 2020

As reported at hashicorp/terraform-provider-aws#15216 ( i thought it was an aws provider problem)

Apparently the below is linked to #239


Terraform CLI and Terraform AWS Provider Version

Version

$ terraform -v
Terraform v0.13.2
+ provider registry.terraform.io/hashicorp/aws v3.7.0
+ provider registry.terraform.io/hetznercloud/hcloud v1.21.0
+ provider registry.terraform.io/rancher/rke v1.1.1

  • resource "rke_cluster" "project_rke"

Terraform Configuration Files

# Create a new  RKE Cluster
resource "rke_cluster" "project_rke" {
  cluster_name = "rke-cluster"

  nodes {
    address = hcloud_server.project-infra-k8s-servers[1].ipv4_address
    #internal_address = module.nodes.internal_ips[0]
    user    = var.ssh_user
    ssh_key = hcloud_ssh_key.project-infra-key.public_key
    role    = ["controlplane", "etcd"]
  }
  nodes {
    address = hcloud_server.project-infra-k8s-servers[2].ipv4_address
    #internal_address = module.nodes.internal_ips[1]
    user    = var.ssh_user
    ssh_key = hcloud_ssh_key.project-infra-key.public_key
    role    = ["worker"]
  }
  nodes {
    address = hcloud_server.project-infra-k8s-servers[3].ipv4_address
    #internal_address = module.nodes.internal_ips[2]
    user    = var.ssh_user
    ssh_key = hcloud_ssh_key.project-infra-key.public_key
    role    = ["worker"]
  }

}

Debug Output

https://gist.github.com/bizmate/f1c700d4c27f3f87cd297702b2f38bfa

Shorter version below

$ terraform plan 
2020/09/18 03:14:23 [WARN] Log levels other than TRACE are currently unreliable, and are supported only for backward compatibility.
  Use TF_LOG=TRACE to see Terraform's internal logs.
  ----
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

hcloud_ssh_key.proj-infra-key: Refreshing state... [id=2076461]
hcloud_server.proj-infra-k8s-servers["1"]: Refreshing state... [id=7659097]
hcloud_server.proj-infra-k8s-servers["2"]: Refreshing state... [id=7659098]
hcloud_server.proj-infra-k8s-servers["3"]: Refreshing state... [id=7659099]
rke_cluster.proj_rke: Refreshing state... [id=5vbn849-7bc3-445e-94f4-f0126827c5e9]
2020/09/18 03:14:29 [ERROR] eval: *terraform.EvalRefresh, err: rpc error: code = Unavailable desc = transport is closing
2020/09/18 03:14:29 [ERROR] eval: *terraform.EvalSequence, err: rpc error: code = Unavailable desc = transport is closing
aws_route53_zone.rancher: Refreshing state... [id=Z00671vnvbDBWL9FQR]
aws_route53_record.rancher: Refreshing state... [id=Z006711vbnWL9FQR_rancher.user.space_A]

Error: rpc error: code = Unavailable desc = transport is closing

Expected Behavior

terraform plan ends successful and displays possible actions as per functionality

Actual Behavior

panic error even if i comment out the aws resource from the code

Steps to Reproduce

terrform init
terraform plan

@rawmind0
Copy link
Contributor

I'm guessing that your rke_cluster was deployed fine but getting this issue on update. If so, what modification are you trying to do at rke_cluster?? Are you removing nodes??

@rawmind0 rawmind0 self-assigned this Sep 18, 2020
@rawmind0 rawmind0 added the bug Something isn't working label Sep 18, 2020
@bizmate
Copy link
Author

bizmate commented Sep 18, 2020

hi @rawmind0
the cluster did not deploy correctly. I was using (as you see in the code) a reference to a public instead of a private key in each node. Before this error happened I changed the ssh key lines to something like

ssh_key = file("${var.private_key_path}") 

and then I run terraform plan. This triggered the error and now I am blocked as even commenting out the rke code does not allow plan to recover from the error. I am not sure why it was triggered as I have used this structure as per documentation but now it is just blocked. Any plan I attempt to run ends up prompting out this panic line

@rawmind0
Copy link
Contributor

rawmind0 commented Sep 18, 2020

Hi @bizmate , i clearly saw where the problem was but i was trying to figure out how get this situation. The issue is happening because for some reason, rke_state and rke_cluster_yaml are not consistent about nodes count. There are more nodes defined at rke_cluster_yaml than at rke_state. Your last message explains how this state can happen.

Submitted PR #250 that should address the issue

@bizmate
Copy link
Author

bizmate commented Sep 18, 2020

hi @rawmind0
thank you for looking it up so fast. I am wondering if you could deploy a new tag so that terraform init --update can pick up the new code?

I am still on v1.1.1 after the PR has been merged

@rawmind0
Copy link
Contributor

Already published, 1.1.2

@bizmate
Copy link
Author

bizmate commented Sep 18, 2020

@rawmind0 thank you :) got it installed and confirm it is indeed working now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants