Provider doesn't work without kubecontext in some clusters #645

StepanKuksenko · 2019-10-11T12:47:43Z

Hi there,

We have several kubernetes clusters in GKE, the problem is reproduced only in one of the clusters. Although in fact there is no difference in setting up kubernetes providers in all clusters.

The bottom line is that we wanted to configure the creation of resources in a kubernetes cluster without kubecontext, but we succeeded in all clusters except one.

Terraform Version

Terraform v0.12.9

Affected Resource(s)

unknown

Terraform Configuration Files

# collect data from GKE clusters
data "google_client_config" "default" {}

data "google_container_cluster" "cluster1" {
  name       = "${var.cluster1_kube_master["name"]}"
  location   = "${var.zone}"
}

data "google_container_cluster" "cluster2" {
  name       = "${var.cluster2_kube_master["name"]}"
  location   = "${var.zone}"
}

data "google_container_cluster" "cluster3" {
  name       = "${var.cluster3_kube_master["name"]}"
  location   = "${var.zone}"
}


# connect to kubernates clusters

provider "kubernetes" {
  version = "~> 1.8.1"
  load_config_file = false

  alias = "cluster1"

  host  = "https://${data.google_container_cluster.cluster1.endpoint}"
  token = "${data.google_client_config.default.access_token}"

  cluster_ca_certificate = "${base64decode(data.google_container_cluster.cluster1.master_auth.0.cluster_ca_certificate)}"
}

provider "kubernetes" {
  version = "~> 1.8.1"
  load_config_file = false

  alias = "cluster2"

  host  = "https://${data.google_container_cluster.cluster2.endpoint}"
  token = "${data.google_client_config.default.access_token}"

  cluster_ca_certificate = "${base64decode(data.google_container_cluster.cluster2.master_auth.0.cluster_ca_certificate)}"
}

provider "kubernetes" {
  version = "~> 1.8.1"
  load_config_file = false

  alias = "cluster3"

  host  = "https://${data.google_container_cluster.cluster3.endpoint}"
  token = "${data.google_client_config.default.access_token}"

  cluster_ca_certificate = "${base64decode(data.google_container_cluster.cluster3.master_auth.0.cluster_ca_certificate)}"
}

# create secrets

resource "kubernetes_secret" "cluster1-key" {
  metadata {
    name = "git-ssh"
  }
  provider = kubernetes.cluster1

  data = {
    id_rsa = "${data.vault_generic_secret.key.data["value"]}"
  }

  type = "kubernetes.io/Opaque"
}

resource "kubernetes_secret" "cluster2-key" {
  metadata {
    name = "git-ssh"
  }
  provider = kubernetes.cluster2

  data = {
    id_rsa = "${data.vault_generic_secret.key.data["value"]}"
  }

  type = "kubernetes.io/Opaque"
}

resource "kubernetes_secret" "cluster3-key" {
  metadata {
    name = "git-ssh"
  }
  provider = kubernetes.cluster3

  data = {
    id_rsa = "${data.vault_generic_secret.key.data["value"]}"
  }

  type = "kubernetes.io/Opaque"
}

Debug Output

i can't provide full debug output because there are sensitive information.
Maybe you will say me what pieces need to check, i will try to do that.

Panic Output

Error: Get http://localhost/api/v1/namespaces/default/secrets/key: dial tcp [::1]:80: connect: connection refused
Error: Get http://localhost/api/v1/namespaces/default/secrets/key: dial tcp [::1]:80: connect: connection refused�
Error: Get http://localhost/api/v1/namespaces/default/secrets/key: dial tcp [::1]:80: connect: connection refused�

Expected Behavior

Plan: 0 to add, 0 to change, 0 to destroy.

Actual Behavior

Error: Get http://localhost/api/v1/namespaces/default/secrets/key: dial tcp [::1]:80: connect: connection refused
Error: Get http://localhost/api/v1/namespaces/default/secrets/key: dial tcp [::1]:80: connect: connection refused�
Error: Get http://localhost/api/v1/namespaces/default/secrets/key: dial tcp [::1]:80: connect: connection refused�

Steps to Reproduce

Create resource "kubernetes_secret" in multiple kubernetes clusters using aliases.
This we did before, the configuration file was exactly the same as I presented above, except for the option load_config_file = false. Successful creation of resources occurs only if the kubecontext is configured to connect to ANY cluster in one Google project.
If the kubecontext is configured to connect to another project, an error occurred Error: Unauthorized.
Now we added to each kubernetes provider option load_config_file = false to eliminate the use of kubecontext. In another clusters it works, except one where we got an error.
Try to apply terraform plan
Got an error (only in one cluster, on another clusters everything is ok - all works without kubecontext)

Important Factoids

References

The text was updated successfully, but these errors were encountered:

mnothic · 2020-03-02T16:27:30Z

if you downgrade the kubernetes provider to 1.9.0 you solve the problem cuz it's the provider >= 1.10.0

aareet · 2020-05-06T16:41:58Z

Investigation note: docs require improvement at https://www.terraform.io/docs/providers/google/guides/using_gke_with_terraform.html#using-the-kubernetes-and-helm-providers

dak1n1 · 2020-11-05T23:36:14Z

Related note: here's a test I did today with static configuration of a GKE cluster. There's a chance a variable like KUBE_HOST could be interfering with the provider config. #1037 (comment)

Offhand, in this specific issue, I would suspect a problem with the token = "${data.google_client_config.default.access_token}", which appears to be used in all 3 clusters. That's the part that needs further testing to see if the same token can be used across multiple clusters. It depends on if the Google Cloud API is placing that token on each cluster as a service account token.

I do see this usage listed in the google provider's docs, but I'm not sure if that is the right token to actually use here.

The token returned by google_client_config seems to be a Google Cloud API Token, which is different than the Kubernetes service account token that the Kubernetes provider is expecting. (Again, this does need verification because I have not actually tested the google_client_config on GKE clusters).

The example I linked above might provide a work-around in the meantime.

dak1n1 · 2021-03-10T02:01:35Z

There have been some changes to authentication in version 2.0.2, and some incoming changes to fix #1179 will help in identifying the configuration that is causing this issue. Given that this issue is fairly old, and that we haven't seen any activity by the original poster, I'm thinking we should close it for now. But here is some information that might help:

Most likely, one of the data sources became unknown during the plan phase. When either data.google_client_config* or data.google_container_cluster* are unknown during plan, the provider will initialize using empty credentials, which causes the Error: Get http://localhost/... errors. It's unfortunately a common problem when you have a single apply that modifies an underlying GKE cluster while there are Kubernetes resources defined on it. For that reason, we recommend using two applies where possible, or at least separating the GKE resources from the Kubernetes resources using separate modules.

We do have an example for GKE which demonstrates separating the two into modules. It might provide some guidance about how to best approach these problems, depending on if you're replacing the cluster or making other modifications to it.

I'm going to close this issue for now, since it is something we're tracking in many other issues and upstream. It should resolve with #1179. But feel free to reopen if I've misunderstood the problem or if it's still ongoing even with the GKE infrastructure separated out from the Kubernetes infrastructure.

ghost · 2021-04-09T05:10:00Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

aareet added the needs investigation label May 6, 2020

aareet added the acknowledged Issue has undergone initial review and is in our work queue. label May 27, 2020

aareet added the bug label Jul 2, 2020

dak1n1 closed this as completed Mar 10, 2021

ghost locked as resolved and limited conversation to collaborators Apr 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provider doesn't work without kubecontext in some clusters #645

Provider doesn't work without kubecontext in some clusters #645

StepanKuksenko commented Oct 11, 2019 •

edited

Loading

mnothic commented Mar 2, 2020

aareet commented May 6, 2020

dak1n1 commented Nov 5, 2020

dak1n1 commented Mar 10, 2021

ghost commented Apr 9, 2021

Provider doesn't work without kubecontext in some clusters #645

Provider doesn't work without kubecontext in some clusters #645

Comments

StepanKuksenko commented Oct 11, 2019 • edited Loading

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Panic Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Important Factoids

References

mnothic commented Mar 2, 2020

aareet commented May 6, 2020

dak1n1 commented Nov 5, 2020

dak1n1 commented Mar 10, 2021

ghost commented Apr 9, 2021

StepanKuksenko commented Oct 11, 2019 •

edited

Loading