Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't use helm provider v1 with plan & apply on different machines #416

Closed
jungopro opened this issue Feb 20, 2020 · 7 comments · Fixed by #466
Closed

Can't use helm provider v1 with plan & apply on different machines #416

jungopro opened this issue Feb 20, 2020 · 7 comments · Fixed by #466

Comments

@jungopro
Copy link

Hello

There is an issue with running plan & apply on different build agents in a CI CD pipeline. It is document here

The use of the home key in the provider worked fine in helm provider < 1.0. for example:

provider "helm" {
  debug           = true
  version         = "~> 0.10"
  namespace       = "kube-system"
  service_account = kubernetes_service_account.tiller_sa.metadata.0.name
  home            = "${abspath(path.root)}/.helm"

  kubernetes {
    host                   = azurerm_kubernetes_cluster.aks.kube_config.0.host
    client_certificate     = base64decode(azurerm_kubernetes_cluster.aks.kube_config.0.client_certificate)
    client_key             = base64decode(azurerm_kubernetes_cluster.aks.kube_config.0.client_key)
    cluster_ca_certificate = base64decode(azurerm_kubernetes_cluster.aks.kube_config.0.cluster_ca_certificate)
    load_config_file       = false
  }
}

However, when upgrading the provider to version 1.0, the home key is invalid anymore

terraform plan
Acquiring state lock. This may take a few moments...

Error: Unsupported argument

  on init.tf line 40, in provider "helm":
  40:   home            = "${abspath(path.root)}/.helm"

An argument named "home" is not expected here.

This key can be removed and terraform will work fine as long as the plan & apply both done on the same machine (e.g. developer's laptop)
But when running the plan and apply on different machines (for example, during a CI / CD process) this will break since the /.helm folder doesn't exist on the agent, causing the deployment of the helm release to fail. for example:

# plan phase

2020-02-20T11:55:00.2426615Z �[1m  # helm_release.phippyandfriends["parrot"]�[0m will be created�[0m�[0m
2020-02-20T11:55:00.2427141Z �[0m  �[32m+�[0m�[0m resource "helm_release" "phippyandfriends" {
2020-02-20T11:55:00.2427646Z       �[32m+�[0m �[0m�[1m�[0matomic�[0m�[0m                = false
2020-02-20T11:55:00.2428208Z       �[32m+�[0m �[0m�[1m�[0mchart�[0m�[0m                 = "parrot"
2020-02-20T11:55:00.2428911Z       �[32m+�[0m �[0m�[1m�[0mcleanup_on_fail�[0m�[0m       = false
2020-02-20T11:55:00.2429613Z       �[32m+�[0m �[0m�[1m�[0mdependency_update�[0m�[0m     = false
2020-02-20T11:55:00.2430106Z       �[32m+�[0m �[0m�[1m�[0mdisable_crd_hooks�[0m�[0m     = false
2020-02-20T11:55:00.2430586Z       �[32m+�[0m �[0m�[1m�[0mdisable_webhooks�[0m�[0m      = false
2020-02-20T11:55:00.2431122Z       �[32m+�[0m �[0m�[1m�[0mforce_update�[0m�[0m          = false
2020-02-20T11:55:00.2432099Z       �[32m+�[0m �[0m�[1m�[0mid�[0m�[0m                    = (known after apply)
2020-02-20T11:55:00.2432645Z       �[32m+�[0m �[0m�[1m�[0mmax_history�[0m�[0m           = 0
2020-02-20T11:55:00.2436642Z       �[32m+�[0m �[0m�[1m�[0mmetadata�[0m�[0m              = (known after apply)
2020-02-20T11:55:00.2437794Z       �[32m+�[0m �[0m�[1m�[0mname�[0m�[0m                  = "parrot"
2020-02-20T11:55:00.2440461Z       �[32m+�[0m �[0m�[1m�[0mnamespace�[0m�[0m             = "phippyandfriends"
2020-02-20T11:55:00.2449192Z       �[32m+�[0m �[0m�[1m�[0mrecreate_pods�[0m�[0m         = false
2020-02-20T11:55:00.2450215Z       �[32m+�[0m �[0m�[1m�[0mrender_subchart_notes�[0m�[0m = true
2020-02-20T11:55:00.2450894Z       �[32m+�[0m �[0m�[1m�[0mreplace�[0m�[0m               = false
2020-02-20T11:55:00.2451419Z       �[32m+�[0m �[0m�[1m�[0mrepository�[0m�[0m            = "***"
2020-02-20T11:55:00.2451948Z       �[32m+�[0m �[0m�[1m�[0mreset_values�[0m�[0m          = false
2020-02-20T11:55:00.2452482Z       �[32m+�[0m �[0m�[1m�[0mreuse_values�[0m�[0m          = false
2020-02-20T11:55:00.2453060Z       �[32m+�[0m �[0m�[1m�[0mskip_crds�[0m�[0m             = false
2020-02-20T11:55:00.2453776Z       �[32m+�[0m �[0m�[1m�[0mstatus�[0m�[0m                = "deployed"
2020-02-20T11:55:00.2454491Z       �[32m+�[0m �[0m�[1m�[0mtimeout�[0m�[0m               = 300
2020-02-20T11:55:00.2454986Z       �[32m+�[0m �[0m�[1m�[0mverify�[0m�[0m                = false
2020-02-20T11:55:00.2455494Z       �[32m+�[0m �[0m�[1m�[0mversion�[0m�[0m               = "v0.5.0"
2020-02-20T11:55:00.2456021Z       �[32m+�[0m �[0m�[1m�[0mwait�[0m�[0m                  = true
2020-02-20T11:55:00.2456689Z 
2020-02-20T11:55:00.2468851Z       �[32m+�[0m �[0mset {
2020-02-20T11:55:00.2469468Z           �[32m+�[0m �[0m�[1m�[0mname�[0m�[0m  = "image.repository"
2020-02-20T11:55:00.2470036Z           �[32m+�[0m �[0m�[1m�[0mvalue�[0m�[0m = "***.azurecr.io/parrot"
2020-02-20T11:55:00.2470298Z         }
2020-02-20T11:55:00.2470677Z       �[32m+�[0m �[0mset {
2020-02-20T11:55:00.2471118Z           �[32m+�[0m �[0m�[1m�[0mname�[0m�[0m  = "ingress.alias"
2020-02-20T11:55:00.2472657Z           �[32m+�[0m �[0m�[1m�[0mvalue�[0m�[0m = "phippyandfriends.dvps.***.guru"
2020-02-20T11:55:00.2473543Z         }
2020-02-20T11:55:00.2474072Z       �[32m+�[0m �[0mset {
2020-02-20T11:55:00.2474823Z           �[32m+�[0m �[0m�[1m�[0mname�[0m�[0m  = "ingress.basedomain"
2020-02-20T11:55:00.2475892Z           �[32m+�[0m �[0m�[1m�[0mvalue�[0m�[0m = (known after apply)
2020-02-20T11:55:00.2476369Z         }
2020-02-20T11:55:00.2476593Z     }
...
2020-02-20T11:55:00.2559391Z �[0m�[1mPlan:�[0m 13 to add, 0 to change, 0 to destroy.�[0m
2020-02-20T11:55:00.2559598Z 
2020-02-20T11:55:00.2559986Z ------------------------------------------------------------------------
2020-02-20T11:55:00.2560232Z 
2020-02-20T11:55:00.2560652Z This plan was saved to: 503-dvps.plan
2020-02-20T11:55:00.2560853Z 
2020-02-20T11:55:00.2561070Z To perform exactly these actions, run the following command to apply:
2020-02-20T11:55:00.2561531Z     terraform apply "503-dvps.plan"
2020-02-20T11:55:00.2561765Z 
2020-02-20T11:55:00.3165350Z ##[section]Finishing: Terraform Dry Run (Plan)

failure in apply phase:

# apply phase

2020-02-20T12:21:11.9461719Z �[1m�[31mError: �[0m�[0m�[1mrepo *** not found�[0m
2020-02-20T12:21:11.9461981Z 
2020-02-20T12:21:11.9462420Z �[0m  on main.tf line 151, in resource "helm_release" "phippyandfriends":
2020-02-20T12:21:11.9462846Z  151: resource "helm_release" "phippyandfriends" �[4m{�[0m
2020-02-20T12:21:11.9463198Z �[0m
2020-02-20T12:21:11.9463508Z �[0m�[0m
2020-02-20T12:21:11.9473599Z �[31m
2020-02-20T12:21:11.9474198Z �[1m�[31mError: �[0m�[0m�[1mrepo *** not found�[0m
2020-02-20T12:21:11.9474291Z 
2020-02-20T12:21:11.9476491Z �[0m  on main.tf line 151, in resource "helm_release" "phippyandfriends":
2020-02-20T12:21:11.9477223Z  151: resource "helm_release" "phippyandfriends" �[4m{�[0m
2020-02-20T12:21:11.9477766Z �[0m
2020-02-20T12:21:11.9478336Z �[0m�[0m
2020-02-20T12:21:11.9488647Z �[31m
2020-02-20T12:21:11.9489190Z �[1m�[31mError: �[0m�[0m�[1mrepo *** not found�[0m
2020-02-20T12:21:11.9489271Z 
2020-02-20T12:21:11.9489671Z �[0m  on main.tf line 151, in resource "helm_release" "phippyandfriends":
2020-02-20T12:21:11.9490010Z  151: resource "helm_release" "phippyandfriends" �[4m{�[0m
2020-02-20T12:21:11.9490246Z �[0m
2020-02-20T12:21:11.9490588Z �[0m�[0m
2020-02-20T12:21:11.9500783Z �[31m
2020-02-20T12:21:11.9505476Z �[1m�[31mError: �[0m�[0m�[1mrepo *** not found�[0m
2020-02-20T12:21:11.9505598Z 
2020-02-20T12:21:11.9505922Z �[0m  on main.tf line 151, in resource "helm_release" "phippyandfriends":
2020-02-20T12:21:11.9506204Z  151: resource "helm_release" "phippyandfriends" �[4m{�[0m
2020-02-20T12:21:11.9506420Z �[0m
2020-02-20T12:21:11.9506611Z �[0m�[0m
2020-02-20T12:21:12.0553033Z ##[error]Bash exited with code '1'.
2020-02-20T12:21:12.0564378Z ##[section]Finishing: Deploy (Terraform Apply)

please note that it is not a problem with the configuration / etc:

  • it is working on a dev machine, where both plan and apply running on the same machine
  • it is working with helm provider < 1.0, meaning there is no problem in the actual terraform code
  • reverting back to the old version fix the problem without any other changes

How can I accomplish the same scenario (plan & apply on separate machines) using the new provider version (and finally remove tiller 😄 )?

Omer

@nickrichardson-presto
Copy link

I'm experiencing the same issue with running plan then apply in what's basically separate Docker containers in my CI/CD pipeline

Error: failed to download "traefik/traefik" (hint: running helm repo update may help)

(Even though it works locally)

@jrhouston
Copy link
Contributor

jrhouston commented Mar 13, 2020

I was able to reproduce this problem locally with the following steps using this example:

provider "helm" {}

data "helm_repository" "stable" {
   name = "stable"
   url = "https://kubernetes-charts.storage.googleapis.com"
}

resource "helm_release" "example" {
   name = "example"
   repository = "stable"
   chart = "postgresql"
}
  1. terraform apply
  2. Make an edit to the release
  3. terraform plan -out tf.plan
  4. helm repo remove stable
  5. terraform apply tf.plan 💣 🔥

I was able to work around this by removing the helm_repository data source and configuring the release to explicitly use the URL of the repository, and the above steps succeeded:

provider "helm" {}

resource "helm_release" "example" {
   name = "example"
   repository = "https://kubernetes-charts.storage.googleapis.com"
   chart = "postgresql"
}

The problem here seems to be that the helm_repository data source is writing state to the file system here where it creates an entry in helm's repositories.yaml file. When coming to do an install, helm expects an entry for the repository name to be in this file and throws the failed to download error seen above if an entry does not exist for the name.

So when we output a plan and try to run it on a fresh machine, the helm_repository data source never gets refreshed and therefore the repo entry doesn't get created, causing the apply to fail. This somewhat questions the legitimacy of helm_repository as a data source, because it is in fact creating a piece of state that's outside of terraform and depended upon by another resource, not just querying for information.

There's a few paths forward here:

  1. Deprecate the helm_repository resource entirely, and do the repository configuration at the helm_release level. I think the intent behind the helm_repository resource is that you only have to configure the repo and it's auth credentials once and re-use it, so this will create a bunch of repetition.

  2. Make helm_repository a resource. In the case were terraform is being run fresh in CI, this would mean the resource would always be being created which I'm not sure makes a lot of sense. This data source was previously a resource, and I don't have full context for why it was changed. This is also confusing because the resource would not actually manage a repository per se, but a RepoEntry in the repositories.yaml file where terraform is being executed.

  3. Find a way of storing this repository entry inside the terraform state, and feeding it into helm at apply time. I haven't yet looked into how feasible this is. The provider defers the locating of the chart to helm here which then uses the ChartDownloader configured with a path to the repositories.yaml here.

Thoughts on the above would be much appreciated.

@jungopro
Copy link
Author

@jrhouston thank you for your suggestion, this has worked very well for me
since I'm using AKS, my provider config is a bit different, see below if anyone needs it in the future. specifically, the load_config_file = false was a must for me:

provider "helm" {
  debug   = true
  version = "~> 1.0.0"

  kubernetes {
    host                   = azurerm_kubernetes_cluster.aks.kube_config.0.host
    client_certificate     = base64decode(azurerm_kubernetes_cluster.aks.kube_config.0.client_certificate)
    client_key             = base64decode(azurerm_kubernetes_cluster.aks.kube_config.0.client_key)
    cluster_ca_certificate = base64decode(azurerm_kubernetes_cluster.aks.kube_config.0.cluster_ca_certificate)
    load_config_file       = false
  }
}

@mcuadros
Copy link
Collaborator

Closing this issue since is making reference to a version based on Helm 2, if this is still valid to the master branch please reopen it. Thanks.

@nickrichardson-presto
Copy link

No, this issue still exists, although it can be overcome by not using the helm_repository and instead including the repo directly in the helm_release

@eskp
Copy link

eskp commented May 5, 2020

Still seeing the issue when specifying the incubator repo url

resource "helm_release" "alb_ingress" { name = "aws-alb-ingress-controller" repository = "https://kubernetes-charts-incubator.storage.googleapis.com" chart = "aws-alb-ingress-controller" namespace = "kube-system" version = "1.0.0" }

Error: failed to download "https://kubernetes-charts-incubator.storage.googleapis.com/aws-alb-ingress-controller-1.0.0.tgz" (hint: running helm repo update may help)

Provider version 1.0
Terraform v0.12.24

@ghost
Copy link

ghost commented May 19, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators May 19, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants