Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helm lookup Function Support #5202

Open
rajivml opened this issue Jan 7, 2021 · 99 comments
Open

Helm lookup Function Support #5202

rajivml opened this issue Jan 7, 2021 · 99 comments
Labels
component:helm enhancement New feature or request workaround There's a workaround, might not be great, but exists

Comments

@rajivml
Copy link

rajivml commented Jan 7, 2021

Hello,

Happy new year Guys !!

So, I have this requirement to build the imagePath by reading the dockerRegistryIP value from configMap, so that I need not ask user explicitly where the registry is located.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ template "helm-guestbook.fullname" . }}
    spec:
      containers:
        - name: {{ .Chart.Name }}
          image: {{ printf "%s/%s:%s" **$dockerRegistryIP** .Values.image.repository .Values.image.tag }}

Helm3 has introduced support for this where they introduced a lookup function through which configMap can be read at runtime like this,

{{ (lookup "v1" "ConfigMap" "default" "my-configmap").data.registryURL}}

But the lookup function will return nil when templates are rendered using "helm dryrun" or "helm template" as a result when you parse a field on nil, you will see an exception like this,

"nil pointer evaluating interface {}.registryURL Use --debug flag to render out invalid YAML"

The solution which was proposed on stack overflow is to use "helm template --validate" instead of "helm template"

Can you guys add support for this ?

Right now am populating docker-registry-ip like this, but with this kustomize-plugin approach am loosing the ability to render values.yaml file as an config screen through which user can override certain values i.e. the fix to solve one issue has lead to an other issue

kubectl -n argocd get cm argocd-cm -o yaml
apiVersion: v1
data:
  configManagementPlugins: |
    - name: kustomized-helm
      generate:
        command: [sh, -c]
        args: ["DOCKER_REG_IP=$(kubectl -n registry get svc registry -o jsonpath={.spec.clusterIP}) && sed -i \"s/DOCKER_REGISTRY_IP/$DOCKER_REG_IP/g\" kustomization.yaml | helm template $ARGOCD_APP_NAME --namespace $ARGOCD_APP_NAMESPACE . > all.yaml && kustomize build"]
    - name: kustomized
      generate:
        command: [sh, -c]
        args: ["DOCKER_REG_IP=$(kubectl -n registry get svc registry -o jsonpath={.spec.clusterIP}) && sed -i \"s/DOCKER_REGISTRY_IP/$DOCKER_REG_IP/g\" kustomization.yaml | kustomize build"]
@jessesuen
Copy link
Member

Note that even if we allowed configuring Argo CD to append the --validate arg when running the helm template command, the repo-server would still need to be given API server credentials (i.e. mount a service account token) in order to perform the lookup. We would not ever allow kubernetes credentials to repo-server as a default (though you are welcome to modify your deployment in your environment) so there would not be a value in adding a --validate option configuration.

Since you would anyways need a customized repo-server, you can already accomplish this today using a wrapper script around the helm binary which appends the argument in the script (coupled with the service account given to the repo server).

@jessesuen jessesuen added the workaround There's a workaround, might not be great, but exists label Jan 7, 2021
@kvaps
Copy link

kvaps commented Jan 7, 2021

@jessesuen, I guess this workaround is possible only with in-cluster configuration, and wont work for the external ones.

@jessesuen
Copy link
Member

Ah yes, you are right about that unfortunately.

@Gowiem
Copy link

Gowiem commented Feb 16, 2021

@jessesuen just coming across this issue and I'm running into the same. There is duplicate issue here as well: #3640

You mentioned two things to accomplish as a work around for Argo not supporting this:

  1. Add a service account for Argo + mount the token -- This is straight forward and would be easy to implement.
  2. "using a wrapper script around the helm binary which appends the argument in the script" -- This I don't really get.

Can you expand more on the wrapper script? How would one inject that into a standard Argo deployment?

@dvcanton
Copy link

Hi @jessesuen, @Gowiem,

any updates on this?

Thanks in advance, Dave

@Gowiem
Copy link

Gowiem commented Apr 14, 2021

@dvcanton I tried the Argo plugin / wrapper script approach that @jessesuen mentioned after asking about it directly in the Argo Slack. You can find more about that by looking at the plugins documentation.

Unfortunately, that solution seemed overly hacky and pretty esoteric to me and my team. Instead we've now moved towards not using lookup in our charts and copy / pasting certain configuration manually instead. It's not great and I wish Argo would support this, but doesn't seem like there is enough momentum unfortunately.

@randrusiak
Copy link

A lot of charts use build-in objects such as Capabilities to provide backward compatibility for old APIs. Capabilities.APIVersions works properly only with --validate flag because without this flag it returns only API versions without available resources.
There is an example in grafana chart: https://github.com/grafana/helm-charts/blob/main/charts/grafana/templates/ingress.yaml#L7

@kvaps
Copy link

kvaps commented Apr 27, 2021

As about Capabilities, helm template command supports setting capabilities manually, ref #3594

@randrusiak
Copy link

randrusiak commented Apr 28, 2021

@kvaps take a look for an example which I posted.
{{- $newAPI := .Capabilities.APIVersions.Has "networking.k8s.io/v1/Ingress" -}} it returns false because without --validate flag only APIs versions are returned.

@kvaps
Copy link

kvaps commented Apr 28, 2021

@randrusiak, it works to me:

# helm template . --set ingress.enabled=true --include-crds > /tmp/1.yaml
# helm template . --api-versions networking.k8s.io/v1/Ingress --set ingress.enabled=true --include-crds > /tmp/2.yaml
# diff -u /tmp/1.yaml /tmp/2.yaml
@@ -399,7 +399,7 @@
           emptyDir: {}
 ---
 # Source: grafana/templates/ingress.yaml
-apiVersion: extensions/v1beta1
+apiVersion: networking.k8s.io/v1
 kind: Ingress
 metadata:
   name: RELEASE-NAME-grafana
@@ -417,9 +417,12 @@
         paths:
 
           - path: /
+            pathType: Prefix
             backend:
-              serviceName: RELEASE-NAME-grafana
-              servicePort: 80
+              service:
+                name: RELEASE-NAME-grafana
+                port:
+                  number: 80

my idea was that ArgoCD could provide repo-server list of api-versions from destination Kubernets API. eg:

kubectl api-versions

will return all available apiversions for the cluster.

Not sure if lookup function support can be implemented with the same simplicity, as it already requires direct access to the cluster.

@randrusiak
Copy link

@kvaps I understand how it works on the helm level, but I don't know how to pass this additional flag--api-versions networking.k8s.io/v1/Ingress via argo manifest. It's still unclear to me. Could you explain that to me? I'd appreciate your help.

@kvaps
Copy link

kvaps commented Apr 28, 2021

Actually my note was more likely for contributors than for users :)
They could implement api-versions passing from the argocd-application-controller to the argocd-repo-server via API call.

On current stage, I think you got nothing to do. The only workaround for you is to add serviceAccount to your repo-server and use --validate flag for helm template (you would need to create small shell wrapper script for helm command, or use custom plugin). Unfortunately this is less secure and would work only with the single cluster (the current one).

Another option for you is to hardcode those parameters somewhere, eg. save the output of the following command:

kubectl api-versions | awk '{printf " --api-versions " $1 } END{printf "\n"}' 

And pass it to helm, using any suitable way for you, eg, you can still use wrapper script for helm, something like that:

cat /usr/local/bin/helm
#!/bin/sh
exec /usr/local/bin/helm.bin $HELM_EXTRA_ARGS "$@"

where:

  • /usr/local/bin/helm.bin - the original helm binary
  • #HELM_EXTRA_ARGS - extra variable for your repo-server

or using custom-plugin

@jgoeres
Copy link

jgoeres commented May 7, 2021

We also ran into the issue with the non-working lookup function. Background is that we want to make sure that for a certain service, a secure random password is generated instead of having a hardcoded default. If desired, the use can explicitly set his own password, but most people don't.
Since we are using Helm's random function, the password is newly generated with each helm upgrade (resp. helm template), so the password would not remain stable. So we use the lookup function to check if the secret holding the password already exists and only generate the password if it doesn't. So effectively, it will only be generated initially on "helm install".
With the non-working lookup function in Argo, we have the issue that the password will be regenerated on each sync, wreaking quite some havoc, as you might guess.

Our goal is to keep the helmchart usage as simple as possible and require as little parameters for simple installations. So I would like to keep the "generate a secure (and stable) random password" as the default for "pure" Helm usage.
Is there a way to find out in the helmchart that we are actually running inside Argo? That would allow me to react to this and add a validation that enforces explicitly setting a password in Argo-based deployments.

@krutsko
Copy link

krutsko commented Oct 1, 2021

Any update on this issue?

@sarahhenkens
Copy link

Ran into this same problem today :( more context in https://cloud-native.slack.com/archives/C01TSERG0KZ/p1635024460105000

@mosheavni
Copy link
Contributor

Same thing happens with aws-load-balancer-controller's mutating webhook that defines tls key, cert and CA:
https://github.com/aws/eks-charts/blob/f4be91b5ae4a2959e821940a77d50dd0424841c1/stable/aws-load-balancer-controller/templates/_helpers.tpl
It can't reuse the previously defined keys if it can't access the cluster, thus producing an Argo app that's always out of sync

@Phil0ctetes
Copy link

Ran into this today :(

Do you have any timeline when the lookup will be available?

@Phil0ctetes
Copy link

Ran into this today :(

Do you have any timeline when the lookup will be available?

It looks like you have overseen my question so I wanted to ask again. Is there any plan to get this in and when?

@richie-tt
Copy link

I run into the same issue, but maybe the lack of this function is a good reason to move to the fluxv2, which already supports it

fluxcd/helm-operator#335

@13013SwagR
Copy link

FYI, this project provides a decent workaround https://github.com/kuuji/helm-external-val

@lukasz-braula-sollers-eu

I like this solution but this only works if you use Kustomize with helm. If we just want to use the normal helm plugin, is there any way to swap out the executable for a wrapper like this?

Sure it's possible, small hint can be found in repo-server:

argocd@argocd-repo-server-xxx~$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
argocd@argocd-repo-server-xxx:~$ which helm
/usr/local/bin/helm

The helm binary is located in /usr/local/bin/ folder, while /usr/local/sbin has more priority, so you can put the script from my message in /usr/local/sbin/helm and make is executable + replace HELM_BIN=${HELM_BIN:-helm} to HELM_BIN=${HELM_BIN:-/usr/local/bin/helm} or just set global env variable HELM_BIN=/usr/local/bin/helm.

@farioas The workaround functioning perfectly, Thanks! Additionally, I've added the necessary values to the original Helm chart for ArgoCD. It's all up and running smoothly now,

The helm values

values:
 repoServer:
   volumes:
    - name: helm-replace
      configMap:
        name: config-map-helm-replace
        defaultMode: 0777
   volumeMounts:
    - name: helm-replace
      mountPath: /usr/local/sbin/helm
      subPath: helm
   env:
    - name: HELM_BIN
      value: /usr/local/bin/helm

The additional resources:

apiVersion: v1
kind: ConfigMap
metadata:
  name: config-map-helm-replace
data:
  helm: |-
    #!/bin/bash
    
    HELM_BIN=${HELM_BIN:-helm}
    
    new_args=()
    template_found=false
    
    for arg in "$@"; do
      if [[ "$arg" == "template" ]]; then
        template_found=true
        new_args+=("$arg")
      elif $template_found && [[ "${#new_args[@]}" -eq 1 ]]; then
        new_args+=("--dry-run=server" "$arg")
        template_found=false
      else
        new_args+=("$arg")
      fi
    done
    
    $HELM_BIN "${new_args[@]}"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: argocd-repo-server-access
rules:
- apiGroups: [""]
  resources: ["configmaps", "secrets"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: argocd-repo-server-access-binding
subjects:
- kind: ServiceAccount
  name: argocd-repo-server
  namespace: argocd
roleRef:
  kind: ClusterRole
  name: argocd-repo-server-access
  apiGroup: rbac.authorization.k8s.io

I was trying apply this, but whenever I use lookup function in helm I'm getting error:

[ERROR] unable to retrieve resource list for: v1 , error: Get "http://localhost:8080/api/v1?timeout=32s": dial tcp [::1]:8080: connect: connection refused lookup_func.go:104: [ERROR] unable to get apiresource from unstructured: /v1, Kind=Secret , error Get "http://localhost:8080/api/v1?timeout=32s": dial tcp [::1]:8080: connect: connection refused Error: template: Kubernetes application/charts/testc/templates/secrets.yaml:14:15: executing "Kubernetes application/charts/testc/templates/secrets.yaml" at <lookup "v1" "Secret" "test-sbox-t" "builder-token">: error calling lookup: unable to get apiresource from unstructured: /v1, Kind=Secret: Get "http://localhost:8080/api/v1?timeout=32s": dial tcp [::1]:8080: connect: connection refused Use --debug flag to render out invalid YAML

but when I remove the lookup function and put value all works as expected. My secret.yaml file:

kind: Secret
apiVersion: v1
metadata:
  name: builder-dockercfg
  namespace: 'test-sbox-t'
  annotations:
    kubernetes.io/service-account.name: builder
    openshift.io/token-secret.name: builder-token
  ownerReferences:
    - apiVersion: v1
      kind: Secret
      name: builder-token
      {{- with (lookup "v1" "Secret" "test-sbox-t" "builder-token") }}
      uid: {{ .metadata.uid }}
      {{- end }}
      controller: true
      blockOwnerDeletion: false
data:
  .dockercfg: e30=
type: kubernetes.io/dockercfg

Helm version on argo-repo-server:
version.BuildInfo{Version:"v3.15.2", GitCommit:"1a500d5625419a524fdae4b33de351cc4f58ec35", GitTreeState:"clean", GoVersion:"go1.22.4"}
Locally (using pure helm app) this helm working fine but on argo not. Have anyone idea what I'm doing wrong?

@lukasz-braula-sollers-eu

I like this solution but this only works if you use Kustomize with helm. If we just want to use the normal helm plugin, is there any way to swap out the executable for a wrapper like this?

Sure it's possible, small hint can be found in repo-server:

argocd@argocd-repo-server-xxx~$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
argocd@argocd-repo-server-xxx:~$ which helm
/usr/local/bin/helm

The helm binary is located in /usr/local/bin/ folder, while /usr/local/sbin has more priority, so you can put the script from my message in /usr/local/sbin/helm and make is executable + replace HELM_BIN=${HELM_BIN:-helm} to HELM_BIN=${HELM_BIN:-/usr/local/bin/helm} or just set global env variable HELM_BIN=/usr/local/bin/helm.

@farioas The workaround functioning perfectly, Thanks! Additionally, I've added the necessary values to the original Helm chart for ArgoCD. It's all up and running smoothly now,
The helm values

values:
 repoServer:
   volumes:
    - name: helm-replace
      configMap:
        name: config-map-helm-replace
        defaultMode: 0777
   volumeMounts:
    - name: helm-replace
      mountPath: /usr/local/sbin/helm
      subPath: helm
   env:
    - name: HELM_BIN
      value: /usr/local/bin/helm

The additional resources:

apiVersion: v1
kind: ConfigMap
metadata:
  name: config-map-helm-replace
data:
  helm: |-
    #!/bin/bash
    
    HELM_BIN=${HELM_BIN:-helm}
    
    new_args=()
    template_found=false
    
    for arg in "$@"; do
      if [[ "$arg" == "template" ]]; then
        template_found=true
        new_args+=("$arg")
      elif $template_found && [[ "${#new_args[@]}" -eq 1 ]]; then
        new_args+=("--dry-run=server" "$arg")
        template_found=false
      else
        new_args+=("$arg")
      fi
    done
    
    $HELM_BIN "${new_args[@]}"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: argocd-repo-server-access
rules:
- apiGroups: [""]
  resources: ["configmaps", "secrets"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: argocd-repo-server-access-binding
subjects:
- kind: ServiceAccount
  name: argocd-repo-server
  namespace: argocd
roleRef:
  kind: ClusterRole
  name: argocd-repo-server-access
  apiGroup: rbac.authorization.k8s.io

I was trying apply this, but whenever I use lookup function in helm I'm getting error:

[ERROR] unable to retrieve resource list for: v1 , error: Get "http://localhost:8080/api/v1?timeout=32s": dial tcp [::1]:8080: connect: connection refused lookup_func.go:104: [ERROR] unable to get apiresource from unstructured: /v1, Kind=Secret , error Get "http://localhost:8080/api/v1?timeout=32s": dial tcp [::1]:8080: connect: connection refused Error: template: Kubernetes application/charts/testc/templates/secrets.yaml:14:15: executing "Kubernetes application/charts/testc/templates/secrets.yaml" at <lookup "v1" "Secret" "test-sbox-t" "builder-token">: error calling lookup: unable to get apiresource from unstructured: /v1, Kind=Secret: Get "http://localhost:8080/api/v1?timeout=32s": dial tcp [::1]:8080: connect: connection refused Use --debug flag to render out invalid YAML

but when I remove the lookup function and put value all works as expected. My secret.yaml file:

kind: Secret
apiVersion: v1
metadata:
  name: builder-dockercfg
  namespace: 'test-sbox-t'
  annotations:
    kubernetes.io/service-account.name: builder
    openshift.io/token-secret.name: builder-token
  ownerReferences:
    - apiVersion: v1
      kind: Secret
      name: builder-token
      {{- with (lookup "v1" "Secret" "test-sbox-t" "builder-token") }}
      uid: {{ .metadata.uid }}
      {{- end }}
      controller: true
      blockOwnerDeletion: false
data:
  .dockercfg: e30=
type: kubernetes.io/dockercfg

Helm version on argo-repo-server: version.BuildInfo{Version:"v3.15.2", GitCommit:"1a500d5625419a524fdae4b33de351cc4f58ec35", GitTreeState:"clean", GoVersion:"go1.22.4"} Locally (using pure helm app) this helm working fine but on argo not. Have anyone idea what I'm doing wrong?

I did resolve my issue using new service account ( instead of using "default")

@jakuboskera
Copy link

I like this solution but this only works if you use Kustomize with helm. If we just want to use the normal helm plugin, is there any way to swap out the executable for a wrapper like this?

Sure it's possible, small hint can be found in repo-server:

argocd@argocd-repo-server-xxx~$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
argocd@argocd-repo-server-xxx:~$ which helm
/usr/local/bin/helm

The helm binary is located in /usr/local/bin/ folder, while /usr/local/sbin has more priority, so you can put the script from my message in /usr/local/sbin/helm and make is executable + replace HELM_BIN=${HELM_BIN:-helm} to HELM_BIN=${HELM_BIN:-/usr/local/bin/helm} or just set global env variable HELM_BIN=/usr/local/bin/helm.

@farioas The workaround functioning perfectly, Thanks! Additionally, I've added the necessary values to the original Helm chart for ArgoCD. It's all up and running smoothly now,
The helm values

values:
 repoServer:
   volumes:
    - name: helm-replace
      configMap:
        name: config-map-helm-replace
        defaultMode: 0777
   volumeMounts:
    - name: helm-replace
      mountPath: /usr/local/sbin/helm
      subPath: helm
   env:
    - name: HELM_BIN
      value: /usr/local/bin/helm

The additional resources:

apiVersion: v1
kind: ConfigMap
metadata:
  name: config-map-helm-replace
data:
  helm: |-
    #!/bin/bash
    
    HELM_BIN=${HELM_BIN:-helm}
    
    new_args=()
    template_found=false
    
    for arg in "$@"; do
      if [[ "$arg" == "template" ]]; then
        template_found=true
        new_args+=("$arg")
      elif $template_found && [[ "${#new_args[@]}" -eq 1 ]]; then
        new_args+=("--dry-run=server" "$arg")
        template_found=false
      else
        new_args+=("$arg")
      fi
    done
    
    $HELM_BIN "${new_args[@]}"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: argocd-repo-server-access
rules:
- apiGroups: [""]
  resources: ["configmaps", "secrets"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: argocd-repo-server-access-binding
subjects:
- kind: ServiceAccount
  name: argocd-repo-server
  namespace: argocd
roleRef:
  kind: ClusterRole
  name: argocd-repo-server-access
  apiGroup: rbac.authorization.k8s.io

I was trying apply this, but whenever I use lookup function in helm I'm getting error:
[ERROR] unable to retrieve resource list for: v1 , error: Get "http://localhost:8080/api/v1?timeout=32s": dial tcp [::1]:8080: connect: connection refused lookup_func.go:104: [ERROR] unable to get apiresource from unstructured: /v1, Kind=Secret , error Get "http://localhost:8080/api/v1?timeout=32s": dial tcp [::1]:8080: connect: connection refused Error: template: Kubernetes application/charts/testc/templates/secrets.yaml:14:15: executing "Kubernetes application/charts/testc/templates/secrets.yaml" at <lookup "v1" "Secret" "test-sbox-t" "builder-token">: error calling lookup: unable to get apiresource from unstructured: /v1, Kind=Secret: Get "http://localhost:8080/api/v1?timeout=32s": dial tcp [::1]:8080: connect: connection refused Use --debug flag to render out invalid YAML
but when I remove the lookup function and put value all works as expected. My secret.yaml file:

kind: Secret
apiVersion: v1
metadata:
  name: builder-dockercfg
  namespace: 'test-sbox-t'
  annotations:
    kubernetes.io/service-account.name: builder
    openshift.io/token-secret.name: builder-token
  ownerReferences:
    - apiVersion: v1
      kind: Secret
      name: builder-token
      {{- with (lookup "v1" "Secret" "test-sbox-t" "builder-token") }}
      uid: {{ .metadata.uid }}
      {{- end }}
      controller: true
      blockOwnerDeletion: false
data:
  .dockercfg: e30=
type: kubernetes.io/dockercfg

Helm version on argo-repo-server: version.BuildInfo{Version:"v3.15.2", GitCommit:"1a500d5625419a524fdae4b33de351cc4f58ec35", GitTreeState:"clean", GoVersion:"go1.22.4"} Locally (using pure helm app) this helm working fine but on argo not. Have anyone idea what I'm doing wrong?

I did resolve my issue using new service account ( instead of using "default")

This workaround works only within the cluster where is ArgoCD deployed right? If you use multiple clusters connected to ArgoCD and ArgoCD is in other cluster, the lookup function won't work as it has no permission to do the lookup in other clusters right?

@PKatBK
Copy link

PKatBK commented Nov 15, 2024

Hey,

we are struggeling with the same issue and felt over it already multiple times. In our case it is mostly handling auto-generated credentials/keys, which is working pretty fine with helm lookup feature.

is there any plan to give the user the option, to choose --dry-run=server as diff/sync "model"?
what is the recommendation from argo team to handle secrets, which are only needed inside the cluster for communication?

@andrii-korotkov-verkada
Copy link
Contributor

There's a challenge here from Gitops perspective. This is a dynamic dependency, so we may need to update and sync the application even if the helm charts and values didn't change. Would it be possible to somehow subscribe to a stream of events or have the corresponding config map as a part of the same application (so that update would trigger refresh + sync)?

@PKatBK
Copy link

PKatBK commented Nov 18, 2024

i am not sure if this would help somehow.
Right now, we are auto-generating values in secrets/configmaps and check their existing upfront via helm lookup.
If the value already exists, we keep it. If it is not existing, we autogenerate it.
We are also using this to create an initial configmap and read out via helm lookup function secrets/configmap values and paste it into the newly created configmap.

that's why we are using helm lookup right now on multiple places, which is working fine w/o argocd :/

@logica0419
Copy link

Hi, I’m interested in addressing the root cause of the issue.
Let me explain my thoughts in detail below.

Regarding Resource Update Tracking

cc. @andrii-korotkov-verkada

I don’t believe we need to track updates to looked-up resources, since the lookup function in Helm is not designed for that purpose.

As mentioned in this issue, the lookup function is:

some way to say to Helm: "At the time that this manifest (or resource) is loaded, do a lookup for me and inject the results here."

Helm itself doesn’t load resources after deployment, and the Helm Operator doesn’t either.
The lookup function is intended for use during initialization, not for continuous updates.

Helm charts relying on the lookup function are based on this initialization-only model, so they may behave incorrectly if we implement in-place re-syncs in Argo CD.

Apologies for the lengthy explanation; I just wanted to clarify my point.
Any further opinions about this?

API Design

For the implementation, I propose adding a serverDryRun field to the Helm section of the Application spec. Here's an example:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: sealed-secrets
  namespace: argocd
spec:
  project: default
  source:
    chart: sealed-secrets
    repoURL: https://bitnami-labs.github.io/sealed-secrets
    targetRevision: 1.16.1
    helm:
      releaseName: sealed-secrets
+     serverDryRun: false # Set this to "true" to enable the lookup function
  destination:
    server: "https://kubernetes.default.svc"
    namespace: kubeseal

The name serverDryRun aligns with the --server-dry-run flag (the earlier form of --dry-run=server) used in the kubectl CLI.

Since the --dry-run flag in Helm only supports client, server and none as options, and we don’t want an option to disable dry-run (as it’s essential for template rendering), a boolean field (serverDryRun) provides a straightforward and effective solution.

Considered alternative field names

  • dryRun
    • While making the dry-run flag fully configurable was an option, it would add unnecessary complexity
    • Allowing users to turn off dry-run isn't desirable, as we use Helm for template rendering
  • enableLookUp
    • Although this name aligns with a specific use case, the potential scope of serverDryRun extends beyond enabling the lookup function (see this issue comment)
    • Server-side dry-run could influence other functionality in the future, so naming it solely around "lookup" feels overly narrow

If these ideas look good, I want to work on the actual implementation as well.
Please let me know if you have feedback or additional considerations. Thanks!

@logica0419
Copy link

I wonder who is responsible for approving this design.
Could someone let me know if I can start the implementation?

@andrii-korotkov-verkada
Copy link
Contributor

@logica0419, please join the contributors meeting https://docs.google.com/document/d/1xkoFkVviB70YBzSEa4bDnu-rUZ1sIFtwKKG1Uw8XsY8/edit?tab=t.0. That's the best place to get people's attention and discuss designs.

@acelinkio
Copy link

the root cause of this is that Helm is growing into an orchestration tool. Any Helm based orchestration should be used as a last resort. See the ArgoCD challenges with Helm hooks for example.

Adding lookup functionality to Helm was a mistake IMO. Would much rather solve the problem inside of Kubernetes or extending Kubernetes. I am not sure if ArgoCD aims to have pairity with all of Helm functionality.

@shinebayar-g
Copy link

Agreed. But to be fair, ArgoCD is the thing that's inside the cluster. So there's no reason to not support these things.

@crenshaw-dev
Copy link
Member

The next challenge to be tackled is "lookup as who?" When you run helm template locally, the answer is easy: "lookup as you." It uses your Kube config, and Helm isn't given access to anything you can't already access.

But when Argo CD runs helm template, it's less clear "who" is doing the lookup. If the answer is "Argo CD," that's potentially problematic, because Argo CD generally runs as root on the destination cluster. So unless you want all your GitOps users to have access to every single resource on the destination cluster, "Argo CD" isn't a good answer to the question "lookup as who."

If you're using the new (alpha) service account impersonation feature, it's possible that the answer is "lookup as the configured service account." This gives the Argo CD admin the option of limiting what the lookup can access.

Once we answer "who" we have to answer "how." In order for the helm template lookup to work, the repo-server will need to somehow make Kubernetes API credentials available to that command. Since the repo-server doesn't have direct access to cluster credentials, we'll have to either teach it to directly access credentials (probably bad from a security perspective) or teach the application controller to pass the credentials to the repo-server just in time to run helm template with them. We'll need to make sure that however we pass those credentials is reasonably secure and that other processes (e.g. CMPs) can't be abused to access those credentials.

So I think this is a rare case of "the API is the easy part." The rest of the design is hard.

@crenshaw-dev
Copy link
Member

I'll take the opportunity to express my opinion again that separation of concerns in secret management is good. In my opinion, Helm charts should reference secrets rather than inject secrets. Secrets should always be populated on the destination cluster by a secrets operator.

But I understand there are pragmatic reasons to prefer Helm lookups. :-)

@shinebayar-g
Copy link

But when Argo CD runs helm template, it's less clear "who" is doing the lookup.

Good point. I think External Secrets Operator implemented this pattern already https://external-secrets.io/latest/api/clustersecretstore/.

They ask you to provide a Kubernetes Service Account (which is authorized to connect to various Secrets Provider backends, but those are irrelevant in this context), and somehow use the KSA to authenticate.

I guess similar pattern can be implemented in ArgoCD. Users can be expected to provide some Kubernetes Service Account, which has proper permission to do whatever they want. ( Helm lookup in this case ).

Also if this pattern works out well, ArgoCD could even be able to define specific Kubernetes Service Account per Application... to further limit the ArgoCD's own root account permission.. Wdyt?

@dudicoco
Copy link

dudicoco commented Dec 5, 2024

I'll take the opportunity to express my opinion again that separation of concerns in secret management is good. In my opinion, Helm charts should reference secrets rather than inject secrets. Secrets should always be populated on the destination cluster by a secrets operator.

But I understand there are pragmatic reasons to prefer Helm lookups. :-)

@crenshaw-dev do you have an example of a secrets operator that can manage kubernetes webhooks?

Today helm can manage the entire process of webhooks creation, with the missing piece of the puzzle being the lookup function:

  1. Generate a CA
  2. Generate a self-signed certificate
  3. Generate a secret and populate it with the CA and certification private key
  4. Populate the webhook's caBundle field

@crenshaw-dev
Copy link
Member

@shinebayar-g I think service account impersonation gets us close to what you're describing. But instead of configuring the service account at the app level, we do it at the project level. In my opinion, that's sufficient to answer the "lookup as who" question. Now the problem is "how do we get the appropriate creds to the repo-server?"

@dudicoco I think cert-manager can do what you're describing: https://cert-manager.io/docs/concepts/ca-injector/

@dudicoco
Copy link

dudicoco commented Dec 5, 2024

@crenshaw-dev I'm aware that cert-manager is capable of that, I don't think users should be forced to use yet another complex tool just for the sake of generating self signed certs for webhooks (assuming they don't need cert-manager for other things).
In addition cert-manager has an issue as it can't rotate the CA secret: cert-manager/cert-manager#2478

@dudicoco
Copy link

dudicoco commented Dec 5, 2024

I'll give another example - what if I just want to generate a random secret to be consumed by my app (for example grafana admin password)?
With Helm I can do that easily, any alternative will most likely be much more complicated.

@crenshaw-dev
Copy link
Member

crenshaw-dev commented Dec 5, 2024

I don't think users should be forced to use yet another complex tool just for the sake of generating self signed certs for webhooks

what if I just want to generate a random secret

Both of these use cases require secret lifecycle management. You need to update certs when they expire, and you need to rotate secrets on a regular basis. Argo CD is not a lifecycle management tool. It doesn't know when you need to update secrets and when you need to avoid updating secrets.

By making secret generation a side-effect of a GitOps deployment, you're forcing imperative state management into a system that's designed for continuous reconciliation of declarative state. Kubernetes works best with a micro-service architecture with strong separation of concerns. Cert manager is one of those micro-services. ESO could be another, providing a generated Grafana password. Mixing secret lifecycle management concerns into a GitOps controller is asking for pain.

@dudicoco
Copy link

I don't think users should be forced to use yet another complex tool just for the sake of generating self signed certs for webhooks

what if I just want to generate a random secret

Both of these use cases require secret lifecycle management. You need to update certs when they expire, and you need to rotate secrets on a regular basis. Argo CD is not a lifecycle management tool. It doesn't know when you need to update secrets and when you need to avoid updating secrets.

By making secret generation a side-effect of a GitOps deployment, you're forcing imperative state management into a system that's designed for continuous reconciliation of declarative state. Kubernetes works best with a micro-service architecture with strong separation of concerns. Cert manager is one of those micro-services. ESO could be another, providing a generated Grafana password. Mixing secret lifecycle management concerns into a GitOps controller is asking for pain.

It's nice to find out that external-secrets added support for password generation, however, that does not mean that now everyone who wants to work with argocd should have to use this solution.

There are many ways to generate secrets, for example you could also generate them with terraform.
The external-secrets password generation feature was introduced 2 years ago, before that everyone would use a different solution for automatic secrets generation, whether they are using argocd or not.

So it feels opinionated to me to say that it is "forcing imperative state management into a system that's designed for continuous reconciliation of declarative state", it's actually the other way around to force users to only use external-secrets and cert-manager as the only possible solutions if they want to generate secrets automatically.

@crenshaw-dev
Copy link
Member

It is opinionated, and there's a cost/benefit to having that opinion. The cost is that people are forced to adapt and sometimes to hack around the opinion. The benefit is that some folks will adopt more modern practices and that Argo CD avoids introducing a costly feature.

I'm not completely opposed to this feature. I think service account impersonation has carried us a long way towards making this viable. The next big hurdle is to design a way for the repo-server to securely read external cluster state when hydrating manifests. Depending on the complexity/risk of that solution, I may or may not be able to support merging it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:helm enhancement New feature or request workaround There's a workaround, might not be great, but exists
Projects
None yet
Development

No branches or pull requests