Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy K8s kustomizations Without Wrapping In A Helm Chart #1337

Closed
UncleGedd opened this issue Feb 7, 2023 · 12 comments · Fixed by #1350
Closed

Deploy K8s kustomizations Without Wrapping In A Helm Chart #1337

UncleGedd opened this issue Feb 7, 2023 · 12 comments · Fixed by #1350
Assignees

Comments

@UncleGedd
Copy link
Contributor

UncleGedd commented Feb 7, 2023

Is your feature request related to a problem? Please describe.
I'm trying to deploy a Big Bang variant using a kustomization inside of a Zarf package. The Helm chart that Zarf wraps my kustomization with fails to install due a templating error. Note that the kustomization builds correctly with kustomize build <dir>
Example of my Zarf pkg

kind: ZarfPackageConfig
metadata:
  name: batcave
  architecture: amd64
  yolo: true

components:
- name: flux
  manifests:
  - name: local-flux
    namespace: flux-system
    kustomizations:
    - ./base/flux

- name: batcave
  manifests:
  - name: batcave-dev
    namespace: batcave
    kustomizations: # fails to install!
    - ./envs/dev

We have determined that the Helm chart fails to install due to a templating error related to the following values for the Big Bang Monitoring chart:

    additionalPrometheusRulesMap:
      custom-rules:
        groups:
          - name: OOMKilled
            rules:
              - alert: OOMKilled
                expr: sum_over_time(kube_pod_container_status_terminated_reason{reason="OOMKilled"}[5m])>0
                for: 5m
                labels:
                  severity: critical
                annotations:
                  description: Pod {{$labels.pod}} in {{$labels.namespace}} got OOMKilled

Zarf fails with the error message:

parse error at (raw-batcave-batcave-batcave-dev/var/folders/sf/_jrk8zgs5tg33r_h1gzh_6t40000gn/T/zarf-1900317816/components/batcave/manifests/kustomization-batcave-dev-0.yaml:1263): undefined variable "$labels"

That $labels variable is inside of a ConfigMap that is passed into a Flux HelmRelease. I believe when we wrap this in a Helm chart, Helm believes it's a local Helm variable, resulting in the above error.

Describe the solution you'd like
I'd like the option to install the kustomization directly (like with kustomization build <dir> | kubectl apply -f -) instead of wrapping it in a Helm chart. Specifcally, I want to specify in the Zarf package to not wrap the kustomization in a Helm chart

@github-project-automation github-project-automation bot moved this to New Requests in Zarf Project Board Feb 7, 2023
@UncleGedd UncleGedd self-assigned this Feb 7, 2023
@Racer159
Copy link
Contributor

Racer159 commented Feb 7, 2023

For this (stated to @wirewc) we may want to make this apply for all manifests since that may be easier technically and would provide more flexibility on standard manifests as well

@Racer159
Copy link
Contributor

Racer159 commented Feb 7, 2023

We will also have to think about how this affects the package lifecycle and user messaging on the key (since we use helm right now to uninstall for example and this may break that)

@wirewc wirewc self-assigned this Feb 7, 2023
@wirewc wirewc moved this from New Requests to Doing Now in Zarf Project Board Feb 7, 2023
@WarheadsSE
Copy link

parse error at (.../manifests/kustomization-batcave-dev-0.yaml:1263): undefined variable "$labels"

This seems to indicate that Helm's parser (within API, I assume) is getting {{$labels.pod}}. This results in an attempt to template the content of $labels.pod within the Helm scope, however $label does not exist.

👋 John Snow here, but it looks like the problem is the values are resulting in Helm getting content it can not template. You may need to add escape sequences.

@WarheadsSE
Copy link

WarheadsSE commented Feb 7, 2023

Some context to the above comment:

Kustomize and Helm both make use of the default markers from gotpl/txt ({{ }}). As such, if kustomize is ran before helm but leaves some content behin, or Helm is handed a YAML intended for kustomize, you will end up with "what happened here".

I don't know enough about the order of operations involved here, but I wonder about precisely what ordering is occurring to result in this behavior.

The report indicates that additionalPrometheusRulesMap was supplied, likely with the {{$labels.pod}} inside of it. That chart is the BigBang implementation of the kube-prometheus. The values provided there are rendered into this ConfigMap.

My immediate question would be what happens when the chart is rendered with helm template test-1337 CHART -f user-values.yaml --show-only templates/prometheus/additionalPrometheusRules.yaml

@WarheadsSE
Copy link

That $labels variable is inside of a ConfigMap that is passed into a Flux HelmRelease. I believe when we wrap this in a Helm chart, Helm believes it's a local Helm variable, resulting in the above error.

🤔 Taking that into account ... yeah. I can see why Helm barked at it. The raw content is expected in the final YAML placed into k8s, and then consumed by Flux' HelmRelease object.

@wirewc
Copy link
Contributor

wirewc commented Feb 7, 2023

So it does look like kustomize does get run before being packaged by helm at least. I'm still shocked nothing barked until it got to helm. Regardless, the next step getting and isolated test case with @UncleGedd so we can iterate fast over changes as we poke through.

Because of how Zarf works, every deployment has to go through helm.

@wirewc
Copy link
Contributor

wirewc commented Feb 9, 2023

It looks like the Kustomize <-> Helm part seems to be working well. @UncleGedd and I will be touching base to dive into the issue further to find root cause. It looks like the issue is just the end of the deployment is failing and might be more related downstream from the helm.

@UncleGedd UncleGedd linked a pull request Feb 9, 2023 that will close this issue
5 tasks
@UncleGedd
Copy link
Contributor Author

Wrote a failing e2e test, now trying to figure out how to have Helm not care about that perceived $label variable

@UncleGedd
Copy link
Contributor Author

Ok, so it turns out that Prometheus templating in Helm is a relatively common problem, check out:

I was able to replicate the issue with a barebones Helm chart with a ConfigMap containing a single Prometheus rule:

apiVersion: v1
kind: ConfigMap
metadata:
  name: monitoring-helmrelease-values-cm
data:
  values.yaml: |
    monitoring:
      values:
        additionalPrometheusRulesMap:
          custom-rules:
            groups:
              - name: OOMKilled
                rules:
                  - alert: OOMKilled
                    expr: sum_over_time(kube_pod_container_status_terminated_reason{reason="OOMKilled"}[5m])>0
                    for: 5m
                    labels:
                      severity: critical
                    annotations:
                      description: Pod {{$labels.pod}} in {{$labels.namespace}} got OOMKilled

Helm throws that same error:

Error: parse error at (chart/templates/configmap.yaml:20): undefined variable "$labels"

Reading those 2 links above, the recommended solution is to just change the templating on the chart's side. Although it's super curious that the Flux HelmRelease is totally fine with that {{ $labels.pod }} syntax. Moving forward I think we have 2 options:

  1. Add an option for Zarf to completely bypass Helm when applying a kustomization or plain manifests
  2. Close this issue and I'll fix the templating on my end, becaues this probably isn't a problem Zarf needs to solve

@wirewc
Copy link
Contributor

wirewc commented Feb 9, 2023

While I do not see this as a Zarf issue, I'm more curious of how often we'll see this issue in the future. If the occurrence seems likely, then we should make an attempt to address it within Zarf or at the very least document how to resolve the problem.

@UncleGedd
Copy link
Contributor Author

UncleGedd commented Feb 9, 2023

Adding some more context to this conversation. I'm deploying a Big Bang cluster using Zarf in a connected env (using YOLO mode). Many of the use cases that Zarf was built for, we aren't using. What I'm interested in is using Zarf as a way to do declarative deployments (including declarative upgrades).

The problem my team and I face is that many of our platform upgrade steps are done either via a bash script or by manually interacting with the cluster. I'd like to use Zarf to declaratively define what upgrades look like.

We will also have to think about how this affects the package lifecycle and user messaging on the key (since we use helm right now to uninstall for example and this may break that)

100% agree @Racer159. For my use case, I'm fine with breaking the zarf destroy ... command, but I also don't want to break or confuse other Zarf users

@wirewc
Copy link
Contributor

wirewc commented Feb 9, 2023

After a brief sync meeting, it looks like Zarf Actions would be a great place to addressing this specific deployment issue.

Relevant Context:

  • Zarf Actions is a replacement for Zarf Scripts
  • Zarf Actions were officially released yesterday in release v0.24
  • The issue is helm is complaining about the variable name that might need to escaped (e.g. {{$labels.pod}})
    • Zarf Actions might be able to address core issue with the onDeploy action providing some sort of string replacement with Zarf variables.
    • This functionality of Zarf Actions was not possible with Zarf Scripts exclusively.

Additional example of Zarf Actions outside of the examples folder can be found in the IaC repository.

@UncleGedd UncleGedd mentioned this issue Feb 10, 2023
5 tasks
Racer159 pushed a commit that referenced this issue Feb 13, 2023
## Description

Wrap manifest/kustomizations in string literals within the helm chart
generation process if needed to avoid nested go template messes.
Supersedes #1347.

## Related Issue

Fixes #1337

## Type of change

- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Other (security config, docs update, etc)

## Checklist before merging

- [x] Test, docs, adr added or updated as needed
- [x] [Contributor Guide
Steps](https://github.com/defenseunicorns/zarf/blob/main/CONTRIBUTING.md#developer-workflow)
followed
@github-project-automation github-project-automation bot moved this from Doing Now to Done in Zarf Project Board Feb 13, 2023
Noxsios pushed a commit that referenced this issue Mar 8, 2023
## Description

Wrap manifest/kustomizations in string literals within the helm chart
generation process if needed to avoid nested go template messes.
Supersedes #1347.

## Related Issue

Fixes #1337

## Type of change

- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Other (security config, docs update, etc)

## Checklist before merging

- [x] Test, docs, adr added or updated as needed
- [x] [Contributor Guide
Steps](https://github.com/defenseunicorns/zarf/blob/main/CONTRIBUTING.md#developer-workflow)
followed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants