Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

Sops support #2580

Merged
merged 4 commits into from
Dec 11, 2019
Merged

Sops support #2580

merged 4 commits into from
Dec 11, 2019

Conversation

PaulFarver
Copy link
Contributor

@PaulFarver PaulFarver commented Nov 5, 2019

This is a pull request that adds support for sops OOTB as described in #1676

The point of this feature is to let flux decrypt files encrypted with sops. Sops is ideal for encryption of yaml files, and has support for AWS KMS, GCP KMS, Azure Key Vault and PGP. The idea is to get flux to decrypt yaml manifests that have been encrypted with sops.

  • The go.mozilla.org/sops library has been added as a dependency
  • load.go has been updated. The Load func now takes a new boolean argument (sopsEnabled), used to specify whether to decrypt yaml files with sops before applying them to the cluster.
  • The manifests struct has been updated with sopsEnabled, and the NewManifests function also takes this as an argument.
  • A new option has been added to the cli:
    • --sops is a boolean option for specifying whether sops decryption should be enabled

@carlpett
Copy link
Contributor

carlpett commented Nov 5, 2019

This is nice! 🎉 Did you have a cause to exec to the binary instead of linking sops as a library? Is something missing in go.mozilla.org/sops/decrypt?

@PaulFarver
Copy link
Contributor Author

PaulFarver commented Nov 5, 2019

This is nice! 🎉 Did you have a cause to exec to the binary instead of linking sops as a library? Is something missing in go.mozilla.org/sops/decrypt?

@carlpett I completely missed that. I think I was just too focused on copying the git-secret method. Of course that makes more sense. Will try it.

@2opremio
Copy link
Contributor

2opremio commented Nov 5, 2019

@PaulFarver thanks a lot for the effort! 🌹

I need to get familiar with sops and discuss it with the other maintainers to see what's the way forward. Please stay put.

@PaulFarver
Copy link
Contributor Author

The docker build seems to be failing when it has to run the known_hosts.sh. Seems gitlab is having some trouble

@carlpett
Copy link
Contributor

Btw, it might make sense to wait for getsops/sops#566, to get proper versioned modules (and the new import path)

@2opremio
Copy link
Contributor

@PaulFarver it seems that the e2e error was transient (or due to a flaky test, I need to investigate).

Good news is it started working again after rebuilding.

@PaulFarver
Copy link
Contributor Author

@2opremio Thanks!

@carlpett I think it's a good idea to wait until they get that merged. Is there anything I could work on for this, in the mean time. I'm guessing there might be some documentation and tests to be updated.

@2opremio
Copy link
Contributor

@PaulFarver I really like the shape this is taking! Some comments:

  1. Let's definitely wait for Upgrade sops to go 1.13 getsops/sops#566 (which I guess will be merged today)
  2. Let's stop requiring a special filename for sops-encrypted files and use the trial-and-error approach suggested by @ajvb
  3. Let's add a flag for enabling/disabling sops decryption: --sops. Let's make it opt-in unless the content-check is really cheap.

@PaulFarver
Copy link
Contributor Author

@2opremio Sounds good 👍 Working on it now

@2opremio
Copy link
Contributor

  1. Can we add a warning like the following? https://github.com/fluxcd/flux/pull/2159/files#diff-350e484490d6241fa81c13942d160480R296 (tweaking the text, of course, since the user may be using something else, like KMS).

@2opremio
Copy link
Contributor

2opremio commented Nov 12, 2019

and ...
5. Can you add tests? A unit test and, ideally, an end-to-end test as well

@2opremio
Copy link
Contributor

and ..
6. Can you add the new flag to site/daemon.md ?

@elzapp
Copy link
Contributor

elzapp commented Nov 12, 2019

How will this play along with Kustomize and other manifest generators?

@2opremio 2opremio self-assigned this Nov 12, 2019
@2opremio
Copy link
Contributor

2opremio commented Nov 12, 2019

How will this play along with Kustomize and other manifest generators?

Good question! Since the detection is content-based, it will work transparently (i.e. you should be able to generate sops-encrypted content)

@PaulFarver
Copy link
Contributor Author

How will this play along with Kustomize and other manifest generators?

As @2opremio has already stated, it should work transparently. I think you might run into some trouble, though.

Sops verifies a file with a checksum in the metadata. The checksum depends on when the file was modified, the values, and key structure. Even the key position. Also that of unencrypted values.

As I understand it, manifest generation is run before the yaml files are loaded in flux, which is when the sops decryption takes place. That means you would have to generate the valid sops encrypted file with kustomize build. It might be possible, but I think it would be pretty tedious, and hard to maintain.

@2opremio
Copy link
Contributor

That means you would have to generate the valid sops encrypted file with kustomize build. It might be possible, but I think it would be pretty tedious, and hard to maintain.

As a side note, kustomize is just one of the possibilities to generate manifests. Flux's approach to manifest generation (.flux.yaml files) is generic and admits any command (as long as it's present in the Flux image, int he future we will support supplying images to run commands through Ephemeral containers).

@elzapp
Copy link
Contributor

elzapp commented Nov 13, 2019

Sops verifies a file with a checksum in the metadata. The checksum depends on when the file was modified, the values, and key structure. Even the key position. Also that of unencrypted values.

Exactly this was my concern. It might be feasable to have the secrets not included with kustomize, and just cating them into the generated manifest, if sops only is concerned with the individual documents inside the yaml.

@PaulFarver
Copy link
Contributor Author

@elzapp

Exactly this was my concern. It might be feasable to have the secrets not included with kustomize, and just cating them into the generated manifest, if sops only is concerned with the individual documents inside the yaml.

Sops does have support for yaml multidocs, but you have to encrypt all documents in the file, so I don't think it aligns too well with the kustomize use case.

The poc of this feature worked in much the same way as the git-secret feature, where files are decrypted right after a checkout. This would support encryption of patches individually, which seems like a better workflow for people using manifest generation.

@2opremio We could pursue decrypt-first functionality, but with the trial-and-error approach suggested by @ajvb. What do you think?

@elzapp
Copy link
Contributor

elzapp commented Nov 13, 2019

Decrypt first would be the best, but it would require to resolve all the bases and resources from the kustomize files first, or decrypt any yaml the whole repository, not just those inside the git-path. (The latter wouldn't work if the kustomize files refers to resources outside the repository, but maybe that is an ok limitation.)

@PaulFarver
Copy link
Contributor Author

Right... I have another branch, that implements decrypt-first with the trial-and-error approach and the go.mozilla.org/sops library. If that is the way we want to move forward, I will switch the pull request to that branch.

@2opremio
Copy link
Contributor

2opremio commented Nov 13, 2019

How will this play along with Kustomize and other manifest generators?

Good question! Since the detection is content-based, it will work transparently (i.e. you should be able to generate sops-encrypted content)

After thinking about this further it turns out I wasn't completely right (sorry I didn't think it through right away).

At the current state of this PR, sops decryption is attempted after reading the files from the filesystem and not when parsing them (which is used both when loading files from disk and by manifest generation).

There is an easy fix though. @PaulFarver can you move the softSopsDecrypt() call into ParseMultidoc() instead?

Depending on how sops encrypts yaml document streams, we may want to attempt softSopsDecrypt() on the full stream and/or on each individual document, after extracting it from the stream. The goal is making sure that manifest generation can output a hybrid YAML stream, with encrypted and unencrypted documents.

This will allow you to do manifest generation with sop-encrypted manifests, included in the generation output, like follows:

.flux.yaml

version: 1
patchUpdated:
  generators:
    - command: kustomize build .
    - command: cat *.enc.yaml
  patchFile: flux-patch.yaml

Where files with suffix enc.yaml are encrypted with sops. Alternatively you could use a --git-path not using manifest generation (i.e. without an associated .flux.yaml) , directly containing the sops-encrypted files.

Note that the enc.yaml files cannot be passed as an input resource to Kustomize. This has the disadvantage of not being able to process sops-encrypted files with Kustomize directly (e.g. add namepace prefix to a decrypted secret).

Now I realize this might be what @elzapp was referring to with play along with Kustomize and other manifest generators.

A decrypt-first (in-place) approach would indeed solve this problem but I think it's overkill since, as @elzapp mentioned:

Decrypt first would be the best, but it would require to resolve all the bases and resources from the kustomize files first, or decrypt any yaml the whole repository, not just those inside the git-path.

Flux doesn't understand kustomization.yaml files (as I mentioned, Flux's approach to manifest generation is technology agnostic) which leaves walking every yaml file in the repo for decryption in every sync.

This approach wouldn't support external bases including sops-encrypted manifests and it would result in git repositories incompatible with standard Kustomize (which doesn't support sops manifests as input).

I think that implementing decrypt-first to solve this problem is just a workaround to compensate for the lack of interoperability between sops and Kustomize, which isn't Flux's job to solve.

As an alternative, and in addition to the current PR, we could include the sops binary in the Flux container, allowing to explicitly decrypt the files before passing them to Kustomize, as follows

.flux.yaml

version: 1
patchUpdated:
  generators:
    - command: for I in *.enc.yaml; do sops -d "$I" -o "${I%*enc.yaml}yaml"; done && kustomize build .
  patchFile: flux-patch.yaml

kustomization.yaml

resources:
- foo.yaml
- bar.yaml
.....

File layout

├── .flux.yaml
├── kustomization.yaml
├── foo.enc.yaml
└── bar.enc.yaml

docker/Dockerfile.flux Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
cmd/fluxd/main.go Outdated Show resolved Hide resolved
docs/references/daemon.md Outdated Show resolved Hide resolved
@PaulFarver PaulFarver marked this pull request as ready for review December 10, 2019 01:13
@PaulFarver
Copy link
Contributor Author

I've cleaned up a bit. I've rebased and squashed, so it should be a little more neatly applied to master, than the other sixty-odd commits😄

I got to thinking... It might be a little neater to split NewManifests(namespacer, logger, sopsEnabled) into NewManifests(namespacer, logger) and NewSopsManifests(namespacer, logger, sopsEnabled) in order to avoid polluting all other NewManifests calls. I've implemented that version, but I still have the other implementation on another branch, so I can switch to that, if it's preferable.

@2opremio
Copy link
Contributor

Really nice! This is ready to go apart from 3 minor comments.

@2opremio
Copy link
Contributor

Nice, I think this is ready to go. Re-running the tests since they seem to be failing for unrelated reasons.

@PaulFarver
Copy link
Contributor Author

Cool! Thanks for the patience @2opremio 🥇 It's been a pleasure

Copy link
Contributor

@2opremio 2opremio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great job! 💯

I may wait a bit for merging until #2672 is resolved

I may ask you to rebase on top on master when that happens

@2opremio
Copy link
Contributor

2opremio commented Dec 11, 2019

@PaulFarver would you mind rebasing on top of master and fixing the conflict (it's minor)?

add the sops v3.5 binary in the flux docker image for use, with
pre-processing tools such kustomize

Signed-off-by: Paul Farver <[email protected]>
Include sops decrypt library, and use it to decrypt files as they are
loaded by flux

Signed-off-by: Paul Farver <[email protected]>
add tests, that verify that a set of pre-encrypted files can be imported
and automatically decrypted with sops

Signed-off-by: Paul Farver <[email protected]>
add the "--sops" flag to switch sops decryption on and off when starting
the flux daemon

Signed-off-by: Paul Farver <[email protected]>
@PaulFarver
Copy link
Contributor Author

@2opremio Yep, done!

@ryedin
Copy link

ryedin commented Feb 18, 2020

@PaulFarver sorry to activate an old thread, but I'm trying to figure out how this works when my file was encrypted with AWS KMS. Would I need to first mount (somehow) a volume into the flux container that contains the ~/.aws/credentials for the IAM user that has permissions for the key?

It looks like the code here is less about generically "enabling SOPS" and more specifically "enabling sops with GPG". Is that accurate?

@2opremio
Copy link
Contributor

KMS should work, you just need to configure the environment if the Flux container as if you were to invoke the sops binary (which is also included in the Flux image, so it's a good way to test that it works).

@ryedin
Copy link

ryedin commented Feb 18, 2020

@2opremio that's essentially what I was asking how to do :) (you just need to configure the environment of the Flux container...)

sops uses the go aws sdk, which expects the credentials file to live at ~/.aws/credentials. So in order to get KMS to work in flux we have a chicken and an egg problem of sorts. Or at least we have this one set of secrets that has to be pre-baked, as it were, into the environment...

I'm not sure the best way to solve that problem. Unless I'm really misunderstanding something here

@tomjohnburton
Copy link

Don’t bother with credentials file. You can just export the variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. Set them up as environment variables for your flux container

@ryedin
Copy link

ryedin commented Feb 18, 2020

@tomjohnburton ... face / palm heh, thank you for that.

that said, I use helm to install flux. I see extraEnvs is available here https://github.com/fluxcd/flux/blob/master/chart/flux/README.md#configuration

... is it considered secure to pass those secrets in like that? that's what I'm alluding to when I say chicken and egg. I'm just not sure if pushing those in directly as env vars is actually a smart/safe thing to be doing.

@stefanprodan
Copy link
Member

Instead of storing the AWS key in a Kubernetes secret or env vars you should consider using IAM roles for service accounts. With eksctl you could associate an IAM role to Flux service account so that Flux pod can access the KMS API see https://eksctl.io/usage/iamserviceaccounts/

@ryedin
Copy link

ryedin commented Feb 18, 2020

ok, perfect! That's very helpful for when I get to the EKS stage. Currently proving everything out locally first. I think helm-secrets has what I need for this layer for now. First am going to push straight into env vars using a temp/throwaway IAM user just to prove out the e2e flow, then will work on improving with helm-secrets/volume mounts, then "harden" once in EKS and switching to the service account/role approach.

Thank you for the help guys!

@patrick-armitage
Copy link

Would anyone have a running example of eksctl-enabled-flux using sops + KMS key to decrypt files?

@ryedin
Copy link

ryedin commented Apr 1, 2020

yes, this has been working very well. we use a simple script we named setup-flux.sh. Obviously change the env vars to be what you want to use, but we use a specific AWS IAM user for this. Ostensibly you could use roles instead, but we didn't get that far yet.

I've kept the stuff in there where we're allowing the script to control setting up multiple instances of flux that each watch a different branch of the repo and a specific path. You may or may not find that part useful, but it's how we are dealing with running multiple flux-enabled CI/CD projects in the same cluster. Obviously change paths and whatnot to match your needs.

The important bits are in the helm command set options near the bottom.
The sops related requirements are the SOPS_KMS_ARN env var and the access key/secret pair. This assumes you have setup a symmetric KMS key (or keys) and have the ARN(s) to plug in.

#!/bin/bash
set -e

if [[ ! -x "$(command -v kubectl)" ]]; then
    echo "kubectl not found"
    exit 1
fi

if [[ ! -x "$(command -v helm)" ]]; then
    echo "helm not found"
    exit 1
fi

if [[ ! -x "$(command -v fluxctl)" ]]; then
    echo "fluxctl not found"
    exit 1
fi

# requires the ENV calling this script to have flux aws user credentials
ACCESS_KEY_ID=$FLUX_AWS_ACCESS_KEY_ID
SECRET_ACCESS_KEY=$FLUX_AWS_SECRET_ACCESS_KEY

# default flux release name
FLUX_NAMESPACE=fluxcd

# default branch and path to sandbox
GITHUB_BRANCH=sandbox
GIT_PATH=kubernetes/flux/environments/sandbox/releases/sandbox

# allow overriding branch
if [[ $CLUSTER_ENV = "sandbox" ]]; then
  FLUX_NAMESPACE=flux-sandbox
  GITHUB_BRANCH=sandbox
  GIT_PATH=kubernetes/flux/environments/sandbox/releases/sandbox
  echo "using sandbox branch"
fi

if [[ $CLUSTER_ENV = "qa" ]]; then
  FLUX_NAMESPACE=flux-qa
  GITHUB_BRANCH=qa
  GIT_PATH=kubernetes/flux/environments/qa/releases/qa
  echo "using qa branch"
fi

if [[ $CLUSTER_ENV = "production" ]]; then
  FLUX_NAMESPACE=flux-production
  GITHUB_BRANCH=master
  GIT_PATH=kubernetes/flux/environments/production/releases/production
  echo "using master branch"
fi

# ensure namespace exists (each env uses it's own namespace)
kubectl create namespace $FLUX_NAMESPACE || echo "$FLUX_NAMESPACE namespace already exists"

# ensure helm repo added
helm repo add fluxcd https://charts.fluxcd.io

echo "Setting up flux controller"
helm upgrade -i $FLUX_NAMESPACE fluxcd/flux --wait \
  --namespace $FLUX_NAMESPACE \
  --set git.pollInterval=1m \
  --set git.ciSkip=true \
  --set [email protected]:<YOUR ORG>/<YOUR REPO> \
  --set git.path=$GIT_PATH \
  --set git.branch=$GITHUB_BRANCH \
  --set sops.enabled=true \
  --set extraEnvs[0].name=SOPS_KMS_ARN,extraEnvs[0].value=$SOPS_KMS_ARN \
  --set extraEnvs[1].name=AWS_ACCESS_KEY_ID,extraEnvs[1].value=$ACCESS_KEY_ID \
  --set extraEnvs[2].name=AWS_SECRET_ACCESS_KEY,extraEnvs[2].value=$SECRET_ACCESS_KEY

# output the ssh public key, which needs to be applied as a deploy key to the repo
echo "Flux has been installed. ssh public key:"
fluxctl identity --k8s-fwd-ns $FLUX_NAMESPACE

echo "Setting up helm-operator"
helm upgrade -i $FLUX_NAMESPACE-helm-operator fluxcd/helm-operator --wait \
  --namespace $FLUX_NAMESPACE \
  --set git.ssh.secretName=$FLUX_NAMESPACE-git-deploy \
  --set git.timeout=300s \
  --set helm.versions=v3 \
  && echo "helm-operator installed"

Once setup, this works great. Any SOPS encrypted manifests flux finds in the configured path in the repo, as long as they were encrypted with that same KMS ARN, will be automatically decrypted.

Make sure to remember for kubernetes Secrets manifests, the values need to be base64 encoded before you SOPS encrypt the doc. that bit me in the butt once or twice, but that's just basic k8s stuff.

@ryedin
Copy link

ryedin commented Apr 1, 2020

And to be clear, there is nothing special about whether your cluster was setup using eksctl or not. Just make sure your current context is using the cluster you want to set flux up in. (highly recommend using kubectx if you're not already)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants