Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow inline environments to be pruned #511

Merged
merged 1 commit into from
Feb 18, 2021

Conversation

craigfurman
Copy link
Contributor

@craigfurman craigfurman commented Feb 8, 2021

By adding the name field to the prune flags. This just works due to
environment loading logic being shared across subcommands.

Base NameLabel on both namespace and name. The namespace of inline
environments includes the path on disk, and multiple environments with
the same given name cannot be loaded from the same path on disk, so the
tuple of these 2 fields is guaranteed to be unique.

Use a truncated hash of a combination of these 2 fields to form a label
that is guaranteed to fit in Kubernetes' 63 character limit for label
values.

This fixes a bug in which 2 inline environments with identical
metadata.Name, applied separately from 2 paths on disk to the same
Kubernetes namespace would create resources with identical
tanka.dev/environment labels, and so pruning one release would delete
the resources of the other.

When upgrading to this tanka version, environments with
spec.injectLabels=true will show diffs on the tanka.dev/environment
label, and prune-diffs will run clean, even if prune-able resources were
previously visible. After applying the label change, the previously
prune-able resources should be prune-able.

Add test case for the "don't load multiple envs" assertion.

Fixes #510

@sh0rez
Copy link
Member

sh0rez commented Feb 8, 2021

Nice! This indeed enables the feature, but there is a bit more to it which is very subtle.

Label uniqueness

tk prune discovers what to prune based on a label called tanka.dev/environment, which value is generated by *Environment.Metadata.NameLabel():

func (m Metadata) NameLabel() string {
return strings.Replace(m.Name, "/", ".", -1)
}

This works fine for static environments, because they have their name generated based on their path on disk. Inline environments however permit any user-given name, and instead use the Metadata.Namespace (not to be confused with spec.namespace) field for their path on disk.

This means to generate an unique label above function would need to include information from both Metadata.Name and Metadata.Namespace. Otherwise running tk prune on a equally named environment would wrongly delete resources.

Update: The new label could look something like <namespace>:<name>, e.g. environments/default/main.jsonnet:myEnv. Hope that illustrates it

Environment uniqueness

Currently it is possible to define two inline environments of the same name in the same file. This is merely an oversight and not a guaranteed feature, but it breaks above uniqueness (again).
Given we never gave support guarantees, we can easily avoid that by having tanka.List() fail when it encounters two environments of the same name.


Given this increase in complexity, let us know whether you are still interested in doing this. Of course we would be there to assist if you wanna do it yourself :)

@sh0rez sh0rez added component/kubernetes Working with Kubernetes clusters kind/feature Something new should be added labels Feb 8, 2021
@craigfurman
Copy link
Contributor Author

Thanks for the quick reply @sh0rez!

Inline environments however permit any user-given name, and instead use the Metadata.Namespace (not to be confused with spec.namespace) field for their path on disk.

Am I right to assume you mean Metadata.Name, not Namespace? If so this matches my mental model of inline environments: their name is the value of metadata.name, not the path on disk.

I want to make sure I understand the problem you go on to describe, with prune not being currently able to guarantee that it will not delete the "wrong" resources. A scenario I can think of that could cause this:

  1. A tanka repo with 2 environment paths on disk, each using inline environments: environments/A/main.jsonnet, and environments/B/main.jsonnet.
  2. Each main.jsonnet declares a tanka.Environment with identical metadata.name: "foo", in identical namespaces.
  3. After tk applying each environment (tk apply environments/A --name foo; tk apply environments/B --name foo), you have various resources in the same namespace with identical prune labels.
  4. tk prune environments/A --name foo will delete the resources from B, in addition to the resources from A. Whoops!

The new label could look something like :, e.g. environments/default/main.jsonnet:myEnv

By making the path-on-disk part of the prune label, we'd avoid the scenario above. That makes sense. Whether we change the value of tanka.dev/environment or change inline environments only to use a new label, won't that create a breaking change for tanka users who've started using inline environments since 0.14.0 came out? I can't immediately think of a way out of that, and we do need to support prunes. Breaking changes in this way might be preferable to the possibility of deleting the wrong resources! FWIW, using the same label key seems best.

Currently it is possible to define two inline environments of the same name in the same file.

I think I understand you, are you saying that this is an invalid state that we should guard against? If so, that makes sense: a tk-eval tree with 2 tanka.Environments with the same metadata.name should not be valid. This is different from the scenario I describe above.


If I've indeed got the right end of the stick, I'm happy to add to this PR 👍

@craigfurman
Copy link
Contributor Author

Concatenating paths on disk and environment names might commonly come close to the k8s 63 character limit on label values. Wdyt of truncate(hashsum(sprintf("%s:%s", pathOnDisk, Environment.metadata.name))?

@sh0rez
Copy link
Member

sh0rez commented Feb 8, 2021

Am I right to assume you mean Metadata.Name, not Namespace? If so this matches my mental model of inline environments: their name is the value of metadata.name, not the path on disk.

I actually meant metadata.namespace. Take a look how we parse environments:

// static env at /environments/default
{
  kind: "Environment",
  metadata: {
    name: "environments/default",
    namespace: "environments/default/main.jsonnet"
  }
}

// inline env at /environments/default
{
  kind: "Environment",
  metadata: {
    name: "whateverYouChose",
    namespace: "environments/default/main.jsonnet"
  }
}

// inline env at /environments/default/foo.jsonnet
{
  kind: "Environment",
  metadata: {
    name: "whateverYouChose",
    namespace: "environments/default/foo.jsonnet"
  }
}

So namespace always holds the path to the file that was used as the entrypoint. For static envs, name also holds the path to the directory the entrypoint is enclosed in.


I want to make sure I understand the problem you go on to describe, with prune not being currently able to guarantee that it will not delete the "wrong" resources. A scenario I can think of that could cause this ...

That scenario is exactly what I had in mind.


Breaking changes in this way might be preferable to the possibility of deleting the wrong resources

Totally! It wouldn't be too bad though. If newer Tanka versions would start to use different label values, the worst thing that could happen is that they would miss resources created by previous versions and not yet re-applied.

Same for older versions, they would miss resources by newer versions. afaict no resources would falsely be deleted.


a tk-eval tree with 2 tanka.Environments with the same metadata.name should not be valid. This is different from the scenario I describe above.

Yup it's different, but it would lead to equal name+namespace, but we should disallow it anyways.


Wdyt of truncate(hashsum(sprintf("%s:%s", pathOnDisk, Environment.metadata.name))?

We thought of this option before, sounds reasonable. If we are going to break anyways, then yeah, let's fix both issues


I hope that clarifies it enough for you to get started. Let me know if anything isn't clear yet

@craigfurman
Copy link
Contributor Author

Ah ha, the part I was missing was that metadata.Namespace already contains the path on disk:

tanka/pkg/spec/spec.go

Lines 41 to 55 in a43a839

namespace, err := filepath.Rel(root, file)
if err != nil {
return nil, err
}
data, err := ioutil.ReadFile(filepath.Join(base, Specfile))
if err != nil {
if os.IsNotExist(err) {
c := v1alpha1.New()
c.Metadata.Name = name // legacy behavior
c.Metadata.Namespace = namespace
return c, ErrNoSpec{path}
}
return nil, err
}

I think I have enough info to get started, and thanks again 👍

@Duologic
Copy link
Member

Duologic commented Feb 9, 2021

We thought of this option before, sounds reasonable. If we are going to break anyways, then yeah, let's fix both issues

Break one thing, fix two. Sounds like a win to me. :)

I did some work towards this in #433, please have look, might want to steal some ideas.

@craigfurman craigfurman marked this pull request as draft February 10, 2021 17:21
By adding the name field to the prune flags. This just works due to
environment loading logic being shared across subcommands.

Base NameLabel on both namespace and name. The namespace of inline
environments includes the path on disk, and multiple environments with
the same given name cannot be loaded from the same path on disk, so the
tuple of these 2 fields is guaranteed to be unique.

Use a truncated hash of a combination of these 2 fields to form a label
that is guaranteed to fit in Kubernetes' 63 character limit for label
values.

This fixes a bug in which 2 inline environments with identical
metadata.Name, applied separately from 2 paths on disk to the same
Kubernetes namespace would create resources with identical
`tanka.dev/environment` labels, and so pruning one release would delete
the resources of the other.

When upgrading to this tanka version, environments with
spec.injectLabels=true will show diffs on the `tanka.dev/environment`
label, and prune-diffs will run clean, even if prune-able resources were
previously visible. After applying the label change, the previously
prune-able resources should be prune-able.

Add test case for the "don't load multiple envs" assertion.

Signed-off-by: Craig Furman <[email protected]>
@craigfurman craigfurman marked this pull request as ready for review February 11, 2021 08:44
@craigfurman
Copy link
Contributor Author

@sh0rez @Duologic I think this is ready for review now, would you mind taking a look? I've updated the commit message / description to reflect the changes requested earlier.

return strings.Replace(m.Name, "/", ".", -1)
partsHash := sha256.Sum256([]byte(fmt.Sprintf("%s:%s", m.Name, m.Namespace)))
chars := []rune(hex.EncodeToString(partsHash[:]))
return string(chars[:48])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no science at all to my choice here of 48 🤷‍♂️

@Duologic Duologic requested review from sh0rez and Duologic February 16, 2021 14:42
Copy link
Member

@sh0rez sh0rez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I see this all looks very nice! LGTM

Copy link
Member

@Duologic Duologic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎆 Looks great. The diff will be rather big on our internal environments and we'd need to communicate this well but it is totally worth fixing this hurdle properly. Thank you!

@Duologic Duologic merged commit 4d269e5 into grafana:master Feb 18, 2021
@craigfurman craigfurman deleted the prune-inline-envs branch February 18, 2021 09:53
@craigfurman
Copy link
Contributor Author

Nice one, thanks for reviewing this. I plan to start using this at GitLab pretty soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/kubernetes Working with Kubernetes clusters kind/feature Something new should be added
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pruning inline environments
3 participants