Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pruning resources #88

Closed
malcolmholmes opened this issue Oct 1, 2019 · 12 comments · Fixed by #251
Closed

Pruning resources #88

malcolmholmes opened this issue Oct 1, 2019 · 12 comments · Fixed by #251
Assignees
Labels
component/kubernetes Working with Kubernetes clusters keepalive kind/feature Something new should be added

Comments

@malcolmholmes
Copy link
Contributor

At present, Tanka will not remove Kubernetes resources that have been removed from the respective Jsonnet configuration, and such removals need to be done manually.

Terraform (for example) handles this by maintaining a record of 'intended state'. If an object is present in the state file, but not in the Jsonnet configuration, then we can deduce that it needs to be removed.

Given that Kubernetes already maintains its own state, and that that is consumed for ks diff etc, we could likely achieve deletions if we simply stored a list of resources created by this Tanka environment, within for example, a single CRD. This CRD could be named within the spec.json file.

Thus, when either tk diff or tk apply are executed, Tanka can look for any resources that are present in this CRD but are not present in the Jsonnet output. Such resources are candidates for deletion. tk diff could include them in its output, and tk apply could use kubectl delete to remove them.

We should consider whether deletions should be a standard feature of Tanka, or whether they would need enabling via a command line switch.

@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the label kind/feature to this issue, with a confidence of 0.91. Please mark this comment with 👍 or 👎 to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

@issue-label-bot issue-label-bot bot added the kind/feature Something new should be added label Oct 1, 2019
@sh0rez
Copy link
Member

sh0rez commented Oct 8, 2019

Hi, I like this idea!

We might not even need a CRD for this purpose, because Kubernetes labels should provide enough functionality to do what we need:

  1. On creation, Tanka injects a label into every object it creates (e.g. tanka.dev/origin: default/loki@config-repo
  2. On apply / diff, kubectl get -l allows us to get a list of all previously created objects, which we can use to easily compute the objects to remote

It might make sense to leave this optional using a flag such as tk apply --tidy, which would if supplied also include these objects as deleted in the diff

@sh0rez sh0rez added the component/kubernetes Working with Kubernetes clusters label Oct 8, 2019
@sh0rez sh0rez self-assigned this Oct 8, 2019
@malcolmholmes
Copy link
Contributor Author

This is such a delightful simplification. I'll see if I can get it done in the next week or so.

@sh0rez
Copy link
Member

sh0rez commented Oct 8, 2019

Can you draft a design doc first? This is one of the bigger features and I would like some additional discussion around it so we can make sure we get it on the first take right before we write anything to the users' clusters?

@malcolmholmes
Copy link
Contributor Author

I considered a design doc, but needed to get my hands dirty to prove it could work. Totally happy to rewrite the PR I just submitted based upon better ideas.

@stale
Copy link

stale bot commented Nov 24, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale
Copy link

stale bot commented Dec 28, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Dec 28, 2019
@sh0rez sh0rez removed the stale label Dec 29, 2019
@Bessonov
Copy link

+1 on for the labels. Docker compose do it in the same way:

            "Labels": {
                "com.docker.compose.config-hash": "32e9d85cbe1711281c7a3b57905307c3a204af30a7b0e4d92f614ec9b307d3e6",
                "com.docker.compose.container-number": "1",
                "com.docker.compose.oneoff": "False",
                "com.docker.compose.project": "dev",
                "com.docker.compose.service": "dev",
                "com.docker.compose.version": "1.16.1"
            }

@sh0rez sh0rez changed the title Tanka support for deletion by maintaining state Pruning resources Feb 29, 2020
@woodsaj
Copy link
Member

woodsaj commented Mar 9, 2020

Using labels on resources is difficult as it requires checking for resources across every apiversion/kind that kube-api supports. With CRDs, this can become quite difficult.

I think a simple "state" file approach would work best here. eg, a simple configMap is used to list all of the resources created, with some simple metadata about each. The configmap should be named based on the environment name defined in spec.json metadata
eg

apiVersion: v1
kind: ConfigMap
metadata:
  name: default
  namespace: monitoring
data:
  state.json: |
    {
       "apps/v1:Deployment:grafana": {
         "timestamp": "2020-03-09T19:46:00+00:00",
         "user": "woodsaj"
        },
       "v1:ConfigMap:grafana-dashboards": {
         "timestamp": "2020-03-09T19:46:02+00:00",
         "user": "woodsaj"
        }
    }

When new resources are deployed, they are added to the state file. Entries are only moved from the state file after the resources have been removed from k8s.

eg
When tk apply is run, it fetches this configMap and loads the state. The new state.json is then created internally after generating all of the resources from jsonnet. Tanka can then inform the user of any resources that are defined in the current state, but not in the new state and so will therefore be pruned after the apply completes.

if tk apply -t is used, the existing state will just be updated, adding new resources or simply updating the metadata for the specific resources being updated.

As an alternative to storing the state in configMaps, we could take the terraform approach and have pluggable support for state storage, eg S3/GCS. Though I would prefer to just start with the configMap approach, we can always add support for other storage options in future.

@captncraig
Copy link
Contributor

I worry the configmap option has a lot of edge cases that need to be considered when actually implementing. If it gets out of sync through asynchronous runs or inconsistent tk versions or manual changes it can be difficult to feel confident it is accurate any more. The labels are much simpler, and I like to hope the discovery problem is solvable.

I have a PoC for scanning a cluster for all resources of all types, including CRDs.

Repository here. It uses the discovery and dynamic apis to enumerate all types in your cluster, and scan them all with a label filter. It usually takes under 10 seconds to scan everything and list out objects.

So my proposed implementation would be something like:

  1. All resources managed by tanka get labeled with tanka.dev/environment: $ENV (I think we should do this regardless of pruning solution and will make another issue for that)
  2. Add a --prune flag to tk diff and tk apply that will scan the cluster for resources matching the above label and list them out and/or delete them.

It does require bringing in the client-go packages, which I hear we have been hesitant to do so far. I would also like to discuss that, but probably in another issue.

@sh0rez
Copy link
Member

sh0rez commented Apr 3, 2020

@captncraig some thoughts on this:

All resources managed by tanka get labeled with tanka.dev/environment: $ENV (I think we should do this regardless of pruning solution and will make another issue for that)

Yes!

It does require bringing in the client-go packages, which I hear we have been hesitant to do so far. I would also like to discuss that, but probably in another issue.

I don't think we need that. We already query kubectl api-resources on every run anyways, so we know about all kinds active in the cluster. Also, it reports only latest versions of a given API, which allows us to check only those, thus saving us from doing advanced cross-version logic.

Add a --prune flag to tk diff and tk apply that will scan the cluster for resources matching the above label and list them out and/or delete them.

While my #131 did that, it became obvious that it has drawbacks, such as accidental deleting of resources when applying, because you forgot to rebase against master before.

Instead, I'd prefer to have this as a separate action tk prune, which shows a prune-only diff before, much like a tk delete would behave if we found the time to implement it, so it a very explicit action

@captncraig
Copy link
Contributor

I like an explicit command. tk prune --dry-run and tk prune would be very obvious with what you want to do. Perhaps if we merge in the labels, we can do more experimentation with how well kubectl alone can pick up changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/kubernetes Working with Kubernetes clusters keepalive kind/feature Something new should be added
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants