House keeper for Pipeline related stuff #64

vincent-pli · 2019-08-07T01:40:20Z

Expected Behavior

When event received, deployment of EventListener will create at least one PipelineResource and one Pipelinerun, then those stuff will get out of hand of Trigger.
That’s caused a problem, after period, there will be tens of hundreds pipeline related stuff leaved.

So I think we need setup garbage disposal/house keeper mechanism for the situation.
I saw #48 , but it need delete EventListener to clean all stuff. I think EventListener is design for long term running.

Actual Behavior

Currently, nothing for this

The text was updated successfully, but these errors were encountered:

vtereso · 2019-08-08T17:50:32Z

I agree that resource build up could/would become cumbersome since we are generating resources based on events. Triggers is enabling users to create these resources simply, which only exacerbates this Tekton issue. Maybe an option could be added to the PipelineRuns/TaskRuns, where PipelineResources can be deleted after runs? In terms of handling arbitrary resource deletion, I don't know how much better we can do than EventListener metav1.OwnerReference-based removal (or some of the sort) or having some time based deletion.

vincent-pli · 2019-08-09T01:58:47Z

Could we add metav1.OwnerReference of the resources (except Pipelinerun) to a Pipelinerun. I believe there is at least one PipelineRun, right?

So the problem is when to delete the Pipelinerun

Use informer of Pipelinerun, monitor the Pipelinerun, delete it when complete.
Change the Pipeline refer by PipelineRun, add last task which could send event to EventListener, when receive event, delete related Pipelinerun

vtereso · 2019-08-09T15:28:39Z

PipelineResources are created ahead of time compared to PipelineRuns (since they do not support being embedded currently?), where likely only some should have owner references. If a resource doesn't have a specific SHA/revisions/tags, it's potentially reusable so deleting it might be bad? That aside, I guess it is possible to add a metav1.OwnerReference to referenced PipelineResources?

vtereso · 2019-08-09T15:35:04Z

This has also been discussed here: tektoncd/pipeline#544

vincent-pli · 2019-08-26T09:44:50Z

Proposal:

Design new kind:TriggerExecutor

apiVersion: tekton.dev/v1alpha1
kind: TriggerExecutor
metadata:
  name: some-executor
spec:
   pipelineruns []*corev1.ObjectReference

TriggerExecutor will be create in controller of EventListener. and set Pipelinerun ref in it for all Pipelinerun created in this event received.

Enhance EventTemplate

type TriggerResourceTemplate struct {
        DeleteAfterComplete bool `json:"deleteAfterComplete,omitempty"`
	json.RawMessage `json:",inline"`
}

When creation, all resources in EventTemplate should:

Set owner to EventListener
If DeleteAfterComplete is true, set owner to TriggerExecutor.

In controller of TriggerExecutor, it will watch both TriggerExector and Pipelinerun

Check the status of Pipelinerun when Done, delete it.
Check the list of Pipelinerun in Spec of TriggerExector, if all Pipelinerun is complete (deleted) then delete TriggerExector.

vtereso · 2019-08-26T13:29:42Z

I believe the next step being taken to address the resource proliferation is to embed PipelineResources within PipelineRuns, which will implicitly set the owner reference. I see here that job controllers, have taken a ttl approach to deletion as well. @bobcatfish seems like it would be relatively simple?

vincent-pli · 2019-08-26T13:51:21Z

Yes, but beside the embed case, pipeline also support original way.
and we also need to handle arbitrary resource deletion.

This is a very draft solution, but I think we need consider the issue now, always welcome suggestion.

vtereso · 2019-08-26T14:03:52Z

For generic garbage collection, the Kubernetes documentation only seems to mention owner references and cron job based deletion.

vincent-pli · 2019-08-26T14:15:13Z

Yes, the upper proposal is based on the owner references.
For the cron job, hmm, to be honest, is a choice, but not attractive...

ncskier · 2019-09-09T16:51:08Z

I agree that we don't want to use the EventListener as the OwnerReference, because the EventListener is supposed to persist through multiple triggerings of resource creation.

So, I think that one cleanup mechanism could be to give every triggered resource a label using the $(uid) or an "event id." Then when a user wants to delete all of these resources, they can delete all resources with the $(uid) or "event id" label.

Another weirder option which might be a little heavy would be to create a new Event CustomResource. This Event resource would be created with all of the triggered resources, and would be the owner of all the triggered resources. So, if a user wanted to clean up all of the resources, they could delete the Event resource.

@vincent-pli do you think either of these cleanup options are adequate?

vincent-pli · 2019-09-10T00:54:59Z

@ncskier Thanks for your comments.

Please take a look at my previous comments: #64 (comment)
It's similar to your second option.
I want to summary all the solutions here for further discussion:

Add label to resources with the content of $(uid) or event id, then let user delete resource by label. It's simple for implements and use, but either $(uid) or event id is not good choice for label:

$(uid) it totally random, can not use regex for batch delete.
event id depends on event generator, maybe repeating.

Set owner reference to EventListener
The use case should be: user want to recreate the EventListener and want clear everything related to it.
Its disadvantage is EventListener is design for long term running and support multiple triggerings of resource creation, cannot use it to runtime garbage collection.
Create new crd, maybe named TriggerExecutor and make it as owner of all resource created in this event received. and deleted somewhen.

Manually
Monitor the status of Pipelinerun and delete TriggerExecutor if done.

I think the three options are not conflict, actually I think we can implement option 1 and 2 then document it.
For option 3, to be honest, I like this one, but as @ncskier said, a little heavy, but could really resolve the house keeping problem.
BTW, I think we can make it as option to let user to make decision.

vtereso · 2019-09-10T13:54:31Z

@vincent-pli Good point. One thing that may help however is that all labels cascade from the EventListener and as well as the additional label on the reconciled resources: labels["app"] = el.Name.
This way, any of these labels could be used for deletion.

ncskier · 2019-09-10T14:04:51Z

Good point @vtereso, you could just use the label app=eventlistenername to delete all of the resources associated with that EventListener 👍

vincent-pli · 2019-09-12T00:52:38Z

@vtereso I think it's a good idea, I will update the #96
For the option 3, I think we need more further discussion.

imjasonh · 2019-09-17T17:36:00Z

I think we want to consider some solution that persists the state/history of runs off-cluster, at which point it's safe to delete it from the cluster (or mark it for eventual later deletion).

The system for persistent off-cluster storage/indexing isn't designed yet (tektoncd/pipeline#453), but I believe it would solve this and many other problems in a CI/CD system, like tracking the success of a continuous build over time, searching for provenance of an image or deployment, etc.

This doesn't seem like a triggering-specific problem; PipelineRuns shouldn't be able to continuously accrete on a cluster indefinitely, regardless of whether they were triggered or manually run. I'd like to move this to the Pipelines repo since this is an issue with PipelineRuns/TaskRuns.

(However, that being said, if triggers are going to be capable of creating arbitrary K8s resources on-demand -- e.g., Jobs, ConfigMaps -- then triggering will have come up with some way of reaping those when it can determine they're unneeded. I don't think we can solve that the same way we can for PipelineRuns, since we know a lot more about the semantics of PipelineRuns.)

tekton-robot · 2020-08-12T18:16:24Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.

/lifecycle stale

Send feedback to tektoncd/plumbing.

tekton-robot · 2020-09-11T18:48:21Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

ncskier mentioned this issue Aug 20, 2019

Create resource(s) from an event #74

Merged

3 tasks

vincent-pli mentioned this issue Sep 6, 2019

Add label for resources create by EventListener #96

Merged

3 tasks

ncskier mentioned this issue Sep 9, 2019

All existing types should have validation #35

Closed

disposab1e mentioned this issue Mar 15, 2020

Relax TriggerTemplate limitation of only creating Tekton resources #482

Closed

jlpettersson mentioned this issue Apr 5, 2020

Introduce runHistoryLimit tektoncd/pipeline#2332

Closed

tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 12, 2020

tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 11, 2020

dibyom closed this as completed Dec 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

House keeper for Pipeline related stuff #64

House keeper for Pipeline related stuff #64

vincent-pli commented Aug 7, 2019

vtereso commented Aug 8, 2019 •

edited

Loading

vincent-pli commented Aug 9, 2019

vtereso commented Aug 9, 2019

vtereso commented Aug 9, 2019

vincent-pli commented Aug 26, 2019

vtereso commented Aug 26, 2019

vincent-pli commented Aug 26, 2019

vtereso commented Aug 26, 2019

vincent-pli commented Aug 26, 2019

ncskier commented Sep 9, 2019 •

edited

Loading

vincent-pli commented Sep 10, 2019

vtereso commented Sep 10, 2019 •

edited

Loading

ncskier commented Sep 10, 2019

vincent-pli commented Sep 12, 2019

imjasonh commented Sep 17, 2019 •

edited

Loading

tekton-robot commented Aug 12, 2020

tekton-robot commented Sep 11, 2020

House keeper for Pipeline related stuff #64

House keeper for Pipeline related stuff #64

Comments

vincent-pli commented Aug 7, 2019

Expected Behavior

Actual Behavior

vtereso commented Aug 8, 2019 • edited Loading

vincent-pli commented Aug 9, 2019

vtereso commented Aug 9, 2019

vtereso commented Aug 9, 2019

vincent-pli commented Aug 26, 2019

vtereso commented Aug 26, 2019

vincent-pli commented Aug 26, 2019

vtereso commented Aug 26, 2019

vincent-pli commented Aug 26, 2019

ncskier commented Sep 9, 2019 • edited Loading

vincent-pli commented Sep 10, 2019

vtereso commented Sep 10, 2019 • edited Loading

ncskier commented Sep 10, 2019

vincent-pli commented Sep 12, 2019

imjasonh commented Sep 17, 2019 • edited Loading

tekton-robot commented Aug 12, 2020

tekton-robot commented Sep 11, 2020

vtereso commented Aug 8, 2019 •

edited

Loading

ncskier commented Sep 9, 2019 •

edited

Loading

vtereso commented Sep 10, 2019 •

edited

Loading

imjasonh commented Sep 17, 2019 •

edited

Loading