Ever growing collection of stack and ssh directories in /tmp #104

rodcloutier · 2020-11-13T14:34:16Z

Problem description

Each run of the reconciliation loop for a stack creates two directories in /tmp.
When in exponential back-off each error will produce the directories quickly filling the disk if the program is large.

The /tmp directory does not seem to be clean-up afterward.

The text was updated successfully, but these errors were encountered:

* WIP fixes for #102 * Update CRD and avoid clobbering last successful commit * Record actual git commit instead of using spec value * Delete working directory after reconciliation - should help with #104 * Address PR comments * Improve tests * Adding more tests to cover lifecycle of stack with success and failure

lukehoban · 2020-12-03T18:54:36Z

@viveklak I believe this was fixed with #111?

viveklak · 2020-12-03T19:59:45Z

Yes it was - closing.

jsravn · 2021-09-13T16:20:51Z

This is still an issue for me as of v0.0.19. Primarily go-build and pulumi_auto directories. It fills up then the pod is evicted. Seems like they never get cleaned. Should this issue be reopened? @viveklak

$ du -h --max-depth=1 .
56M	./go-build11131166
51M	./pulumi_auto424417654
125M	./pulumi_auto149427206
51M	./pulumi_auto817377308
51M	./pulumi_auto897560026
51M	./pulumi_auto118987428
56M	./go-build2872142450
125M	./pulumi_auto320708718
51M	./pulumi_auto064414734
56M	./go-build494682469
56M	./go-build4232292867
56M	./go-build1436358647
51M	./pulumi_auto150684010
51M	./pulumi_auto591096598
56M	./go-build668430343
56M	./go-build2473233511
125M	./pulumi_auto115912020
51M	./pulumi_auto840246808
51M	./pulumi_auto759567042
56M	./go-build827327584
51M	./pulumi_auto807673218
125M	./pulumi_auto765021992
125M	./pulumi_auto143358980
51M	./pulumi_auto229110728
51M	./pulumi_auto503688548
56M	./go-build2446001715
56M	./go-build3402958734
125M	./pulumi_auto030638300
56M	./go-build708999075
125M	./pulumi_auto071293304
1.2G	./.cache
51M	./pulumi_auto591984558
51M	./pulumi_auto193436996
51M	./pulumi_auto673547078
56M	./go-build1043104184
51M	./pulumi_auto510264894
4.0K	./ssh-3GNLN8m9GrBB
56M	./go-build2440826664
125M	./pulumi_auto937262976
51M	./pulumi_auto894580848
125M	./pulumi_auto269451444
51M	./pulumi_auto696370720
125M	./pulumi_auto045380786
51M	./pulumi_auto137819124
51M	./pulumi_auto907926380
125M	./pulumi_auto049063888
125M	./pulumi_auto365861080
51M	./pulumi_auto255523000
56M	./go-build4083929398
125M	./pulumi_auto269504970
125M	./pulumi_auto216728222
56M	./go-build4054638386
51M	./pulumi_auto851282536
51M	./pulumi_auto044917746
51M	./pulumi_auto985741906
51M	./pulumi_auto899407740
56M	./go-build1635953218
51M	./pulumi_auto250098880
56M	./go-build401387737
51M	./pulumi_auto697617878
51M	./pulumi_auto571644286
51M	./pulumi_auto417612048
56M	./go-build231586882
125M	./pulumi_auto058073242
56M	./go-build2087838320
51M	./pulumi_auto270401826
51M	./pulumi_auto734179806
56M	./go-build3211809675
51M	./pulumi_auto250955018
51M	./pulumi_auto461150630
56M	./go-build4238020611
51M	./pulumi_auto381682892
51M	./pulumi_auto695353514
125M	./pulumi_auto477282163
125M	./pulumi_auto440772140
125M	./pulumi_auto191234956
736M	./go-build2293400727
56M	./go-build2744468495
51M	./pulumi_auto069259322
7.1G	.

viveklak · 2021-09-14T04:18:41Z

@jsravn Looks like this can indeed happen if the workspace fails to be initialized (e.g. you have the wrong token or config etc.) But these are the situations which are retried aggressively. I have reopened and will address.

viveklak · 2021-09-30T21:53:05Z

There is still a problem with a work directories under repos. The current cleanup only cleans the workdir and leaks the rest of the repo. The ideal fix is #78. But we should make every effort to cleanup any additional files generated.

liamawhite · 2021-10-16T13:11:00Z

We end up having to use volumes anyway (I think due to our use of a monorepo). The volume fills up over time as pods are killed mid-reconcile. Obviously, we can fix this by increasing the graceful shutdown period, but it would be better if on startup the operator looks for old auto_workdirs and cleans them up.

dirien · 2022-12-20T14:41:27Z

I got several options to perform this housekeeping:

Move the reconciliation loop to a model where each run executes as a k8s Job #78 would be good but inversive way as it requires a rewrite of the operator
Having an housekeeping while starting the pod and delete all files. (delete candidates selection has to be defined in detail)
Sidecar container running deletes on the volume. Here, the selection of a delete candidate has to be clear to not accidentally delete a (long) running deployment.

EronWright · 2024-10-29T23:50:48Z

Good news everyone, we just released a preview of Pulumi Kubernetes Operator v2. This new release has a whole-new architecture that uses pods as the execution environment. I do think the garbage problem is now under control.

Please read the announcement blog post for more information:
https://www.pulumi.com/blog/pulumi-kubernetes-operator-2-0/

Would love to hear your feedback! Feel free to engage with us on the #kubernetes channel of the Pulumi Slack workspace.
cc @rodcloutier

mikhailshilkov added the enhancement label Nov 17, 2020

viveklak mentioned this issue Dec 1, 2020

Use ephemeral storage for disk mutations #109

Merged

viveklak pushed a commit that referenced this issue Dec 1, 2020

Delete working directory after reconciliation - should help with #104

8bc2723

viveklak mentioned this issue Dec 1, 2020

Fix stack state message regression #107

Merged

lukehoban assigned viveklak Dec 1, 2020

viveklak pushed a commit that referenced this issue Dec 1, 2020

Delete working directory after reconciliation - should help with #104

c624bd3

viveklak pushed a commit that referenced this issue Dec 3, 2020

Delete working directory after reconciliation - should help with #104

9dd97fe

viveklak mentioned this issue Dec 3, 2020

Use ssh key directly instead of writing to disk #111

Merged

viveklak closed this as completed Dec 3, 2020

infin8x added the kind/enhancement Improvements or new features label Jul 10, 2021

viveklak reopened this Sep 14, 2021

viveklak mentioned this issue Sep 14, 2021

Improve workdir cleanup logic #195

Merged

viveklak closed this as completed in #195 Sep 15, 2021

pulumi-bot added the resolution/fixed This issue was fixed label Sep 15, 2021

viveklak mentioned this issue Sep 30, 2021

Fix clean up logic on reconcile #203

Merged

viveklak reopened this Sep 30, 2021

mnlumi unassigned viveklak Jan 27, 2023

EronWright closed this as completed Oct 29, 2024

pulumi-bot assigned viveklak Oct 29, 2024

EronWright assigned EronWright and unassigned viveklak Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ever growing collection of stack and ssh directories in /tmp #104

Ever growing collection of stack and ssh directories in /tmp #104

rodcloutier commented Nov 13, 2020

lukehoban commented Dec 3, 2020

viveklak commented Dec 3, 2020

jsravn commented Sep 13, 2021 •

edited

Loading

viveklak commented Sep 14, 2021

viveklak commented Sep 30, 2021

liamawhite commented Oct 16, 2021

dirien commented Dec 20, 2022

EronWright commented Oct 29, 2024

Ever growing collection of stack and ssh directories in /tmp #104

Ever growing collection of stack and ssh directories in /tmp #104

Comments

rodcloutier commented Nov 13, 2020

Problem description

lukehoban commented Dec 3, 2020

viveklak commented Dec 3, 2020

jsravn commented Sep 13, 2021 • edited Loading

viveklak commented Sep 14, 2021

viveklak commented Sep 30, 2021

liamawhite commented Oct 16, 2021

dirien commented Dec 20, 2022

EronWright commented Oct 29, 2024

jsravn commented Sep 13, 2021 •

edited

Loading