"Pod is blocking scale down because it’s not backed by a controller" but the pod is a jenkins agent running a job #5434

FranAguiar · 2023-01-20T08:54:50Z

Which component are you using?:

Cluster autoscaler

Is your feature request designed to solve a problem? If so describe the problem this feature should solve.:

This is my use case, jenkins agents created to run jobs cause the autoscaler rise a warning that a pod is blocking the scale down because is not backed by a controller, those agents eventually will finish the job and the autoscaler will be able to shutdown the node. Is not a big deal but is annoying to see that warning time to time and have to check if it is real or not

Describe the solution you'd like.:
An annotation like "cluster-autoscaler.kubernetes.io/safe-to-evict": "true" to said to the autoscaler to wait or ignore the pod without controller and not shutdown the node

x13n · 2023-01-30T10:28:52Z

I'm not sure I follow. Do you think the issue is that the jenkins pod should not block node removal or is it just that the logs are annoying?

FranAguiar · 2023-01-30T10:45:06Z

Hi, correct, I mean that the log is annoying. I know the pod will end its task eventually, I would like to have an option to say to the autoscaler "I know about this pod without a controller, do not warn me about it and not shutdown the node"

k8s-triage-robot · 2023-04-30T15:37:51Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2023-05-30T15:53:24Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

darrenbeck-te · 2023-06-15T11:36:45Z

@FranAguiar Did you find a solution to this issue?

FranAguiar · 2023-06-26T12:05:52Z

@FranAguiar Did you find a solution to this issue?

No, we are just ignoring the warning :(

gp187 · 2023-08-01T07:16:04Z

Is there a way to find the POD that is preventing scale down?

lauraseidler · 2023-08-21T15:42:50Z

We have the same issue - I was expecting this warning to go away by annotating the pod as not cluster-autoscaler.kubernetes.io/safe-to-evict: "false", but this did not help.

I think the reason is that the auto-scaler first checks for a controller and only at the end checks for the annotation.

IMO, it would make sense to flip this, and check for the annotation first - that way, pods can be marked as not safe to evict no matter what, and can generate always the same warning, which will also allow to essentially "ignore" all other warnings (in this case - we know those pods don't have a controller, we don't want to even consider them further for eviction).

This would also be helpful for users of managed k8s services - e.g. we are using GKE, which flags "pod not backed by controller" as a warning, but not "pod marked as not safe to evict".

@x13n as you have commented previously - does that seem reasonable? Would a PR for that be considered?

x13n · 2023-08-23T14:16:44Z

Changing the order of conditions would only replace one warning with another in that case: the scale down is blocked, it's just now it will be because the pod has cluster-autoscaler.kubernetes.io/safe-to-evict: "false" annotation.

lauraseidler · 2023-08-24T09:31:28Z

Yes - which could then be more easily ignored. E.g. we are ignoring all warnings generated by this annotation, as we are adding it on purpose - we know this is not safe to evict, so we can safely ignore this warning, which would solve the "the logs are annoying" issue.

I would like to have an option to say to the autoscaler "I know about this pod without a controller, do not warn me about it and not shutdown the node"

Essentially this - yes, it would still generate a warning, but one that can be more easily ignored. Hard to ignore all "pod doesn't have controller" warnings, easier for all "this pod has been explicitly marked as not safe to evict" warnings.

x13n · 2023-08-28T14:46:32Z

Ah, I see, thanks for clarifying. That makes sense to me - feel free to assign me a PR if you'd like to change the ordering of these conditions.

lauraseidler · 2023-08-29T14:59:01Z

@x13n thanks! See the linked PR :)

FranAguiar added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 20, 2023

jbartosik added the area/cluster-autoscaler label Jan 30, 2023

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 30, 2023

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 30, 2023

kgolab mentioned this issue Jun 15, 2023

VPA release 0.14.0 #5851

Closed

lauraseidler mentioned this issue Aug 29, 2023

Process annotation marking pod as not safe to evict first #6077

Merged

k8s-ci-robot closed this as completed in #6077 Sep 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Pod is blocking scale down because it’s not backed by a controller" but the pod is a jenkins agent running a job #5434

"Pod is blocking scale down because it’s not backed by a controller" but the pod is a jenkins agent running a job #5434

FranAguiar commented Jan 20, 2023

x13n commented Jan 30, 2023

FranAguiar commented Jan 30, 2023

k8s-triage-robot commented Apr 30, 2023

k8s-triage-robot commented May 30, 2023

darrenbeck-te commented Jun 15, 2023

FranAguiar commented Jun 26, 2023

gp187 commented Aug 1, 2023

lauraseidler commented Aug 21, 2023 •

edited

Loading

x13n commented Aug 23, 2023

lauraseidler commented Aug 24, 2023

x13n commented Aug 28, 2023

lauraseidler commented Aug 29, 2023

"Pod is blocking scale down because it’s not backed by a controller" but the pod is a jenkins agent running a job #5434

"Pod is blocking scale down because it’s not backed by a controller" but the pod is a jenkins agent running a job #5434

Comments

FranAguiar commented Jan 20, 2023

x13n commented Jan 30, 2023

FranAguiar commented Jan 30, 2023

k8s-triage-robot commented Apr 30, 2023

k8s-triage-robot commented May 30, 2023

darrenbeck-te commented Jun 15, 2023

FranAguiar commented Jun 26, 2023

gp187 commented Aug 1, 2023

lauraseidler commented Aug 21, 2023 • edited Loading

x13n commented Aug 23, 2023

lauraseidler commented Aug 24, 2023

x13n commented Aug 28, 2023

lauraseidler commented Aug 29, 2023

lauraseidler commented Aug 21, 2023 •

edited

Loading