-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve UX of Step Resource Requests #2986
Comments
+1, the CLI/UI could also make this clearer Per-step limits are still valid and enforced, so maybe we should migrate to a world where requests are specified per-Task and each step can only set a limit? |
Yes, think the CLI could definitely add some kind of option for this.
Think it's reasonable to still support limits for steps. Another benefit of this approach would be potentially eliminating needing to search through all containers as part of a TaskRun to find the max since the values are known in advance. |
Something that will have to be worked out with this though is how to deal with step limits and not repeating what occurred in #1937. |
In rethinking this with regards to limits, why not have one limit for the entire Task? This way, the Task can be set up with a limit to be enforced via a TaskRun, but then resource requests can be specified at run time via a request property? Since a step has access to all resources of a pod, is there any reason for restricting individual steps? |
Regarding steps limits being configurable, would this apply also to the Tekton containers like the |
I don't think so. Tekton is responsible for those steps, and should be responsible for configuring their limits. Perhaps that step should be an |
Makes sense. Do you think it would be possible for Tekton to define reasonable |
@imjasonh I +1 on Tekton being responsible for those steps (init containers as well as containers added by Tekton) and therefore providing the resources (imo not just limits but also requests). But, I suggest that Tekton provides reasonable values. What we (I am in the same team as @qu1queee) see (we are still on Tekton 0.11, about to get to 0.14; so if things are changed in the meantime, pls correct me) is the following: we have a LimitRange in place in the namespace where the TaskRun (with embedded TaskSpec) gets created. The LimitRange contains default, defaultRequest, min and max for CPU and Memory. What we see for initContainers: Tekton takes default and defaultRequest. This is imo not correct (because in our case it requests 1 Gi of memory which is way too much). If Tekton is responsible for those init containers, then it should specify reasonable values. Tekton knows what these containers are doing. As such, it should specify a request and limit that is suitable for that (while still respecting min and max of the LimitRange to create valid containers). The same applies to Tekton's containers (
@danielhelfand that would be new to me. A step is a container and has only access to the memory and CPU defined for this container.
@danielhelfand No, this refers to my previous comment. As my understanding is that the CPU and memory are assigned to a container, it can't be assigned once per TaskRun. I would really like to see a mechanism available where the user creating the TaskSpec has control over the step resource requests AND limit. I agree with the concern outlined elsewhere that due to containers being started all at the same time this can easily eat resources of a node in case there are many steps (or even cause it to not be schedulable). But, I also think that the current algorithm can cause an issue. Assuming there are two steps and we leave all tekton-added containers out of the picture for simplification. The first one has request = limit = 1 Gi, the second one has request = limit = 2 Gi. When the LimitRange min is 128 Mi, then Tekton will change the request for the first step to 128 Mi. Kubernetes can now schedule this pod to a node that currently has only 2128 Mi memory available because scheduling happens based on the request (and not the limit). The first one now starts to run. If that's an operation that really needs memory up to 1 Gi, then this container will be OOMKilled as the node can't give it the necessary memory although the container limit is not reached. I am not asking for the current mechanism to be generally changed, but at least there should be an option somewhere (global as env property, a flag on the TaskSpec, something like that) that allows to force the step's request and limit to be set in the Pod's containers as well without modification. |
@SaschaSchwarze0 Not sure if you have read how Tekton handles step/container resource requests, but there are some differences from the traditional approach by Kubernetes. Please see the documentation below:
|
Yes, I know it. |
I agree that the request is reasonable as far as being able override step resource requests if it is the case that each step container doesn't have access to the max resources defined for all containers. cc @imjasonh Any thoughts on this? |
folks, any update on this? |
Issues go stale after 90d of inactivity. /lifecycle stale Send feedback to tektoncd/plumbing. |
/remove-lifecycle stale |
+1, we need this feature too. |
+1, I think it doesn't need to be an But in a shared or complex Kubernetes environment, it is better to provide the capability to customize the settings by admin. Although some of the containers are created by Tekton but they also depend on the end user input may have some problem if just using default settings. |
Issues go stale after 90d of inactivity. /lifecycle stale Send feedback to tektoncd/plumbing. |
Stale issues rot after 30d of inactivity. /lifecycle rotten Send feedback to tektoncd/plumbing. |
/lifecycle frozen |
quick update @vdemeester is making some improvements to how tekton handles limitranges (#4176), tho i think this particular issue is also requesting support for specifying limits at the Task level as well |
This commit expands the scope of TEP-0094 to cover the user experience of specifying resource requests and limits in Tasks. Focusing only on Step and Sidecar resource requirements may be too narrow of a scope for this TEP. This is largely motivated by tektoncd/pipeline#2986, because the solution to this problem may involve removing the ability to specify Step resource requests. It doesn't make sense to override Step resource requests in TaskRuns if users shouldn't be able to specify Step resource requests in the first place. The scope is also expanded to include parameterizing resource requests based on discussion in tektoncd#560, around treating resource requirement parameterization and runtime overrides as "both/and", rather than "either/or". Fixing Task's resource requirement UX may allow us to get parameterization for free.
This commit expands the scope of TEP-0094 to cover the user experience of specifying resource requests and limits in Tasks. Focusing only on Step and Sidecar resource requirements may be too narrow of a scope for this TEP. This is largely motivated by tektoncd/pipeline#2986, because the solution to this problem may involve removing the ability to specify Step resource requests. It doesn't make sense to override Step resource requests in TaskRuns if users shouldn't be able to specify Step resource requests in the first place. The scope is also expanded to include parameterizing resource requests based on discussion in tektoncd#560, around treating resource requirement parameterization and runtime overrides as "both/and", rather than "either/or". Fixing Task's resource requirement UX may allow us to get parameterization for free.
As of v0.28.0, Step resource requests and limits are left unchanged, or adjusted based on limitranges present (see docs for more info)-- thanks @vdemeester! I'm going to close this issue because the Step level behavior described is no longer true. I've opened #4470 as a FR for Task-level resource requirements. /close |
@lbernick: You can't close an active issue/PR unless you authored it or you are a collaborator. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/close |
@vdemeester: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Feature request
Currently, users have the ability to define resource requests as part of a step for a Task for each step:
This is misleading in my opinion because it makes it seem like the user has control of setting the requests for each step. However, only the max requests from all steps are ever used as documented here.
So a user should only have to, at most, define resource requests (the max values needed for a container) once for steps. Maybe it makes sense to allow users to define CPU, memory, and ephemeral storage once as part of a TaskRun and allow this to be configurable for PipelineRuns via taskRunSpecs?
Use case
Discussed in #2984: #2984 (comment)
Discussed in #2931: #2931 (comment)
The idea has come up a couple times in issues where the experience of requests is confusing for users, and it might help to provide this option and clearly document with better examples.
The text was updated successfully, but these errors were encountered: