-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VPA: configurable resource #2387
Comments
According to the current VPA, only the CPU and memory are automatically scaled, then the configuration options for each VPA will be:
|
Agree! |
Making the limit scaling configurable would also solve our issues in #2359. So 👍 on that. |
WRT just scaling one of CPU, memory, there is a way to currently do so. If you want to turn off scaling for a given resource, you can specify MinAllowed=MaxAllowed=desired request in the scaling policy. I know it's not ideal in terms of API, but possible to configure. WRT to keeping limit unchanged, I'm still not sure I understand the use case here. What is the benefit? If we want to treat limit as an upper maximum for request, VPA config provides MaxAllowed for capping the desired resource request. |
"specify MinAllowed=MaxAllowed=desired request in the scaling policy" may work, but it also causes a lot of trouble. For example, every time I update the "CPU Request" in a deployment, I need to update the VPA synchronously and set MinAllowed=MaxAllowed=desired. In addition, not all services are configured with VPA, so you need to maintain this mapping. In fact, if the API design is reasonable, you can avoid these problems and make it easier to use. Regarding Limit, it is relatively dangerous to automatically change it. For example, the memory limit setting is incorrect, it is easy to cause OOM, which causes the "service" SLA to deteriorate. Some "services" want to be able to set Limit appropriately to provide redundancy for bursty traffic. The current VPA prediction algorithm takes a percentage value, which may be reasonable for optimizing the Request, but it is too simple to calculate Limit, which can easily affect the quality of the service. So it is better to provide this configuration, allowing users to specify Limit themselves. In fact, the "Request" value is more important when optimizing the utilization of the cluster. If it is set properly, it can help the scheduler to work better and make full use of the resources of the cluster. For the "Limit" value, the service usage limit is set on the one hand, and the ability to use resources excessively is provided on the other hand. Before there is no better prediction algorithm for Limit, if you provide the ability to configure, you can use VPA for "services" that cannot use VPA because of the "Limit" problem. |
/cc @kgolab for visibility |
@Avanpourm I am sorry if I am missing the point from your comment. I'm trying to wrap my head around it. |
@bskiba Then we need to consider some scenarios of sudden traffic increase in the service when the request utilization is as high as possible. If Limit is calculated proportionally, such as Limit = 200% Request, then Lmit will be small when Request is very small, but this is not very reasonable. For example, the increase in service memory usage from 1G to 2G has doubled, but from 100M to 2G has increased by 20 times. The relationship between Limit and Request should not be just a fixed ratio. Like two cases
|
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
It is desirable to configure each VPA-associated resource separately to determine whether to automatically scale the memory, CPU, or both, and configure whether to scale the Limit value proportionally.
In the actual business scenario, not all services are suitable for automatic scaling of memory and CPU at the same time. It is hoped that the corresponding configuration capability can be provided. In addition, the proportional scaling of the Limit values is not suitable for all services. Many services want to maintain the “Limit” value and only automatically resize the “Reqeust” value.
The text was updated successfully, but these errors were encountered: