Set limits and requests for exclusive pool containers #25

Levovar · 2019-07-26T20:41:43Z

Similarly to shared pool Pods, we need to care about setting requests, and limits for exclusives.
Request: explicitly setting 0 is required, similarly to how it was done for shareds in #14

Limit: we should either set explicit 0 to avoid artificially limiting the share time of a core which is anyway physically isolated; or we need to set (number_of_exclusive_cores)X1000.
Reasoning for 0: even if the limit is actually higher than what is possible to get, it will still add a CFS quota to the container. There are anecdotal reports out there indicating just the presence of a CFS quota can negatively affect performance.
Reasoning for explicitly setting limit: there are some admission controllers in K8s which mandate a Pod to define requests, and limits. These mandated requests are ofc pointless in case of an exclusive user, but failure to comply results in failing Pod admission.
So, if the presence of CFS quotas do not affect performance, it is actually the safer option to set the limit!

@TimoLindqvist : WDYT Timo?

TimoLindqvist · 2019-08-15T10:56:15Z

I'm under the impression that CFS quota can affect performance / latency so we should configure it so that CFS quota is disbaled. Is it disabled if we set the CPU limit to zero ?

Levovar · 2019-08-27T15:12:03Z

K8s community actually proposed kernel patches to correct this issue :)
so I think we shouldn't disable CFS quotas. they should work much better from 4.14, and as intended in the latest versions.

Levovar · 2019-08-27T15:15:42Z

sorry, first improvement is available from 4.18:
torvalds/linux@512ac99#diff-1c5364196d98130348bddabaad0a701f

And this one should totally fix them:
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=de53fd7aedb100f03e5d2231cfce0e4993282425

BTW based on the description the CFS quota was only misbehaving for workloads which frequently idle, and only consume some slices.
our exclusive workloads are quite the opposite, they always use 100% of the core

TimoLindqvist · 2019-09-06T05:41:55Z

I'm still a bit against setting the limit to something else than 0 thus enabling the CFS quota. If container allocates exclusive core(s), it must have full access to the core(s). On the other hand it cannot use other cores so why is the quota needed ?

Levovar · 2019-09-10T11:14:00Z

On the other hand it cannot use other cores so why is the quota needed ?

purely for Kubernetes compatibility reasons I described in the opening post. If the user has LimitRanger admission plugins enabled in their cluster, instantiating Pods within their CPU-Pooler enhanced cluster will fail:
https://kubernetes.io/docs/concepts/policy/limit-range/#overview-of-limit-range

I think the same might be an issue when the operator would set some constraints in the Namespace object

TimoLindqvist · 2019-10-21T06:14:56Z

We can add the limit and requests to exclusive containers to avoid failures in pod instantiation but would it be ok to set the limit to zero ?

This commit solves Issue #25. When a container is using shared pool resources, the CFS quota is set to its limit value With exclusive users it is set to the total amount of all exclusive cores * 1000 When both are requested the overall quota is set to exclusive*1000 + 1.2*shared In this hybrid scenario we leave a 20% safety margin on top of the originally requested shared resoruces, to avoid accidentally throttling the higher prio exclusive thread when the lower prio shared threads are overloaded.

This commit solves Issue #25. When a container is using shared pool resources, the CFS quota is set to its limit value With exclusive users it is set to the total amount of all exclusive cores * 1000 + 100 (constant 100 is added to avoid activating throttling mechanisms near 100% utilization) When both are requested the overall quota is set to exclusive*1000 + 1.2*shared In this hybrid scenario we leave a 20% safety margin on top of the originally requested shared resoruces, to avoid accidentally throttling the higher prio exclusive thread when the lower prio shared threads are overloaded.

This commit solves Issue nokia#25. When a container is using shared pool resources, the CFS quota is set to its limit value With exclusive users it is set to the total amount of all exclusive cores * 1000 + 100 (constant 100 is added to avoid activating throttling mechanisms near 100% utilization) When both are requested the overall quota is set to exclusive*1000 + 1.2*shared In this hybrid scenario we leave a 20% safety margin on top of the originally requested shared resoruces, to avoid accidentally throttling the higher prio exclusive thread when the lower prio shared threads are overloaded.

Levovar mentioned this issue Jan 6, 2021

Adding auto-provisioned CFS quotas to all non-default containers #55

Merged

Levovar linked a pull request Jan 13, 2021 that will close this issue

Adding auto-provisioned CFS quotas to all non-default containers #55

Merged

Levovar closed this as completed in #55 Jan 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set limits and requests for exclusive pool containers #25

Set limits and requests for exclusive pool containers #25

Levovar commented Jul 26, 2019 •

edited

Loading

TimoLindqvist commented Aug 15, 2019

Levovar commented Aug 27, 2019

Levovar commented Aug 27, 2019

TimoLindqvist commented Sep 6, 2019

Levovar commented Sep 10, 2019

TimoLindqvist commented Oct 21, 2019

Set limits and requests for exclusive pool containers #25

Set limits and requests for exclusive pool containers #25

Comments

Levovar commented Jul 26, 2019 • edited Loading

TimoLindqvist commented Aug 15, 2019

Levovar commented Aug 27, 2019

Levovar commented Aug 27, 2019

TimoLindqvist commented Sep 6, 2019

Levovar commented Sep 10, 2019

TimoLindqvist commented Oct 21, 2019

Levovar commented Jul 26, 2019 •

edited

Loading