Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resource limits for pods #14876

Merged
merged 1 commit into from
Jul 21, 2022
Merged

Conversation

cdoern
Copy link
Contributor

@cdoern cdoern commented Jul 8, 2022

added the following flags and handling for podman pod create

--memory-swap
--cpuset-mems
--device-read-bps
--device-write-bps
--blkio-weight
--blkio-weight-device
--cpu-shares

given the new backend for systemd in c/common, all of these can now be exposed to pod create.
most of the heavy lifting (nearly all) is done within c/common. However, some rewiring needed to be done here
as well!

Signed-off-by: Charlie Doern [email protected]

Does this PR introduce a user-facing change?

expose resource limts flags: 

--memory-swap
--cpuset-mems
--device-read-bps
--device-write-bps
--blkio-weight
--blkio-weight-device 
--cpu-shares

to podman pod create

@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note labels Jul 8, 2022
@cdoern
Copy link
Contributor Author

cdoern commented Jul 8, 2022

@giuseppe I am getting failures for blkio-weight and blkio-weight-device due to crun not being able to write io.weight. Looking around, it seems that crun might only be able to write to io.bfq.weight.

This might be something we want to look into... but for next week!

@giuseppe
Copy link
Member

giuseppe commented Jul 8, 2022

@giuseppe I am getting failures for blkio-weight and blkio-weight-device due to crun not being able to write io.weight. Looking around, it seems that crun might only be able to write to io.bfq.weight.

we might need to add a fallback to io.weight when io.bfq.weight doesn't work. I see in on Fedora though, what system are you using?

@giuseppe
Copy link
Member

giuseppe commented Jul 9, 2022

we might need to add a fallback to io.weight when io.bfq.weight doesn't work. I see in on Fedora though, what system are you using?

opened a PR: containers/crun#964

@cdoern
Copy link
Contributor Author

cdoern commented Jul 11, 2022

@giuseppe how does getting a new crun version work since it isn't a go vendor? do we have to wait until a new release is cut or is it just built from the main branch.

@rhatdan
Copy link
Member

rhatdan commented Jul 11, 2022

@cdoern @giuseppe we need to get a release and then update the images we are testing with to use the new crun.

test/system/200-pod.bats Outdated Show resolved Hide resolved
test/system/200-pod.bats Outdated Show resolved Hide resolved
test/system/200-pod.bats Outdated Show resolved Hide resolved
@cdoern cdoern force-pushed the cgroup branch 2 times, most recently from 0a3e9cd to 50b8d5f Compare July 14, 2022 14:47
@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 14, 2022
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 14, 2022
@cdoern cdoern marked this pull request as ready for review July 14, 2022 17:23
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 14, 2022
@cdoern cdoern force-pushed the cgroup branch 3 times, most recently from 912ff40 to d81b5d1 Compare July 14, 2022 20:04
@giuseppe
Copy link
Member

one issue I've noticed is that you've hardcoded /dev/block for the device name. But I think you should also consider /dev/char when the device is a character device.

I am not sure there is any advantage in resolving them, can't we just pass whatever name we go as input?

@TomSweeneyRedHat
Copy link
Member

@cdoern all kinds or red tests here

@cdoern
Copy link
Contributor Author

cdoern commented Jul 14, 2022

@giuseppe given the feedback in systemd/systemd#23938 I think we need to hardcode either block or char. Is there a way to tell which is which?

@rhatdan
Copy link
Member

rhatdan commented Jul 15, 2022

Drive by, can't you just stat the source and get whether it is a block or char device?

@cdoern
Copy link
Contributor Author

cdoern commented Jul 15, 2022

@rhatdan the issue is, runc's blkio device resources (which we use within c/common) only have a field for the major, minor, and limit. Unless there is a way to stat just the major:minor, we lose the original path once the resource limits are created.

cdoern added a commit to cdoern/common that referenced this pull request Jul 15, 2022
I only included the v1 name for a cgroup wide IO weight in my previous PR, this fixes the --blkio-weight
flag in containers/podman#14876

Signed-off-by: Charlie Doern <[email protected]>
cdoern added a commit to cdoern/common that referenced this pull request Jul 15, 2022
I only included the v1 name for a cgroup wide IO weight in my previous PR, this fixes the --blkio-weight
flag in containers/podman#14876

Signed-off-by: Charlie Doern <[email protected]>
@rhatdan
Copy link
Member

rhatdan commented Jul 15, 2022

Major and minor are just ints, there is no way to differenctiate block or char.

@cdoern
Copy link
Contributor Author

cdoern commented Jul 20, 2022

@edsantiago let me know if I addressed all of your changes!

@cdoern
Copy link
Contributor Author

cdoern commented Jul 20, 2022

@mheon @rhatdan @giuseppe no rush since this won't get into 4.2, but feel free to merge once we get @edsantiago 's blessing on the tests

@edsantiago
Copy link
Member

No, tests are kind of broken still. Try this.
foo.diff.txt

@cdoern
Copy link
Contributor Author

cdoern commented Jul 20, 2022

No, tests are kind of broken still. Try this. foo.diff.txt

that works.

@edsantiago
Copy link
Member

K. Is it clear what I fixed? That is: is anything unclear? I'm sorry about providing a patch without explanation, it's just, argh, today!!

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 20, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cdoern, edsantiago, giuseppe

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [edsantiago,giuseppe]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cdoern
Copy link
Contributor Author

cdoern commented Jul 20, 2022

K. Is it clear what I fixed? That is: is anything unclear? I'm sorry about providing a patch without explanation, it's just, argh, today!!

No worries! I get everything you changed, I was trying to use sed for the spacing but yours works better.

@edsantiago
Copy link
Member

LGTM but I have a feeling this is going to blow up when combined with #14972 (runc). If yours merges first, I will need your help getting tests to pass (I will probably just add yet more Skips). If mine merges first, you will need to rebase this and then figure out the required Skips or fixes.

@cdoern
Copy link
Contributor Author

cdoern commented Jul 21, 2022

@mheon just an FYI for when you review, I just changed pod_api.go (pod inspect) to include the new limits since I forgot to do that. Also, @edsantiago skipped the resources limits test in v1 since the limits are not allowed in most cases.

@mheon
Copy link
Member

mheon commented Jul 21, 2022

Please add something to the podman pod create manpage about how resource limits for pods work. I think an explanation of the specifics (containers in the pod can have separate limits, the limit applies to all containers in the pod) and any exceptions (doesn't work if the container specifies its own cgroup) is worth it.

Code LGTM

@cdoern
Copy link
Contributor Author

cdoern commented Jul 21, 2022

@mheon does this work?

Note: resource limit related flags work by setting the limits explicitly in the pod's cgroup which by default, is the cgroup parent for all containers joining the pod. Containers are still delegated the ability to set their own resource limits when joining a pod via the **cgroup.subtree_control** file in the pod cgroup which impacts the **cgroup.controller** file in the child container cgroup. containers do NOT get the pod level cgroup resources if they specify their own cgroup when joining a pod such as **--cgroupns=host**

@mheon
Copy link
Member

mheon commented Jul 21, 2022

ontainers are still delegated the ability to set their own resource limits when joining a pod via the **cgroup.subtree_control** file in the pod cgroup which impacts the **cgroup.controller** file in the child container cgroup

This is very unclear. As a Podman user, what I care about is: Can I use --cpu on a pod to set an overall CPU limit, then use --cpu on a container in the pod to set a smaller limit for one container in the pod?

@cdoern
Copy link
Contributor Author

cdoern commented Jul 21, 2022

ok, @mheon

Note: resource limit related flags work by setting the limits explicitly in the pod's cgroup which by default, is the cgroup parent for all containers joining the pod. Containers are still delegated the ability to set their own resource limits when joining a pod meaning that if you execute **podman pod create --cpus=5** you can also execute **podman container create --pod=<POD_ID> --cpus=4** and the container will only see the smaller limit. containers do NOT get the pod level cgroup resources if they specify their own cgroup when joining a pod such as **--cgroupns=host**

@mheon
Copy link
Member

mheon commented Jul 21, 2022

That WFM

added the following flags and handling for podman pod create

--memory-swap
--cpuset-mems
--device-read-bps
--device-write-bps
--blkio-weight
--blkio-weight-device
--cpu-shares

given the new backend for systemd in c/common, all of these can now be exposed to pod create.
most of the heavy lifting (nearly all) is done within c/common. However, some rewiring needed to be done here
as well!

Signed-off-by: Charlie Doern <[email protected]>
@cdoern
Copy link
Contributor Author

cdoern commented Jul 21, 2022

@containers/podman-maintainers PTAL and merge

@mheon
Copy link
Member

mheon commented Jul 21, 2022

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 21, 2022
@openshift-merge-robot openshift-merge-robot merged commit ee937c5 into containers:main Jul 21, 2022
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 21, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. release-note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants