Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pids-limit should only be set if the user set it #6842

Merged
merged 1 commit into from
Jul 13, 2020

Conversation

rhatdan
Copy link
Member

@rhatdan rhatdan commented Jul 2, 2020

Currently we are sending over pids-limits from the user even if they
never modified the defaults. The pids limit should be set at the server
side unless modified by the user.

This issue has led to failures on systems that were running with cgroups V1.

Signed-off-by: Daniel J Walsh [email protected]

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 2, 2020
@rhatdan
Copy link
Member Author

rhatdan commented Jul 2, 2020

@giuseppe @mheon PTAL

Copy link
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mheon
Copy link
Member

mheon commented Jul 2, 2020

I don't like this because it disables the default PID limit entirely, even for root containers. We should preserve the default limit where possible (root and cgroups v2 rootless)

@rhatdan
Copy link
Member Author

rhatdan commented Jul 2, 2020

@mheon The default limit should be handled on the server side, if the user does not specify a root limit then pkg/specgen should use the default.

@mheon
Copy link
Member

mheon commented Jul 2, 2020

I don't see code for that anywhere. pkg/specgen takes the resource limits as they were given and uses them unmodified.

@rhatdan
Copy link
Member Author

rhatdan commented Jul 2, 2020

@mheon Added the defaults.

@@ -169,6 +170,19 @@ func CompleteSpec(ctx context.Context, r *libpod.Runtime, s *specgen.SpecGenerat
}
}

// If caller did not specify Pids Limits load default
if s.ResourceLimits.Pids == nil {
if s.CgroupsMode != "disabled" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to make sure we are not rootless + cgroups v1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rtc.PidsLimit() takes care of this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rtc.PidsLimit() is returning 4096 on my cgroups v1 machine, rootless, even though the default does not show in run --help.

@@ -169,6 +170,19 @@ func CompleteSpec(ctx context.Context, r *libpod.Runtime, s *specgen.SpecGenerat
}
}

// If caller did not specify Pids Limits load default
if s.ResourceLimits.Pids == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could null-pointer, I think - s.ResourceLimits may be null.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -169,6 +170,19 @@ func CompleteSpec(ctx context.Context, r *libpod.Runtime, s *specgen.SpecGenerat
}
}

// If caller did not specify Pids Limits load default
if s.ResourceLimits != nil && s.ResourceLimits.Pids == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to make ResourceLimits if it was null and Pids is not empty?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I think this will be the case for rootless v2, for example)

@rhatdan rhatdan force-pushed the pids-limit branch 2 times, most recently from dba4cad to 9b4e26a Compare July 2, 2020 16:56
@goochjj
Copy link
Contributor

goochjj commented Jul 2, 2020

Does not work.

(focal)mrwizard@FocalCG1Dev:~/src/podman
$ bin/podman --runtime /usr/local/bin/crun-debug run --rm -it alpine sh
Error: cannot set limits without cgroups: OCI runtime error
jq .linux.resources /run/user/1000/crun/config.json
{
  "pids": {
    "limit": 4096
  }
}

@mheon
Copy link
Member

mheon commented Jul 2, 2020

Confirmed here as well.

@mheon
Copy link
Member

mheon commented Jul 2, 2020

Ah, I think I know what's going on here. The code in containers/common for determining default limit is broken.

func (c *Config) PidsLimit() int64 {
        if unshare.IsRootless() {
                if c.Engine.CgroupManager == SystemdCgroupsManager {
                        cgroup2, _ := cgroupv2.Enabled()
                        if cgroup2 {
                                return c.Containers.PidsLimit
                        }
                        return 0
                }
        }
        return sysinfo.GetDefaultPidsLimit()
}

Notice that if cgroup manager is set to cgroupfs, we will drop through the conditionals and return sysinfo.GetDefaultPidsLimit - and v1 rootless defaults to cgroupfs.

@goochjj
Copy link
Contributor

goochjj commented Jul 2, 2020

Ah, I think I know what's going on here. The code in containers/common for determining default limit is broken.

func (c *Config) PidsLimit() int64 {
        if unshare.IsRootless() {
                if c.Engine.CgroupManager == SystemdCgroupsManager {
                        cgroup2, _ := cgroupv2.Enabled()
                        if cgroup2 {
                                return c.Containers.PidsLimit
                        }
                        return 0
                }
        }
        return sysinfo.GetDefaultPidsLimit()
}

Notice that if cgroup manager is set to cgroupfs, we will drop through the conditionals and return sysinfo.GetDefaultPidsLimit - and v1 rootless defaults to cgroupfs.

if it's cgroupfs, we return 0 - that's what I see

@goochjj
Copy link
Contributor

goochjj commented Jul 2, 2020

Ah, I think I know what's going on here. The code in containers/common for determining default limit is broken.

func (c *Config) PidsLimit() int64 {
        if unshare.IsRootless() {
                if c.Engine.CgroupManager == SystemdCgroupsManager {
                        cgroup2, _ := cgroupv2.Enabled()
                        if cgroup2 {
                                return c.Containers.PidsLimit
                        }
                        return 0
                }
        }
        return sysinfo.GetDefaultPidsLimit()
}

Notice that if cgroup manager is set to cgroupfs, we will drop through the conditionals and return sysinfo.GetDefaultPidsLimit - and v1 rootless defaults to cgroupfs.

if it's cgroupfs, we return 0 - that's what I see

NVM you're right.

@goochjj
Copy link
Contributor

goochjj commented Jul 2, 2020

Ah, I think I know what's going on here. The code in containers/common for determining default limit is broken.

func (c *Config) PidsLimit() int64 {
        if unshare.IsRootless() {
                if c.Engine.CgroupManager == SystemdCgroupsManager {
                        cgroup2, _ := cgroupv2.Enabled()
                        if cgroup2 {
                                return c.Containers.PidsLimit
                        }
                        return 0
                }
        }
        return sysinfo.GetDefaultPidsLimit()
}

Notice that if cgroup manager is set to cgroupfs, we will drop through the conditionals and return sysinfo.GetDefaultPidsLimit - and v1 rootless defaults to cgroupfs.

if it's cgroupfs, we return 0 - that's what I see

NVM you're right.

The return 0 needs to move out one level probably.

@goochjj
Copy link
Contributor

goochjj commented Jul 2, 2020

Confirmed this fixes it:

--- a/vendor/github.com/containers/common/pkg/config/default.go
+++ b/vendor/github.com/containers/common/pkg/config/default.go
@@ -489,8 +489,8 @@ func (c *Config) PidsLimit() int64 {
                        if cgroup2 {
                                return c.Containers.PidsLimit
                        }
-                       return 0
                }
+               return 0
        }
        return sysinfo.GetDefaultPidsLimit()
 }

@goochjj
Copy link
Contributor

goochjj commented Jul 2, 2020

Though I'm not sure why cgroups v2 can't set a pids limit in cgroupfs mode - if the pids controller is in our cgroup, and we have write access to it, why can't we?

@rhatdan
Copy link
Member Author

rhatdan commented Jul 2, 2020

@goochjj Ask @giuseppe I don't remember why.
Anyways lets get this in.

@rhatdan rhatdan force-pushed the pids-limit branch 2 times, most recently from 7e42ddc to 4828943 Compare July 3, 2020 09:02
@goochjj
Copy link
Contributor

goochjj commented Jul 5, 2020

This merges a bunch of vendor dependencies which breaks the build.

Could you remove the runc bump from 1.0.0-rc90 to rc91? It breaks things in buildah due to changes in libcontainer...

IMHO that bump should be a separate PR, allowing for fixes to that issue to be tracked under their own issue/PR.

I've verified removing the runc bump fixes the compile.

@goochjj
Copy link
Contributor

goochjj commented Jul 5, 2020

rhatdan#6

@rhatdan rhatdan changed the title Pids-limit should only be set if the user set it [WIP] Pids-limit should only be set if the user set it Jul 6, 2020
@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jul 6, 2020
@goochjj
Copy link
Contributor

goochjj commented Jul 7, 2020

@rhatdan https://github.com/goochjj/libpod/tree/rhatdan-pids-limit

Rebased, with runc rc90 and common 0.15.2

@goochjj goochjj mentioned this pull request Jul 7, 2020
@goochjj
Copy link
Contributor

goochjj commented Jul 8, 2020

My branch updated and ready if #6897 is merged.

@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe, rhatdan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 10, 2020
@rhatdan rhatdan removed Backport to v2.0 do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Jul 10, 2020
@rhatdan rhatdan changed the title [WIP] Pids-limit should only be set if the user set it Pids-limit should only be set if the user set it Jul 10, 2020
Currently we are sending over pids-limits from the user even if they
never modified the defaults.  The pids limit should be set at the server
side unless modified by the user.

This issue has led to failures on systems that were running with cgroups V1.

Signed-off-by: Daniel J Walsh <[email protected]>
@mheon
Copy link
Member

mheon commented Jul 12, 2020

I'm going to give this a final test on Debian to verify it fixes v1, but LGTM based on what I see now.

@mheon
Copy link
Member

mheon commented Jul 13, 2020

Confirmed working on Debian with v1
/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 13, 2020
@openshift-merge-robot openshift-merge-robot merged commit d86acf2 into containers:master Jul 13, 2020
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 24, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants