Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert runc bump to rc91 #6882

Conversation

goochjj
Copy link
Contributor

@goochjj goochjj commented Jul 7, 2020

Lets separate out the dependbot update - common here, runc elsewhere.

Signed-off-by: Joseph Gooch [email protected]

@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: goochjj
To complete the pull request process, please assign rhatdan
You can assign the PR to them by writing /assign @rhatdan in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jul 7, 2020
@openshift-ci-robot
Copy link
Collaborator

Hi @goochjj. Thanks for your PR.

I'm waiting for a containers member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@goochjj
Copy link
Contributor Author

goochjj commented Jul 7, 2020

#6850

@vrothberg
Copy link
Member

I don't see other vendor changes. Did you run make vendor to update the dependencies?

@goochjj
Copy link
Contributor Author

goochjj commented Jul 7, 2020

I specifically undid the runc bump to rc91. It's breaking the build.

It's my assumption (since runc has experimental cgroups v2 now, and breaks the build, specifically in buildah), that the best path forward would be to merge the common 0.15.2 update, but NOT merge the runc bump to rc91. Then runc compliance can be done in a separate PR, and only involves setting this back to rc91.

IMHO #6842 needs to be merged pronto because rootless is broken for cgroups v1.
#6842 needs common 0.15.2
#6842 did a make vendor, and got runc-rc91 which broke things.

Hence I'm trying to get #6842 out the door. Now, I've done a branch of @rhatdan's fix with this backreved, so right now if #6842 were merged (plus my fix) this would be the end result anyway. I'm just unclear, direction-wise, if the plan is to

  1. introduce this commit as a dependency update, either including this commit (to exclude runc) or squashing it
  2. rebase 6842 so it's clear that his pids-limit commit is a strictly pids-limit commit, NOT a dependency update
  3. rebase later dependency commits or have dependabot create another commit or introduce an issue to update to runc-rc91, realizing it's going to fail tests and require fixing

Regardless, the title of this PR is "Bump github.com/containers/common from 0.15.1 to 0.15.2 #6850" but it ACTUALLY ALSO updates runc to rc91... Which looks like an unintended consequence.

Please advise on procedure!

@goochjj
Copy link
Contributor Author

goochjj commented Jul 7, 2020

It's possible I'm doing this completely wrong.

It looks like containers/common was bumped to require runc 1.0.0rc91, and so no matter what I do it's going to keep importing that, unless I manually override it and check out rc90 in the vendor directory, which is probably less than ideal.

The real problem IMHO is that runc libcontainer had breaking changes in a minor release.

Advice on how this can be expedited welcome!

Use "replace" in go.mod to explicitly make sure runc stays at rc90 - because it breaks buildah.

Signed-off-by: Joseph Gooch <[email protected]>
@goochjj goochjj force-pushed the common-upd-only branch from 279a41c to 6fa8094 Compare July 7, 2020 16:24
@goochjj
Copy link
Contributor Author

goochjj commented Jul 7, 2020

This will do it.

@goochjj
Copy link
Contributor Author

goochjj commented Jul 7, 2020

@vrothberg what do you think, should I just supercede #6850 with my own PR to bump containers/common to 0.15.2, squash this with the one dependabot did? Then the tests can run here.

I can then open an issue for the build breakage w/ runc rc91 - which is in buildah, so should I do that issue in the buildah repo? Or here?

@mheon
Copy link
Member

mheon commented Jul 7, 2020

@goochjj Definitely file an issue against the Buildah repo.

As for the runc thing - it looks like it's just being dragged in by c/common, so I think we need to undo the vendor there, then revendor c/common?

@goochjj
Copy link
Contributor Author

goochjj commented Jul 7, 2020

@mheon True, but adding the replace line supercedes what common says - so it's a nice stop-gap to prevent needing common 0.15.3

This is the thing
common requires runc rc91 (just because it was bumped)
libpod requires common (so it gets rc91)
buildah requires common 0.14 - so it doesn't get the version of runc that has a problem
but libpod gets buildah, and has a NEWER runc.

@goochjj
Copy link
Contributor Author

goochjj commented Jul 7, 2020

Root cause is runc rc91 changed the way devicerules work because of the differences between cgroups v1 and v2.

Which causes:

vendor/github.com/containers/buildah/run_linux.go:142:72: cannot use d.DeviceRule.Permissions (type configs.DevicePermissions) as type string in argument to g.AddLinuxResourcesDevice

(It might break other things for all I know)

So

Possible solutions

  1. Bump c/common dependency back to rc90, reversion, revendor, and let everything trickle down, so libpod will inherit rc90 from common.
  2. Temp replace rc91 with rc90 in libpod, since it's the intersection of c/common and c/buildah that causes it, and either project is fine alone.
  3. Update buildah to require c/common 0.15.2 and then fix the breakage

I think...?

Basically that means buildah is going to end up with code that EITHER works on rc90 or rc91 - (unless a conditional can be used) so I'm not sure what the best plan is to version these things properly.

@giuseppe
Copy link
Member

giuseppe commented Jul 7, 2020

Let's skip runc rc91 as there is a regression that breaks cgroup v1 on systemd

@goochjj
Copy link
Contributor Author

goochjj commented Jul 7, 2020

That's fine, but skip where - c/common and reversion or use this PR - which blocks it in libpod?

@goochjj
Copy link
Contributor Author

goochjj commented Jul 7, 2020

Root cause:
https://github.com/opencontainers/runc/blob/819fcc687efb6e99d980a257f1dc8d92d02ba5bd/libcontainer/configs/device.go#L37

Device permissions are now a custom type

This commit
goochjj@2b64224

shows what needs to change in various libpod-things to support that.

@mheon
Copy link
Member

mheon commented Jul 7, 2020

Requiring a fresh version of c/common is probably OK. We're not likely to cut a fresh release off master in the next few weeks (we're cutting a 2.0.2 on the v2.0 branch, which has an old-enough c/common that it should avoid this) so we have time to fix this in master (or potentially even fix this in Buildah and then vendor that) before we're in danger of releasing a broken build.

I am sort of interested as to how this snuck in without being detected by CI, though.

@goochjj
Copy link
Contributor Author

goochjj commented Jul 7, 2020

I saw it in the CI for Dan's pids-limit patch.

@goochjj
Copy link
Contributor Author

goochjj commented Jul 7, 2020

I imagine that:

  1. Buildah's CI isn't including the bad version of runc since it depends on older c/common
  2. Runc CI wouldn't expose it because all of runc's bits were updated to use the new custom type
  3. c/commons CI (does it have CI?) wouldn't expose it because it largely has common code and probably does poke around in libcontainer

So libpod as the intersection of all those points is the one that exposed it...

@goochjj
Copy link
Contributor Author

goochjj commented Jul 7, 2020

conversation moved to issue
Will be solved by c/common new version

@goochjj goochjj closed this Jul 7, 2020
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 24, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants