-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resilience in adding of exec tasks to cgroups #1950
Resilience in adding of exec tasks to cgroups #1950
Conversation
bbbcd28
to
37fa8c3
Compare
The kernel will sometimes return EINVAL when writing a pid to a cgroup.procs file. It does so when the task being added still has the state TASK_NEW. See: https://elixir.bootlin.com/linux/v4.8/source/kernel/sched/core.c#L8286 Co-authored-by: Danail Branekov <[email protected]> Signed-off-by: Tom Godkin <[email protected]> Signed-off-by: Danail Branekov <[email protected]>
37fa8c3
to
bdf3524
Compare
The patch contains the commit from PR opencontainers/runc#1950 [#162607734] Co-authored-by: Danail Branekov <[email protected]>
Friendly bump |
@cyphar @crosbymichael Hey friends! Currently Garden is shipping with this in place as a patch since it was causing persistent application push failures in some cloud foundry deployments (which is fixed with this change). Would be great to get this merge in so we can stop relying on patching runc! Please let me know what you think, thanks! |
Reviewing this today. Sorry, it got lost on me. Thanks for the ping! |
1 similar comment
Does it solve #1326 ? |
We've had this running in prod for a while now (even as a gift patch before this got merged) in cloud foundry and have not seen it reoccur. |
This is related to #1884
This fix allows more leniency for setting up cgroups on exec when cgroup namespaces are not used.
The issue linked above was partly resolved by #1916 - we can avoid the issue entirely by using cgroup namespaces. Cloud Foundry's use of runc does not yet use cgroup namespaces and we encounter the issue above frequently.
We believe that since not using cgroup namespaces is a valid configuration of a config.json, runc should be more resilient when we choose not to. This PR will attempt to write to cgroup.procs a few times when execing, only retrying on seeing EINVAL (which, unless you're using realtime processes, means the task's state in the kernel is still TASK_NEW: https://elixir.bootlin.com/linux/v4.8/source/kernel/sched/core.c#L8286)
We replaced the ioutil.WriteFile with a Open/Write dance because we don't want to open the file each time we attempt to write to cgroup.procs, and we reduce the surface for error types changing underneath us within golang.