-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to set the cpuset cgroup for container on WSL2 #10709
Comments
Hi @TGNThump. Can you give a little more context as to where the Nomad client is actually running here? You're saying you've got Docker Desktop on one hand but then WSL2 on the other. Also, are you running the Nomad client as root? That's a hard requirement. Just a heads up that in general, the WSL2 environment is not well-supported: #2633. You're going to find that the WSL2 kernel is "weird". Note that it looks like you can't detect bridge networking in this environment either. From your logs:
|
Hey @tgross, Nomad is running in --dev mode as root inside the ubuntu WSL2 instance. |
Thanks @TGNThump. I traced this error message down to Unfortunately it looks like we'll need to open a bug with upstream here. We're on version rc93 of runc, but it looks like from that files recent commits that there's been nothing helpful to us here since then either. We'll start with Before we go down that path, does this happen with any other container image? |
My colleague @notnoop pointed out to me that we'd see the exact same error (which originates from |
I also tried to run ubuntu:latest with nomad. When I run it with Not sure if this is down to the same root cause, or something different. |
i am having the same issue on fedora coreos but i am running nomad in a podman container with access to the docker socket as well as mounted required volumes and access to cgroups the error is the same, so the pid cannot be bound to a cpu set but the docker process is and keeps running - nomad just does loose the control about the process after the error and reports it as failed my assumption is, that coreos runs the docker containers in a different cgroup namespace and so this PID cannot be assigned side note: it would be great to disable cpu sets with a config flag |
did a quick test run the error is but the pid exists here |
Hi @zyclonite, we strongly discourage anyone from running Nomad clients in containers and it's not at all a documented, tested, and supported configuration. |
Not much in the Nomad logs there, unfortunately. Does the Docker event log have anything in this situation? It might have more error information we could use to debug. |
I think I understand why that is, and that's a clear bug: when we hit an error condition on this path we should be stopping the container, just as we do when the logging setup fails at |
@tgross but the option to disable cpuset usage would still be nice ;) |
Sorry, I've been on holiday. I'll give this a try tomorrow. |
@tgross v1.1.1 does indeed fix the cpuset issue. |
Happy to send over the docker desktop diagnostics zip if that's helpful. I don't really know what I'm looking for. Also happy to create a new issue, given the cpusets issue has been resolved. |
Let's create a new issue for that. Thanks @TGNThump! |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
Nomad v1.1.0 (2678c3604bc9530014208bc167415e167fd440fc)
Operating system and Environment details
Nomad running on
Ubuntu 20.04.2 LTS
onWSL2 4.19.84-microsoft-standard
onWindows 10 20H2 (OS Build 19042.985)
Docker Desktop v3.3.3, Docker Engine v20.10.6
Issue
failed to set the cpuset cgroup for container: failed to write 3839 to cgroup.procs: write /sys/fs/cgroup/cpuset/nomad/shared/cgroup.procs: no such process
Reproduction steps
Run docker desktop with wsl2 integration enabled in linux containers mode.
Running
nomad job run -check-index 0 kratos.jobspec.hcl
with nomad running from wsl2.Expected Result
The job should work.
Actual Result
See errors:
failed to set the cpuset cgroup for container: failed to write 3839 to cgroup.procs: write /sys/fs/cgroup/cpuset/nomad/shared/cgroup.procs: no such process
nomad.log
kratos.jobspec.hcl
The text was updated successfully, but these errors were encountered: