Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qemu: pass task resources into driver for cgroup setup #23466

Merged
merged 1 commit into from
Jul 1, 2024
Merged

Conversation

tgross
Copy link
Member

@tgross tgross commented Jun 28, 2024

As part of the work for 1.7.0 we moved portions of the task cgroup setup down into the executor. This requires that the executor constructor get the TaskConfig.Resources struct, and this was missing from the qemu driver. We fixed a panic caused by this change in #19089 before we shipped, but this fix was effectively undone after we added plumbing for custom cgroups for raw_exec in 1.8.0. As a result, running qemu tasks always fail on Linux.

This was undetected in testing because our CI environment doesn't have QEMU installed. I've got all the unit tests running locally again and have added QEMU installation when we're running the drivers tests, so that'll catch future regressions.

Fixes: #23250

@tgross
Copy link
Member Author

tgross commented Jun 28, 2024

(Holding this in draft pending GitHub Actions getting unstuck so we can run the tests.) Done!

As part of the work for 1.7.0 we moved portions of the task cgroup setup down
into the executor. This requires that the executor constructor get the
`TaskConfig.Resources` struct, and this was missing from the `qemu` driver. We
fixed a panic caused by this change in #19089 before we shipped, but this fix
was effectively undo after we added plumbing for custom cgroups for `raw_exec`
in 1.8.0. As a result, running `qemu` tasks always fail on Linux.

This was undetected in testing because our CI environment doesn't have QEMU
installed. I've got all the unit tests running locally again and have added QEMU
installation when we're running the drivers tests.

Fixes: #23250
Copy link
Contributor

@pkazmierczak pkazmierczak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@tgross tgross merged commit eedbd36 into main Jul 1, 2024
20 checks passed
@tgross tgross deleted the qemu-cgroup branch July 1, 2024 15:41
@djthorpe
Copy link

FYI this is also happening to me using the raw_exec driver on nomad 1.8.1

Here is the output in nomad 1.8.1:

Jul 15, '24 13:00:39 +0200 | Driver Failure | failed to launch command with executor: rpc error: code = Unknown desc = unable to configure cgroups: no such file or directory

Here is the snippet from my job:

    task "init" {
      driver = "raw_exec"

      lifecycle {
        sidecar = false
        hook    = "prestart"
      }

      config {
        // Set permissions on the directory
        command = var.data == "" ? "/usr/bin/echo" : "/usr/bin/install"
        args = compact([
          "-d", var.data,
          "-o", "472"
        ])
      }
    } // task "init"

@tgross
Copy link
Member Author

tgross commented Jul 15, 2024

@djthorpe that's because Nomad 1.8.1 doesn't include this PR, as you can see in the changelog. Once the next version of Nomad is released with this patch, please try again. Wait, for raw_exec? Can you open a new issue instead of commenting on a closed PR?

Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
backport/ent/1.7.x+ent Changes are backported to 1.7.x+ent backport/1.8.x backport to 1.8.x release line theme/driver/qemu type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

qemu job failed at "rpc error: code = Unknown desc = unable to configur e cgroups: no such file or directory"
3 participants