Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman build: error opening file .../cgroup.freeze: No such device #422

Closed
edsantiago opened this issue Mar 12, 2020 · 18 comments · Fixed by #423
Closed

podman build: error opening file .../cgroup.freeze: No such device #422

edsantiago opened this issue Mar 12, 2020 · 18 comments · Fixed by #423

Comments

@edsantiago
Copy link
Member

In CI. Rootless. I don't know which Fedora:

# $ ./bin/podman build -t build_test --format=docker /tmp/podman_bats.JIkSvM/build-test
...
# STEP 3: RUN echo mYNEUygNEKRSRisDZZ2w1dztQbAfn5p96CBHIKfkL9GHAcSfum > /8stfSssoTINVW8YZTQ58
# error opening file `/sys/fs/cgroup//user.slice/user-14802.slice/[email protected]/buildah-buildah631812000-99184.scope/cgroup.freeze`: No such device
# kill container: No such process
# error running container: error reading container state: exit status 1
# Error: error building at STEP "RUN echo mYNEUygNEKRSRisDZZ2w1dztQbAfn5p96CBHIKfkL9GHAcSfum > /8stfSssoTINVW8YZTQ58": error while running runtime: exit status 1
# [ rc=125 (** EXPECTED 0 **) ]

CI links: cirrus -- highlighted

I've restarted test, on the assumption that it's a flake, but want to start tracking regardless.

@mheon
Copy link
Member

mheon commented Mar 12, 2020

Rootless is on F30, I thiiiiiink?

@cevich can you confirm?

@edsantiago
Copy link
Member Author

Yep, it was a flake.

@edsantiago
Copy link
Member Author

Oh dear. Seeing it in ginkgo now (as opposed to BATS the first time). Again, rootless.

@giuseppe
Copy link
Member

it looks like we are using cgroup v2 on a kernel that doesn't support the freezer control.

We should either use cgroup v1 in this case, or an updated kernel

@edsantiago
Copy link
Member Author

But it's a flake: restarting the failed test makes everything OK.

@rhatdan
Copy link
Member

rhatdan commented Mar 16, 2020

Are we always running the same kernel?

@edsantiago
Copy link
Member Author

In one of the cases above, the package_versions step shows 5.5.5-200.fc31.x86_64 for both the failed and successful runs.

@cevich
Copy link
Member

cevich commented Mar 16, 2020

Are we always running the same kernel?

Yes, and in general/as much as possible automation avoids updating/changing as much as possible from one run to the next. IIRC, the crun package is the only exception currently, and only for F31 VMs.

Otherwise (for F30 and Ubuntu), updated packages and kernel requires building new VM cache images (as documented).

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Apr 16, 2020

Since we are updating to run against newer kernels going to close this issue.

@rhatdan rhatdan closed this as completed Apr 16, 2020
@edsantiago
Copy link
Member Author

It's back

not ok 35 podman build - basic test
...
$ podman build -t build_test --format=docker /tmp/podman_bats.Nfkrh8/build-test
...
STEP 3: RUN echo P7P7OoAPTPb03b2fgkfyVDRp65bhJP6FpGECwhdYOZpKPzuAig > /wKvZH6BbgiOoBaO0S6ht
 error opening file `/sys/fs/cgroup//system.slice/buildah-buildah158633376-456211.scope/cgroup.freeze`: No such file or directory
 kill container: No such process
 error running container: error reading container state: exit status 1
 Error: error building at STEP "RUN echo P7P7OoAPTPb03b2fgkfyVDRp65bhJP6FpGECwhdYOZpKPzuAig > /wKvZH6BbgiOoBaO0S6ht": error while running runtime: exit status 1
 [ rc=125 (** EXPECTED 0 **) ]

Fedora 31, kernel 5.6.15-200.fc31.x86_64

@edsantiago edsantiago reopened this Jun 22, 2020
@vrothberg
Copy link
Member

vrothberg commented Jun 23, 2020

@giuseppe @nalind could this be a race?

@giuseppe
Copy link
Member

possibly, I think it could be caused by the cgroup disappearing (systemd cleaning it up) in the middle of accessing it

@edsantiago
Copy link
Member Author

Two more: one on f31 and one on f32

@edsantiago
Copy link
Member Author

Another one, f32, this time on Dan's PR containers/podman#6570. Shall I just remove the podman build tests from CI?

@edsantiago
Copy link
Member Author

Another one, f31, again in PR containers/podman#6570, this time in the healthcheck test because it too runs podman build.

@edsantiago
Copy link
Member Author

Two more recent ones: PR containers/podman#6791 (log) and PR containers/podman#6823 (log)

Oh, and two more today: PR containers/podman#6773 (log) and PR containers/podman#6821 (log)

All failures are only on f31.

Could we please give this one some attention?

@giuseppe giuseppe transferred this issue from containers/podman Jul 2, 2020
giuseppe added a commit to giuseppe/crun that referenced this issue Jul 2, 2020
fix some race conditions where crun would fail if the process already
exited.

Closes: containers#422

Signed-off-by: Giuseppe Scrivano <[email protected]>
@giuseppe
Copy link
Member

giuseppe commented Jul 2, 2020

PR here: #423

giuseppe added a commit to giuseppe/buildah that referenced this issue Jul 2, 2020
fix a race condition where the container process could exit before the
runtime sends the signal, causing the command to fail.

Part of: containers/crun#422

Signed-off-by: Giuseppe Scrivano <[email protected]>
giuseppe added a commit to giuseppe/buildah that referenced this issue Jul 2, 2020
fix a race condition where the container process could exit before the
runtime sends the signal, causing the command to fail.

Part of: containers/crun#422

Signed-off-by: Giuseppe Scrivano <[email protected]>
giuseppe added a commit to giuseppe/buildah that referenced this issue Jul 2, 2020
fix a race condition where the container process could exit before the
runtime sends the signal, causing the command to fail.

Part of: containers/crun#422

Signed-off-by: Giuseppe Scrivano <[email protected]>
giuseppe added a commit to giuseppe/buildah that referenced this issue Jul 9, 2020
fix a race condition where the container process could exit before the
runtime sends the signal, causing the command to fail.

Part of: containers/crun#422

Signed-off-by: Giuseppe Scrivano <[email protected]>
bors bot added a commit to containers/buildah that referenced this issue Jul 13, 2020
2434: linux: skip errors from the runtime kill r=vrothberg a=giuseppe

fix a race condition where the container process could exit before the
runtime sends the signal, causing the command to fail.

Part of: containers/crun#422

Signed-off-by: Giuseppe Scrivano <[email protected]>

<!--
Thanks for sending a pull request!

Please make sure you've read and understood our contributing guidelines
(https://github.com/containers/buildah/blob/master/CONTRIBUTING.md) as well as ensuring
that all your commits are signed with `git commit -s`.
-->

#### What type of PR is this?

<!--
Please label this pull request according to what type of issue you are
addressing, especially if this is a release targeted pull request.

Uncomment only one `/kind <>` line, hit enter to put that in a new line, and
remove leading whitespace from that line:
-->

> /kind api-change
> /kind bug
> /kind cleanup
> /kind deprecation
> /kind design
> /kind documentation
> /kind failing-test 
> /kind feature
> /kind flake
> /kind other

#### What this PR does / why we need it:

#### How to verify it

#### Which issue(s) this PR fixes:

<!--
Automatically closes linked issue when PR is merged.
Uncomment the following comment block and include the issue
number or None on one line.
Usage: `Fixes #<issue number>`, or `Fixes (paste link of issue)`, or `None`.
-->

<!--
Fixes #
or
None
-->

#### Special notes for your reviewer:

#### Does this PR introduce a user-facing change?

<!--
If no, just write `None` in the release-note block below. If yes, a release note
is required: Enter your extended release note in the block below. If the PR
requires additional action from users switching to the new release, include the
string "action required".

For more information on release notes please follow the kubernetes model:
https://git.k8s.io/community/contributors/guide/release-notes.md
-->

```release-note

```



Co-authored-by: Giuseppe Scrivano <[email protected]>
bors bot added a commit to containers/buildah that referenced this issue Jul 13, 2020
2434: linux: skip errors from the runtime kill r=vrothberg a=giuseppe

fix a race condition where the container process could exit before the
runtime sends the signal, causing the command to fail.

Part of: containers/crun#422

Signed-off-by: Giuseppe Scrivano <[email protected]>

<!--
Thanks for sending a pull request!

Please make sure you've read and understood our contributing guidelines
(https://github.com/containers/buildah/blob/master/CONTRIBUTING.md) as well as ensuring
that all your commits are signed with `git commit -s`.
-->

#### What type of PR is this?

<!--
Please label this pull request according to what type of issue you are
addressing, especially if this is a release targeted pull request.

Uncomment only one `/kind <>` line, hit enter to put that in a new line, and
remove leading whitespace from that line:
-->

> /kind api-change
> /kind bug
> /kind cleanup
> /kind deprecation
> /kind design
> /kind documentation
> /kind failing-test 
> /kind feature
> /kind flake
> /kind other

#### What this PR does / why we need it:

#### How to verify it

#### Which issue(s) this PR fixes:

<!--
Automatically closes linked issue when PR is merged.
Uncomment the following comment block and include the issue
number or None on one line.
Usage: `Fixes #<issue number>`, or `Fixes (paste link of issue)`, or `None`.
-->

<!--
Fixes #
or
None
-->

#### Special notes for your reviewer:

#### Does this PR introduce a user-facing change?

<!--
If no, just write `None` in the release-note block below. If yes, a release note
is required: Enter your extended release note in the block below. If the PR
requires additional action from users switching to the new release, include the
string "action required".

For more information on release notes please follow the kubernetes model:
https://git.k8s.io/community/contributors/guide/release-notes.md
-->

```release-note

```



Co-authored-by: Giuseppe Scrivano <[email protected]>
bors bot added a commit to containers/buildah that referenced this issue Jul 14, 2020
2434: linux: skip errors from the runtime kill r=vrothberg a=giuseppe

fix a race condition where the container process could exit before the
runtime sends the signal, causing the command to fail.

Part of: containers/crun#422

Signed-off-by: Giuseppe Scrivano <[email protected]>

<!--
Thanks for sending a pull request!

Please make sure you've read and understood our contributing guidelines
(https://github.com/containers/buildah/blob/master/CONTRIBUTING.md) as well as ensuring
that all your commits are signed with `git commit -s`.
-->

#### What type of PR is this?

<!--
Please label this pull request according to what type of issue you are
addressing, especially if this is a release targeted pull request.

Uncomment only one `/kind <>` line, hit enter to put that in a new line, and
remove leading whitespace from that line:
-->

> /kind api-change
> /kind bug
> /kind cleanup
> /kind deprecation
> /kind design
> /kind documentation
> /kind failing-test 
> /kind feature
> /kind flake
> /kind other

#### What this PR does / why we need it:

#### How to verify it

#### Which issue(s) this PR fixes:

<!--
Automatically closes linked issue when PR is merged.
Uncomment the following comment block and include the issue
number or None on one line.
Usage: `Fixes #<issue number>`, or `Fixes (paste link of issue)`, or `None`.
-->

<!--
Fixes #
or
None
-->

#### Special notes for your reviewer:

#### Does this PR introduce a user-facing change?

<!--
If no, just write `None` in the release-note block below. If yes, a release note
is required: Enter your extended release note in the block below. If the PR
requires additional action from users switching to the new release, include the
string "action required".

For more information on release notes please follow the kubernetes model:
https://git.k8s.io/community/contributors/guide/release-notes.md
-->

```release-note

```



Co-authored-by: Giuseppe Scrivano <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants