-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry writing to cgroup files on EINTR error #2258
Conversation
Dockerfile
Outdated
@@ -1,4 +1,4 @@ | |||
FROM golang:1.12-stretch | |||
FROM golang:1.14-stretch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default should be still kept <= 1.13. But you can override the version to 1.14 with --build-arg
in CI.
Golang 1.14 introduces asynchronous preemption which results into applications getting frequent EINTR (syscall interrupted) errors when invoking slow syscalls, e.g. when writing to cgroup files. As writing to cgroups is idempotent, it is safe to retry writing to the file whenever the write syscall is interrupted. Signed-off-by: Mario Nitchev <[email protected]>
15d2243
to
f34eb2c
Compare
|
||
for i := 0; i < 100000; i++ { | ||
limit := 1024*1024 + i | ||
if err := WriteFile(cgroupPath, "memory.limit_in_bytes", strconv.Itoa(limit)); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t.Skip()
on cgroup v2 env
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added the skip in the beginning of the test
Signed-off-by: Yulia Nedyalkova <[email protected]>
@danail-branekov have you actually seen this (EINTR from ioutils.WriteFile) happening? I'm asking because Golang 1.14 release notes say
and from this I would assume that if you're not using e.g. write(2) syscall directly, golang will either handle EINTR for you, or avoid it somehow (lock the goroutine doing the write). The question is, is this a wrong assumption? |
It seems that
But I don't know for sure. |
Hello @kolyshkin Yes, we've been seeing this fail in our CI the past two weeks after an upgrade to Go 1.14. The tests we're running are failing on
Worth noting is that we've only seen this fail on writing in the memory cgroup. |
@mnitchev thank you for the explanation! Reproduced locally, filed an issue to golang: golang/go#38033 |
ping @opencontainers/runc-maintainers . We need this to unblock migration to Go 1.14. |
Looks like it's regression of golang, but this work around seems harmless. |
Do you plan on releasing with this change soon? |
I assume we need changes like this in more places before shipping the next release? |
It's probably worth taking a look, but from our perspective, we've only seen this type of error in this specific place in the codebase (for context our CI runs test every 20 minutes using pretty much all the runc operations). It also looks like the fscommon package is used when writing to all the cgroups. |
@cyphar What do you think about the next release? |
After deploying a version of gvisor built with Go 1.14, we're seeing errors setting up cgroups (we manually run `runsc` via `runsc run`, which creates the cgroup). This turns out to be a known issue with Go: golang/go#38033. Given that the [fix won't be backported](golang/go#39026 (comment)), we should retry writes that may fail with EINTR. This is also what runc does: opencontainers/runc#2258 FUTURE_COPYBARA_INTEGRATE_REVIEW=#3102 from stripe:andrew/cgroup-eintr 079123b PiperOrigin-RevId: 320183771
After deploying a version of gvisor built with Go 1.14, we're seeing errors setting up cgroups (we manually run `runsc` via `runsc run`, which creates the cgroup). This turns out to be a known issue with Go: golang/go#38033. Given that the [fix won't be backported](golang/go#39026 (comment)), we should retry writes that may fail with EINTR. This is also what runc does: opencontainers/runc#2258 FUTURE_COPYBARA_INTEGRATE_REVIEW=#3102 from stripe:andrew/cgroup-eintr 079123b PiperOrigin-RevId: 323575152
Golang 1.14 introduces asynchronous preemption which results into
applications getting frequent EINTR (syscall interrupted) errors when
invoking slow syscalls, e.g. when writing to cgroup files.
As writing to cgroups is idempotent, it is safe to retry writing to the
file whenever the write syscall is interrupted.
Note that this PR also bumps the golang version in the test image
Dockerfile
so that the fix can be verified.