Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iv_fd_poll: epoll_pwait2() returns EPERM on CentOS7 with unprivileged… #33

Closed
wants to merge 1 commit into from

Conversation

bazsi
Copy link
Contributor

@bazsi bazsi commented May 8, 2024

Under RHEL7 running an unprivileged container with a 3.10 based kernel, syslog-ng gets an abort, because the automatic fallback to epoll_pwait() does not work.

Because this is an unprivileged container and because on 3.10 epoll_pwait2() is an unknown system call, we get EPERM instead of ENOSYS.

See for example: axoflow/axosyslog#85 (comment)

… containers

Work this around by accepting EPERM as well.

Signed-off-by: Balazs Scheidler <[email protected]>
@buytenh
Copy link
Owner

buytenh commented May 9, 2024

Is this EPERM behavior a (mis)feature of podman? I would argue that it is a bug in the container technology to be returning EPERM for random system calls instead of ENOSYS -- what do you think?

@MrAnno
Copy link

MrAnno commented May 9, 2024

Probably this one:
containers/podman#10337
containers/common#573

@bazsi
Copy link
Contributor Author

bazsi commented May 9, 2024

I'd agree, this is on centos7 with an ancient podman. Wasn't reported in centos8, so probably fixed in later versions.

The issue is, this is never going to get fixed in that podman release.

And the workaround is to disable epoll completely and we will start with an abort without thatn which is kind of ugly.

I understand you don't want to merge this, but I'm not sure how to fix this. Don't want to use a local patch.

@bazsi
Copy link
Contributor Author

bazsi commented May 15, 2024

This came up again. I've tried to run this under Raspberry Pi in 32 bit, running Debian buster (old, I know) and the same thing happens. So it's not just RHEL7 that has this problem, but there's probably a range of kernel/docker combinations where epoll_pwait2() returns EPERM and no ENOSYS.

I am inclined to add a workaround somehow, as this generates pretty ugly SIGABRT-s at startup. Maybe the probing for epoll_pwait2() needs to be a bit more complex and done at auto detection path? would you be open to take such a patch?

buytenh added a commit that referenced this pull request May 16, 2024
Commit 491daf4 ("iv_fd_epoll: Add support for epoll_pwait2().")
added support for epoll_pwait2(), with a fallback to epoll_wait() in
case epoll_pwait2() is not supported by the kernel we are running on,
which would be indicated by epoll_pwait2() returning -ENOSYS.

Some reports (e.g. axoflow/axosyslog#85 ,
#33 (comment) )
suggest that some container technologies can cause -EPERM to be
returned for epoll_pwait2(), independently of whether or not
epoll_pwait2() is actually supported by the kernel we are running on,
and this trips us up because we don't currently handle -EPERM
gracefully, as we did not expect that we would have to do so.

Making system calls return -EPERM to indicate that they were filtered
out by a security policy framework seems somewhat dubious, especially
when considering the amount of application and user confusion generated
by system calls that are not documented as being able to fail with
-EPERM now suddenly being able to fail with -EPERM, but there is not
much we can do about this.

I would be against adding EPERM-as-ENOSYS fallbacks for every current
or future case where we handle ENOSYS, but:

1. it seems that this is the only case where this triggers;

2. upstream seems to agree that this EPERM behavior is a bug (see
   e.g. these links dug up by László Várady:
   containers/common#573 ,
   containers/podman#10337 ,
   opencontainers/runtime-spec#1087 ), so
   there will hopefully be no new cases of this in the future;

3. there's at least one container technology release (podman on
   CentOS 7) where this bug triggers and where the platform is
   sufficiently old to no longer be receiving updates, as pointed
   out by Balazs Scheidler, so this issue can't be fixed by users
   updating their container software.

Under these circumstances, adding a workaround on our end seems
reasonable, and this commit does so.

This issue was originally reported by @mstopa-splunk on GitHub.
Workaround originally by Balazs Scheidler.

Signed-off-by: Lennert Buytenhek <[email protected]>
@buytenh
Copy link
Owner

buytenh commented May 16, 2024

I wrote a commit message rationalising adding this workaround, and added a comment to the relevant code:

a98ca24

Does this look OK? If so, I can release this as v0.43.1.

@ikheifets-splunk
Copy link

@buytenh are you have a plans to release this fix? We also depending on you lib

@buytenh
Copy link
Owner

buytenh commented May 27, 2024

@buytenh are you have a plans to release this fix? We also depending on you lib

I was hoping that @bazsi could test and ACK this, given that he originally opened this PR.

@bazsi
Copy link
Contributor Author

bazsi commented May 28, 2024

This indeed looks good to me. I'll test this later today. Thanks a lot Lennert.

@bazsi bazsi closed this May 28, 2024
@bazsi bazsi reopened this May 28, 2024
Repository owner deleted a comment from bazsi May 28, 2024
@bazsi
Copy link
Contributor Author

bazsi commented Jun 1, 2024

I have got around to testing this, and I can confirm it fixes the issue for syslog-ng on my aging Raspberry 32 bit. While it crashed at startup before, it now starts up properly.

@buytenh
Copy link
Owner

buytenh commented Jun 1, 2024

Fix from #33 (comment) merged into master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants