Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

seccomp: switch default to ENOSYS #573

Merged
merged 2 commits into from
Jun 15, 2021

Conversation

giuseppe
Copy link
Member

add the currently blocked syscalls to a deny-list and switch the default to ENOSYS.

Signed-off-by: Giuseppe Scrivano [email protected]

@rhatdan
Copy link
Member

rhatdan commented May 24, 2021

Tests are not happy.

@@ -1,5 +1,6 @@
{
"defaultAction": "SCMP_ACT_ERRNO",
"defaultErrnoRet": 38,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a magic number?

"vserver"
],
"action": "SCMP_ACT_ERRNO",
"errnoRet": 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a magic number?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is already too late for errnoRet as we already support it. Perhaps add a errnoRetString that can be used in place of errnoRet?

We could make defaultErrnoRet a string, but I don't see a way to convert from a name to an errno list in the std library, so we would need to do the conversion ourselves.

@giuseppe giuseppe force-pushed the default-errno-ENOSYS branch from 40d6e52 to dd4bab0 Compare May 24, 2021 13:16
@giuseppe
Copy link
Member Author

tests are happy now

@rhatdan
Copy link
Member

rhatdan commented May 24, 2021

LGTM
@containers/podman-maintainers PTAL

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you open a test PR against Podman?

We've regressed a number of the times recently in c/common and caught them while vendoring the release into Podman.

@giuseppe
Copy link
Member Author

Can you open a test PR against Podman?

good idea. Opened a PR here: containers/podman#10451

@giuseppe giuseppe force-pushed the default-errno-ENOSYS branch 12 times, most recently from 5ab4b05 to fcaaf65 Compare June 4, 2021 13:56
@giuseppe
Copy link
Member Author

giuseppe commented Jun 7, 2021

containers/podman#10451 failures don't seem related to the seccomp changes, but are related to the new runc release. Some tests need to be fixed

@rhatdan
Copy link
Member

rhatdan commented Jun 14, 2021

@giuseppe Still working on this?

{
Names: []string{
"bdflush",
"clone3",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this has to be treated the same way as clone. Ideally, we would allow this system call because it is required for supporting certain forms of security hardening. But we definitely will need ENOSYS for clone3.

"io_pgetevents",
"io_uring_enter",
"io_uring_register",
"io_uring_setup",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

io_uring_* should be ENOSYS, too.

"pciconfig_write",
"pkey_alloc",
"pkey_free",
"pkey_mprotect",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pkey_* functions should really be permitted because they enable security hardening. Otherwise, ENOSYS, not EPERM is needed for them.

"pkey_alloc",
"pkey_free",
"pkey_mprotect",
"rseq",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see any harm in permitting rseq. In the future, it will be an important performance enhancement for some workloads.

"io_uring_setup",
"kexec_file_load",
"kexec_load",
"membarrier",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

membarrier should default to ENOSYS if it can't be permitted.

"get_mempolicy",
"mbind",
"set_mempolicy",
},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why those are separate from move_pages and migrate_pages above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these fails with EPERM only when the container is not granted CAP_SYS_NICE. Otherwise seccomp won't block them

Copy link

@fweimer-rh fweimer-rh Jun 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. move_pages and migrate_pages seem to have similar constraints. I also believe that these system calls do not require CAP_SYS_NICE in all cases. See the discussion of the MPOL_MF_MOVE_ALL flag in the manual pages.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you suggest enabling them in any case?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know, they are similar to madvise and not really privileged, so I think they can be enabled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks. Added a patch to always allow them

giuseppe added 2 commits June 14, 2021 19:08
Add support to specify the default errno return value.

The OCI runtime specs already have support for it, and both crun (>=
0.19) and runc (>= 1.0-rc95) have support for it.

Signed-off-by: Giuseppe Scrivano <[email protected]>
add the currently blocked syscalls to a deny-list and switch the
default to ENOSYS.

Signed-off-by: Giuseppe Scrivano <[email protected]>
@giuseppe giuseppe force-pushed the default-errno-ENOSYS branch from fcaaf65 to 81cd342 Compare June 14, 2021 17:09
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 14, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@giuseppe
Copy link
Member Author

@fweimer-rh thanks for the suggestions. I agree with them and I'll implement them as a follow-up PR.

I am still struggling to get this in as some Podman CI tests are failing with the new ubuntu/fedora images needed for having runc-1.0-rc95. Progress here: containers/podman#10451

As soon as the tests pass, I will polish the Podman PR to update the images and we can merge this one

@giuseppe
Copy link
Member Author

Podman tests are finally green. If we merge the PR, I will cleanup the Podman PR to vendor a release of containers/common and we can then target the new changes to the seccomp policy

@rhatdan
Copy link
Member

rhatdan commented Jun 15, 2021

/lgtm

@giuseppe
Copy link
Member Author

@fweimer-rh addressed your comments in #627, except for mbind, set_mempolicy, get_mempolicy, but I can add another patch to address it

openshift-merge-robot added a commit that referenced this pull request Jun 18, 2021
seccomp: tweak default profile (followup for #573)
buytenh added a commit to buytenh/ivykis that referenced this pull request May 16, 2024
Commit 491daf4 ("iv_fd_epoll: Add support for epoll_pwait2().")
added support for epoll_pwait2(), with a fallback to epoll_wait() in
case epoll_pwait2() is not supported by the kernel we are running on,
which would be indicated by epoll_pwait2() returning -ENOSYS.

Some reports (e.g. axoflow/axosyslog#85 ,
#33 (comment) )
suggest that some container technologies can cause -EPERM to be
returned for epoll_pwait2(), independently of whether or not
epoll_pwait2() is actually supported by the kernel we are running on,
and this trips us up because we don't currently handle -EPERM
gracefully, as we did not expect that we would have to do so.

Making system calls return -EPERM to indicate that they were filtered
out by a security policy framework seems somewhat dubious, especially
when considering the amount of application and user confusion generated
by system calls that are not documented as being able to fail with
-EPERM now suddenly being able to fail with -EPERM, but there is not
much we can do about this.

I would be against adding EPERM-as-ENOSYS fallbacks for every current
or future case where we handle ENOSYS, but:

1. it seems that this is the only case where this triggers;

2. upstream seems to agree that this EPERM behavior is a bug (see
   e.g. these links dug up by László Várady:
   containers/common#573 ,
   containers/podman#10337 ,
   opencontainers/runtime-spec#1087 ), so
   there will hopefully be no new cases of this in the future;

3. there's at least one container technology release (podman on
   CentOS 7) where this bug triggers and where the platform is
   sufficiently old to no longer be receiving updates, as pointed
   out by Balazs Scheidler, so this issue can't be fixed by users
   updating their container software.

Under these circumstances, adding a workaround on our end seems
reasonable, and this commit does so.

This issue was originally reported by @mstopa-splunk on GitHub.
Workaround originally by Balazs Scheidler.

Signed-off-by: Lennert Buytenhek <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants