-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use seccomp instead of setsid() to workaround CVE-2017-5226 #150
base: main
Are you sure you want to change the base?
Conversation
The setsid() workaround of containers#143 is problematic, because it e.g. breaks shell job control for bubblewrap instances. So, instead we use a seccomp approach based on: util-linux/util-linux@8e49250 However, since we don't want to pull in any more dependencies into the setuid binary we pre-compile the seccomp code during the build. If libseccomp is not available on your architecture, we still support the old fix with --disable-seccomp-tty-fix. This fixes containers#147
Hmm. But this still leaves bwrap users on other arches with that behavior, which we'd still have to then document how to work around in examples, etc. Note that util-linux reverted their seccomp change; the commit message doesn't document extensively why though. |
I'm not opposed to this though if you think it's worth it (which I guess you do having written the patch 😄 ) |
Its definately worth it, I've updated flatpak to the bubblewrap 0.1.6 in master, and it is completely useless on the command line. Whatever you do you end up with multiple processes reading from the terminal and you have to start a new terminal to find which one to kill. Its a complete show-stopper imho, |
Are there any arches that don't have seccomp yet though? If so its probably just a matter of time before they get it. Also, I don't see how you can work around any issues with this... |
You can work around it by creating a new pty and having the shell use that as the controlling terminal. I don't have a handy one liner for this, but e.g. running |
You can work around the |
For instance, you can't do |
We could address the Ctrl-C UX by having a process outside of the container that acts as a lifecycle bind to init inside the container. Basically both watch a pipe, and if one exits, the other does. |
That's just a single example though. The general issue is that it doesn't behave like a UNIX command. |
I.E it doesn't get sigstop when it writer to the try when backgrounded, etc |
|
||
if test "x$enable_seccomp_tty_fix" = "xno"; then | ||
AC_DEFINE([DISABLE_SECCOMP], [1], | ||
[Define if using seccomp]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not using
generate_seccomp_LDADD = $(LIBSECCOMP_LIBS) $(SELINUX_LIBS) | ||
|
||
seccomp-filter.h: generate-seccomp$(EXEEXT) | ||
./generate-seccomp$(EXEEXT) > seccomp-filter.h |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work in a cross-compilation scenario? Offhand it seems we'd generate rules for the wrong arch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it will fail to cross-compile because the built generate-seccomp will fail to run.
What UNIX/commandline issues wouldn't be addressed by having an "outer init"? (I realized there's no need for a pipe, the "outer init" could just have the "container init" as a child process, and the "outer init" watches it via SIGCHILD, and container init should use |
Eh, SIGSTOP. Yeah, true. We'd have to go to proxying read/write and signals across. I guess really we'd end up with a userspace pty emulator, just without |
We could have an outer init that proxies stdin/stdout/stderror and signals (SIGSTOP/SIGCONT, etc), but it would never quite get the right thing, because e.g. /dev/tty will not work properly inside it. It also seems very complex and easy to get wrong. |
I think for |
The scope for "outer init" would be a lot smaller if we specified that we didn't support ttys (e.g. |
we still have to proxy things like canonical/raw mode, etc, no? |
Although, "outer init + setsid()" still seems viable for the basic "ctrl-c'able" case. Not sure how many people would initially notice the SIGSTOP/backgroundable bit. |
OK so...why explicitly generate the rules at build time? (And do you agree it breaks cross compilation?) Is there other precedent for this? systemd for example doesn't seem to do this. |
Because I don't want to load essentially a compiler into a setuid binary. |
Here is that code: (or an xterm clone with lighter-weight dependencies) |
It is a compiler of sorts, but it's operating on known, static trusted input. I'm not sure it's worth breaking cross builds for this, though I admit to not actively maintaining a cross-built system right now myself. |
Last fixup drops the generate at build-time |
One other concern that pops to mind - we're still vulnerable without the ptrace-after-seccomp patches, which are in 4.8, but probably not backported to kernels like the CentOS7/RHEL one. |
For flatpak we disable ptrace by default unless you run with -d or grant "developer" permissions. |
This is also disabled with ptrace, but i believe that by itself should be ok, because the ptrace-after-seccomp issue is after ptrace has been enabled, no? |
So...what one could argue here is we should really have a Conceptually, the CVE then isn't in bwrap - it's in any program which is using bwrap with a pty connected to separate security domains. There are other ways to fix the issue externally - for example, just don't provide a pty to the child process - if it's a background daemon, connect it to the e.g. systemd journal. Or, per above, install a seccomp filter in your software. One argument that the Another really important example is we support providing So there's all of this "best practice" stuff that needs to live outside. Hence then, why don't we add a command line option |
That sounds good to me, although |
The reason I phrased it as That said...I think I'm convincing myself we should do this:
|
By 1) do you mean invert its meaning? Or rely on apps to use setsid(1)? |
I'd prefer That's consistent with how bwrap does filesystems (if you don't do any |
Have you tried to use bwarp with setsid though? Its extremely easy to get into very confusing situations where the terminal is essentially unusable. |
In discussion in containers#150 it was noted that most of the bwrap command line tends towards "closed by default, request open". But the `--unshare` options are inverse. Now, I suspect in practice there's only one namespace that most users will care about, which is the network namespace. There are very useful programs to build on both cases. I think everything else (pid, ipc, uts) people will want as a group. Any cases that are unusual enough to want to turn one of them off can still fall back to the previous bwrap behavior of explicitly unsharing. They're likely to be security sensitive enough that if a new namespace were added, it would make sense to evaluate the tool. But again I think most users will want all namespaces, with the network one as a primary "enable it" option.
So, the reason I dislike --disable-setsid is, if we ignore for the moment the CVE, that we're introducing a new default that changes the semantics of the sandbox. Suddenly we're making something that worked (such as flatpak) and essentially break it if you update bubblewrap (because, with bubblewrap 0.1.6 doing development with flatpak is basically broken). I think most users of bubblewrap want as-secure-as-possible, but don't break my app. However, this is really really hard to guarantee for random apps, so the only guaranteed way is to add sandboxing features as default open, ask for limit. For example, I can add a --disable-setsid to bubblewrap, and then use that from flatpak. However, this means the next flatpak has to require the newer bubblewrap (will fail with the old one with unknown switch). My plan was to do a stable bugfix-only release of flatpak and hope stable distros (like Debian 9) could just pick it up. However, thay may be foiled by having to rely on a new bubblewrap that adds new features (the switch). |
@cgwalters #154 has an approach like you suggested above instead. Then I'll add the seccomp rule to flatpak instead. |
☔ The latest upstream changes (presumably a6e1516) made this pull request unmergeable. Please resolve the merge conflicts. |
In discussion in containers#150 it was noted that most of the bwrap command line tends towards "closed by default, request open". But the `--unshare` options are inverse. Now, I suspect in practice there's only one namespace that most users will care about, which is the network namespace. There are very useful programs to build on both cases. I think everything else (pid, ipc, uts) people will want as a group. Any cases that are unusual enough to want to turn one of them off can still fall back to the previous bwrap behavior of explicitly unsharing. They're likely to be security sensitive enough that if a new namespace were added, it would make sense to evaluate the tool. But again I think most users will want all namespaces, with the network one as a primary "enable it" option.
In discussion in #150 it was noted that most of the bwrap command line tends towards "closed by default, request open". But the `--unshare` options are inverse. Now, I suspect in practice there's only one namespace that most users will care about, which is the network namespace. There are very useful programs to build on both cases. I think everything else (pid, ipc, uts) people will want as a group. Any cases that are unusual enough to want to turn one of them off can still fall back to the previous bwrap behavior of explicitly unsharing. They're likely to be security sensitive enough that if a new namespace were added, it would make sense to evaluate the tool. But again I think most users will want all namespaces, with the network one as a primary "enable it" option. Closes: #153 Approved by: alexlarsson
It's still an open question a bit to me whether we want to add any seccomp to bwrap itself. |
If this gets picked up later, please note that it's not just |
} | ||
|
||
if (seccomp_rule_add (ctx, SCMP_ACT_ERRNO(EPERM), SCMP_SYS(ioctl), 1, | ||
SCMP_A1(SCMP_CMP_EQ, (int)TIOCSTI)) < 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would need SCMP_CMP_MASKED_EQ
rather than SCMP_CMP_EQ
to not re-introduce CVE-2019-10063. Sending TIOCSTI + 0x100000000
(eight zeros) can be used for a test.
The setsid() workaround of
#143 is problematic,
because it e.g. breaks shell job control for bubblewrap instances.
So, instead we use a seccomp approach based on:
util-linux/util-linux@8e49250
However, since we don't want to pull in any more dependencies into
the setuid binary we pre-compile the seccomp code during the build.
If libseccomp is not available on your architecture, we still support
the old fix with --disable-seccomp-tty-fix.
This fixes #147