-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Proposal] Use runc-dmz to defeat CVE-2019-5736 #3983
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -57,7 +57,12 @@ endif | |
|
||
.DEFAULT: runc | ||
|
||
runc: | ||
runc: runc-dmz runc-bin | ||
|
||
runc-dmz: | ||
$(CC) -o runc-dmz -static contrib/cmd/runc-dmz/runc-dmz.c | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we are going to make this a hard dependency, it should be placed out of “contrib” |
||
|
||
runc-bin: | ||
$(GO_BUILD) -o runc . | ||
|
||
all: runc recvtty sd-helper seccompagent fs-idmap | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,11 @@ | ||||||
#include <unistd.h> | ||||||
|
||||||
extern char **environ; | ||||||
|
||||||
int main(int argv, char **args) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
{ | ||||||
if (argv > 0) { | ||||||
return execve(args[0], args, environ); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am wondering about ensuring the first argument, i.e. the executed binary is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The first argument is not There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh OK! Thank you for the precision! I still need to polish my There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
If you try to run the C binary directly, I think the first argument of args would be |
||||||
} | ||||||
return 0; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Probably this should be non-zero |
||||||
} | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess we can optimize the binary footprint by eliminating glibc and executing “syscall” instruction directly, but that may happen in follow-up PRs There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we build this binary with musl, you get a 13KB binary. With glibc it's 1.1MB. We will need to adjust our build system for this -- but there is a separate issue that distributions might find building a single binary with musl quite difficult. Speaking from the SUSE side, I don't think we even package musl for SLES so we would need to build with glibc (meaning the binary would be 1.1MB in size -- almost 100 times larger). As you say, this can be done in a future PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rewriting this in asm will be much easier than complicating libc dependency There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure if there is a silver bullet here:
Maybe using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sad :(. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. https://gist.github.com/Zheaoli/f37bc1fb04917fdfac36d644ee69f7e9 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here is my contribution for From 95defc03b7ec526c9c966735ebeb5bac5a9a34a0 Mon Sep 17 00:00:00 2001
From: Francis Laniel <[email protected]>
Date: Wed, 30 Aug 2023 13:06:27 +0200
Subject: [PATCH] runc-dmz: Add arm64 assembly code.
Signed-off-by: Francis Laniel <[email protected]>
---
contrib/cmd/runc-dmz/runc-dmz-arm64.S | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
create mode 100644 contrib/cmd/runc-dmz/runc-dmz-arm64.S
diff --git a/contrib/cmd/runc-dmz/runc-dmz-arm64.S b/contrib/cmd/runc-dmz/runc-dmz-arm64.S
new file mode 100644
index 00000000..25387e61
--- /dev/null
+++ b/contrib/cmd/runc-dmz/runc-dmz-arm64.S
@@ -0,0 +1,25 @@
+.equ EXECVE_SYSCALL_NUMBER, 221
+.equ EXIT_SYSCALL_NUMBER, 93
+
+.text
+.global _start
+_start:
+ ldr x3, [sp] // *sp contains argc, so x3 = argc
+ cmp x3, #0
+ bgt execve
+ mov x0, #0
+
+exit:
+ mov x8, EXIT_SYSCALL_NUMBER
+ svc #0
+
+execve:
+ add x1, sp, #16 // x1 = &argv[1]
+ ldr x0, [x1] // x0 = argv[1]
+ # argv[argc] is NULL and environment starts at argv[argc + 1].
+ mov x4, #8 // x4 = 8
+ mul x3, x3, x4 // x3 = (argc + 1) * 8
+ add x2, x1, x3 // x3 = &argv[argc + 1], i.e. environment.
+ mov x8, EXECVE_SYSCALL_NUMBER
+ svc #0
+ b exit
--
2.34.1 I tested it both through emulation and virtualization: francis@pwmachine:~/Codes/kinvolk/runc/contrib/cmd/runc-dmz$ aarch64-linux-gnu-as -g runc-dmz-arm64.S -o runc-dmz-arm64.o (95defc03...) %
francis@pwmachine:~/Codes/kinvolk/runc/contrib/cmd/runc-dmz$ aarch64-linux-gnu-ld -static runc-dmz-arm64.o (95defc03...) %
francis@pwmachine:~/Codes/kinvolk/runc/contrib/cmd/runc-dmz$ du -sh a.out (95defc03...) %
4,0K a.out
francis@pwmachine:~/Codes/kinvolk/runc/contrib/cmd/runc-dmz$ file a.out (95defc03...) %
a.out: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), statically linked, with debug_info, not stripped
francis@pwmachine:~/Codes/kinvolk/runc/contrib/cmd/runc-dmz$ qemu-aarch64-static ./a.out /bin/ls -al (95defc03...) %
total 32
drwxrwxr-x 2 francis francis 4096 août 30 13:09 .
drwxrwxr-x 7 francis francis 4096 août 29 17:23 ..
-rw-rw-r-- 1 francis francis 1213 août 30 13:06 0001-runc-dmz-Add-arm64-assembly-code.patch
-rwxrwxr-x 1 francis francis 1832 août 30 13:09 a.out
-rw-rw-r-- 1 francis francis 2168 août 30 13:09 runc-dmz-arm64.o
-rw-rw-r-- 1 francis francis 524 août 30 13:06 runc-dmz-arm64.S
-rw-rw-r-- 1 francis francis 168 août 29 17:23 runc-dmz.c
-rw-rw-r-- 1 francis francis 700 août 29 18:59 runc-dmz.s
# Inside arm64 VM:
root@vm-arm64:~/share/kinvolk/runc/contrib/cmd/runc-dmz# lscpu | head -1
Architecture: aarch64
root@vm-arm64:~/share/kinvolk/runc/contrib/cmd/runc-dmz# ./a.out /bin/ls -al
total 32
drwxrwxr-x 2 1000 1000 4096 Aug 30 11:09 .
drwxrwxr-x 7 1000 1000 4096 Aug 29 15:23 ..
-rw-rw-r-- 1 1000 1000 1213 Aug 30 11:06 0001-runc-dmz-Add-arm64-assembly-code.patch
-rwxrwxr-x 1 1000 1000 1832 Aug 30 11:09 a.out
-rw-rw-r-- 1 1000 1000 524 Aug 30 11:06 runc-dmz-arm64.S
-rw-rw-r-- 1 1000 1000 2168 Aug 30 11:09 runc-dmz-arm64.o
-rw-rw-r-- 1 1000 1000 168 Aug 29 15:23 runc-dmz.c
-rw-rw-r-- 1 1000 1000 700 Aug 29 16:59 runc-dmz.s Feel free to apply the patch! In any case, it was funny to write arm64 assembly from scratch as it was a long time I did not do that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think maybe there still have some design problems for |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -27,6 +27,16 @@ func init() { | |
// Figure out path to get-images.sh. Note it won't work | ||
// in case the compiled test binary is moved elsewhere. | ||
_, ex, _, _ := runtime.Caller(0) | ||
// make and copy runc-dmg | ||
rootDir, err := filepath.Abs(filepath.Join(filepath.Dir(ex), "..", "..")) | ||
if err != nil { | ||
panic(err) | ||
} | ||
nowPath := getExeDir() | ||
dmzMake, err := exec.Command("gcc", "-o", filepath.Join(nowPath, "integration.test-dmz"), "-static", filepath.Join(rootDir, "contrib/cmd/runc-dmz/runc-dmz.c")).CombinedOutput() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This shouldn’t be here, and the dmz binary should be just in the PATH? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we should just embed the dmz binary into the main binary |
||
if err != nil { | ||
panic(fmt.Errorf("make runc-dmz error %w (output: %s)", err, dmzMake)) | ||
} | ||
getImages, err := filepath.Abs(filepath.Join(filepath.Dir(ex), "..", "..", "tests", "integration", "get-images.sh")) | ||
if err != nil { | ||
panic(err) | ||
|
@@ -48,6 +58,15 @@ func init() { | |
} | ||
} | ||
|
||
func getExeDir() string { | ||
exePath, err := os.Executable() | ||
if err != nil { | ||
panic(err) | ||
} | ||
res, _ := filepath.EvalSymlinks(filepath.Dir(exePath)) | ||
return res | ||
} | ||
|
||
func ptrInt(v int) *int { | ||
return &v | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -132,7 +132,7 @@ int memfd_create(const char *name, unsigned int flags) | |
#endif | ||
|
||
#define CLONED_BINARY_ENV "_LIBCONTAINER_CLONED_BINARY" | ||
#define RUNC_MEMFD_COMMENT "runc_cloned:/proc/self/exe" | ||
#define RUNC_MEMFD_COMMENT "runc_cloned:runc-dmz" | ||
/* | ||
* There are newer memfd seals (such as F_SEAL_FUTURE_WRITE and F_SEAL_EXEC), | ||
* which we use opportunistically. However, this set is the original set of | ||
|
@@ -142,162 +142,6 @@ int memfd_create(const char *name, unsigned int flags) | |
#define RUNC_MEMFD_MIN_SEALS \ | ||
(F_SEAL_SEAL | F_SEAL_SHRINK | F_SEAL_GROW | F_SEAL_WRITE) | ||
|
||
static void *must_realloc(void *ptr, size_t size) | ||
{ | ||
void *old = ptr; | ||
do { | ||
ptr = realloc(old, size); | ||
} while (!ptr); | ||
return ptr; | ||
} | ||
|
||
/* | ||
* Verify whether we are currently in a self-cloned program (namely, is | ||
* /proc/self/exe a memfd). F_GET_SEALS will only succeed for memfds (or rather | ||
* for shmem files), and we want to be sure it's actually sealed. | ||
*/ | ||
static int is_self_cloned(void) | ||
{ | ||
int fd, seals = 0, is_cloned = false; | ||
struct stat statbuf = { }; | ||
struct statfs fsbuf = { }; | ||
|
||
fd = open("/proc/self/exe", O_RDONLY | O_CLOEXEC); | ||
if (fd < 0) { | ||
write_log(ERROR, "cannot open runc binary for reading: open /proc/self/exe: %m"); | ||
return -ENOTRECOVERABLE; | ||
} | ||
|
||
/* | ||
* Is the binary a fully-sealed memfd? We don't need CLONED_BINARY_ENV for | ||
* this, because you cannot write to a sealed memfd no matter what. | ||
*/ | ||
seals = fcntl(fd, F_GET_SEALS); | ||
if (seals >= 0) { | ||
write_log(DEBUG, "checking /proc/self/exe memfd seals: 0x%x", seals); | ||
is_cloned = (seals & RUNC_MEMFD_MIN_SEALS) == RUNC_MEMFD_MIN_SEALS; | ||
if (is_cloned) | ||
goto out; | ||
} | ||
|
||
/* | ||
* All other forms require CLONED_BINARY_ENV, since they are potentially | ||
* writeable (or we can't tell if they're fully safe) and thus we must | ||
* check the environment as an extra layer of defence. | ||
*/ | ||
if (!getenv(CLONED_BINARY_ENV)) { | ||
is_cloned = false; | ||
goto out; | ||
} | ||
|
||
/* | ||
* Is the binary on a read-only filesystem? We can't detect bind-mounts in | ||
* particular (in-kernel they are identical to regular mounts) but we can | ||
* at least be sure that it's read-only. In addition, to make sure that | ||
* it's *our* bind-mount we check CLONED_BINARY_ENV. | ||
*/ | ||
if (fstatfs(fd, &fsbuf) >= 0) | ||
is_cloned |= (fsbuf.f_flags & MS_RDONLY); | ||
|
||
/* | ||
* Okay, we're a tmpfile -- or we're currently running on RHEL <=7.6 | ||
* which appears to have a borked backport of F_GET_SEALS. Either way, | ||
* having a file which has no hardlinks indicates that we aren't using | ||
* a host-side "runc" binary and this is something that a container | ||
* cannot fake (because unlinking requires being able to resolve the | ||
* path that you want to unlink). | ||
*/ | ||
if (fstat(fd, &statbuf) >= 0) | ||
is_cloned |= (statbuf.st_nlink == 0); | ||
|
||
out: | ||
close(fd); | ||
return is_cloned; | ||
} | ||
|
||
/* Read a given file into a new buffer, and providing the length. */ | ||
static char *read_file(char *path, size_t *length) | ||
{ | ||
int fd; | ||
char buf[4096], *copy = NULL; | ||
|
||
if (!length) | ||
return NULL; | ||
|
||
fd = open(path, O_RDONLY | O_CLOEXEC); | ||
if (fd < 0) | ||
return NULL; | ||
|
||
*length = 0; | ||
for (;;) { | ||
ssize_t n; | ||
|
||
n = read(fd, buf, sizeof(buf)); | ||
if (n < 0) | ||
goto error; | ||
if (!n) | ||
break; | ||
|
||
copy = must_realloc(copy, (*length + n) * sizeof(*copy)); | ||
memcpy(copy + *length, buf, n); | ||
*length += n; | ||
} | ||
close(fd); | ||
return copy; | ||
|
||
error: | ||
close(fd); | ||
free(copy); | ||
return NULL; | ||
} | ||
|
||
/* | ||
* A poor-man's version of "xargs -0". Basically parses a given block of | ||
* NUL-delimited data, within the given length and adds a pointer to each entry | ||
* to the array of pointers. | ||
*/ | ||
static int parse_xargs(char *data, int data_length, char ***output) | ||
{ | ||
int num = 0; | ||
char *cur = data; | ||
|
||
if (!data || *output != NULL) | ||
return -1; | ||
|
||
while (cur < data + data_length) { | ||
num++; | ||
*output = must_realloc(*output, (num + 1) * sizeof(**output)); | ||
(*output)[num - 1] = cur; | ||
cur += strlen(cur) + 1; | ||
} | ||
(*output)[num] = NULL; | ||
return num; | ||
} | ||
|
||
/* | ||
* "Parse" out argv from /proc/self/cmdline. | ||
* This is necessary because we are running in a context where we don't have a | ||
* main() that we can just get the arguments from. | ||
*/ | ||
static int fetchve(char ***argv) | ||
{ | ||
char *cmdline = NULL; | ||
size_t cmdline_size; | ||
|
||
cmdline = read_file("/proc/self/cmdline", &cmdline_size); | ||
if (!cmdline) | ||
goto error; | ||
|
||
if (parse_xargs(cmdline, cmdline_size, argv) <= 0) | ||
goto error; | ||
|
||
return 0; | ||
|
||
error: | ||
free(cmdline); | ||
return -EINVAL; | ||
} | ||
|
||
enum { | ||
EFD_NONE = 0, | ||
EFD_MEMFD, | ||
|
@@ -499,12 +343,20 @@ static int clone_binary(void) | |
struct stat statbuf = { }; | ||
size_t sent = 0; | ||
int fdtype = EFD_NONE; | ||
char runcpath[PATH_MAX] = { 0 }; | ||
char dmzpath[PATH_MAX] = { 0 }; | ||
|
||
execfd = make_execfd(&fdtype); | ||
if (execfd < 0 || fdtype == EFD_NONE) | ||
return -ENOTRECOVERABLE; | ||
|
||
binfd = open("/proc/self/exe", O_RDONLY | O_CLOEXEC); | ||
if (readlink("/proc/self/exe", runcpath, PATH_MAX) < 1) | ||
goto error; | ||
|
||
if (snprintf(dmzpath, PATH_MAX, "%s%s", runcpath, "-dmz") < 0) | ||
goto error; | ||
|
||
binfd = open(dmzpath, O_RDONLY | O_CLOEXEC); | ||
if (binfd < 0) | ||
goto error; | ||
|
||
|
@@ -543,24 +395,19 @@ extern char **environ; | |
int ensure_cloned_binary(void) | ||
{ | ||
int execfd; | ||
char **argv = NULL; | ||
|
||
/* Check that we're not self-cloned, and if we are then bail. */ | ||
int cloned = is_self_cloned(); | ||
if (cloned > 0 || cloned == -ENOTRECOVERABLE) | ||
return cloned; | ||
|
||
if (fetchve(&argv) < 0) | ||
return -EINVAL; | ||
|
||
execfd = clone_binary(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The binary text should be embedded into the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, I think you can take this if you think this proposal worth to try. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay, I'll open a separate PR with this code and combine it with the cloned-binary patch. Thanks for working on this, I like the idea a lot! |
||
if (execfd < 0) | ||
return -EIO; | ||
|
||
if (putenv(CLONED_BINARY_ENV "=1")) | ||
char envString[PATH_MAX] = { 0 }; | ||
if (sprintf(envString, "%d", execfd) < 0) | ||
goto error; | ||
|
||
if (setenv("_LIBCONTAINER_DMZFD", envString, 1)) | ||
goto error; | ||
|
||
fexecve(execfd, argv, environ); | ||
return 0; | ||
error: | ||
close(execfd); | ||
return -ENOEXEC; | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,6 +8,8 @@ import ( | |
"io" | ||
"os" | ||
"os/exec" | ||
"path/filepath" | ||
"runtime" | ||
"strings" | ||
"testing" | ||
|
||
|
@@ -16,6 +18,29 @@ import ( | |
"golang.org/x/sys/unix" | ||
) | ||
|
||
func init() { | ||
_, ex, _, _ := runtime.Caller(0) | ||
// make and copy runc-dmg | ||
rootDir, err := filepath.Abs(filepath.Join(filepath.Dir(ex), "..", "..")) | ||
if err != nil { | ||
panic(err) | ||
} | ||
nowPath := getExeDir() | ||
dmzMake, err := exec.Command("gcc", "-o", filepath.Join(nowPath, "nsenter.test-dmz"), "-static", filepath.Join(rootDir, "contrib/cmd/runc-dmz/runc-dmz.c")).CombinedOutput() | ||
if err != nil { | ||
panic(fmt.Errorf("make runc-dmz error %w (output: %s)", err, dmzMake)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The dmz binary should be just preinstalled in the PATH There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we should just embed the dmz binary into the main binary |
||
} | ||
} | ||
|
||
func getExeDir() string { | ||
exePath, err := os.Executable() | ||
if err != nil { | ||
panic(err) | ||
} | ||
res, _ := filepath.EvalSymlinks(filepath.Dir(exePath)) | ||
return res | ||
} | ||
|
||
func TestNsenterValidPaths(t *testing.T) { | ||
args := []string{"nsenter-exec"} | ||
parent, child := newPipe(t) | ||
|
@@ -24,6 +49,7 @@ func TestNsenterValidPaths(t *testing.T) { | |
// join pid ns of the current process | ||
fmt.Sprintf("pid:/proc/%d/ns/pid", os.Getpid()), | ||
} | ||
fmt.Printf("=========os.Args[0] %s\n", os.Args[0]) | ||
cmd := &exec.Cmd{ | ||
Path: os.Args[0], | ||
Args: args, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And CFLAGS