Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASSERT os.c: found: /proc/pid/maps contains a zero-length VMA #599

Closed
derekbruening opened this issue Nov 28, 2014 · 6 comments
Closed

ASSERT os.c: found: /proc/pid/maps contains a zero-length VMA #599

derekbruening opened this issue Nov 28, 2014 · 6 comments

Comments

@derekbruening
Copy link
Contributor

From [email protected] on November 04, 2011 09:53:11

This is the results summary from our linux_x64_lucid DR builder:

RESULTS

debug-external-32: build successful; no tests for this build
debug-external-64: build successful; no tests for this build
debug-internal-32: 92 tests passed, **** 5 tests failed: ****
code_api|security-linux.trampoline => Application security-linux.trampoline (17848). Internal Error Internal DynamoRIO Error: /mnt/data/b/build/slave/linux_lucid_x64_dr_checkout/drmemory/dynamorio/core/linux/os.c:7668 found
code_api|security-linux.stacktest => Application security-linux.stacktest (17843). Internal Error Internal DynamoRIO Error: /mnt/data/b/build/slave/linux_lucid_x64_dr_checkout/drmemory/dynamorio/core/linux/os.c:7668 found
code_api|security-common.codemod => Application security-common.codemod (17868). Internal Error Internal DynamoRIO Error: /mnt/data/b/build/slave/linux_lucid_x64_dr_checkout/drmemory/dynamorio/core/linux/os.c:7668 found
code_api|security-common.decode-bad-stack => Application security-common.decode-bad-stack (17870). Internal Error Internal DynamoRIO Error: /mnt/data/b/build/slave/linux_lucid_x64_dr_checkout/drmemory/dynamorio/core/linux/os.c:7668 found
code_api|security-common.TestMemProtChg => Application security-common.TestMemProtChg (17906). Internal Error Internal DynamoRIO Error: /mnt/data/b/build/slave/linux_lucid_x64_dr_checkout/drmemory/dynamorio/core/linux/os.c:7668 found
debug-internal-64: 84 tests passed, **** 9 tests failed: ****
code_api|linux.execve-null
code_api|linux.vfork
code_api|pthreads.pthreads
code_api|pthreads.ptsig
code_api|security-linux.trampoline => Application security-linux.trampoline (23781). Internal Error Internal DynamoRIO Error: /mnt/data/b/build/slave/linux_lucid_x64_dr_checkout/drmemory/dynamorio/core/linux/os.c:7668 found
code_api|security-common.codemod => Application security-common.codemod (23793). Internal Error Internal DynamoRIO Error: /mnt/data/b/build/slave/linux_lucid_x64_dr_checkout/drmemory/dynamorio/core/linux/os.c:7668 found
code_api|security-common.decode-bad-stack => Application security-common.decode-bad-stack (23791). Internal Error Internal DynamoRIO Error: /mnt/data/b/build/slave/linux_lucid_x64_dr_checkout/drmemory/dynamorio/core/linux/os.c:7668 found
code_api|security-common.TestMemProtChg => Application security-common.TestMemProtChg (23808). Internal Error Internal DynamoRIO Error: /mnt/data/b/build/slave/linux_lucid_x64_dr_checkout/drmemory/dynamorio/core/linux/os.c:7668 found
code_api|client.drmgr-test
release-external-32: build successful; no tests for this build
release-external-64: build successful; no tests for this build
vmsafe-debug-internal-32: build successful; no tests for this build
vps-debug-internal-32: build successful; no tests for this build

Many of the failures are caused by this assert where we can't find an address in the /proc/pid/maps file.

This doesn't reproduce on my workstation, which runs 2.6.38, but I reproduced it on the bot, which runs 2.6.32-25 under VMWare. I added code to DR to dump /proc/pid/maps at startup and before the assertion failure, and we get the following:

$ ./bin64/drrun -debug suite/tests/bin/security-linux.trampoline


<Starting application security-linux.trampoline (32402)>
00400000-00401000 r-xp 00000000 08:05 167114 /home/chrome-bot/dr/dynamorio/build/suite/tests/bin/security-linux.trampoline
00600000-00601000 r-xp 00000000 08:05 167114 /home/chrome-bot/dr/dynamorio/build/suite/tests/bin/security-linux.trampoline
00601000-00602000 rwxp 00001000 08:05 167114 /home/chrome-bot/dr/dynamorio/build/suite/tests/bin/security-linux.trampoline
01a67000-01a88000 rwxp 00000000 00:00 0 [heap]
41a02000-41a05000 ---p 00000000 00:00 0
41a05000-41a08000 rwxp 00000000 00:00 0
41a08000-41a0d000 ---p 00000000 00:00 0
41a0d000-41a0e000 rwxp 00000000 00:00 0
41a0e000-41a15000 ---p 00000000 00:00 0
41a15000-41a16000 rwxp 00000000 00:00 0
41a16000-49a06000 ---p 00000000 00:00 0
71000000-7133f000 r-xp 00000000 08:05 36156 /home/chrome-bot/dr/dynamorio/build/lib64/debug/libdynamorio.so.3.0
7133f000-7153e000 ---p 0033f000 08:05 36156 /home/chrome-bot/dr/dynamorio/build/lib64/debug/libdynamorio.so.3.0
7153e000-7155f000 r-xp 0033e000 08:05 36156 /home/chrome-bot/dr/dynamorio/build/lib64/debug/libdynamorio.so.3.0
7155f000-7157d000 rwxp 0035f000 08:05 36156 /home/chrome-bot/dr/dynamorio/build/lib64/debug/libdynamorio.so.3.0
7157d000-71599000 rwxp 00000000 00:00 0
7f258b842000-7f258b9bc000 r-xp 00000000 08:05 73215 /lib/libc-2.11.1.so
7f258b9bc000-7f258bbbb000 ---p 0017a000 08:05 73215 /lib/libc-2.11.1.so
7f258bbbb000-7f258bbbf000 r-xp 00179000 08:05 73215 /lib/libc-2.11.1.so
7f258bbbf000-7f258bbc0000 rwxp 0017d000 08:05 73215 /lib/libc-2.11.1.so
7f258bbc0000-7f258bbc5000 rwxp 00000000 00:00 0
7f258bbc5000-7f258bbc7000 r-xp 00000000 08:05 73218 /lib/libdl-2.11.1.so
7f258bbc7000-7f258bdc7000 ---p 00002000 08:05 73218 /lib/libdl-2.11.1.so
7f258bdc7000-7f258bdc8000 r-xp 00002000 08:05 73218 /lib/libdl-2.11.1.so
7f258bdc8000-7f258bdc9000 rwxp 00003000 08:05 73218 /lib/libdl-2.11.1.so
7f258bdc9000-7f258be4b000 r-xp 00000000 08:05 73219 /lib/libm-2.11.1.so
7f258be4b000-7f258c04a000 ---p 00082000 08:05 73219 /lib/libm-2.11.1.so
7f258c04a000-7f258c04b000 r-xp 00081000 08:05 73219 /lib/libm-2.11.1.so
7f258c04b000-7f258c04c000 rwxp 00082000 08:05 73219 /lib/libm-2.11.1.so
7f258c04c000-7f258c04f000 r-xp 00000000 08:05 36160 /home/chrome-bot/dr/dynamorio/build/lib64/debug/libdrpreload.so
7f258c04f000-7f258c24e000 ---p 00003000 08:05 36160 /home/chrome-bot/dr/dynamorio/build/lib64/debug/libdrpreload.so
7f258c24e000-7f258c24f000 r-xp 00002000 08:05 36160 /home/chrome-bot/dr/dynamorio/build/lib64/debug/libdrpreload.so
7f258c24f000-7f258c250000 rwxp 00003000 08:05 36160 /home/chrome-bot/dr/dynamorio/build/lib64/debug/libdrpreload.so
7f258c250000-7f258c253000 rwxp 00000000 00:00 0
7f258c253000-7f258c273000 r-xp 00000000 08:05 72946 /lib/ld-2.11.1.so
7f258c454000-7f258c457000 rwxp 00000000 00:00 0
7f258c470000-7f258c473000 rwxp 00000000 00:00 0
7f258c473000-7f258c474000 r-xp 00020000 08:05 72946 /lib/ld-2.11.1.so
7f258c474000-7f258c475000 rwxp 00021000 08:05 72946 /lib/ld-2.11.1.so
7f258c475000-7f258c476000 rwxp 00000000 00:00 0
7fff092e0000-7fff092f5000 rwxp 00000000 00:00 0 [stack]
7fff09368000-7fff09369000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
<Initial options = >
inside outer
inside ind
<executing region with pc 0x00007fff092f3ab4 on the stack.>
<get_memory_info mismatch! (can happen if os combines entries in /proc/pid/maps)
os says: 0x00007fff092e1000-0x00007fff092f5000 prot=0x0000000000000005
cache says: 0x00007fff092e0000-0x00007fff092f5000 prot=0x0000000700000005

<get_memory_info mismatch! (can happen if os combines entries in /proc/pid/maps)
os says: 0x00007fff092e1000-0x00007fff092f5000 prot=0x0000000000000005
cache says: 0x00007fff092e0000-0x00007fff092f5000 prot=0x0000000000000005

00400000-00401000 r-xp 00000000 08:05 167114 /home/chrome-bot/dr/dynamorio/build/suite/tests/bin/security-linux.trampoline 00600000-00601000 r-xp 00000000 08:05 167114 /home/chrome-bot/dr/dynamorio/build/suite/tests/bin/security-linux.trampoline 00601000-00602000 rwxp 00001000 08:05 167114 /home/chrome-bot/dr/dynamorio/build/suite/tests/bin/security-linux.trampoline 01a67000-01a88000 rwxp 00000000 00:00 0 [heap] 41a02000-41a05000 ---p 00000000 00:00 0 41a05000-41a0b000 rwxp 00000000 00:00 0 41a0b000-41a0d000 ---p 00000000 00:00 0

[many code cache entries]

41ad5000-41ad6000 rwxp 00000000 00:00 0
41ad6000-49a06000 ---p 00000000 00:00 0
71000000-7133f000 r-xp 00000000 08:05 36156 /home/chrome-bot/dr/dynamorio/build/lib64/debug/libdynamorio.so.3.0
7133f000-7153e000 ---p 0033f000 08:05 36156 /home/chrome-bot/dr/dynamorio/build/lib64/debug/libdynamorio.so.3.0
7153e000-7155f000 r-xp 0033e000 0...

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=599

@derekbruening
Copy link
Contributor Author

From [email protected] on November 04, 2011 09:41:24

I instrumented all of our mmap, mprotect, and munmap syscalls to dump the /proc/pid/maps file, and the issue we run into is that it gets up to about 4096 bytes and then it starts to lose lines at the bottom. My belief is that there is a 4096 byte buffer for the file contents in the kernel, and it's being truncated. I never see any partial lines being truncated, so it's possible that their printf construct does no writes if the write would overflow.

I'm going to spend a bit more time verifying my hypothesis. Assuming it is correct, this is basically a Linux kernel bug we have to work around, or a bug in our maps file iterator. I downgraded the assert to a SYSLOG, and the tests now pass without any further consequences, since this assert is really just a debug build only check. Does this sound reasonable, assuming this really is a kernel bug?

@derekbruening
Copy link
Contributor Author

From [email protected] on November 04, 2011 10:09:57

the maps file is used for real data in a number of places (allmem is used in the others) so this is not just a debug build sanity check issue: a full audit of who uses what would have to be done to see all possible consequences, but this is a problem

@derekbruening
Copy link
Contributor Author

From [email protected] on November 04, 2011 10:21:27

I misdiagnosed this because I was dumping /proc/pid/maps incorrectly. I forgot to retry the read syscall until it returned zero. Here is the relevant part of /proc/pid/maps that's foiling our search:

7f8c8df77000-7f8c8df78000 rwxp 00021000 08:05 72946 /lib/ld-2.11.1.so
7f8c8df78000-7f8c8df79000 rwxp 00000000 00:00 0
7fffbd955000-7fffbd955000 rwxp 00000000 00:00 0
7fffbd956000-7fffbd96a000 r-xp 00000000 00:00 0 [stack]
7fffbd9ff000-7fffbda00000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
<pc not found in proc/pid/maps!
cache says: 0x00007fffbd955000-0x00007fffbd96a000 prot=0x0000000000000005

pc: 0x00007fffbd955000
<Application security-linux.trampoline (16110). Internal Error Internal DynamoRIO Error: /home/chrome-bot/dr/dynamorio/core/linux/os.c:7703 found

The page we are searching for seems to be part of a zero-length VMA.

IMO the correct solution is not to downgrade this assert. We should improve our maps file parser to understand this.

Summary: ASSERT os.c: found: /proc/pid/maps contains a zero-length VMA

@derekbruening
Copy link
Contributor Author

From [email protected] on November 04, 2011 10:27:13

issue #366 supposedly added support for zero-length entries: why doesn't it solve this one?

@derekbruening
Copy link
Contributor Author

From [email protected] on November 04, 2011 10:41:12

Because the PC being looked up is the empty region. See the "pc: 0x..." line. r610 skips the region, so we can't find it. Maybe the solution is to do something that isn't skipping it, like merge it into the next.

In this case, it looks like this is the stack guard page. The kernel has code in /fs/proc/vm_task.c that tries to hide the guard page, so our mprotect of the entire stack region may be interacting with that in some weird way.

I did search the tracker before filing, but I'm frustrated that I didn't find this issue. :(

@derekbruening
Copy link
Contributor Author

From [email protected] on November 04, 2011 13:42:15

This issue was closed by revision r1131 .

Status: Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant