Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Private loader cannot find libc TLS for apps with lots of __thead data #834

Closed
derekbruening opened this issue Nov 28, 2014 · 4 comments
Closed

Comments

@derekbruening
Copy link
Contributor

From [email protected] on July 09, 2012 10:27:51

Linux x86_64

$ cat 1.cc
__thread long long x[100];

int main(void) {
x[0] = 42;
return 0;
}

$ g++ 1.cc -o 1
$ ./dr/build/bin64/drrun -- ./1
^C^C^C^ZKilled

Hangs up eating 100% cpu somewhere in SEGV handling (??)

#0 0x00000000710f13f2 in syscall_ready () from dr/build/lib64/release/libdynamorio.so
#1 0x000000007136d1a0 in ?? () from dr/build/lib64/release/libdynamorio.so
#2 0x0000000071077868 in write_lock (rw=0x0) at dr/core/utils.c:1181
#3 0x00000000710434b4 in synchronize_dynamic_options () at dr/core/options.c:1972
#4 0x0000000071075d8d in notify (priority=SYSLOG_CRITICAL, internal=, synch=true, substitution_num=,
prefix=, fmt=) at dr/core/utils.c:1842
#5 0x000000007110ae22 in abort_on_DR_fault (dcontext=0x75d86880, pc=0x7ffff7127d62 "H\213Rp\353\f\017\037\204",
signame=0x7112691b "SEGV", where=0x71124607 "unknown") at dr/core/linux/signal.c:3085
#6 0x000000007110ce9d in record_pending_signal (dcontext=0x75d86880, sig=11, ucxt=0x75e16c80, frame=0x75e16c78,
forged=, access_address=) at dr/core/linux/signal.c:3373
#7 0x000000007110d6fe in master_signal_handler_C (sig=11, siginfo=, ucxt=0x75e16c80, xsp=0x75e16c78 "\367\023\017q") at dr/core/linux/signal.c:4141
#8 0x00000000710f13f7 in client_int_syscall () from dr/build/lib64/release/libdynamorio.so
#9 0x0000000000000000 in ?? ()

Running it in GDB from the start:

(gdb) run
Starting program: 1

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7127d62 in *__GI___strcasecmp (s1=0x7fffffffdb10 "mscorsvr.dll", s2=0x72ae3798 "1") at strcasecmp.c:62
62 strcasecmp.c: No such file or directory.
(gdb) br __GI___strcasecmp
Breakpoint 1 at 0x7ffff7127d50: file strcasecmp.c, line 56.
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: 1

Breakpoint 1, ___GI___strcasecmp (s1=0x7fffffffdb10 "mscorsvr.dll", s2=0x40003798 "1") at strcasecmp.c:56
56 strcasecmp.c: No such file or directory.
(gdb) stepi
0x00007ffff7127d57 56 in strcasecmp.c
(gdb)
62 in strcasecmp.c
(gdb) disassemble
Dump of assembler code for function *__GI___strcasecmp:
0x00007ffff7127d50 <+0>: mov 0x2f61e9(%rip),%rax # 0x7ffff741df40
0x00007ffff7127d57 <+7>: mov %fs:(%rax),%rdx
=> 0x00007ffff7127d5b <+11>: xor %eax,%eax
0x00007ffff7127d5d <+13>: cmp %rsi,%rdi
0x00007ffff7127d60 <+16>: je 0x7ffff7127d90 <___GI___strcasecmp+64>
0x00007ffff7127d62 <+18>: mov 0x70(%rdx),%rdx
0x00007ffff7127d66 <+22>: jmp 0x7ffff7127d74 <___GI___strcasecmp+36>
0x00007ffff7127d68 <+24>: nopl 0x0(%rax,%rax,1)
0x00007ffff7127d70 <+32>: add $0x1,%rdi
0x00007ffff7127d74 <+36>: movzbl (%rdi),%r8d
0x00007ffff7127d78 <+40>: movzbl (%rsi),%ecx
0x00007ffff7127d7b <+43>: add $0x1,%rsi
0x00007ffff7127d7f <+47>: movzbl %r8b,%eax
0x00007ffff7127d83 <+51>: mov (%rdx,%rax,4),%eax
0x00007ffff7127d86 <+54>: sub (%rdx,%rcx,4),%eax
0x00007ffff7127d89 <+57>: jne 0x7ffff7127d90 <___GI___strcasecmp+64>
0x00007ffff7127d8b <+59>: test %r8b,%r8b
0x00007ffff7127d8e <+62>: jne 0x7ffff7127d70 <___GI___strcasecmp+32>
0x00007ffff7127d90 <+64>: repz retq
End of assembler dump.
(gdb) info reg
rax 0xfffffffffffffc58 -936
rbx 0x71362cf4 1899375860
rcx 0x7fffffffdb17 140737488345879
rdx 0x0 0
rsi 0x40003798 1073756056
rdi 0x7fffffffdb10 140737488345872
rbp 0x71362d00 0x71362d00
rsp 0x7fffffffdaf8 0x7fffffffdaf8 r8 0x6c 108 r9 0x0 0 r10 0xfffffffffffffff0 -16 r11 0x7ffff71f09e0 140737339394528 r12 0xc 12 r13 0x7fffffffdb10 140737488345872 r14 0x0 0 r15 0x40003798 1073756056
rip 0x7ffff7127d5b 0x7ffff7127d5b <___GI___strcasecmp+11>
eflags 0x246 [ PF ZF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x63 99
gs 0x6b 107
(gdb) bt
#0 *__GI___strcasecmp (s1=0x7fffffffdb10 "mscorsvr.dll", s2=0x40003798 "1") at strcasecmp.c:62
#1 0x0000000071074feb in check_filter_common (filter=,
short_name=0xfffffffffffffc58 <Address 0xfffffffffffffc58 out of bounds>, wildcards=)
at dr/core/utils.c:2375
#2 0x0000000071097187 in on_native_exec_list (modname=)
at dr/core/module_list.c:220
#3 check_and_mark_native_exec (ma=0x400a31e8, add=true)
at dr/core/module_list.c:241
#4 0x0000000071097706 in module_list_add (base=0x400000 "\177ELF\002\001\001", view_size=2105344, at_map=,
filepath=, inode=10751273)
at dr/core/module_list.c:325
#5 0x00000000711060e5 in find_executable_vm_areas ()
at dr/core/linux/os.c:7689
#6 0x0000000071050755 in dynamorio_app_init () at dr/core/dynamo.c:592
#7 0x00007ffff7bd2717 in _init () at dr/core/linux/preload.c:186
#8 0x00007ffff7debd65 in call_init (env=, argv=, argc=, l=)
at dl-init.c:70
#9 _dl_init (main_map=0x7ffff7ffe128, argc=1, argv=0x7fffffffdec8, env=0x7fffffffded8) at dl-init.c:134
#10 0x00007ffff7dddb2a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#11 0x0000000000000001 in ?? ()
#12 0x00007fffffffe1d5 in ?? ()
#13 0x0000000000000000 in ?? ()
(gdb)

Note that on another occassion (with a slightly different test program that I can not now reproduce) %rdx contained 0xabababababababab instead of 0.

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=834

@derekbruening
Copy link
Contributor Author

From [email protected] on July 09, 2012 07:49:27

I can reproduce. I recently changed this code.

Passing -no_native_exec to drrun avoids this particular piece of code as a work around, but I worry about the fundamental issue of using str* routines from libc. They always want to access TLS to get the locale.

Owner: [email protected]

@derekbruening
Copy link
Contributor Author

From [email protected] on July 09, 2012 08:03:56

The hang is because we're trying to fail an assert from the signal handler. This should work fine, but then we call synchronize_dynamic_options() and it starts the hang. gdb won't let me interrupt the program after that and it doesn't receive any SIGSEGV interrupts.

@derekbruening
Copy link
Contributor Author

From [email protected] on July 09, 2012 09:40:31

Some background: DR calls libc, and we need to ensure that libc doesn't interfere with the app's state in TLS. Therefore, we change the segment base to point to our own TLS space, and mangle all segment accesses in the code cache to point to the original base.

When we create our own TLS space, we copy the data in the app's current space. The offsets of the data are not part of glibc's public interface, so we have to guess where glibc put its data. Currently we copy 0x200 bytes before the segment base to try to get libc's thread local data, and 0x900 bytes after the base to get the "thread control block".


libc __thread vars, locale, _res, malloc arena, etc
------- <--- fs/gs point here
thread control block, used by pthreads, ld.so, and others

The problem is that if the app has these __thread vars, the loader installs them first, and libc's data is pushed back further from the thread pointer. In this case, the locale data that strcasecmp is trying to access is at -0x388.


libc __thread vars, locale, _res, malloc arena, etc

app's __thread vars, in this case 0x190 bytes of it
------- <--- fs/gs point here
thread control block, used by pthreads, ld.so, and others

One solution is to get more friendly with ld.so and to introspect its data structures to find the offset of libc's TLS from the segment base. This is obviously fragile.

Another is just to bump up APP_LIBC_TLS_SIZE. So long as the app doesn't use too much __thread data, this will work.

Summary: Private loader cannot find libc TLS for apps with lots of __thead data
Labels: OpSys-Linux

@derekbruening
Copy link
Contributor Author

From [email protected] on July 11, 2012 10:36:19

OK, there's a work around in r1452 , and more libc isolation ( issue #46 ) coming up in https://codereview.appspot.com/6344097/ .

Status: Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant