-
Notifications
You must be signed in to change notification settings - Fork 570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRASH non-det running C++11 threaded app on Android #1931
Comments
On Sat, Apr 30, 2016 at 12:25 PM, Caroline Trippel wrote:
You're not actually disassembling the instructions there. Please examine the instructions, machine context, and get the corresponding lines from /proc/self/maps. Use something like "x/15i $pc" and "info reg" in gdb when at the fault. Do not expect source-level debugger commands to work for the code cache or other locations.
There are no messages about a fault in the logs, which is odd. Both thread logs are truncated, and there is a clone call for a 3rd thread from the initial thread but no log or any messages in the global log for the new thread 9289. The debugger will have to be used to examine the fault.
Please disassemble (x/15i $pc), get the register values (info reg), get the maps lines near $pc.
Ditto.
0x2a232d06 was the address from the last run, so I'm not sure why it's being examined here. Please run "x/15i $pc" and "info reg" and acquire the /proc//maps lines near $pc (which is 0x2a21118e). |
This is rather strange, the fault happening without DR's signal handler catching it: it must be during thread init with the stack messed up and the alt signal stack not set up yet. Just to check other possibilities, are you out of disk space and that's why the logs were truncated in the faulting run? |
Here is some new gdb output where I have tried to include the information you requested. On the host side (Anroid 5.0), I am running ./gdbserver ./drrun -debug -loglevel 4 -- ./apps/cppthread. Right now, the SIGTRAP signal is happening consistencly at 0x2020f5f8, and the SIGSEGV at 0x2a232d06. However, although I have plenty of disk space, the output logs are not being created when running under gdb. When running without gdb, I get the output provided below as well as the attached log files.
|
So the crash is in the DR library itself:
Please use add-symbol-file to tell gdb about libdynamorio.so and then get the function and line number of the fault address. This is your own build so there's no way for us to translate the address. Please get the callstack as well (after adding the DR symbols). |
I tried adding the symbol table and was still unable to get the necessary information on the fault address. It seems that there was no ./debug/linker binary found. In an attempt to fix this I created a debug linker file with the commands:
This did not work either. Do you see something that I am missing?
|
The symbols for libdynamorio.so need to be added after the exec, not while in the initial drrun process. Typically you would wait for the SIGSEGV and add the symbol file there. The linker symbols should not matter. |
OK, I see. Here's what happens when I load later after knowing the fault address:
|
The address given for libdynamorio.so should be the address of the .text section, not the address of the SIGSEGV. This will be printed out to stderr at startup in debug builds, including the whole add-symbol-file command so you can just paste it (modulo remote vs local paths for Android): see https://github.com/DynamoRIO/dynamorio/wiki/Debugging#loading-client-symbols. Or it can be obtained with:
|
In case it wasn't clear, I'm saying that the string_option_read_lock callstack is incorrect (no, gdb does not try to detect that you gave it the wrong address, so you get no error message) -- the .text address needs to be used to load the symbols. |
Crash call stack:
b6a05000-b6b02000 rw-p 00000000 00:00 0
|
The crash is in this code:
Which uses this offset:
Which is:
And we see our address here:
For ARM Linux we rely on the field after dtv to be NULL, and on Android we rely on unused space at the end of the mmap to be NULL, but with #1920 expanding the offset to the 2nd page, on Android 5 we now de-reference onto a 2nd page. So this seems to be a regression coming from #1920, and we'd expect the 6.1.0 release to work (though it sounds like that's not the case?). Xref #1936 if we want to get ambitious and try to solve both at once. |
Elaborating further: the first new thread got lucky and there happens to be a mapping on the subsequent page, so it does not crash (and probably reads a 0 too), which is why this only happened on the 2nd thread here and why it's non-deterministic in general. |
We now have confirmation that 6.1.0 does work, so the analysis above seems confirmed. |
Xref #1986 where we have TLS problems as well: perhaps there's some broad solution that would address Android, WSL, and corner-case Linux apps. |
To fix the offset I am putting in indirection of the tls offs through a variable. This makes the assembly more painful, and requires an initial value that returns 0 prior to proper initialization of the variable, but it seems workable. I do not have a good way to test it unfortunately without an Android 5 device. |
Please see prior email discussion at https://groups.google.com/forum/#!topic/dynamorio-users/eL1__o7m4AQ
Pasting from the most recent email there:
I'm going to look deeper into the log files now. I've put together a set of files containing the binaries I am currently working with as well as the log files produced from a series of runs. I'm detailing these runs below, and I reference files from the attached directory. Additionally, I am wondering if there any details pertaining to the 6.1-3 release that would be specific to a version of Android other than API 21. gdb doesn't seem to recognize the addresses of my segmentation faults has having meaningful instructions at them. I'm wondering if something is causing a branch to a random part of memory.
—————————————————————————————————————————————————————————————————————————
How I built NDK:
—————————————————————————————————————————————————————————————————————————
_See simple.cpp file_
—————————————————————————————————————————————————————————————————————————
How I compiled simple.cpp to cppthread and pushed to phone
Phone specs: Nexus 6, Quad-core, running Android 5.0
—————————————————————————————————————————————————————————————————————————
Running uninstrumented app on phone
—————————————————————————————————————————————————————————————————————————
Phone file hierarchy:
—————————————————————————————————————————————————————————————————————————
—————————————————————————————————————————————————————————————————————————
Running drrun with no client on cppthread
—————————————————————————————————————————————————————————————————————————
—————————————————————————————————————————————————————————————————————————
Now running with -debug -loglevel 4
Runs fine the first time…
*_See cppthread.9286.00000000/ directory_
—————————————————————————————————————————————————————————————————————————
Now running with -debug -loglevel 4 and gdb/gdbserver (This time does not create logs)
—————————————————————————————————————————————————————————————————————————
Running again with -debug -loglevel 4 and gdb… (This time does not create logs and fails at the same place as the previous)
—————————————————————————————————————————————————————————————————————————
Running again with -debug -loglevel 4 and gdb… (This time fails elsewhere and create logs)
*_See cppthread.9361.00000000/ directory_
—————————————————————————————————————————————————————————————————————————
The text was updated successfully, but these errors were encountered: