ASSERT missing memory region w/ start/stop API for changes after dr_app_setup #2037

derekbruening · 2016-10-19T20:42:52Z

# clients/bin64/tool.drcacheoff.burst_threads
pre-DR init
pre-DR start
pre-DR detach
<all_memory_areas is missing region 0x00007fb7c24a7000-0x00007fb7c2ca7000!>
<Application /work/dr/git/build_x64_dbg_tests/clients/bin64/tool.drcacheoff.burst_threads (23195).  Internal Error: DynamoRIO debug check failure: /work/dr/git/src/core/unix/memcache.c:446 false

The query is for the signal frame:

<all_memory_areas is missing region 0x00007ff1a2c62000-0x00007ff1a3462000!>

#1  0x00000000004fcd7c in internal_error (file=0x79e320 "/work/dr/git/src/core/unix/memcache.c", line=446, expr=0x79e79b "false")
    at /work/dr/git/src/core/utils.c:174
#2  0x00000000006eaca0 in memcache_query_memory (pc=0x7ff1a3460000 "", out_info=0x46d668d0) at /work/dr/git/src/core/unix/memcache.c:446
#3  0x00000000006ccf0b in query_memory_ex (pc=0x7ff1a3460000 "", out_info=0x46d668d0) at /work/dr/git/src/core/unix/os.c:8753
#4  0x00000000006ccf9a in get_memory_info (pc=0x7ff1a3460000 "", base_pc=0x0, size=0x0, prot=0x46d66924)
    at /work/dr/git/src/core/unix/os.c:8773
#5  0x00000000006d4f52 in copy_frame_to_stack (dcontext=0x46cd1c40, sig=12, frame=0x46d66ab8, sp=0x7ff1a3460878 "\375\005j", 
    from_pending=false) at /work/dr/git/src/core/unix/signal.c:2682
#6  0x00000000006e18b8 in sig_detach (dcontext=0x46cd1c40, frame=0x46d66ab8, detached=0x46d410f8)
    at /work/dr/git/src/core/unix/signal.c:6217
#7  0x00000000006e1ef9 in handle_suspend_signal (dcontext=0x46cd1c40, ucxt=0x46d66ac0, frame=0x46d66ab8)
    at /work/dr/git/src/core/unix/signal.c:6353

46d59000-46d5a000 r--p 00000000 00:00 0 
46d5a000-46d67000 rw-p 00000000 00:00 0                                  [stack:23277]
<...>
71412000-71436000 rw-p 00000000 00:00 0 
7ff1a1c5f000-7ff1a1c60000 ---p 00000000 00:00 0 
7ff1a1c60000-7ff1a2460000 rw-p 00000000 00:00 0                          [stack:23279]
7ff1a2460000-7ff1a2461000 ---p 00000000 00:00 0 
7ff1a2461000-7ff1a2c61000 rw-p 00000000 00:00 0 
7ff1a2c61000-7ff1a2c62000 ---p 00000000 00:00 0 
7ff1a2c62000-7ff1a3462000 rw-p 00000000 00:00 0 
7ff1a3462000-7ff1a3619000 r-xp 00000000 09:00 1313251                    /usr/lib64/libc-2.21.so

This is thread 23277. Why is it using 7ff1a2c62000-7ff1a3462000 as its
stack if the maps shows a different region as the stack? Sigaltstack? My
test doesn't set up an alt stack though.

(gdb) p *(thread_sig_info_t *)dcontext->signal_field
$4 = {
app_sigstack = {
ss_sp = 0x0,
ss_flags = 2,
ss_size = 0
},
sigstack = {
ss_sp = 0x46d59000,
ss_flags = 0,
ss_size = 57344
},
}

So the one labeled is in fact DR's alt stack and the missing region must be
the main app stack. How did our scan miss it? dr_app_setup() is prior to
thread creation is the reason I suppose and the start/stop API code never added any handling of later changes.

Should we do a full maps file scan on every dr_app_start?
Or we can try to lazily query maps file whenever our cache misses.
But the cache might have false positives, not just negatives, which will
mess us up.

Not sure the code supports repeated scans, might take a little work.
Need to throw out cache first, too.

The text was updated successfully, but these errors were encountered:

derekbruening · 2017-08-09T01:08:22Z

The solution of only supporting dr_app_setup_and_start() and dr_app_stop_and_cleanup() rather than having them split can still run into issues like this during attach if not-yet-attached threads are concurrently changing the address space.

derekbruening · 2017-08-09T01:13:18Z

Another assert like this is seen when running a test for #2601:

$ DYNAMORIO_OPTIONS="-msgbox_mask 12" LD_LIBRARY_PATH=lib64/debug suite/tests/bin/api.detach_spawn
<Starting application suite/tests/bin/api.detach_spawn (16269)>
<Initial options = -msgbox_mask 12 >
..............................................<thread exited while attaching>
<Application suite/tests/bin/api.detach_spawn (16269).  Internal Error: DynamoRIO debug check failure: core/unix/memcache.c:332 !exists || vmvector_overlap(all_memory_areas, start, end) || are_dynamo_vm_areas_stale()
(Error occurred @84 frags)

derekbruening · 2018-03-02T16:37:11Z

Looks like we don't update allmem in memcache_query_memory() after a miss
and lookup in the maps file! So we have a stale allmem from setup being split from start,
and we repeatedly have to perform maps lookups: that explains the heavy contention on
the allmem lock from holding it across maps lookups.

Adds a complete maps file walk to update the memcache on re-taking-over the process for dr_api_start. The memcache is cleared beforehand to avoid both false positives and negatives in later queries. This helps to solve issues with a gap between dr_app_setup() and dr_app_start(). Does not update the executable areas or module list: they are more difficult to re-walk, and existing lazy updates to those will suffice for now, with a low risk of false positives. Adds updating of the memcache on a query miss. Previously we would just continue to miss and walk the maps file every time. Tested manually by disabling the i#2114 change so that a signal does a query, adding signals to the burst_threads test, and calling dr_app_setup() before creating the test's threads, causing queries to miss when delivering signals. It is difficult to create a regression test for this as the consequences are performance degradations rather than correctness, and these degradations only really show up at scale with hundreds of threads whose missing stacks are queried at once with no caching. Fixes #2037

Refactors find_executable_vm_areas() to share its map entry skipping with the re-takeover re-walk from b06a702, but not its module list or executable area updates. This entry skipping for vmheap turns out to make a big performance difference when attaching. Removes individual updates to memcache for entries inside vmheap which were already bulk-added for the find_executable_vm_areas() walk. Issue: #2037

Fixes Mac build breakage from 8471d5c. Issue: #2037

Fixes a regression from #2037 8471d5c where the module re-walk hits a deadlock lock order violation if logging is enabled. Fixes #2934

derekbruening added OpSys-Linux Bug-Assert labels Oct 19, 2016

derekbruening self-assigned this Oct 19, 2016

This was referenced Mar 6, 2018

i#2037: update memcache on re-takeover and query miss #2870

Merged

app itimer disabled after detach #2871

Closed

derekbruening closed this as completed in #2870 Mar 7, 2018

derekbruening mentioned this issue Mar 19, 2018

i#2037 re-walk-maps: refactor to share main walk with re-walk #2887

Merged

derekbruening added a commit that referenced this issue Mar 24, 2018

i#2037 re-walk maps: fix Mac build

f83b6c4

Fixes Mac build breakage from 8471d5c. Issue: #2037

derekbruening mentioned this issue Mar 24, 2018

i#2037 re-walk maps: fix Mac build #2897

Merged

derekbruening added a commit that referenced this issue Mar 24, 2018

i#2037 re-walk maps: fix Mac build (#2897)

998da66

Fixes Mac build breakage from 8471d5c. Issue: #2037

derekbruening mentioned this issue Apr 16, 2018

deadlock violation from maps walk refactoring #2934

Closed

derekbruening added a commit that referenced this issue Apr 16, 2018

i#2934: fix module re-walk lock order when logging

22f4b3e

Fixes a regression from #2037 8471d5c where the module re-walk hits a deadlock lock order violation if logging is enabled. Fixes #2934

derekbruening mentioned this issue Apr 16, 2018

i#2934: fix module re-walk lock order when logging #2935

Merged

derekbruening added a commit that referenced this issue Apr 16, 2018

i#2934: fix module re-walk lock order when logging (#2935)

e62bdc0

Fixes a regression from #2037 8471d5c where the module re-walk hits a deadlock lock order violation if logging is enabled. Fixes #2934

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ASSERT missing memory region w/ start/stop API for changes after dr_app_setup #2037

ASSERT missing memory region w/ start/stop API for changes after dr_app_setup #2037

derekbruening commented Oct 19, 2016

derekbruening commented Aug 9, 2017

derekbruening commented Aug 9, 2017 •

edited

Loading

derekbruening commented Mar 2, 2018

ASSERT missing memory region w/ start/stop API for changes after dr_app_setup #2037

ASSERT missing memory region w/ start/stop API for changes after dr_app_setup #2037

Comments

derekbruening commented Oct 19, 2016

derekbruening commented Aug 9, 2017

derekbruening commented Aug 9, 2017 • edited Loading

derekbruening commented Mar 2, 2018

derekbruening commented Aug 9, 2017 •

edited

Loading