-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ASSERT missing memory region w/ start/stop API for changes after dr_app_setup #2037
Comments
The solution of only supporting dr_app_setup_and_start() and dr_app_stop_and_cleanup() rather than having them split can still run into issues like this during attach if not-yet-attached threads are concurrently changing the address space. |
Another assert like this is seen when running a test for #2601:
|
Looks like we don't update allmem in memcache_query_memory() after a miss |
Adds a complete maps file walk to update the memcache on re-taking-over the process for dr_api_start. The memcache is cleared beforehand to avoid both false positives and negatives in later queries. This helps to solve issues with a gap between dr_app_setup() and dr_app_start(). Does not update the executable areas or module list: they are more difficult to re-walk, and existing lazy updates to those will suffice for now, with a low risk of false positives. Adds updating of the memcache on a query miss. Previously we would just continue to miss and walk the maps file every time. Tested manually by disabling the i#2114 change so that a signal does a query, adding signals to the burst_threads test, and calling dr_app_setup() before creating the test's threads, causing queries to miss when delivering signals. It is difficult to create a regression test for this as the consequences are performance degradations rather than correctness, and these degradations only really show up at scale with hundreds of threads whose missing stacks are queried at once with no caching. Fixes #2037
Adds a complete maps file walk to update the memcache on re-taking-over the process for dr_api_start. The memcache is cleared beforehand to avoid both false positives and negatives in later queries. This helps to solve issues with a gap between dr_app_setup() and dr_app_start(). Does not update the executable areas or module list: they are more difficult to re-walk, and existing lazy updates to those will suffice for now, with a low risk of false positives. Adds updating of the memcache on a query miss. Previously we would just continue to miss and walk the maps file every time. Tested manually by disabling the i#2114 change so that a signal does a query, adding signals to the burst_threads test, and calling dr_app_setup() before creating the test's threads, causing queries to miss when delivering signals. It is difficult to create a regression test for this as the consequences are performance degradations rather than correctness, and these degradations only really show up at scale with hundreds of threads whose missing stacks are queried at once with no caching. Fixes #2037
Refactors find_executable_vm_areas() to share its map entry skipping with the re-takeover re-walk from b06a702, but not its module list or executable area updates. This entry skipping for vmheap turns out to make a big performance difference when attaching. Removes individual updates to memcache for entries inside vmheap which were already bulk-added for the find_executable_vm_areas() walk. Issue: #2037
Refactors find_executable_vm_areas() to share its map entry skipping with the re-takeover re-walk from b06a702, but not its module list or executable area updates. This entry skipping for vmheap turns out to make a big performance difference when attaching. Removes individual updates to memcache for entries inside vmheap which were already bulk-added for the find_executable_vm_areas() walk. Issue: #2037
The query is for the signal frame:
This is thread 23277. Why is it using 7ff1a2c62000-7ff1a3462000 as its
stack if the maps shows a different region as the stack? Sigaltstack? My
test doesn't set up an alt stack though.
(gdb) p *(thread_sig_info_t *)dcontext->signal_field
$4 = {
app_sigstack = {
ss_sp = 0x0,
ss_flags = 2,
ss_size = 0
},
sigstack = {
ss_sp = 0x46d59000,
ss_flags = 0,
ss_size = 57344
},
}
So the one labeled is in fact DR's alt stack and the missing region must be
the main app stack. How did our scan miss it? dr_app_setup() is prior to
thread creation is the reason I suppose and the start/stop API code never added any handling of later changes.
Should we do a full maps file scan on every dr_app_start?
Or we can try to lazily query maps file whenever our cache misses.
But the cache might have false positives, not just negatives, which will
mess us up.
Not sure the code supports repeated scans, might take a little work.
Need to throw out cache first, too.
The text was updated successfully, but these errors were encountered: