Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASSERT dispatch.c:766 ks->depth > 2 && "stop_not_matching not allowed to clear kstack" #1140

Closed
derekbruening opened this issue Nov 28, 2014 · 12 comments

Comments

@derekbruening
Copy link
Contributor

From [email protected] on April 09, 2013 13:32:19

running startstop, in the main thread on entering dispatch on
dr_app_start().

05 00000000bfe7eb70 0000000015150141 dynamorio!internal_error+0x190 [d:\derek\dr\git\src\core\utils.c @ 190]
06 00000000bfe7ecf0 000000001514bf59 dynamorio!dispatch_enter_dynamorio+0xa61 [d:\derek\dr\git\src\core\dispatch.c @ 766]
07 00000000bfe7ee70 0000000015360820 dynamorio!dispatch+0x19 [d:\derek\dr\git\src\core\dispatch.c @ 147]
08 00000000bfe7efe0 00000000bfe61080 dynamorio!call_switch_stack+0x46 [D:\derek\dr\git\build_x64_dbg_tests\core\CMakeFiles\dynamorio.dir\x86\x86.asm.obj.s @ 1041]

when we execute dr_app_stop() and go native we run this code:
dispatch_at_stopping_point():
KSTOP_NOT_MATCHING(dispatch_num_exits);

dr_app_start() should hit this in dispatch_enter_dynamorio():
KSTART_DC(dcontext, dispatch_num_exits);

so it thinks it's not the first entrance to DR, and thus dispatch_num_exits
should be on the kstats stack. but it is the first entrance.

  wherewasi = WHERE_TRAMPOLINE (0n6)

0:000> ln @@(dcontext->last_exit)
(000000001546f81c) dynamorio!linkstub_native_exec_syscall | (000000001546f820) dynamorio!linkstub_shared_syscall_trace_fragment
Exact matches:
dynamorio!linkstub_native_exec_syscall = struct _linkstub_t
0:000> ?? dcontext->next_tag
unsigned char * 0xffffffff`ffffffff
0:000> ?? dcontext->thread_record->under_dynamo_control
char 0n0 ''

=> we came from a syscall hook that intends to stay native.

normally, native_exec gateway xfer to native happens from code cache, so we
have dispatch_num_exits on kstack. but dr_app_stop() pops the kstack.

easiest solution: have entering_native() pop kstack

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=1140

@derekbruening
Copy link
Contributor Author

From [email protected] on April 10, 2013 10:57:39

This issue was closed by revision r2044 .

Status: Fixed

@derekbruening
Copy link
Contributor Author

From [email protected] on April 12, 2013 08:26:02

reopen it as it still fail on api.startstop on my machine.

Status: Started

@derekbruening
Copy link
Contributor Author

From [email protected] on April 12, 2013 09:13:11

api.startstop calls to print function, which trigger the memory allocation system call ntdll!NtAllocateVirtualMemory

The control is then transferred to dispatch with dcontext->last_exit pointing to the linkstub_native_exec_syscall
0:000> ?? dcontext->last_exit
struct _linkstub_t * 0x5eefec34
0:000> ?? &linkstub_native_exec_syscall
struct _linkstub_t * 0x5eefec34
0:000> ?? dcontext->next_tag
unsigned char * 0xffffffff
""
0:000> ?? dcontext->thread_record->under_dynamo_control
char 0 ''

=> we came from a syscall hook that intends to stay native.

The execution continue to handle_system_call, then
enter_fcache(dcontext, fcache_enter, do_syscall);
where the target is
241624c0 64ff15c0000000 call dword ptr fs:[000000c0]
starts stepping
then
74602320 ea1e2760743300 jmp 0033:7460271e
then
ntdll!ZwQueryInformationProcess+0x12:
7700faca 83c404 add esp,0x4
then
241624c7 a35cbd0c24 mov [240cbd5c],eax ds:002b:240cbd5c=00000015
I am using the stepping, but not sure how the control transferred above.
Looks like the jmp should go to syscall instr, and then return to the caller.

then after the context switch
241620f8 6a00 push 0x0
241620fa 9d popfd
241620fb a3a0be0c24 mov [240cbea0],eax
24162100 6840bd0c24 push 0x240cbd40
24162105 e8f62bb33a call dynamorio!dispatch (5ec94d00)

The dcontext->last_exit is updated by
241620fb a3a0be0c24 mov [240cbea0],eax
and the control goes back to dispatch.

in dispatch, dispatch_enter_dynamorio is called, and the assert is fired:
if (dcontext->last_exit != get_native_exec_linkstub() &&
dcontext->last_exit != get_native_exec_syscall_linkstub())
KSTOP_NOT_MATCHING(dispatch_num_exits);

@derekbruening
Copy link
Contributor Author

From [email protected] on April 12, 2013 09:19:00

you're stepping through the wow64 "syscall": you must be using a 64-bit debugger

the syscall enter-exit sequence should be ok b/c it should go through regular code that does KSTART on dispatch_num_exits so stack should not be empty when it comes back out. on a go-native sequence comes out empty (b/c it may not come out: so it clears kstack).

@derekbruening
Copy link
Contributor Author

From [email protected] on April 12, 2013 11:39:10

On calling ntdll!NtAllocateVirtualMemory, the control is transferred to dispatch.
There the KSTAT depth is 2:
0:000> ?? dcontext->thread_kstats->stack_kstats
struct kstat_stack_t
+0x000 depth : 2
+0x008 last_start_time : 0x1e859a134be03 +0x010 last_end_time : 0x1e859a134be03
+0x018 node : [16] kstat_node_t

the value is decreased at dispatch_exit_fcache_stats:
else if (dcontext->last_exit == get_native_exec_syscall_linkstub()) {
LOG(THREAD, LOG_DISPATCH, 2, "Exit from native_exec syscall trampoline\n");
STATS_INC(num_exits_native_exec_syscall);
/* may be a quite large kstat count */
KSWITCH_STOP_NOT_PROPAGATED(native_exec_fcache);
return;
}

0:000> ?? dcontext->thread_kstats->stack_kstats
struct kstat_stack_t
+0x000 depth : 1
+0x008 last_start_time : 0x1e8a14e262c38 +0x010 last_end_time : 0x1e8a14e262c38
+0x018 node : [16] kstat_node_t
0:000> kn

ChildEBP RetAddr

00 2546ee2c 5ec97f22 dynamorio!dispatch_exit_fcache_stats+0x2624 [core\dispatch.c @ 1207]
01 2546ef18 5ec94d14 dynamorio!dispatch_enter_dynamorio+0x7a2 [core\dispatch.c @ 765]

then the value is changed to 2 at dispatch_enter_dynamorio
/* KSWITCHed next time around for a better explanation */
KSTART_DC(dcontext, dispatch_num_exits);

Later the the depth changes in pairs, i.e. START/STOP and finally end up in the assert, where depth is 2.

@derekbruening
Copy link
Contributor Author

From [email protected] on April 12, 2013 12:29:08

normal fcache entry does:
/* KSWITCHed next time around for a better explanation /
KSTART_DC(dcontext, dispatch_num_exits);
and then:
KSTART(fcache_default); /
fcache_bb_bb or fcache_bb_trace */
and then on coming back to dispatch, dispatch_exit_fcache_stats() does:
KSWITCH_STOP_NOT_PROPAGATED(fcache_default);
and then something like this also in dispatch_exit_fcache_stats():
KSWITCH(num_exits_dir_miss);
and then in dispatch_enter_dynamorio():
KSTOP_NOT_MATCHING(dispatch_num_exits);

both pops are avoided if we came from WHERE_APP
i.e., 1st time in
that's why dr_app_stop() pops dispatch_num_exits
never pushed 2nd one, so kstack clear now

which is fine if we next come to dispatch for dr_app_start
but if we come for n_e_syscall, that's where issue #1140 hits

dispatch_enter_native() for BACK_TO_NATIVE_AFTER_SYSCALL does:
/* restore state as though never came out for syscall */
KSTART_DC(dcontext, fcache_default);

my fix seems incomplete: must have thought native_exec_syscalls went
through entering_native(), but it does not.

one way to fix for real:
check depth and if empty push both on in syscall_while_native?
then revert the conditional on the kstop

Owner: [email protected]

@derekbruening
Copy link
Contributor Author

From [email protected] on April 12, 2013 15:07:07

better fix:
good to have kstats on time spent native, so let's always have them, and
who cares if left there on stop that never comes back.
=> revert my diff, remove the stop from dispatch_at_stopping_point(), and
have the dispatch_enter_native() do KSTART_DC(dcontext, fcache_default) for
both cases and not just BACK_TO_NATIVE_AFTER_SYSCALL

@derekbruening
Copy link
Contributor Author

From [email protected] on April 13, 2013 20:42:26

still have problems.
When enter native, we set dcontext->whereami to be WHERE_APP.
In dispatch_enter_dynamorio, WHERE_APP is used as check if it is the first entrance.
So in api.startstop test, whenever we enter via start api, we did not stop dispatch_num_exits, but start a new dispatch_num_exits.

@derekbruening
Copy link
Contributor Author

From [email protected] on April 13, 2013 20:57:37

Why do we have code in dispatch_enter_dynamorio like this:

if (wherewasi == WHERE_APP) { /* first entrance */
   ...
} else {
   ...
   KSTOP_NOT_MATCHING(dispatch_num_exits);
}
/* KSWITCHed next time around for a better explanation */
KSTART_DC(dcontext, dispatch_num_exits);

It means dispatch_num_exits is start almost all the time.

@derekbruening
Copy link
Contributor Author

From [email protected] on April 07, 2014 11:45:16

Labels: Component-Tests

@derekbruening
Copy link
Contributor Author

From [email protected] on April 08, 2014 10:47:13

taking over

after fixing issue #813 I fixed this one too:

what I ended up doing was adding a check on KSWITCH_STOP_NOT_PROPAGATED for
get_native_exec_syscall_linkstub() for being native due to start/stop,
which is different from native_exec native as start/stop has the kstack
cleared.

Owner: [email protected]

@derekbruening
Copy link
Contributor Author

From [email protected] on April 08, 2014 12:12:59

This issue was closed by revision r2640 .

Status: Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant