From be68fccf5bc5574155a0a972cb8136c59d2bced3 Mon Sep 17 00:00:00 2001 From: Derek Bruening Date: Mon, 29 Jun 2020 10:10:13 -0400 Subject: [PATCH 1/7] i#803: Cross-arch Windows injection Adds a long-missing feature: following into a Windows child process of a different bitwidth. Switches injection from DR and from drinjectlib (including drrun and drinject) to use -early_inject_map. This was most easily done by turning on -early_inject by default as well. However, the -early_inject_location default is INJECT_LOCATION_ImageEntry, which is the same late takeover point as with thread injection. Switching all injection over to map-from-the-parent simplifies cross-arch following, as well as making it easier to shift the takeover point to an earlier spot in the future. This is a step toward #607 by switching drinjectlib to use map injection; the takeover point, as mentioned, is still the image entry. Adds an -inject_x64 option to inject a 64-bit DR lib into a 32-bit child from a 64-bit parent, but this option is only sketched out and is not fully supported yet: #49 covers adding tests and official support. Adds library swapping code to find the other-bitwidth library, which assumes a parallel directory structure. Add a new fatal error if the library for a child is not found. To support generating code for all 3 child-parent cases (same-same, 32-64, and 64-32), and in particular for 32-64, switches the small gencode sequence for -early_inject_map from using IR to using raw bytes. A multi-arch encoder (#1684) would help but we would need cross-bitwidth support there, which is not on the horizon. Fixes what look like bugs in the original gencode generation along the way (s/pc/cur_local_pos/ and s/local_code_buf/remote_code_buf/): it's not clear how it worked before. Adds support for several system calls from a 32-bit parent to a 64-bit child where the desired NtWow64* system call does not exist. We use switch_modes_and_call() for NtProtectVirtualMemory and NtQueryVirtualMemory. Changes all types in the injection code to handle 64-bit addresses in 32-bit code. Adds UNICODE_STRING_32 and RTL_USER_PROCESS_PARAMETERS_32 for handling 32-bit structures from 64-bit parents. Similarly, adds RTL_USER_PROCESS_PARAMETERS_64 and PROCESS_BASIC_INFORMATION64. Adds get_process_imgname_cmdline() capability for 64-bit remote from 32-bit. Adds get_remote_proc_address() and uses it to look up dynamorio_earliest_init_takeover() in a child DR. Finds the remote ntdll base via a remote query memory walk plus remote image header parsing. This requires adding a switch_modes_and_call() version of NtQueryVirtualMemory (also mentioned above), which needs 64-bit args: so we refactor switch_modes_and_call() to take in a struct of all 64-bit fields for the args. Fixes a few bugs in other routines to properly get the image name and image entry for 32-bit children of 64-bit parents. Updates environment variable propagation code to handle a 32-bit parent and a 64-bit child. Updates a 64-bit parent and 32-bit child to insert the variables into the 32-bit PEB (64-bit does no good), which requires finding the 32-bit PEB. This is done via the 32-bit TEB, using a hack due to what seems like a kernel bug where it has the TebBaseAddress 0x2000 too low. Makes environment variable propagation failures fatal and visible, unlike previously where errors would just result in silently letting the child run natively. Turns some other prior soft errors into fatal errors on child takeover. Moves environment variable propagation to post-CreateUserProcess instead of waiting for ResumeThread, which avoids having to get the thread context (for which we have no other-bitwidth support) to figure out whether it's the first thread in the process or not. We bail on propagation for pre-Vista where we'd have to wait for ResumeThred. Generalizes the other-bitwidth Visual Studio toolchain environment variable setting for use in a new build-and-test other-bitwidth test which builds dynamorio and the large_options client (to ensure options are propagated to children; and it has convenient init and exit time prints) for the other bitwidth, arranges parallel lib dirs, and runs the other client Issue: #803, #147, #607, #49 Fixes #803 --- core/CMakeLists.txt | 8 +- core/arch/arch.c | 2 +- core/drlibc/drlibc.h | 19 +- core/drlibc/drlibc_x86.asm | 51 ++-- core/module_shared.h | 4 +- core/optionsx.h | 41 +-- core/os_shared.h | 5 +- core/vmareas.c | 9 +- core/win32/events.mc | 18 +- core/win32/inject.c | 423 ++++++++++++++++++------------ core/win32/inject_shared.c | 82 +++--- core/win32/injector.c | 16 +- core/win32/module.c | 4 +- core/win32/module_shared.c | 301 ++++++++++++++++++--- core/win32/ntdll.c | 77 +++++- core/win32/ntdll.h | 98 +++++-- core/win32/ntdll_shared.c | 140 ++++++++-- core/win32/ntdll_shared.h | 31 ++- core/win32/ntdll_types.h | 10 + core/win32/os.c | 99 +++++-- core/win32/os_private.h | 16 +- core/win32/pre_inject.c | 11 +- core/win32/syscall.c | 413 +++++++++++++++++------------ suite/runsuite_common_pre.cmake | 210 +++++++++------ suite/tests/CMakeLists.txt | 75 +++++- suite/tests/win32/xarch.templatex | 20 ++ 26 files changed, 1551 insertions(+), 632 deletions(-) create mode 100644 suite/tests/win32/xarch.templatex diff --git a/core/CMakeLists.txt b/core/CMakeLists.txt index 543bd426f19..3bcf74ff6a9 100644 --- a/core/CMakeLists.txt +++ b/core/CMakeLists.txt @@ -1,5 +1,5 @@ # ********************************************************** -# Copyright (c) 2010-2020 Google, Inc. All rights reserved. +# Copyright (c) 2010-2021 Google, Inc. All rights reserved. # Copyright (c) 2009-2010 VMware, Inc. All rights reserved. # ********************************************************** @@ -399,8 +399,8 @@ else (UNIX) win32/syscall.c win32/callback.c win32/drmarker.c - win32/ntdll.c win32/ntdll_shared.c + win32/ntdll.c win32/inject.c win32/inject_shared.c win32/module.c @@ -424,8 +424,8 @@ else (UNIX) ) set(PRELOAD_SRCS win32/pre_inject.c - win32/ntdll.c win32/ntdll_shared.c + win32/ntdll.c win32/inject_shared.c win32/drmarker.c ${preinject_asm_src} @@ -439,10 +439,10 @@ else (UNIX) set(INJECTOR_SRCS win32/injector.c win32/inject.c - win32/ntdll.c win32/inject_shared.c win32/module_shared.c win32/ntdll_shared.c + win32/ntdll.c win32/resources.rc config.c win32/os.c diff --git a/core/arch/arch.c b/core/arch/arch.c index bd7762fc99d..19ae45006e4 100644 --- a/core/arch/arch.c +++ b/core/arch/arch.c @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2010-2020 Google, Inc. All rights reserved. + * Copyright (c) 2010-2021 Google, Inc. All rights reserved. * Copyright (c) 2000-2010 VMware, Inc. All rights reserved. * **********************************************************/ diff --git a/core/drlibc/drlibc.h b/core/drlibc/drlibc.h index d223692ca17..04e3631e991 100644 --- a/core/drlibc/drlibc.h +++ b/core/drlibc/drlibc.h @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2011-2020 Google, Inc. All rights reserved. + * Copyright (c) 2011-2021 Google, Inc. All rights reserved. * Copyright (c) 2000-2010 VMware, Inc. All rights reserved. * **********************************************************/ @@ -131,6 +131,23 @@ find_script_interpreter(OUT script_interpreter_t *result, IN const char *fname, */ void d_r_set_ss_selector(); + +typedef struct { + uint64 func; + uint64 arg1; + uint64 arg2; + uint64 arg3; + uint64 arg4; + uint64 arg5; + uint64 arg6; +} invoke_func64_t; + +/* Switches from 32-bit mode to 64-bit mode and invokes func, passing + * arg1, arg2, arg3, arg4, and arg5. Works fine when func takes fewer + * than 5 args as well. + */ +int +switch_modes_and_call(invoke_func64_t *info); #endif #endif /* _DR_LIBC_H_ */ diff --git a/core/drlibc/drlibc_x86.asm b/core/drlibc/drlibc_x86.asm index 9ca1a5f38f8..20972d39864 100644 --- a/core/drlibc/drlibc_x86.asm +++ b/core/drlibc/drlibc_x86.asm @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2011-2020 Google, Inc. All rights reserved. + * Copyright (c) 2011-2021 Google, Inc. All rights reserved. * Copyright (c) 2001-2010 VMware, Inc. All rights reserved. * ********************************************************** */ @@ -530,20 +530,19 @@ GLOBAL_LABEL(FUNCNAME:) END_FUNC(FUNCNAME) /* - * int switch_modes_and_call(uint64 func, void *arg1, void *arg2, void *arg3) + * int switch_modes_and_call(invoke_uint64_t *args) */ # undef FUNCNAME # define FUNCNAME switch_modes_and_call DECLARE_FUNC(FUNCNAME) GLOBAL_LABEL(FUNCNAME:) - mov eax, esp /* get.. */ - add eax, ARG_SZ /* ...address of func */ - mov ecx, ARG3 /* arg1 */ - mov edx, ARG4 /* arg2 */ - /* save callee-saved registers */ + mov eax, ARG1 + /* Save callee-saved registers. */ push ebx - mov ebx, ARG6 /* really ARG5==arg3, but we have 1 push */ - /* far jmp to next instr w/ 64-bit switch: jmp 0033: */ + push esi + push edi + push ebp + /* Far jmp to next instr w/ 64-bit switch: jmp 0033:. */ RAW(ea) DD offset smc_transfer_to_64 DB CS64_SELECTOR @@ -552,26 +551,34 @@ smc_transfer_to_64: /* Below here is executed in 64-bit mode, but with guarantees that * no address is above 4GB, as this is a WOW64 process. */ - /* save WOW64 state */ + /* Save WOW64 calee-saved registers. */ RAW(41) push esp /* push r12 */ RAW(41) push ebp /* push r13 */ RAW(41) push esi /* push r14 */ RAW(41) push edi /* push r15 */ - RAW(44) mov eax, ebx /* mov arg3 from ebx to r8d (3rd arg slot) */ - /* align the stack pointer */ + /* Align the stack pointer. */ mov ebx, esp /* save esp in callee-preserved reg */ - sub esp, 32 /* call conv */ and esp, HEX(fffffff0) /* align to 16-byte boundary */ - /* arg1 is already in rcx, arg2 in rdx, and arg3 now in r8 */ - RAW(48) mov eax, DWORD [eax] /* mov rax, qword ptr [rax] */ + /* Set up args on the stack. */ + RAW(48) mov ecx, DWORD [eax + 8*6] /* load args.arg6 */ + push ecx /* push args.arg6 */ + RAW(48) mov ecx, DWORD [eax + 8*5] /* load args.arg5 */ + push ecx /* push args.arg5 */ + sub esp, 32 /* Leave slots for args 1-4. */ + /* arg1 is already in rcx, arg2 in rdx, arg3 in r8, arg4 in r9 */ + RAW(4c) mov ecx, DWORD [eax + 8*4] /* load args.arg4 into r9 */ + RAW(4c) mov eax, DWORD [eax + 8*3] /* load args.arg3 into r8 */ + RAW(48) mov edx, DWORD [eax + 8*2] /* load args.arg2 into rdx */ + RAW(48) mov ecx, DWORD [eax + 8*1] /* load args.arg1 into rcx */ + RAW(48) mov eax, DWORD [eax] /* load args.func into rax */ call eax /* call rax */ - mov esp, ebx /* restore esp */ - /* restore WOW64 state */ + mov esp, ebx /* restore rsp */ + /* Restore WOW64 callee-saved regs. */ RAW(41) pop edi /* pop r15 */ RAW(41) pop esi /* pop r14 */ RAW(41) pop ebp /* pop r13 */ RAW(41) pop esp /* pop r12 */ - /* far jmp to next instr w/ 32-bit switch: jmp 0023: */ + /* Far jmp to next instr w/ 32-bit switch: jmp 0023:. */ push offset smc_return_to_32 /* 8-byte push */ mov dword ptr [esp + 4], CS32_SELECTOR /* top 4 bytes of prev push */ jmp fword ptr [esp] @@ -586,8 +593,12 @@ smc_return_to_32: */ mov ebx, DWORD SYMREF(d_r_ss_value) mov ss, ebx - pop ebx /* restore callee-saved reg */ - ret /* return value already in eax */ + /* Restore callee-saved regs. */ + pop ebp + pop edi + pop esi + pop ebx + ret /* return value already in eax */ END_FUNC(FUNCNAME) #endif /* WINDOWS && !X64 */ diff --git a/core/module_shared.h b/core/module_shared.h index 1bb98acb115..80446e8a602 100644 --- a/core/module_shared.h +++ b/core/module_shared.h @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2011-2020 Google, Inc. All rights reserved. + * Copyright (c) 2011-2021 Google, Inc. All rights reserved. * Copyright (c) 2008-2010 VMware, Inc. All rights reserved. * **********************************************************/ @@ -376,7 +376,7 @@ get_proc_address_resolve_forward(module_base_t lib, const char *name); #endif /* WINDOWS */ #ifdef WINDOWS -void * +uint64 get_remote_process_entry(HANDLE process_handle, OUT bool *x86_code); #endif diff --git a/core/optionsx.h b/core/optionsx.h index e3bc1e3a2ed..70de038385c 100644 --- a/core/optionsx.h +++ b/core/optionsx.h @@ -1,5 +1,5 @@ /* ******************************************************************************* - * Copyright (c) 2010-2020 Google, Inc. All rights reserved. + * Copyright (c) 2010-2021 Google, Inc. All rights reserved. * Copyright (c) 2011 Massachusetts Institute of Technology All rights reserved. * Copyright (c) 2003-2010 VMware, Inc. All rights reserved. * *******************************************************************************/ @@ -566,6 +566,11 @@ OPTION_DEFAULT_INTERNAL(bool, mangle_app_seg, "mangle application's segment usage.") #endif /* X86 */ #ifdef X64 +# ifdef WINDOWS +/* TODO i#49: This option is still experimental and is not fully tested/supported yet. */ +OPTION_DEFAULT(bool, inject_x64, false, + "Inject 64-bit DynamoRIO into 32-bit child processes.") +# endif OPTION_COMMAND(bool, x86_to_x64, false, "x86_to_x64", { /* i#1494: to avoid decode_fragment messing up the 32-bit/64-bit mode, @@ -1694,10 +1699,12 @@ DYNAMIC_OPTION_DEFAULT( * even _init is run needs to have a non-early default. * Thus we turn this on in privload_early_inject. */ -OPTION_COMMAND(bool, early_inject, - IF_WINDOWS_ELSE - /* i#980: too early for kernel32 so we disable */ - (IF_CLIENT_INTERFACE_ELSE(false, true), false), +/* On Windows this does *not* imply early injection anymore: it just enables control + * over where to inject via a hook and alternate injection methods, rather than using + * the old thread injection. + * XXX: Clean up by removing this option and thread injection completely? + */ +OPTION_COMMAND(bool, early_inject, IF_UNIX_ELSE(false /*see above*/, true), "early_inject", { if (options->early_inject) { @@ -1706,20 +1713,16 @@ OPTION_COMMAND(bool, early_inject, } }, "inject early", STATIC, OP_PCACHE_GLOBAL) -#if 0 /* FIXME i#234 NYI: not ready to enable just yet */ - OPTION_DEFAULT(bool, early_inject_map, true, "inject earliest via map") - /* see enum definition is os_shared.h for notes on what works with which - * os version */ - OPTION_DEFAULT(uint, early_inject_location, 5 /* INJECT_LOCATION_KiUserApc */, - "where to hook for early_injection. default is earliest injection: anything else will be later.") -#else -OPTION_DEFAULT(bool, early_inject_map, false, "inject earliest via map") -/* see enum definition is os_shared.h for notes on what works with which - * os version */ -OPTION_DEFAULT(uint, early_inject_location, 4 /* INJECT_LOCATION_LdrDefault */, - "where to hook for early_injection. default is earliest injection: " - "anything else will be later.") -#endif +/* To support cross-arch follow-children injection we need to use the map option. */ +OPTION_DEFAULT(bool, early_inject_map, true, "inject earliest via map") +/* See enum definition is os_shared.h for notes on what works with which + * os version. Our default is late injection to make it easier on clients + * (as noted in i#980, we don't want to be too early for a private kernel32). + */ +OPTION_DEFAULT(uint, early_inject_location, 7 /* INJECT_LOCATION_ImageEntry */, + "where to hook for early_injection. Use 5 ==" + "INJECT_LOCATION_KiUserApcdefault for earliest injection; use " + "4 == INJECT_LOCATION_LdrDefault for easier-but-still-early.") OPTION_DEFAULT(uint_addr, early_inject_address, 0, "specify the address to hook at for INJECT_LOCATION_LdrCustom") #ifdef WINDOWS /* probably the surrounding options should also be under this ifdef */ diff --git a/core/os_shared.h b/core/os_shared.h index d4c4ad7df18..65bb2bc5b66 100644 --- a/core/os_shared.h +++ b/core/os_shared.h @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2010-2020 Google, Inc. All rights reserved. + * Copyright (c) 2010-2021 Google, Inc. All rights reserved. * Copyright (c) 2003-2010 VMware, Inc. All rights reserved. * **********************************************************/ @@ -636,6 +636,7 @@ typedef struct _dr_mem_info_t { * It uses a function call so be careful where performance is critical. */ #define PAGE_START(x) (((ptr_uint_t)(x)) & ~(os_page_size() - 1)) +#define PAGE_START64(x) (((uint64)(x)) & ~((uint64)os_page_size() - 1)) size_t os_page_size(void); @@ -1256,11 +1257,9 @@ enum { JMP_REL32_OPCODE = 0xe9, JMP_REL32_SIZE = 5, /* size in bytes of 32-bit rel jmp */ CALL_REL32_OPCODE = 0xe8, -# ifdef X64 JMP_ABS_IND64_OPCODE = 0xff, JMP_ABS_IND64_SIZE = 6, /* size in bytes of a 64-bit abs ind jmp */ JMP_ABS_MEM_IND64_MODRM = 0x25, -# endif }; #elif defined(AARCHXX) enum { diff --git a/core/vmareas.c b/core/vmareas.c index 13e1dc6180a..4550ad1eed5 100644 --- a/core/vmareas.c +++ b/core/vmareas.c @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2010-2020 Google, Inc. All rights reserved. + * Copyright (c) 2010-2021 Google, Inc. All rights reserved. * Copyright (c) 2002-2010 VMware, Inc. All rights reserved. * **********************************************************/ @@ -6523,9 +6523,10 @@ app_memory_protection_change_internal(dcontext_t *dcontext, bool update_areas, os_terminate(dcontext, TERMINATE_PROCESS); ASSERT_NOT_REACHED(); } else { - SYSLOG_INTERNAL_WARNING_ONCE("Application changing protections of " - "%s memory at least once (" PFX "-" PFX ")", - target_area_name, base, base + size); + /* On Win10 this happens in every run so we do not syslog. */ + LOG(THREAD, LOG_VMAREAS, 1, + "Application changing protections of %s memory (" PFX "-" PFX ")", + target_area_name, base, base + size); if (how_handle == DR_MODIFY_NOP) { /* we use a separate list, rather than a flag on DR areas, as the * affected region could include non-DR memory diff --git a/core/win32/events.mc b/core/win32/events.mc index ebeb86d6a65..27889f012b2 100644 --- a/core/win32/events.mc +++ b/core/win32/events.mc @@ -1,5 +1,5 @@ ;// ********************************************************** -;// Copyright (c) 2012-2020 Google, Inc. All rights reserved. +;// Copyright (c) 2012-2021 Google, Inc. All rights reserved. ;// Copyright (c) 2003-2010 VMware, Inc. All rights reserved. ;// ********************************************************** @@ -702,4 +702,20 @@ Language=English Application %1!s! (%2!s!). Private library static TLS limit crossed: %3!s! . +MessageId = +Severity = Error +Facility = DRCore +SymbolicName = MSG_INJECTION_LIBRARY_MISSING +Language=English +Application %1!s! (%2!s!). The library %3!s! for child process injection is missing. +. + +MessageId = +Severity = Error +Facility = DRCore +SymbolicName = MSG_FOLLOW_CHILD_FAILED +Language=English +Application %1!s! (%2!s!). Failed to follow into child process: %3!s!. +. + ;// ADD NEW MESSAGES HERE diff --git a/core/win32/inject.c b/core/win32/inject.c index fb558aacf11..9f9986d05dc 100644 --- a/core/win32/inject.c +++ b/core/win32/inject.c @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2011-2019 Google, Inc. All rights reserved. + * Copyright (c) 2011-2021 Google, Inc. All rights reserved. * Copyright (c) 2000-2010 VMware, Inc. All rights reserved. * **********************************************************/ @@ -381,8 +381,10 @@ enum { CALL_REL32 = 0xe8, CALL_RM32 = 0xff, CALL_EAX_RM = 0xd0, + JMP_FAR_DIRECT = 0xea, MOV_RM32_2_REG32 = 0x8b, + MOV_REG32_2_RM32 = 0x89, MOV_ESP_2_EAX_RM = 0xc4, MOV_EAX_2_ECX_RM = 0xc8, MOV_EAX_2_EDX_RM = 0xd0, @@ -395,22 +397,68 @@ enum { MOV_IMM_XAX = 0xb8, ADD_EAX_IMM32 = 0x05, + AND_RM32_IMM32 = 0x81, CMP_EAX_IMM32 = 0x3d, JZ_REL8 = 0x74, JNZ_REL8 = 0x75, -#ifdef X64 REX_W = 0x48, REX_B = 0x41, REX_R = 0x44, -#endif }; #define DEBUG_LOOP 0 #define ASSERT_ROOM(cur, buf, maxlen) ASSERT(cur + maxlen < buf + sizeof(buf)) +#define RAW_INSERT_INT16(pos, value) \ + do { \ + ASSERT(CHECK_TRUNCATE_TYPE_short((ptr_int_t)(value))); \ + *(short *)(pos) = (short)(value); \ + (pos) += sizeof(short); \ + } while (0) + +#define RAW_INSERT_INT32(pos, value) \ + do { \ + ASSERT(CHECK_TRUNCATE_TYPE_int((ptr_int_t)(value))); \ + *(int *)(pos) = (int)(ptr_int_t)(value); \ + (pos) += sizeof(int); \ + } while (0) + +#define RAW_INSERT_INT64(pos, value) \ + do { \ + *(int64 *)(pos) = (int64)(value); \ + (pos) += sizeof(int64); \ + } while (0) + +#define RAW_INSERT_INT8(pos, value) \ + do { \ + ASSERT(CHECK_TRUNCATE_TYPE_sbyte((int)value)); \ + *(char *)(pos) = (char)(value); \ + (pos) += sizeof(char); \ + } while (0) + +#define RAW_PUSH_INT64(pos, value) \ + do { \ + *(pos)++ = PUSH_IMM32; \ + RAW_INSERT_INT32(pos, (int)value); \ + /* Push is sign-extended, so we can skip top half if top 33 bits are 0. */ \ + if ((uint64)(value) >= 0x80000000UL) { \ + *(pos)++ = MOV_IMM32_2_RM32; \ + *(pos)++ = 0x44; \ + *(pos)++ = 0x24; \ + *(pos)++ = 0x04; /* xsp+4 */ \ + RAW_INSERT_INT32(pos, (value) >> 32); \ + } \ + } while (0) + +#define RAW_PUSH_INT32(pos, value) \ + do { \ + *(pos)++ = PUSH_IMM32; \ + RAW_INSERT_INT32(pos, value); \ + } while (0) + /* i#142, i#923: 64-bit support now works regardless of where the hook * location and the allocated remote_code_buffer are. * @@ -423,8 +471,9 @@ enum { * bitwidths, which actually might be easier w/ the macros for 32-to-64. */ -/* If reachable is non-NULL, ensures the resulting allocation is +/* If reachable is non-0, ensures the resulting allocation is * 32-bit-disp-reachable from [reachable, reachable+PAGE_SIZE). + * For injecting into 64-bit from 32-bit, uses only low addresses. */ static byte * allocate_remote_code_buffer(HANDLE phandle, size_t size, byte *reachable) @@ -448,6 +497,12 @@ allocate_remote_code_buffer(HANDLE phandle, size_t size, byte *reachable) MEMORY_BASIC_INFORMATION mbi; size_t got; do { + /* We do now have remote_query_virtual_memory_maybe64() available, but we + * do not yet have allocation (win8+ only) or free (would have to make + * one via switch_modes_and_call()) routines, and using low addresses should + * always work. We thus stick with 32-bit pointers here even for 64-bit + * child processes. + */ res = nt_remote_query_virtual_memory(phandle, pc, &mbi, sizeof(mbi), &got); if (got != sizeof(mbi)) { /* bail and hope a low address works, which it will pre-win8 */ @@ -467,6 +522,10 @@ allocate_remote_code_buffer(HANDLE phandle, size_t size, byte *reachable) * STATUS_CONFLICTING_ADDRESSES. Yet a local commit works, and a remote * reserve+commit works. Go figure. */ + /* See above: we use only low addresses. To support high we'd need to add + * allocate and free routines via switch_modes_and_call() (we can use + * NtWow64AllocateVirtualMemory64 on win8+). + */ res = nt_remote_allocate_virtual_memory(phandle, &buf, size, PAGE_EXECUTE_READWRITE, MEM_RESERVE); if (NT_SUCCESS(res)) { @@ -475,7 +534,8 @@ allocate_remote_code_buffer(HANDLE phandle, size_t size, byte *reachable) } /* We know buf at low end reaches, but might have gone too high. */ - if (!NT_SUCCESS(res) || !REL32_REACHABLE(buf + size, (byte *)reachable)) { + if (!NT_SUCCESS(res) || + (reachable != 0 && !REL32_REACHABLE(buf + size, (byte *)reachable))) { #ifndef NOT_DYNAMORIO_CORE_PROPER SYSLOG_INTERNAL_ERROR("failed to allocate child memory for injection"); #endif @@ -487,10 +547,15 @@ allocate_remote_code_buffer(HANDLE phandle, size_t size, byte *reachable) static bool free_remote_code_buffer(HANDLE phandle, byte *base) { + /* There seems to be no such thing as NtWow64FreeVirtualMemory64! + * allocate_remote_code_buffer() is using low address though, so we're good + * to use 32-bit pointers even for 64-bit children. + */ NTSTATUS res = nt_remote_free_virtual_memory(phandle, base); return NT_SUCCESS(res); } +/* Does not support a 64-bit child of a 32-bit DR. */ static void * inject_gencode_at_ldr(HANDLE phandle, char *dynamo_path, uint inject_location, void *inject_address, void *hook_location, @@ -589,10 +654,7 @@ inject_gencode_at_ldr(HANDLE phandle, char *dynamo_path, uint inject_location, PAGE_READONLY, &old_prot); ASSERT(res); -#define INSERT_INT(value) \ - ASSERT(CHECK_TRUNCATE_TYPE_int((ptr_int_t)(value))); \ - *(int *)cur_local_pos = (int)(value); \ - cur_local_pos += sizeof(int) +#define INSERT_INT(value) RAW_INSERT_INT32(cur_local_pos, value) #define INSERT_ADDR(value) \ *(ptr_int_t *)cur_local_pos = (ptr_int_t)(value); \ @@ -658,27 +720,14 @@ inject_gencode_at_ldr(HANDLE phandle, char *dynamo_path, uint inject_location, # define INSERT_POP_ALL_REG() *cur_local_pos++ = POPA #endif -#define PUSH_IMMEDIATE(value) \ - *cur_local_pos++ = PUSH_IMM32; \ - INSERT_INT(value) +#define PUSH_IMMEDIATE(value) RAW_PUSH_INT32(cur_local_pos, value) #define PUSH_SHORT_IMMEDIATE(value) \ *cur_local_pos++ = PUSH_IMM8; \ *cur_local_pos++ = value #ifdef X64 -# define PUSH_PTRSZ_IMMEDIATE(value) \ - do { \ - *cur_local_pos++ = PUSH_IMM32; \ - INSERT_INT((int)(value)); \ - if ((ptr_uint_t)(value) >= 0x80000000) { \ - *cur_local_pos++ = MOV_IMM32_2_RM32; \ - *cur_local_pos++ = 0x44; \ - *cur_local_pos++ = 0x24; \ - *cur_local_pos++ = 0x04; /*rsp+4*/ \ - INSERT_INT((int)((value) >> 32)); \ - } \ - } while (0) +# define PUSH_PTRSZ_IMMEDIATE(value) RAW_PUSH_INT64(cur_local_pos, value) #else # define PUSH_PTRSZ_IMMEDIATE(value) PUSH_IMMEDIATE(value) #endif @@ -775,7 +824,7 @@ inject_gencode_at_ldr(HANDLE phandle, char *dynamo_path, uint inject_location, /* ecx will hold OldProtection afterwards */ /* for x64 we need the 4 stack slots anyway so we do the pushes */ /* on x64, up to caller to have rsp aligned to 16 prior to calling this macro */ -#define PROT_IN_ECX 0xbad15bad /* doesn't match a PAGE_* define */ +#define PROT_IN_ECX 0xbad5bad /* doesn't match a PAGE_* define */ #define CHANGE_PROTECTION(start, size, new_protection) \ *cur_local_pos++ = PUSH_EAX; /* OldProtect slot */ \ MOV_ESP_TO_EAX(); /* get &OldProtect */ \ @@ -1075,8 +1124,8 @@ generate_switch_mode_jmp_to_hook(HANDLE phandle, byte *local_code_buf, sz = (size_t)(pc - local_code_buf); /* copy local buffer to child process */ - if (!nt_write_virtual_memory(phandle, mode_switch_buf, local_code_buf, - pc - local_code_buf, &num_bytes_out) || + if (!write_remote_memory_maybe64(phandle, (uint64)mode_switch_buf, local_code_buf, + pc - local_code_buf, &num_bytes_out) || num_bytes_out != sz) { return false; } @@ -1084,18 +1133,54 @@ generate_switch_mode_jmp_to_hook(HANDLE phandle, byte *local_code_buf, } #endif -static byte * -inject_gencode_mapped_helper(HANDLE phandle, char *dynamo_path, void *hook_location, +static uint64 +find_remote_ntdll_base(HANDLE phandle, bool find64bit) +{ + MEMORY_BASIC_INFORMATION64 mbi; + uint64 got; + NTSTATUS res; + uint64 addr = 0; + char name[MAXIMUM_PATH]; + do { + res = remote_query_virtual_memory_maybe64(phandle, addr, &mbi, sizeof(mbi), &got); + if (got != sizeof(mbi) || !NT_SUCCESS(res)) + break; +#if VERBOSE + print_file(STDERR, "0x%I64x-0x%I64x type=0x%x state=0x%x\n", mbi.BaseAddress, + mbi.BaseAddress + mbi.RegionSize, mbi.Type, mbi.State); +#endif + if (mbi.Type == MEM_IMAGE && mbi.BaseAddress == mbi.AllocationBase) { + bool is_64; + if (get_remote_dll_short_name(phandle, mbi.BaseAddress, name, + BUFFER_SIZE_ELEMENTS(name), &is_64)) { +#if VERBOSE + print_file(STDERR, "found |%s| @ 0x%I64x 64=%d\n", name, mbi.BaseAddress, + is_64); +#endif + if (strcmp(name, "ntdll.dll") == 0 && BOOLS_MATCH(find64bit, is_64)) + return mbi.BaseAddress; + } + } + if (addr + mbi.RegionSize < addr) + break; + addr += mbi.RegionSize; + } while (true); + return 0; +} + +static uint64 +inject_gencode_mapped_helper(HANDLE phandle, char *dynamo_path, uint64 hook_location, byte hook_buf[EARLY_INJECT_HOOK_SIZE], byte *map, void *must_reach, bool x86_code, bool late_injection) { - instrlist_t ilist; - byte *remote_code_buf = NULL, *local_code_buf = NULL, *pc, *remote_data; - byte *hook_code_buf = NULL; + uint64 remote_code_buf = 0, remote_data; + byte *local_code_buf = NULL; + uint64 pc; + uint64 hook_code_buf = 0; const size_t remote_alloc_sz = 2 * PAGE_SIZE; /* one code, one data */ const size_t code_alloc_sz = PAGE_SIZE; size_t hook_code_sz = PAGE_SIZE; - void *switch_code_location = hook_location; + uint64 switch_code_location = hook_location; #ifdef X64 byte *mode_switch_buf = NULL; byte *mode_switch_data = NULL; @@ -1106,10 +1191,14 @@ inject_gencode_mapped_helper(HANDLE phandle, char *dynamo_path, void *hook_locat uint old_prot; earliest_args_t args; int i; - - /* generate code and data */ - remote_code_buf = allocate_remote_code_buffer(phandle, remote_alloc_sz, must_reach); - if (remote_code_buf == NULL) + bool target_64 = !x86_code IF_X64(|| DYNAMO_OPTION(inject_x64)); + + /* Generate code and data. */ + /* We only support low-address remote allocations. */ + IF_NOT_X64(ASSERT(!target_64 || must_reach == NULL)); + remote_code_buf = + (uint64)allocate_remote_code_buffer(phandle, remote_alloc_sz, must_reach); + if (remote_code_buf == 0) goto error; /* we can't use heap_mmap() in drinjectlib */ @@ -1120,14 +1209,14 @@ inject_gencode_mapped_helper(HANDLE phandle, char *dynamo_path, void *hook_locat ASSERT(sizeof(args) < PAGE_SIZE); #ifdef X64 - if (x86_code) { - mode_switch_buf = remote_code_buf; - switch_code_location = mode_switch_buf; - mode_switch_data = remote_data; + if (x86_code && DYNAMO_OPTION(inject_x64)) { + mode_switch_buf = (byte *)remote_code_buf; + switch_code_location = (uint64)mode_switch_buf; + mode_switch_data = (byte *)remote_data; remote_data += switch_data_sz; switch_code_sz = generate_switch_mode_jmp_to_hook( - phandle, local_code_buf, mode_switch_buf, hook_location, switch_code_sz, - mode_switch_data); + phandle, local_code_buf, mode_switch_buf, (byte *)hook_location, + switch_code_sz, mode_switch_data); if (!switch_code_sz || switch_code_sz == PAGE_SIZE) goto error; hook_code_sz -= switch_code_sz; @@ -1136,62 +1225,73 @@ inject_gencode_mapped_helper(HANDLE phandle, char *dynamo_path, void *hook_locat #endif /* see below on why it's easier to point at args in memory */ - args.dr_base = map; -#ifdef NOT_DYNAMORIO_CORE_PROPER - /* FIXME i#234 NYI: pass in ntdll_base */ -#endif - /* FIXME i#234 NYI: for wow64 pick proper ntdll */ - args.ntdll_base = get_ntdll_base(); + args.dr_base = (uint64)map; + args.ntdll_base = find_remote_ntdll_base(phandle, target_64); + if (args.ntdll_base == 0) + goto error; args.tofree_base = remote_code_buf; args.hook_location = hook_location; args.late_injection = late_injection; strncpy(args.dynamorio_lib_path, dynamo_path, BUFFER_SIZE_ELEMENTS(args.dynamorio_lib_path)); NULL_TERMINATE_BUFFER(args.dynamorio_lib_path); - if (!nt_write_virtual_memory(phandle, remote_data, &args, sizeof(args), - &num_bytes_out) || + if (!write_remote_memory_maybe64(phandle, remote_data, &args, sizeof(args), + &num_bytes_out) || num_bytes_out != sizeof(args)) { goto error; } - instrlist_init(&ilist); + /* We would prefer to use IR to generate our instructions, but we need to support + * creating 64-bit code from 32-bit DR. XXX i#1684: Once we have multi-arch + * cross-bitwidth IR support from a single build, switch this back to using IR. + */ + byte *cur_local_pos = local_code_buf; #ifdef X64 - if (x86_code) { + if (x86_code && DYNAMO_OPTION(inject_x64)) { /* Mode Switch from 32 bit to 64 bit. * Forward align stack. */ - instr_t *label64 = INSTR_CREATE_label(GDC); - instr_t *ljmp = - INSTR_CREATE_jmp_far(GDC, opnd_create_far_instr(CS64_SELECTOR, label64)); - instr_t *save_esp = INSTR_CREATE_mov_st( - GDC, OPND_CREATE_MEM32(REG_NULL, (int)(size_t)mode_switch_data), - opnd_create_reg(REG_ESP)); - instr_t *and_esp = - INSTR_CREATE_and(GDC, opnd_create_reg(REG_ESP), OPND_CREATE_INT32(-8)); - instr_set_x86_mode(ljmp, true); - APP(&ilist, save_esp); - APP(&ilist, ljmp); - APP(&ilist, label64); - APP(&ilist, and_esp); + *cur_local_pos++ = MOV_REG32_2_RM32; + *cur_local_pos++ = 0x24; + *cur_local_pos++ = 0x25; + RAW_INSERT_INT32(cur_local_pos, mode_switch_data); + /* Far jmp to next instr. */ + const int far_jmp_len = 7; + byte *pre_jmp = cur_local_pos; + *cur_local_pos++ = JMP_FAR_DIRECT; + RAW_INSERT_INT32(cur_local_pos, pre_jmp + far_jmp_len); + RAW_INSERT_INT16(cur_local_pos, CS64_SELECTOR); + ASSERT(cur_local_pos == pre_jmp + far_jmp_len); + /* Align stack. */ + *cur_local_pos++ = AND_RM32_IMM32; + *cur_local_pos++ = 0xe4; + RAW_INSERT_INT32(cur_local_pos, -8); } #endif - /* restore hook rather than trying to pass contents to C code - * (we leave hooked page writable for this and C code restores) + /* Restore hook rather than trying to pass contents to C code + * (we leave hooked page writable for this and C code restores). */ - APP(&ilist, - INSTR_CREATE_mov_imm(GDC, opnd_create_reg(REG_XAX), - OPND_CREATE_INTPTR((ptr_uint_t)hook_location))); + if (target_64) + *cur_local_pos++ = REX_W; + *cur_local_pos++ = MOV_IMM_XAX; + if (target_64) + RAW_INSERT_INT64(cur_local_pos, hook_location); + else + RAW_INSERT_INT32(cur_local_pos, hook_location); + for (i = 0; i < EARLY_INJECT_HOOK_SIZE / 4; i++) { - /* restore bytes 4*i..4*i+3 of hook */ - APP(&ilist, - INSTR_CREATE_mov_st(GDC, OPND_CREATE_MEM32(REG_XAX, i * 4), - OPND_CREATE_INT32(*((int *)hook_buf + i)))); + /* Restore bytes 4*i..4*i+3 of the hook. */ + *cur_local_pos++ = MOV_IMM32_2_RM32; + *cur_local_pos++ = MOV_deref_disp8_EAX_2_EAX_RM; + RAW_INSERT_INT8(cur_local_pos, i * 4); + RAW_INSERT_INT32(cur_local_pos, *((int *)hook_buf + i)); } for (i = i * 4; i < EARLY_INJECT_HOOK_SIZE; i++) { - /* restore byte i of hook */ - APP(&ilist, - INSTR_CREATE_mov_st(GDC, OPND_CREATE_MEM8(REG_XAX, i), - OPND_CREATE_INT8((char)hook_buf[i]))); + /* Restore byte i of the hook. */ + *cur_local_pos++ = MOV_IMM8_2_RM8; + *cur_local_pos++ = MOV_deref_disp8_EAX_2_EAX_RM; + RAW_INSERT_INT8(cur_local_pos, i); + RAW_INSERT_INT8(cur_local_pos, (char)hook_buf[i]); } /* Call DR earliest-takeover routine w/ retaddr pointing at hooked @@ -1206,85 +1306,76 @@ inject_gencode_mapped_helper(HANDLE phandle, char *dynamo_path, void *hook_locat * isn't counting on of course). * We pass our args in memory pointed at by xax stored in the 2nd page. */ - APP(&ilist, - INSTR_CREATE_mov_imm(GDC, opnd_create_reg(REG_XAX), - OPND_CREATE_INTPTR((ptr_uint_t)remote_data))); - /* we can't use dr_insert_call() b/c it's not avail in drdecode for drinject, + if (target_64) + *cur_local_pos++ = REX_W; + *cur_local_pos++ = MOV_IMM_XAX; + if (target_64) + RAW_INSERT_INT64(cur_local_pos, remote_data); + else + RAW_INSERT_INT32(cur_local_pos, remote_data); + /* We can't use dr_insert_call() b/c it's not avail in drdecode for drinject, * and its main value is passing params and we can't use regular param regs. * we don't even want the 4 stack slots for x64 here b/c we don't want to * clean them up. */ - APP(&ilist, - INSTR_CREATE_push_imm(GDC, - OPND_CREATE_INT32((int)(ptr_int_t)switch_code_location))); -#ifdef X64 - /* push is sign-extended, so we can skip top half if nothing in top 33 bits */ - if ((ptr_uint_t)switch_code_location >= 0x80000000) { - APP(&ilist, - INSTR_CREATE_mov_st( - GDC, OPND_CREATE_MEM32(REG_XSP, 4), - OPND_CREATE_INT32((int)((ptr_int_t)switch_code_location >> 32)))); - } -#endif -#ifdef NOT_DYNAMORIO_CORE_PROPER - /* FIXME i#234 NYI: need to pass in offset of dynamorio_earliest_init_takeover - * or could look it up here: either link in module.c, or export - * privload_bootstrap_get_export() - */ - pc = 0 + map; -#else - pc = (byte *)dynamorio_earliest_init_takeover - get_dynamorio_dll_start() + map; -#endif - if (REL32_REACHABLE(pc, hook_code_buf) && + if (target_64) + RAW_PUSH_INT64(cur_local_pos, switch_code_location); + else + RAW_PUSH_INT32(cur_local_pos, switch_code_location); + pc = + get_remote_proc_address(phandle, (uint64)map, "dynamorio_earliest_init_takeover"); + if (pc == 0) + goto error; + if (REL32_REACHABLE((int64)pc, (int64)hook_code_buf) && /* over-estimate to be sure: we assert below we're < PAGE_SIZE */ - REL32_REACHABLE(pc, remote_code_buf + PAGE_SIZE)) { - APP(&ilist, INSTR_CREATE_jmp(GDC, opnd_create_pc(pc))); + REL32_REACHABLE((int64)pc, (int64)remote_code_buf + PAGE_SIZE)) { + *cur_local_pos++ = JMP_REL32; + uint64 cur_remote_pos = remote_code_buf + (cur_local_pos - local_code_buf); + RAW_INSERT_INT32(cur_local_pos, + (int64)pc - (int64)(cur_remote_pos + sizeof(int))); } else { - /* indirect through an inlined target */ - instr_t *tgt = instr_build_bits(GDC, OP_UNDECODED, sizeof(pc)); - APP(&ilist, INSTR_CREATE_jmp_ind(GDC, opnd_create_mem_instr(tgt, 0, OPSZ_PTR))); - instr_set_raw_bytes(tgt, (byte *)&pc, sizeof(pc)); - APP(&ilist, tgt); + /* Indirect through an inlined target. */ + *cur_local_pos++ = JMP_ABS_IND64_OPCODE; + *cur_local_pos++ = JMP_ABS_MEM_IND64_MODRM; + uint64 cur_remote_pos = remote_code_buf + (cur_local_pos - local_code_buf); + RAW_INSERT_INT32(cur_local_pos, target_64 ? 0 : cur_remote_pos + sizeof(int)); + if (target_64) + RAW_INSERT_INT64(cur_local_pos, pc); + else + RAW_INSERT_INT32(cur_local_pos, pc); } - - /* can't use copy_and_re_relativize_raw_instr b/c don't have direct access: - * need to finalize and then do direct copy to child process - */ - pc = instrlist_encode_to_copy(GDC, &ilist, local_code_buf, hook_code_buf, - local_code_buf + hook_code_sz, - true /*has instr targets*/); - ASSERT(pc != NULL && pc < local_code_buf + hook_code_sz); - instrlist_clear(GDC, &ilist); + ASSERT(cur_local_pos - local_code_buf <= (ssize_t)hook_code_sz); /* copy local buffer to child process */ - if (!nt_write_virtual_memory(phandle, hook_code_buf, local_code_buf, - pc - local_code_buf, &num_bytes_out) || - num_bytes_out != (size_t)(pc - local_code_buf)) { + if (!write_remote_memory_maybe64(phandle, hook_code_buf, local_code_buf, + cur_local_pos - local_code_buf, &num_bytes_out) || + num_bytes_out != (size_t)(cur_local_pos - local_code_buf)) { goto error; } - if (!nt_remote_protect_virtual_memory(phandle, remote_code_buf, remote_alloc_sz, - PAGE_EXECUTE_READWRITE, &old_prot)) { + if (!remote_protect_virtual_memory_maybe64(phandle, remote_code_buf, remote_alloc_sz, + PAGE_EXECUTE_READWRITE, &old_prot)) { ASSERT_NOT_REACHED(); goto error; } free_remote_code_buffer(NT_CURRENT_PROCESS, local_code_buf); - return (void *)hook_code_buf; + return hook_code_buf; error: if (local_code_buf != NULL) free_remote_code_buffer(NT_CURRENT_PROCESS, local_code_buf); - if (remote_code_buf != NULL) - free_remote_code_buffer(phandle, remote_code_buf); - return NULL; + if (remote_code_buf != 0) + free_remote_code_buffer(phandle, (byte *)(ptr_int_t)remote_code_buf); + return 0; } /* i#234: earliest injection so we see every single user-mode instruction - * XXX i#625: not supporting rebasing: assuming no conflict w/ executable + * Supports a 64-bit child of a 32-bit DR. + * XXX i#625: not supporting rebasing: assuming no conflict w/ executable. */ -static void * -inject_gencode_mapped(HANDLE phandle, char *dynamo_path, void *hook_location, +static uint64 +inject_gencode_mapped(HANDLE phandle, char *dynamo_path, uint64 hook_location, byte hook_buf[EARLY_INJECT_HOOK_SIZE], void *must_reach, bool x86_code, bool late_injection) { @@ -1295,7 +1386,7 @@ inject_gencode_mapped(HANDLE phandle, char *dynamo_path, void *hook_location, byte *map = NULL; size_t view_size = 0; wchar_t dllpath[MAX_PATH]; - byte *ret = NULL; + uint64 ret = 0; /* map DR dll into child * @@ -1320,6 +1411,9 @@ inject_gencode_mapped(HANDLE phandle, char *dynamo_path, void *hook_location, if (!NT_SUCCESS(res)) goto done; + /* For 32-into-64, there's no NtWow64 version so we rely on this simply mapping + * into the low 2G. + */ res = nt_raw_MapViewOfSection(section, phandle, &map, 0, 0 /* not page-file-backed */, NULL, (PSIZE_T)&view_size, ViewUnmap, 0 /* no special top-down or anything */, @@ -1330,21 +1424,24 @@ inject_gencode_mapped(HANDLE phandle, char *dynamo_path, void *hook_location, ret = inject_gencode_mapped_helper(phandle, dynamo_path, hook_location, hook_buf, map, must_reach, x86_code, late_injection); done: - if (ret == NULL) { + if (ret == 0) { close_handle(file); close_handle(section); } - return (void *)ret; + return ret; } /* Early injection. */ -/* FIXME - like inject_into_thread we assume esp, but we could allocate our - * own stack in the child and swap to that for transparency. */ +/* XXX: Like inject_into_thread we assume esp, but we could allocate our + * own stack in the child and swap to that for transparency. + */ bool inject_into_new_process(HANDLE phandle, char *dynamo_path, bool map, uint inject_location, void *inject_address) { - void *hook_target = NULL, *hook_location = NULL; + /* To handle a 64-bit child of a 32-bit DR we use "uint64" for remote addresses. */ + uint64 hook_target = 0; + uint64 hook_location = 0; uint old_prot; size_t num_bytes_out; byte hook_buf[EARLY_INJECT_HOOK_SIZE]; @@ -1365,8 +1462,8 @@ inject_into_new_process(HANDLE phandle, char *dynamo_path, bool map, uint inject case INJECT_LOCATION_LdrDefault: /* caller provides the ldr address to use */ ASSERT(inject_address != NULL); - hook_location = inject_address; - if (hook_location == NULL) { + hook_location = (uint64)inject_address; + if (hook_location == 0) { goto error; } break; @@ -1383,15 +1480,15 @@ inject_into_new_process(HANDLE phandle, char *dynamo_path, bool map, uint inject */ HANDLE ntdll_base = get_module_handle(L"ntdll.dll"); ASSERT(ntdll_base != NULL); - hook_location = (void *)GET_PROC_ADDR(ntdll_base, "LdrInitializeThunk"); - ASSERT(hook_location != NULL); + hook_location = (uint64)GET_PROC_ADDR(ntdll_base, "LdrInitializeThunk"); + ASSERT(hook_location != 0); } else - hook_location = (void *)KiUserApcDispatcher; + hook_location = (uint64)KiUserApcDispatcher; ASSERT(map); break; } case INJECT_LOCATION_KiUserException: - hook_location = (void *)KiUserExceptionDispatcher; + hook_location = (uint64)KiUserExceptionDispatcher; break; case INJECT_LOCATION_ImageEntry: hook_location = get_remote_process_entry(phandle, &x86_code); @@ -1401,8 +1498,8 @@ inject_into_new_process(HANDLE phandle, char *dynamo_path, bool map, uint inject } /* read in code at hook */ - if (!nt_read_virtual_memory(phandle, hook_location, hook_buf, sizeof(hook_buf), - &num_bytes_out) || + if (!read_remote_memory_maybe64(phandle, hook_location, hook_buf, sizeof(hook_buf), + &num_bytes_out) || num_bytes_out != sizeof(hook_buf)) { goto error; } @@ -1420,33 +1517,31 @@ inject_into_new_process(HANDLE phandle, char *dynamo_path, bool map, uint inject hook_target = inject_gencode_mapped(phandle, dynamo_path, hook_location, hook_buf, NULL, x86_code, late_injection); } else { - hook_target = - inject_gencode_at_ldr(phandle, dynamo_path, inject_location, inject_address, - hook_location, hook_buf, NULL); + /* No support for 32-to-64. */ + hook_target = (uint64)inject_gencode_at_ldr( + phandle, dynamo_path, inject_location, inject_address, + (void *)(ptr_int_t)hook_location, hook_buf, NULL); } - if (hook_target == NULL) + if (hook_target == 0) goto error; /* Place hook */ - if (IF_X64_ELSE(x86_code, true)) { + if (REL32_REACHABLE((int64)hook_location + 5, (int64)hook_target)) { hook_buf[0] = JMP_REL32; - *(int *)(&hook_buf[1]) = (int)((byte *)hook_target - ((byte *)hook_location + 5)); - } -#ifdef X64 - else { + *(int *)(&hook_buf[1]) = (int)((int64)hook_target - ((int64)hook_location + 5)); + } else { hook_buf[0] = JMP_ABS_IND64_OPCODE; hook_buf[1] = JMP_ABS_MEM_IND64_MODRM; *(int *)(&hook_buf[2]) = 0; /* rip-rel to following address */ - *(byte **)(&hook_buf[6]) = hook_target; + *(uint64 *)(&hook_buf[6]) = hook_target; } -#endif - if (!nt_remote_protect_virtual_memory(phandle, hook_location, sizeof(hook_buf), - PAGE_EXECUTE_READWRITE, &old_prot)) { + if (!remote_protect_virtual_memory_maybe64(phandle, hook_location, sizeof(hook_buf), + PAGE_EXECUTE_READWRITE, &old_prot)) { goto error; } - if (!nt_write_virtual_memory(phandle, hook_location, hook_buf, sizeof(hook_buf), - &num_bytes_out) || + if (!write_remote_memory_maybe64(phandle, hook_location, hook_buf, sizeof(hook_buf), + &num_bytes_out) || num_bytes_out != sizeof(hook_buf)) { goto error; } @@ -1456,8 +1551,8 @@ inject_into_new_process(HANDLE phandle, char *dynamo_path, bool map, uint inject * so we can't mark +w from gencode easily: so we just leave it +w * and restore to +rx in dynamorio_earliest_init_takeover_C(). */ - if (!nt_remote_protect_virtual_memory(phandle, hook_location, sizeof(hook_buf), - old_prot, &old_prot)) { + if (!remote_protect_virtual_memory_maybe64( + phandle, hook_location, sizeof(hook_buf), old_prot, &old_prot)) { goto error; } } diff --git a/core/win32/inject_shared.c b/core/win32/inject_shared.c index 143a61eafd0..d91be85edc3 100644 --- a/core/win32/inject_shared.c +++ b/core/win32/inject_shared.c @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2012-2019 Google, Inc. All rights reserved. + * Copyright (c) 2012-2021 Google, Inc. All rights reserved. * Copyright (c) 2003-2010 VMware, Inc. All rights reserved. * **********************************************************/ @@ -394,6 +394,7 @@ is_windows_version_vista_plus(void); /* forward decl */ * image name and cmdline combined into one call to reduce * read process memory calls (whether this is actually true depends on * usage) + * Handles both 32-bit and 64-bit remote processes. */ void get_process_imgname_cmdline(HANDLE process_handle, wchar_t *image_name, @@ -403,30 +404,48 @@ get_process_imgname_cmdline(HANDLE process_handle, wchar_t *image_name, size_t nbytes; int res; int len; - PEB peb; - LPVOID peb_base = get_peb(process_handle); - RTL_USER_PROCESS_PARAMETERS process_parameters; - void *param_location; + /* For a 64-bit parent querying a 32-bit remote we assume we'll get back the + * 64-bit WOW64 PEB. + */ + uint64 peb_base = get_peb_maybe64(process_handle); + union { + uint64 params_ptr_64; + uint params_ptr_32; + } params_ptr; + bool peb_is_32 = IF_X64_ELSE(false, is_32bit_process(process_handle)); + uint64 param_location; /* It is supposed to be at process_parameters.ImagePathName.Buffer */ + RTL_USER_PROCESS_PARAMETERS params = { 0 }; +# ifndef X64 + RTL_USER_PROCESS_PARAMETERS_64 params64 = { 0 }; +# endif + + if (image_name != NULL) + image_name[0] = L'\0'; + if (command_line != NULL) + command_line[0] = L'\0'; /* Read process PEB */ - res = nt_read_virtual_memory(process_handle, (LPVOID)peb_base, &peb, sizeof(peb), - &nbytes); - /* FIXME: is this always possible? - although we assume we can even do WriteProcessMemory for an explicit inject */ - if (!res) { + res = read_remote_memory_maybe64( + process_handle, + peb_base + + (peb_is_32 ? X86_PROCESS_PARAM_PEB_OFFSET : X64_PROCESS_PARAM_PEB_OFFSET), + ¶ms_ptr, sizeof(params_ptr), &nbytes); + if (!res || nbytes != sizeof(params_ptr)) { display_error("Warning: could not read process memory!"); - if (image_name) - image_name[0] = L'\0'; - if (command_line) - command_line[0] = L'\0'; return; } /* Follow on to process parameters */ - res = - nt_read_virtual_memory(process_handle, (LPVOID)peb.ProcessParameters, - &process_parameters, sizeof(process_parameters), &nbytes); + uint64 params_base = peb_is_32 ? params_ptr.params_ptr_32 : params_ptr.params_ptr_64; + res = read_remote_memory_maybe64( + process_handle, params_base, + IF_NOT_X64(!peb_is_32 ? (void *)¶ms64 :)(void *) & params, + IF_NOT_X64(!peb_is_32 ? sizeof(params64) :) sizeof(params), &nbytes); + if (!res || nbytes != (IF_NOT_X64(!peb_is_32 ? sizeof(params64) :) sizeof(params))) { + display_error("Warning: could not read process memory!"); + return; + } /* apparently {ImagePathName,CommandLine}.Buffer contains the offset * from the beginning of the ProcessParameters structure during @@ -434,20 +453,21 @@ get_process_imgname_cmdline(HANDLE process_handle, wchar_t *image_name, if (image_name) { if (is_windows_version_vista_plus()) { - param_location = process_parameters.ImagePathName.Buffer; + param_location = IF_NOT_X64(!peb_is_32 ? params64.ImagePathName.u.Buffer64 + :)(uint64) params.ImagePathName.Buffer; } else { - param_location = - (void *)((ptr_uint_t)process_parameters.ImagePathName.Buffer + - (ptr_uint_t)peb.ProcessParameters); + param_location = IF_NOT_X64(!peb_is_32 ? params64.ImagePathName.u.Buffer64 :)( + (uint64)params.ImagePathName.Buffer + params_base); } - len = process_parameters.ImagePathName.Length; + len = IF_NOT_X64(!peb_is_32 ? params64.ImagePathName.Length :) + params.ImagePathName.Length; if (len > 2 * (max_image_wchars - 1)) len = 2 * (max_image_wchars - 1); /* Read the image file name in our memory too */ - res = nt_read_virtual_memory(process_handle, (LPVOID)param_location, image_name, - len, &nbytes); + res = read_remote_memory_maybe64(process_handle, param_location, image_name, len, + &nbytes); if (!res) { len = 0; display_warning("Warning: could not read image name from PEB"); @@ -457,19 +477,21 @@ get_process_imgname_cmdline(HANDLE process_handle, wchar_t *image_name, if (command_line) { if (is_windows_version_vista_plus()) { - param_location = process_parameters.CommandLine.Buffer; + param_location = IF_NOT_X64(!peb_is_32 ? params64.CommandLine.u.Buffer64 + :)(uint64) params.CommandLine.Buffer; } else { - param_location = (void *)((ptr_uint_t)process_parameters.CommandLine.Buffer + - (ptr_uint_t)peb.ProcessParameters); + param_location = IF_NOT_X64(!peb_is_32 ? params64.CommandLine.u.Buffer64 :)( + uint64)(params.CommandLine.Buffer + params_base); } - len = process_parameters.CommandLine.Length; + len = IF_NOT_X64(!peb_is_32 ? params64.CommandLine.Length :) + params.CommandLine.Length; if (len > 2 * (max_cmdl_wchars - 1)) len = 2 * (max_cmdl_wchars - 1); /* Read the image file name in our memory too */ - res = nt_read_virtual_memory(process_handle, (LPVOID)param_location, command_line, - len, &nbytes); + res = read_remote_memory_maybe64(process_handle, param_location, command_line, + len, &nbytes); if (!res) { len = 0; display_warning("Warning: could not read cmdline from PEB"); diff --git a/core/win32/injector.c b/core/win32/injector.c index 2c3c1db9cd8..c4fc63bde26 100644 --- a/core/win32/injector.c +++ b/core/win32/injector.c @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2011-2019 Google, Inc. All rights reserved. + * Copyright (c) 2011-2021 Google, Inc. All rights reserved. * Copyright (c) 2002-2010 VMware, Inc. All rights reserved. * **********************************************************/ @@ -847,7 +847,6 @@ bool dr_inject_process_inject(void *data, bool force_injection, const char *library_path) { dr_inject_info_t *info = (dr_inject_info_t *)data; - CONTEXT cxt; bool inject = true; char library_path_buf[MAXIMUM_PATH]; @@ -914,12 +913,15 @@ dr_inject_process_inject(void *data, bool force_injection, const char *library_p #endif inject_init(); - /* FIXME PR 211367: use early_inject instead of this late injection! - * but non-trivial to gather the relevant addresses: so wait for - * earliest injection => i#234/PR 204587 prereq? + /* Like the core, we use map injection, which supports cross-arch injection, is + * in some ways cleaner than thread injection, and supports early injection at + * various points. For now we use the (late) image entry as the takeover point. + * TODO PR 211367: use earlier injection instead of this late injection! + * But it's non-trivial to gather the relevant addresses. + * i#234/PR 204587 is a prereq? */ - if (!inject_into_thread(info->pi.hProcess, &cxt, info->pi.hThread, - (char *)library_path)) { + if (!inject_into_new_process(info->pi.hProcess, (char *)library_path, true /*map*/, + INJECT_LOCATION_ImageEntry, NULL)) { close_handle(info->pi.hProcess); TerminateProcess(info->pi.hProcess, 0); return false; diff --git a/core/win32/module.c b/core/win32/module.c index dae8fe66ca1..6346af4ec61 100644 --- a/core/win32/module.c +++ b/core/win32/module.c @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2011-2020 Google, Inc. All rights reserved. + * Copyright (c) 2011-2021 Google, Inc. All rights reserved. * Copyright (c) 2003-2010 VMware, Inc. All rights reserved. * **********************************************************/ @@ -1891,7 +1891,7 @@ get_module_preferred_base(app_pc pc) /* we return NULL on error above, make sure no one actually sets their * preferred base address to NULL */ ASSERT_CURIOSITY(OPT_HDR(nt, ImageBase) != 0); - return (app_pc)OPT_HDR(nt, ImageBase); + return (app_pc)(ptr_int_t)OPT_HDR(nt, ImageBase); } /* we simply test if allocation bases of a region are the same */ diff --git a/core/win32/module_shared.c b/core/win32/module_shared.c index dd8b32d934f..2f77045fe17 100644 --- a/core/win32/module_shared.c +++ b/core/win32/module_shared.c @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2010-2019 Google, Inc. All rights reserved. + * Copyright (c) 2010-2021 Google, Inc. All rights reserved. * Copyright (c) 2008 VMware, Inc. All rights reserved. * **********************************************************/ @@ -93,6 +93,8 @@ is_readable_without_exception(const byte *pc, size_t size); # define LOG(...) /* nothing */ #endif +#define MAX_FUNCNAME_SIZE 128 + #if defined(CLIENT_INTERFACE) && !defined(NOT_DYNAMORIO_CORE) && \ !defined(NOT_DYNAMORIO_CORE_PROPER) # include "instrument.h" @@ -157,35 +159,44 @@ is_readable_without_exception(const byte *pc, size_t size) #if defined(WINDOWS) && !defined(NOT_DYNAMORIO_CORE) /* Image entry point is stored at, - * PEB->DOS_HEADER->NT_HEADER->OptionalHeader.AddressOfEntryPoint + * PEB->DOS_HEADER->NT_HEADER->OptionalHeader.AddressOfEntryPoint. + * Handles both 32-bit and 64-bit remote processes. */ -void * +uint64 get_remote_process_entry(HANDLE process_handle, OUT bool *x86_code) { - PEB peb; - LPVOID peb_base; - IMAGE_DOS_HEADER *dos_ptr, dos; - IMAGE_NT_HEADERS *nt_ptr, nt; + uint64 peb_base; + /* Handle the two possible widths of peb.ImageBaseAddress: */ + union { + uint64 dos64; + uint dos32; + } dos_ptr; + IMAGE_DOS_HEADER dos; + IMAGE_NT_HEADERS nt; bool res; size_t nbytes; - - peb_base = get_peb(process_handle); - res = nt_read_virtual_memory(process_handle, (LPVOID)peb_base, &peb, sizeof(peb), - &nbytes); - if (!res || nbytes != sizeof(peb)) - return NULL; - dos_ptr = (IMAGE_DOS_HEADER *)peb.ImageBaseAddress; - res = nt_read_virtual_memory(process_handle, (void *)dos_ptr, &dos, sizeof(dos), - &nbytes); - if (!res || nbytes != sizeof(dos)) - return NULL; - nt_ptr = (IMAGE_NT_HEADERS *)(((ptr_uint_t)dos_ptr) + dos.e_lfanew); + bool peb_is_32 = IF_X64_ELSE(false, is_32bit_process(process_handle)); + /* Read peb.ImageBaseAddress. */ + peb_base = get_peb_maybe64(process_handle); + res = read_remote_memory_maybe64( + process_handle, + peb_base + (peb_is_32 ? X86_IMAGE_BASE_PEB_OFFSET : X64_IMAGE_BASE_PEB_OFFSET), + &dos_ptr, sizeof(dos_ptr), &nbytes); + if (!res || nbytes != sizeof(dos_ptr)) + return 0; + uint64 dos_base = peb_is_32 ? dos_ptr.dos32 : dos_ptr.dos64; res = - nt_read_virtual_memory(process_handle, (void *)nt_ptr, &nt, sizeof(nt), &nbytes); + read_remote_memory_maybe64(process_handle, dos_base, &dos, sizeof(dos), &nbytes); + if (!res || nbytes != sizeof(dos)) + return 0; + res = read_remote_memory_maybe64(process_handle, dos_base + dos.e_lfanew, &nt, + sizeof(nt), &nbytes); if (!res || nbytes != sizeof(nt)) - return NULL; + return 0; + /* IMAGE_NT_HEADERS.FileHeader == IMAGE_NT_HEADERS64.FileHeader */ *x86_code = nt.FileHeader.Machine == IMAGE_FILE_MACHINE_I386; - return (void *)((byte *)dos_ptr + (size_t)nt.OptionalHeader.AddressOfEntryPoint); + ASSERT(BOOLS_MATCH(is_32bit_process(process_handle), *x86_code)); + return dos_base + (size_t)OPT_HDR(&nt, AddressOfEntryPoint); } #endif @@ -840,20 +851,11 @@ typedef struct ALIGN_VAR(8) _LDR_MODULE_64 { typedef void (*void_func_t)(); # define MAX_MODNAME_SIZE 128 -# define MAX_FUNCNAME_SIZE 128 -/* in arch/x86.asm */ +/* in drlibc_x86.asm */ extern int switch_modes_and_load(void *ntdll64_LdrLoadDll, UNICODE_STRING_64 *lib, HANDLE *result); -/* in arch/x86.asm */ -/* Switches from 32-bit mode to 64-bit mode and invokes func, passing - * arg1, arg2, and arg3. Works fine when func takes fewer than 3 args - * as well. - */ -extern int -switch_modes_and_call(uint64 func, void *arg1, void *arg2, void *arg3); - /* Here and not in ntdll.c b/c libutil targets link to this file but not * ntdll.c */ @@ -1193,7 +1195,8 @@ free_library_64(HANDLE lib) if (ntdll64 > UINT_MAX || ntdll64 == 0) return false; ntdll64_LdrUnloadDll = get_proc_address_64(ntdll64, "LdrUnloadDll"); - res = switch_modes_and_call(ntdll64_LdrUnloadDll, (void *)lib, NULL, NULL); + invoke_func64_t args = { ntdll64_LdrUnloadDll, (uint64)lib }; + res = switch_modes_and_call(&args); return (res >= 0); } @@ -1212,7 +1215,8 @@ thread_get_context_64(HANDLE thread, CONTEXT_64 *cxt64) if (ntdll64 == 0) return false; ntdll64_GetContextThread = get_proc_address_64(ntdll64, "NtGetContextThread"); - res = switch_modes_and_call(ntdll64_GetContextThread, thread, cxt64, NULL); + invoke_func64_t args = { ntdll64_GetContextThread, (uint64)thread, (uint64)cxt64 }; + res = switch_modes_and_call(&args); return NT_SUCCESS(res); } @@ -1225,12 +1229,237 @@ thread_set_context_64(HANDLE thread, CONTEXT_64 *cxt64) if (ntdll64 == 0) return false; ntdll64_SetContextThread = get_proc_address_64(ntdll64, "NtSetContextThread"); - res = switch_modes_and_call(ntdll64_SetContextThread, thread, cxt64, NULL); + invoke_func64_t args = { ntdll64_SetContextThread, (uint64)thread, (uint64)cxt64 }; + res = switch_modes_and_call(&args); return NT_SUCCESS(res); } # endif /* !NOT_DYNAMORIO_CORE_PROPER */ +bool +remote_protect_virtual_memory_64(HANDLE process, uint64 base, size_t size, uint prot, + uint *old_prot) +{ + uint64 ntdll64_ProtectVirtualMemory; + NTSTATUS res; + uint64 ntdll64 = get_module_handle_64(L"ntdll.dll"); + if (ntdll64 == 0) + return false; + uint64 size64 = size; + uint64 *size_ptr = &size64; + uint64 mybase = base; + uint64 *base_ptr = &mybase; + ntdll64_ProtectVirtualMemory = get_proc_address_64(ntdll64, "NtProtectVirtualMemory"); + invoke_func64_t args = { ntdll64_ProtectVirtualMemory, + (uint64)process, + (uint64)base_ptr, + (uint64)size_ptr, + prot, + (uint64)old_prot }; + res = switch_modes_and_call(&args); + return NT_SUCCESS(res); +} + +NTSTATUS +remote_query_virtual_memory_64(HANDLE process, uint64 addr, + MEMORY_BASIC_INFORMATION64 *mbi, size_t mbilen, + uint64 *got) +{ + uint64 ntdll64_QueryVirtualMemory; + NTSTATUS res; + uint64 ntdll64 = get_module_handle_64(L"ntdll.dll"); + if (ntdll64 == 0) + return false; + ntdll64_QueryVirtualMemory = get_proc_address_64(ntdll64, "NtQueryVirtualMemory"); + invoke_func64_t args = { ntdll64_QueryVirtualMemory, + (uint64)process, + addr, + MemoryBasicInformation, + (uint64)mbi, + mbilen, + (uint64)got }; + res = switch_modes_and_call(&args); + return NT_SUCCESS(res); +} # endif /* !NOT_DYNAMORIO_CORE */ +#endif /* !X64 */ + +#ifndef NOT_DYNAMORIO_CORE +bool +remote_protect_virtual_memory_maybe64(HANDLE process, uint64 base, size_t size, uint prot, + uint *old_prot) +{ +# ifdef X64 + return nt_remote_protect_virtual_memory(process, (void *)base, size, prot, old_prot); +# else + return remote_protect_virtual_memory_64(process, base, size, prot, old_prot); +# endif +} + +NTSTATUS +remote_query_virtual_memory_maybe64(HANDLE process, uint64 addr, + MEMORY_BASIC_INFORMATION64 *mbi, size_t mbilen, + uint64 *got) +{ +# ifdef X64 + return nt_remote_query_virtual_memory(process, (void *)addr, + (MEMORY_BASIC_INFORMATION *)mbi, mbilen, got); +# else + return remote_query_virtual_memory_64(process, addr, mbi, mbilen, got); +# endif +} +#endif + +/* Excluding from libutil b/c it doesn't need it and is_32bit_process() and + * read_remote_memory_maybe64() aren't exported to libutil. + */ +#ifndef NOT_DYNAMORIO_CORE +static bool +read_remote_maybe64(HANDLE process, uint64 addr, size_t bufsz, void *buf) +{ + size_t num_read; + return read_remote_memory_maybe64(process, addr, buf, bufsz, &num_read) && + num_read == bufsz; +} + +/* Handles 32-bit or 64-bit remote processes. + * Ignores forwarders and ordinals. + */ +uint64 +get_remote_proc_address(HANDLE process, uint64 remote_base, const char *name) +{ + uint64 lib = remote_base; + size_t exports_size; + IMAGE_DOS_HEADER dos; + IMAGE_NT_HEADERS64 nt64; + IMAGE_NT_HEADERS32 nt32; + IMAGE_DATA_DIRECTORY *expdir; + IMAGE_EXPORT_DIRECTORY exports; + uint i; + PULONG functions; /* array of RVAs */ + PUSHORT ordinals; + PULONG fnames; /* array of RVAs */ + uint ord = UINT_MAX; /* the ordinal to use */ + uint64 func = 0; + char local_buf[MAX_FUNCNAME_SIZE]; + + if (!read_remote_maybe64(process, lib, sizeof(dos), &dos)) + return 0; + ASSERT(dos.e_magic == IMAGE_DOS_SIGNATURE); + if (!read_remote_maybe64(process, lib + dos.e_lfanew, sizeof(nt64), &nt64)) + return 0; + ASSERT(nt64.Signature == IMAGE_NT_SIGNATURE); + if (nt64.OptionalHeader.Magic == IMAGE_NT_OPTIONAL_HDR32_MAGIC) { + if (!read_remote_maybe64(process, lib + dos.e_lfanew, sizeof(nt32), &nt32)) + return 0; + ASSERT(nt32.Signature == IMAGE_NT_SIGNATURE); + expdir = &nt32.OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT]; + } else { + expdir = &nt64.OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT]; + } + exports_size = expdir->Size; + if (exports_size <= 0 || + !read_remote_maybe64(process, lib + expdir->VirtualAddress, + MIN(exports_size, sizeof(exports)), &exports)) + return 0; + if (exports.NumberOfNames == 0 || exports.AddressOfNames == 0) + return 0; + +# if defined(NOT_DYNAMORIO_CORE) || defined(NOT_DYNAMORIO_CORE_PROPER) + functions = + (PULONG)HeapAlloc(GetProcessHeap(), 0, exports.NumberOfFunctions * sizeof(ULONG)); + ordinals = + (PUSHORT)HeapAlloc(GetProcessHeap(), 0, exports.NumberOfNames * sizeof(USHORT)); + fnames = + (PULONG)HeapAlloc(GetProcessHeap(), 0, exports.NumberOfNames * sizeof(ULONG)); +# else + functions = (PULONG)global_heap_alloc(exports.NumberOfFunctions * + sizeof(ULONG) HEAPACCT(ACCT_OTHER)); + ordinals = (PUSHORT)global_heap_alloc(exports.NumberOfNames * + sizeof(USHORT) HEAPACCT(ACCT_OTHER)); + fnames = (PULONG)global_heap_alloc(exports.NumberOfNames * + sizeof(ULONG) HEAPACCT(ACCT_OTHER)); +# endif + if (read_remote_maybe64(process, lib + exports.AddressOfFunctions, + exports.NumberOfFunctions * sizeof(ULONG), functions) && + read_remote_maybe64(process, lib + exports.AddressOfNameOrdinals, + exports.NumberOfNames * sizeof(USHORT), ordinals) && + read_remote_maybe64(process, lib + exports.AddressOfNames, + exports.NumberOfNames * sizeof(ULONG), fnames)) { + bool match = false; + for (i = 0; i < exports.NumberOfNames; i++) { + if (!read_remote_maybe64(process, lib + fnames[i], + BUFFER_SIZE_BYTES(local_buf), local_buf)) + break; + NULL_TERMINATE_BUFFER(local_buf); + if (strcasecmp(name, local_buf) == 0) { + match = true; + ord = ordinals[i]; + break; + } + } + if (match && ord < exports.NumberOfFunctions && functions[ord] != 0 && + /* We don't support forwarded functions */ + (functions[ord] < expdir->VirtualAddress || + functions[ord] >= expdir->VirtualAddress + exports_size)) + func = lib + functions[ord]; + } +# if defined(NOT_DYNAMORIO_CORE) || defined(NOT_DYNAMORIO_CORE_PROPER) + HeapFree(GetProcessHeap(), 0, functions); + HeapFree(GetProcessHeap(), 0, ordinals); + HeapFree(GetProcessHeap(), 0, fnames); +# else + global_heap_free(functions, + exports.NumberOfFunctions * sizeof(ULONG) HEAPACCT(ACCT_OTHER)); + global_heap_free(ordinals, + exports.NumberOfNames * sizeof(USHORT) HEAPACCT(ACCT_OTHER)); + global_heap_free(fnames, exports.NumberOfNames * sizeof(ULONG) HEAPACCT(ACCT_OTHER)); +# endif + return func; +} + +/* Handles 32-bit or 64-bit remote processes. */ +bool +get_remote_dll_short_name(HANDLE process, uint64 remote_base, OUT char *name, + size_t name_len, OUT bool *is_64) +{ + uint64 lib = remote_base; + size_t exports_size; + IMAGE_DOS_HEADER dos; + IMAGE_NT_HEADERS64 nt64; + IMAGE_NT_HEADERS32 nt32; + IMAGE_DATA_DIRECTORY *expdir; + IMAGE_EXPORT_DIRECTORY exports; + if (!read_remote_maybe64(process, lib, sizeof(dos), &dos)) + return false; + if (dos.e_magic != IMAGE_DOS_SIGNATURE) + return false; + if (!read_remote_maybe64(process, lib + dos.e_lfanew, sizeof(nt64), &nt64)) + return false; + if (nt64.Signature != IMAGE_NT_SIGNATURE) + return false; + if (nt64.OptionalHeader.Magic == IMAGE_NT_OPTIONAL_HDR32_MAGIC) { + if (!read_remote_maybe64(process, lib + dos.e_lfanew, sizeof(nt32), &nt32)) + return 0; + ASSERT(nt32.Signature == IMAGE_NT_SIGNATURE); + expdir = &nt32.OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT]; + if (is_64 != NULL) + *is_64 = false; + } else { + expdir = &nt64.OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT]; + if (is_64 != NULL) + *is_64 = true; + } + exports_size = expdir->Size; + if (exports_size <= 0 || + !read_remote_maybe64(process, lib + expdir->VirtualAddress, + MIN(exports_size, sizeof(exports)), &exports)) + return false; + if (exports.Name == 0 || + !read_remote_maybe64(process, lib + exports.Name, name_len, name)) + return false; + name[name_len - 1] = '\0'; + return true; +} +#endif -#endif /* !X64 */ /****************************************************************************/ diff --git a/core/win32/ntdll.c b/core/win32/ntdll.c index 617003a91ed..a62b8f17760 100644 --- a/core/win32/ntdll.c +++ b/core/win32/ntdll.c @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2010-2020 Google, Inc. All rights reserved. + * Copyright (c) 2010-2021 Google, Inc. All rights reserved. * Copyright (c) 2003-2010 VMware, Inc. All rights reserved. * **********************************************************/ @@ -1000,6 +1000,66 @@ get_own_peb() return own_peb; } +/* Returns a 32-bit PEB for a 32-bit child and !X64 parent. + * Else returns a 64-bit PEB. + */ +uint64 +get_peb_maybe64(HANDLE h) +{ +#ifdef X64 + return (uint64)get_peb(h); +#else + /* The WOW64 query below should work regardless of whether the kernel is 32-bit + * or the child is 32-bit or 64-bit. But, it returns the 64-bit PEB, while we + * would prefer the 32-bit, so we first try get_peb(). + */ + PEB *peb32 = get_peb(h); + if (peb32 != NULL) + return (uint64)peb32; + PROCESS_BASIC_INFORMATION64 info; + NTSTATUS res = nt_wow64_query_info_process64(h, &info); + if (!NT_SUCCESS(res)) + return 0; + else + return info.PebBaseAddress; +#endif +} + +#ifdef X64 +/* Returns the 32-bit PEB for a WOW64 process, given process and thread handles. */ +uint64 +get_peb32(HANDLE process, HANDLE thread) +{ + THREAD_BASIC_INFORMATION info; + NTSTATUS res = query_thread_info(thread, &info); + if (!NT_SUCCESS(res)) + return 0; + /* Bizarrely, info.TebBaseAddress points 2 pages too low! We do sanity + * checks to confirm we have a TEB by looking at its self pointer. + */ +# define TEB32_QUERY_OFFS 0x2000 + byte *teb32 = (byte *)info.TebBaseAddress; + uint ptr32; + size_t sz_read; + if (!nt_read_virtual_memory(process, teb32 + X86_SELF_TIB_OFFSET, &ptr32, + sizeof(ptr32), &sz_read) || + sz_read != sizeof(ptr32) || ptr32 != (uint64)teb32) { + teb32 += TEB32_QUERY_OFFS; + if (!nt_read_virtual_memory(process, teb32 + X86_SELF_TIB_OFFSET, &ptr32, + sizeof(ptr32), &sz_read) || + sz_read != sizeof(ptr32) || ptr32 != (uint64)teb32) { + /* XXX: Also try peb64+0x1000? That was true for older Windows version. */ + return 0; + } + } + if (!nt_read_virtual_memory(process, teb32 + X86_PEB_TIB_OFFSET, &ptr32, + sizeof(ptr32), &sz_read) || + sz_read != sizeof(ptr32)) + return 0; + return ptr32; +} +#endif + /****************************************************************************/ #ifndef NOT_DYNAMORIO_CORE /* avoid needing CXT_ macros and SELF_TIB_OFFSET from os_exports.h */ @@ -2025,6 +2085,21 @@ is_wow64_process(HANDLE h) return self_is_wow64; } +bool +is_32bit_process(HANDLE h) +{ +#ifdef X64 + /* Kernel is definitely 64-bit. */ + return is_wow64_process(h); +#else + /* If kernel is 64-bit, ask about wow64; else, kernel is 32-bit, so true. */ + if (is_wow64_process(NT_CURRENT_PROCESS)) + return is_wow64_process(h); + else + return true; +#endif +} + NTSTATUS nt_get_drive_map(HANDLE process, PROCESS_DEVICEMAP_INFORMATION *map OUT) { diff --git a/core/win32/ntdll.h b/core/win32/ntdll.h index ca12a1bb837..106462b17da 100644 --- a/core/win32/ntdll.h +++ b/core/win32/ntdll.h @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2011-2018 Google, Inc. All rights reserved. + * Copyright (c) 2011-2021 Google, Inc. All rights reserved. * Copyright (c) 2003-2010 VMware, Inc. All rights reserved. * **********************************************************/ @@ -95,7 +95,21 @@ # define ATTACH_PARENT_PROCESS ((DWORD)-1) #endif -#ifndef X64 +#ifdef X64 +typedef struct _UNICODE_STRING_32 { + /* Length field is size in bytes not counting final 0 */ + USHORT Length; + USHORT MaximumLength; + uint Buffer; +} UNICODE_STRING_32; + +typedef struct _RTL_USER_PROCESS_PARAMETERS_32 { + uint Reserved[14]; + UNICODE_STRING_32 ImagePathName; + UNICODE_STRING_32 CommandLine; + uint Environment; +} RTL_USER_PROCESS_PARAMETERS_32, *PRTL_USER_PROCESS_PARAMETERS_32; +#else typedef struct ALIGN_VAR(8) _UNICODE_STRING_64 { /* Length field is size in bytes not counting final 0 */ USHORT Length; @@ -109,6 +123,14 @@ typedef struct ALIGN_VAR(8) _UNICODE_STRING_64 { uint64 Buffer64; } u; } UNICODE_STRING_64; + +typedef struct _RTL_USER_PROCESS_PARAMETERS_64 { + BYTE Reserved1[16]; + uint64 Reserved2[10]; + UNICODE_STRING_64 ImagePathName; + UNICODE_STRING_64 CommandLine; + uint64 Environment; +} RTL_USER_PROCESS_PARAMETERS_64, *PRTL_USER_PROCESS_PARAMETERS_64; #endif /* from DDK2003SP1/3790.1830/inc/ddk/wnet/ntddk.h */ @@ -207,28 +229,24 @@ typedef struct _LDR_MODULE { /* offset: 32bit / 64bit */ LDR_DLL_LOAD_REASON LoadReason; /* 0x094 / 0x10c */ } LDR_MODULE, *PLDR_MODULE; -/* This macro is defined so that 32-bit dlls can be handled in 64-bit DR. +/* This macro is defined so that 32-bit dlls can be handled in 64-bit DR, + * and vice versa (for injection from 32-bit into a 64-bit child). * Not all IMAGE_OPTIONAL_HEADER fields are affected, only ImageBase, * LoaderFlags, NumberOfRvaAndSizes, SizeOf{Stack,Heap}{Commit,Reserve}, * and DataDirectory, of which we use only ImageBase and DataDirectory. * All other fields happen to have the same offsets and sizes in both * IMAGE_OPTIONAL_HEADER32 and IMAGE_OPTIONAL_HEADER64. */ -#ifdef X64 /* Don't need to use module_is_32bit() here as that is heavyweight. Also, as * it is used directly in process_image() just when the module processing * begins, we don't have to do all the checks here. */ -# define OPT_HDR(nt_hdr_p, field) OPT_HDR_BASE(nt_hdr_p, field, ) -# define OPT_HDR_P(nt_hdr_p, field) OPT_HDR_BASE(nt_hdr_p, field, (app_pc) &) -# define OPT_HDR_BASE(nt_hdr_p, field, amp) \ - ((nt_hdr_p)->OptionalHeader.Magic == IMAGE_NT_OPTIONAL_HDR32_MAGIC \ - ? amp(((IMAGE_OPTIONAL_HEADER32 *)&((nt_hdr_p)->OptionalHeader))->field) \ - : amp(((IMAGE_OPTIONAL_HEADER64 *)&((nt_hdr_p)->OptionalHeader))->field)) -#else -# define OPT_HDR(nt_hdr_p, field) ((nt_hdr_p)->OptionalHeader.field) -# define OPT_HDR_P(nt_hdr_p, field) (&((nt_hdr_p)->OptionalHeader.field)) -#endif +#define OPT_HDR(nt_hdr_p, field) OPT_HDR_BASE(nt_hdr_p, field, ) +#define OPT_HDR_P(nt_hdr_p, field) OPT_HDR_BASE(nt_hdr_p, field, (app_pc) &) +#define OPT_HDR_BASE(nt_hdr_p, field, amp) \ + ((nt_hdr_p)->OptionalHeader.Magic == IMAGE_NT_OPTIONAL_HDR32_MAGIC \ + ? amp(((IMAGE_OPTIONAL_HEADER32 *)&((nt_hdr_p)->OptionalHeader))->field) \ + : amp(((IMAGE_OPTIONAL_HEADER64 *)&((nt_hdr_p)->OptionalHeader))->field)) /* For use by routines that walk the module lists. */ enum { MAX_MODULE_LIST_INFINITE_LOOP_THRESHOLD = 2048 }; @@ -385,7 +403,6 @@ typedef _W64 long LONG_PTR, *PLONG_PTR; typedef _W64 unsigned long ULONG_PTR, *PULONG_PTR; typedef ULONG KAFFINITY; #endif -typedef LONG KPRIORITY; typedef struct _KERNEL_USER_TIMES { LARGE_INTEGER CreateTime; @@ -1275,14 +1292,6 @@ typedef struct _KUSER_SHARED_DATA { /* We only rely on this up through Windows XP */ #define KUSER_SHARED_DATA_ADDRESS ((ULONG_PTR)0x7ffe0000) -/*************************************************************************** - * convenience enums - */ -typedef enum { - MEMORY_RESERVE_ONLY = MEM_RESERVE, - MEMORY_COMMIT = MEM_RESERVE | MEM_COMMIT -} memory_commit_status_t; - /*************************************************************************** * function declarations */ @@ -1337,6 +1346,15 @@ get_peb(HANDLE h); PEB * get_own_peb(void); +uint64 +get_peb_maybe64(HANDLE h); + +#ifdef X64 +/* Returns the 32-bit PEB for a WOW64 process, given process and thread handles. */ +uint64 +get_peb32(HANDLE process, HANDLE thread); +#endif + TEB * get_teb(HANDLE h); @@ -1506,6 +1524,9 @@ get_process_load(HANDLE h); bool is_wow64_process(HANDLE h); +bool +is_32bit_process(HANDLE h); + NTSTATUS nt_get_drive_map(HANDLE process, PROCESS_DEVICEMAP_INFORMATION *map OUT); @@ -1630,10 +1651,10 @@ query_full_attributes_file(PCWSTR filename, PFILE_NETWORK_OPEN_INFORMATION info) #define FILE_ANY_ACCESS 0 #define FILE_SPECIAL_ACCESS (FILE_ANY_ACCESS) #ifndef FILE_READ_ACCESS -# define FILE_READ_ACCESS (0x0001) // file & pipe +# define FILE_READ_ACCESS (0x0001) // file & pipe #endif #ifndef FILE_WRITE_ACCESS -# define FILE_WRITE_ACCESS (0x0002) // file & pipe +# define FILE_WRITE_ACCESS (0x0002) // file & pipe #endif /* share flags, from ntddk.h, rest are in winnt.h */ @@ -2154,7 +2175,14 @@ get_own_context(CONTEXT *cxt); enum { /* can't put w/ os_exports.h enum b/c needed for non-core */ /* for accessing x64 data from WOW64 */ X64_PEB_TIB_OFFSET = 0x060, + X86_PEB_TIB_OFFSET = 0x030, + X64_SELF_TIB_OFFSET = 0x030, + X86_SELF_TIB_OFFSET = 0x018, X64_LDR_PEB_OFFSET = 0x018, + X64_IMAGE_BASE_PEB_OFFSET = 0x010, + X86_IMAGE_BASE_PEB_OFFSET = 0x008, + X64_PROCESS_PARAM_PEB_OFFSET = 0x020, + X86_PROCESS_PARAM_PEB_OFFSET = 0x010, }; LDR_MODULE * @@ -2180,8 +2208,28 @@ get_module_handle_64(const wchar_t *name); uint64 get_proc_address_64(uint64 lib, const char *name); + +bool +remote_protect_virtual_memory_64(HANDLE process, uint64 base, size_t size, uint prot, + uint *old_prot); #endif /* !X64 */ +uint64 +get_remote_proc_address(HANDLE process, uint64 remote_base, const char *name); + +bool +get_remote_dll_short_name(HANDLE process, uint64 remote_base, OUT char *name, + size_t name_len, OUT bool *is_64); + +bool +remote_protect_virtual_memory_maybe64(HANDLE process, uint64 base, size_t size, uint prot, + uint *old_prot); + +NTSTATUS +remote_query_virtual_memory_maybe64(HANDLE process, uint64 addr, + MEMORY_BASIC_INFORMATION64 *mbi, size_t mbilen, + uint64 *got); + IMAGE_EXPORT_DIRECTORY * get_module_exports_directory(app_pc base_addr, size_t *exports_size /* OPTIONAL OUT */); diff --git a/core/win32/ntdll_shared.c b/core/win32/ntdll_shared.c index 650da5b8f82..9ad2db4267a 100644 --- a/core/win32/ntdll_shared.c +++ b/core/win32/ntdll_shared.c @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2010-2019 Google, Inc. All rights reserved. + * Copyright (c) 2010-2021 Google, Inc. All rights reserved. * Copyright (c) 2003-2010 VMware, Inc. All rights reserved. * **********************************************************/ @@ -72,7 +72,52 @@ #include "ntdll_shared.h" -#ifndef X64 +/* In ntdll.c which is linked everywhere ntdll_shared.c is these days. */ +bool +nt_read_virtual_memory(HANDLE process, const void *base, void *buffer, + size_t buffer_length, size_t *bytes_read); + +bool +nt_write_virtual_memory(HANDLE process, void *base, const void *buffer, + size_t buffer_length, size_t *bytes_written); + +#ifndef X64 /* Around most of the rest of the file. */ + +# if !defined(NOT_DYNAMORIO_CORE) && !defined(NOT_DYNAMORIO_CORE_PROPER) +# define UNPROT_IF_INIT() \ + do { \ + /* The first call may not be during init so we have to unprot */ \ + if (dynamo_initialized) { \ + SELF_UNPROTECT_DATASEC(DATASEC_RARELY_PROT); \ + } \ + } while (0) +# define PROT_IF_INIT() \ + do { \ + /* The first call may not be during init so we have to unprot */ \ + if (dynamo_initialized) { \ + SELF_PROTECT_DATASEC(DATASEC_RARELY_PROT); \ + } \ + } while (0) +# else +# define PROT_IF_INIT() /* Nothing. */ +# define UNPROT_IF_INIT() /* Nothing. */ +# endif + +# ifdef NOT_DYNAMORIO_CORE +# define GET_PROC_ADDR(name) GetProcAddress(GetModuleHandle("ntdll.dll"), name) +# else +# define GET_PROC_ADDR(name) d_r_get_proc_address(get_ntdll_base(), name) +# endif + +# define INIT_NTWOW64_FUNCPTR(var, name) \ + do { \ + if (ntcall == NULL) { \ + UNPROT_IF_INIT(); \ + var = (name##_t)GET_PROC_ADDR(#name); \ + PROT_IF_INIT(); \ + } \ + } while (0) + NTSTATUS nt_wow64_read_virtual_memory64(HANDLE process, uint64 base, void *buffer, size_t buffer_length, size_t *bytes_read) @@ -83,23 +128,7 @@ nt_wow64_read_virtual_memory64(HANDLE process, uint64 base, void *buffer, IN ULONGLONG BufferSize, OUT PULONGLONG NumberOfBytesRead); static NtWow64ReadVirtualMemory64_t ntcall; NTSTATUS res; - if (ntcall == NULL) { -# if !defined(NOT_DYNAMORIO_CORE) && !defined(NOT_DYNAMORIO_CORE_PROPER) - /* The first call may not be during init so we have to unprot */ - if (dynamo_initialized) - SELF_UNPROTECT_DATASEC(DATASEC_RARELY_PROT); -# endif - ntcall = (NtWow64ReadVirtualMemory64_t) -# ifdef NOT_DYNAMORIO_CORE - GetProcAddress(GetModuleHandle("ntdll.dll"), "NtWow64ReadVirtualMemory64"); -# else - d_r_get_proc_address(get_ntdll_base(), "NtWow64ReadVirtualMemory64"); -# endif -# if !defined(NOT_DYNAMORIO_CORE) && !defined(NOT_DYNAMORIO_CORE_PROPER) - if (dynamo_initialized) - SELF_PROTECT_DATASEC(DATASEC_RARELY_PROT); -# endif - } + INIT_NTWOW64_FUNCPTR(ntcall, NtWow64ReadVirtualMemory64); if (ntcall == NULL) { /* We do not need to fall back to NtReadVirtualMemory, b/c * NtWow64ReadVirtualMemory64 was added in xp64==2003 and so should @@ -116,4 +145,77 @@ nt_wow64_read_virtual_memory64(HANDLE process, uint64 base, void *buffer, } return res; } + +NTSTATUS +nt_wow64_write_virtual_memory64(HANDLE process, uint64 base, void *buffer, + size_t buffer_length, size_t *bytes_written) +{ + /* Just like nt_wow64_read_virtual_memory64, we dynamically acquire. */ + typedef NTSTATUS(NTAPI * NtWow64WriteVirtualMemory64_t)( + HANDLE ProcessHandle, IN PVOID64 BaseAddress, IN PVOID Buffer, + IN ULONGLONG BufferSize, OUT PULONGLONG NumberOfBytesWritten); + static NtWow64WriteVirtualMemory64_t ntcall; + NTSTATUS res; + INIT_NTWOW64_FUNCPTR(ntcall, NtWow64WriteVirtualMemory64); + if (ntcall == NULL) { + ASSERT_NOT_REACHED(); + res = STATUS_NOT_IMPLEMENTED; + } else { + uint64 len; + res = ntcall(process, (PVOID64)base, buffer, (ULONGLONG)buffer_length, &len); + if (bytes_written != NULL) + *bytes_written = (size_t)len; + } + return res; +} + +NTSTATUS +nt_wow64_query_info_process64(HANDLE process, PROCESS_BASIC_INFORMATION64 *info) +{ + /* Just like nt_wow64_read_virtual_memory64, we dynamically acquire. */ + typedef NTSTATUS(NTAPI * NtWow64QueryInformationProcess64_t)( + HANDLE ProcessHandle, IN PROCESSINFOCLASS InfoClass, OUT PVOID Buffer, + IN ULONG BufferSize, OUT PULONG NumberOfBytesRead); + static NtWow64QueryInformationProcess64_t ntcall; + NTSTATUS res; + INIT_NTWOW64_FUNCPTR(ntcall, NtWow64QueryInformationProcess64); + if (ntcall == NULL) { + ASSERT_NOT_REACHED(); + res = STATUS_NOT_IMPLEMENTED; + } else { + ULONG got; + res = ntcall(process, ProcessBasicInformation, info, sizeof(*info), &got); + ASSERT(!NT_SUCCESS(res) || got == sizeof(PROCESS_BASIC_INFORMATION64)); + } + return res; +} + +#endif /* !X64 */ + +#ifndef NOT_DYNAMORIO_CORE +bool +read_remote_memory_maybe64(HANDLE process, uint64 base, void *buffer, + size_t buffer_length, size_t *bytes_read) +{ +# ifdef X64 + return nt_read_virtual_memory(process, (LPVOID)base, buffer, buffer_length, + bytes_read); +# else + return NT_SUCCESS( + nt_wow64_read_virtual_memory64(process, base, buffer, buffer_length, bytes_read)); +# endif +} + +bool +write_remote_memory_maybe64(HANDLE process, uint64 base, void *buffer, + size_t buffer_length, size_t *bytes_read) +{ +# ifdef X64 + return nt_write_virtual_memory(process, (LPVOID)base, buffer, buffer_length, + bytes_read); +# else + return NT_SUCCESS(nt_wow64_write_virtual_memory64(process, base, buffer, + buffer_length, bytes_read)); +# endif +} #endif diff --git a/core/win32/ntdll_shared.h b/core/win32/ntdll_shared.h index 1860c9b225b..0bf3f99bd6f 100644 --- a/core/win32/ntdll_shared.h +++ b/core/win32/ntdll_shared.h @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2011-2015 Google, Inc. All rights reserved. + * Copyright (c) 2011-2021 Google, Inc. All rights reserved. * Copyright (c) 2003-2010 VMware, Inc. All rights reserved. * **********************************************************/ @@ -46,11 +46,38 @@ #include #include "ntdll_types.h" +bool +read_remote_memory_maybe64(HANDLE process, uint64 base, void *buffer, + size_t buffer_length, size_t *bytes_read); + +bool +write_remote_memory_maybe64(HANDLE process, uint64 base, void *buffer, + size_t buffer_length, size_t *bytes_written); + #ifndef X64 -/* returns raw NTSTATUS */ +typedef struct _PROCESS_BASIC_INFORMATION64 { + NTSTATUS ExitStatus; + uint64 PebBaseAddress; + uint64 AffinityMask; + KPRIORITY BasePriority; + uint64 UniqueProcessId; + uint64 InheritedFromUniqueProcessId; +} PROCESS_BASIC_INFORMATION64; + +/* Returns raw NTSTATUS. */ NTSTATUS nt_wow64_read_virtual_memory64(HANDLE process, uint64 base, void *buffer, size_t buffer_length, size_t *bytes_read); + +/* Returns raw NTSTATUS. */ +NTSTATUS +nt_wow64_write_virtual_memory64(HANDLE process, uint64 base, void *buffer, + size_t buffer_length, size_t *bytes_read); + +/* Returns raw NTSTATUS. */ +NTSTATUS +nt_wow64_query_info_process64(HANDLE process, PROCESS_BASIC_INFORMATION64 *info); + #endif #endif /* _NTDLL_SHARED_H_ */ diff --git a/core/win32/ntdll_types.h b/core/win32/ntdll_types.h index f7e9716c25b..2b6ff5af819 100644 --- a/core/win32/ntdll_types.h +++ b/core/win32/ntdll_types.h @@ -33,6 +33,8 @@ typedef LONG NTSTATUS; #define STATUS_SUCCESS ((NTSTATUS)0x00000000L) #define STATUS_UNSUCCESSFUL ((NTSTATUS)0xC0000001L) +typedef LONG KPRIORITY; + typedef struct _UNICODE_STRING { /* Length field is size in bytes not counting final 0 */ USHORT Length; @@ -486,4 +488,12 @@ typedef struct _RTL_USER_PROCESS_PARAMETERS { #define FILE_DEVICE_CONSOLE 0x00000050 +/*************************************************************************** + * convenience enums + */ +typedef enum { + MEMORY_RESERVE_ONLY = MEM_RESERVE, + MEMORY_COMMIT = MEM_RESERVE | MEM_COMMIT +} memory_commit_status_t; + #endif /* _NTDLL_TYPES_H_ */ diff --git a/core/win32/os.c b/core/win32/os.c index edbb5c2b8e4..50414110228 100644 --- a/core/win32/os.c +++ b/core/win32/os.c @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2010-2020 Google, Inc. All rights reserved. + * Copyright (c) 2010-2021 Google, Inc. All rights reserved. * Copyright (c) 2000-2010 VMware, Inc. All rights reserved. * **********************************************************/ @@ -3339,6 +3339,7 @@ inject_into_process(dcontext_t *dcontext, HANDLE process_handle, CONTEXT *cxt, to have them use the same library. */ char library_path_buf[MAXIMUM_PATH]; + char alt_arch_path[MAXIMUM_PATH]; char *library = library_path_buf; bool res; @@ -3352,8 +3353,9 @@ inject_into_process(dcontext_t *dcontext, HANDLE process_handle, CONTEXT *cxt, * unless the child is in fact explicit in which case we just use the global library. */ + bool custom_library = false; switch (err) { - case GET_PARAMETER_SUCCESS: break; + case GET_PARAMETER_SUCCESS: custom_library = true; break; case GET_PARAMETER_NOAPPSPECIFIC: /* We got the global key's library, use parent's library instead if the only * reason we're injecting is -follow_children (i.e. reading RUNUNDER gave us @@ -3368,8 +3370,38 @@ inject_into_process(dcontext_t *dcontext, HANDLE process_handle, CONTEXT *cxt, default: ASSERT_NOT_REACHED(); } + if (!custom_library IF_X64(&&!DYNAMO_OPTION(inject_x64))) { + if (IF_NOT_X64(!) is_32bit_process(process_handle)) { + /* The build system passes us the LIBDIR_X{86,64} defines. */ +# define DR_LIBDIR_X86 STRINGIFY(LIBDIR_X86) +# define DR_LIBDIR_X64 STRINGIFY(LIBDIR_X64) + strncpy(alt_arch_path, library, BUFFER_SIZE_ELEMENTS(alt_arch_path)); + /* Assumption: libdir name is not repeated elsewhere in path */ + char *libdir = + strstr(alt_arch_path, IF_X64_ELSE(DR_LIBDIR_X64, DR_LIBDIR_X86)); + if (libdir != NULL) { + const char *newdir = IF_X64_ELSE(DR_LIBDIR_X86, DR_LIBDIR_X64); + /* Do NOT place the NULL. */ + strncpy(libdir, newdir, strlen(newdir)); + NULL_TERMINATE_BUFFER(alt_arch_path); + library = alt_arch_path; + LOG(THREAD, LOG_SYSCALLS | LOG_THREADS, 1, + "alternate-bitwidth library path: %s", library); + } else { + REPORT_FATAL_ERROR_AND_EXIT( + INJECTION_LIBRARY_MISSING, 3, get_application_name(), + get_application_pid(), + ""); + } + } + } + LOG(THREAD, LOG_SYSCALLS | LOG_THREADS, 1, "\tinjecting %s into child process\n", library); + if (!os_file_exists(library, false)) { + REPORT_FATAL_ERROR_AND_EXIT(INJECTION_LIBRARY_MISSING, 3, get_application_name(), + get_application_pid(), library); + } if (DYNAMO_OPTION(aslr_dr) && /* case 8749 - can't aslr dr for thin_clients */ @@ -3412,6 +3444,7 @@ inject_into_process(dcontext_t *dcontext, HANDLE process_handle, CONTEXT *cxt, return true; } +/* Does not support 32-bit asking about a 64-bit process. */ bool is_first_thread_in_new_process(HANDLE process_handle, CONTEXT *cxt) { @@ -3428,37 +3461,42 @@ is_first_thread_in_new_process(HANDLE process_handle, CONTEXT *cxt) * but no easy way to do either here. FIXME */ process_id_t pid = process_id_from_handle(process_handle); - if (pid == 0) + if (pid == 0) { + LOG(THREAD_GET, LOG_SYSCALLS | LOG_THREADS, 2, "%s: failed to get pid\n"); return true; + } if (!is_pid_me(pid)) { - ptr_uint_t peb = (ptr_uint_t)get_peb(process_handle); - if (cxt->THREAD_START_ARG == peb) + uint64 peb = get_peb_maybe64(process_handle); + uint64 start_arg = + IF_X64_ELSE(cxt->THREAD_START_ARG64, + is_32bit_process(process_handle) ? cxt->THREAD_START_ARG32 + : cxt->THREAD_START_ARG64); + LOG(THREAD_GET, LOG_SYSCALLS | LOG_THREADS, 2, + "%s: pid=" PIFX " vs me=" PIFX ", arg=" PFX " vs peb=" PFX "\n", __FUNCTION__, + pid, get_process_id(), start_arg, peb); + if (start_arg == peb) return true; else if (is_wow64_process(process_handle) && get_os_version() >= WINDOWS_VERSION_VISTA) { /* i#816: for wow64 process PEB query will be x64 while thread addr * will be the x86 PEB. On Vista and Win7 the x86 PEB seems to * always be one page below but we don't want to rely on that, and - * it doesn't hold on Win8. Instead we ensure the start addr is - * a one-page alloc whose first 3 fields match the x64 PEB: - * boolean flags, Mutant, and ImageBaseAddress. + * it doesn't hold on Win8. Instead we ensure the start addr's + * first 3 fields match the x64 PEB: boolean flags, Mutant, and + * ImageBaseAddress. + * + * XXX: We now have get_peb32() with a thread handle. But this is no + * longer used for the default injection. */ int64 peb64[3]; int peb32[3]; byte *base = NULL; - size_t sz = get_allocation_size_ex(process_handle, - (byte *)cxt->THREAD_START_ARG, &base); - LOG(THREAD_GET, LOG_SYSCALLS | LOG_THREADS, 2, - "%s: pid=" PIFX " vs me=" PIFX ", arg=" PFX " vs peb=" PFX "\n", - __FUNCTION__, pid, get_process_id(), cxt->THREAD_START_ARG, peb); - if (sz != PAGE_SIZE || base != (byte *)cxt->THREAD_START_ARG) - return false; - if (!nt_read_virtual_memory(process_handle, (const void *)peb, peb64, - sizeof(peb64), &sz) || + size_t sz; + if (!read_remote_memory_maybe64(process_handle, peb, peb64, sizeof(peb64), + &sz) || sz != sizeof(peb64) || - !nt_read_virtual_memory(process_handle, - (const void *)cxt->THREAD_START_ARG, peb32, - sizeof(peb32), &sz) || + !read_remote_memory_maybe64(process_handle, start_arg, peb32, + sizeof(peb32), &sz) || sz != sizeof(peb32)) return false; LOG(THREAD_GET, LOG_SYSCALLS | LOG_THREADS, 2, @@ -3475,7 +3513,9 @@ is_first_thread_in_new_process(HANDLE process_handle, CONTEXT *cxt) /* Depending on registry and options maybe inject into child process with * handle process_handle. Called by SYS_CreateThread in pre_system_call (in * which case cxt is non-NULL) and by CreateProcess[Ex] in post_system_call (in - * which case cxt is NULL). */ + * which case cxt is NULL). + * Does not support cross-arch injection for cxt!=NULL. + */ bool maybe_inject_into_process(dcontext_t *dcontext, HANDLE process_handle, CONTEXT *cxt) { @@ -3518,8 +3558,9 @@ maybe_inject_into_process(dcontext_t *dcontext, HANDLE process_handle, CONTEXT * } else { injected = true; /* attempted, at least */ ASSERT(cxt != NULL || DYNAMO_OPTION(early_inject)); - /* FIXME : if not -early_inject, we are going to read and write - * to cxt, which may be unsafe */ + /* XXX: if not -early_inject, we are going to read and write + * to cxt, which may be unsafe. + */ if (inject_into_process(dcontext, process_handle, cxt, should_inject)) { check_for_run_once(process_handle, rununder_mask); } @@ -9019,21 +9060,23 @@ earliest_inject_init(byte *arg_ptr) earliest_args_t *args = (earliest_args_t *)arg_ptr; /* Set up imports w/o making any library calls */ - if (!privload_bootstrap_dynamorio_imports(args->dr_base, args->ntdll_base)) { + if (!privload_bootstrap_dynamorio_imports((byte *)(ptr_int_t)args->dr_base, + (byte *)(ptr_int_t)args->ntdll_base)) { /* XXX: how handle failure? too early to ASSERT. how bail? * should we just silently go native? */ } else { /* Restore +rx to hook location before DR init scans it */ uint old_prot; - if (!bootstrap_protect_virtual_memory(args->hook_location, EARLY_INJECT_HOOK_SIZE, - PAGE_EXECUTE_READ, &old_prot)) { + if (!bootstrap_protect_virtual_memory((byte *)(ptr_int_t)args->hook_location, + EARLY_INJECT_HOOK_SIZE, PAGE_EXECUTE_READ, + &old_prot)) { /* XXX: again, how handle failure? */ } } /* We can't walk Ldr list to get this so set it from parent args */ - set_ntdll_base(args->ntdll_base); + set_ntdll_base((byte *)(ptr_int_t)args->ntdll_base); /* We can't get DR path from Ldr list b/c DR won't be in there even once * it's initialized so we pass it in from parent. @@ -9058,7 +9101,7 @@ void earliest_inject_cleanup(byte *arg_ptr) { earliest_args_t *args = (earliest_args_t *)arg_ptr; - byte *tofree = args->tofree_base; + byte *tofree = (byte *)(ptr_int_t)args->tofree_base; NTSTATUS res; /* Free tofree (which contains args). diff --git a/core/win32/os_private.h b/core/win32/os_private.h index a24fb040a90..982cf4610d0 100644 --- a/core/win32/os_private.h +++ b/core/win32/os_private.h @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2011-2020 Google, Inc. All rights reserved. + * Copyright (c) 2011-2021 Google, Inc. All rights reserved. * Copyright (c) 2005-2010 VMware, Inc. All rights reserved. * **********************************************************/ @@ -68,12 +68,12 @@ extern app_pc dynamo_dll_end; extern dcontext_t *early_inject_load_helper_dcontext; -/* passed to early injection init by parent */ +/* Passed to early injection init by parent. Sized to work for any bitwidth. */ typedef struct { - byte *dr_base; - byte *ntdll_base; - byte *tofree_base; - byte *hook_location; + uint64 dr_base; + uint64 ntdll_base; + uint64 tofree_base; + uint64 hook_location; bool late_injection; char dynamorio_lib_path[MAX_PATH]; } earliest_args_t; @@ -346,7 +346,9 @@ os_rename_file_in_directory(IN HANDLE rootdir, const wchar_t *orig_name, /* see notes in intercept_new_thread() about these values */ #define THREAD_START_ADDR IF_X64_ELSE(CXT_XCX, CXT_XAX) -#define THREAD_START_ARG IF_X64_ELSE(CXT_XDX, CXT_XBX) +#define THREAD_START_ARG64 CXT_XDX +#define THREAD_START_ARG32 CXT_XBX +#define THREAD_START_ARG IF_X64_ELSE(THREAD_START_ARG64, THREAD_START_ARG32) void callback_init(void); diff --git a/core/win32/pre_inject.c b/core/win32/pre_inject.c index 2dd895a3d4d..34cef507187 100644 --- a/core/win32/pre_inject.c +++ b/core/win32/pre_inject.c @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2012-2019 Google, Inc. All rights reserved. + * Copyright (c) 2012-2021 Google, Inc. All rights reserved. * Copyright (c) 2001-2010 VMware, Inc. All rights reserved. * **********************************************************/ @@ -172,9 +172,10 @@ display_error(char *msg) typedef int (*int_func_t)(); typedef void (*void_func_t)(); -/* in arch/x86.asm */ +/* in drlibc_x86.asm */ extern int -switch_modes_and_call(void_func_t func, void *arg1, void *arg2, void *arg3); +switch_modes_and_call(void_func_t func, void *arg1, void *arg2, void *arg3, void *arg4, + void *arg5, void *arg6); static bool load_dynamorio_lib(IF_NOT_X64(bool x64_in_wow64)) { @@ -333,14 +334,14 @@ static bool load_dynamorio_lib(IF_NOT_X64(bool x64_in_wow64)) VERBOSE_MESSAGE("about to inject dynamorio"); #ifndef X64 if (x64_in_wow64) - res = switch_modes_and_call(init_func, NULL, NULL, NULL); + res = switch_modes_and_call(init_func, NULL, NULL, NULL, NULL, NULL, NULL); else #endif res = (*init_func)(); VERBOSE_MESSAGE("dynamorio_app_init() returned %d\n", res); #ifndef X64 if (x64_in_wow64) - switch_modes_and_call(take_over_func, NULL, NULL, NULL); + switch_modes_and_call(take_over_func, NULL, NULL, NULL, NULL, NULL, NULL); else #endif (*take_over_func)(); diff --git a/core/win32/syscall.c b/core/win32/syscall.c index bb9bc5f66ce..c32e9a3bfff 100644 --- a/core/win32/syscall.c +++ b/core/win32/syscall.c @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2011-2019 Google, Inc. All rights reserved. + * Copyright (c) 2011-2021 Google, Inc. All rights reserved. * Copyright (c) 2006-2010 VMware, Inc. All rights reserved. * **********************************************************/ @@ -1491,12 +1491,13 @@ static const wchar_t *const wenv_to_propagate[] = { }; #define NUM_ENV_TO_PROPAGATE (sizeof(env_to_propagate) / sizeof(env_to_propagate[0])) -/* read env var from remote process: +/* Read env var from remote process: * - return true on read successfully or until end of reading * - skip DR env vars + * Handles both 32-bit and 64-bit remote processes. */ -static wchar_t * -get_process_env_var(HANDLE phandle, wchar_t *env_ptr, wchar_t *buf, size_t toread) +static uint64 +get_process_env_var(HANDLE phandle, uint64 env_ptr, wchar_t *buf, size_t toread) { int i; size_t got; @@ -1507,17 +1508,16 @@ get_process_env_var(HANDLE phandle, wchar_t *env_ptr, wchar_t *buf, size_t torea /* if an env var is too long we're ok: DR vars will fit, and if longer we'll * handle rest next call. */ - if (!nt_read_virtual_memory(phandle, env_ptr, buf, toread, &got)) { + if (!read_remote_memory_maybe64(phandle, env_ptr, buf, toread, &got)) { /* may have crossed page boundary and the next page is inaccessible */ - byte *start = (byte *)env_ptr; - if (PAGE_START(start) != PAGE_START(start + toread)) { - ASSERT((size_t)((byte *)ALIGN_FORWARD(start, PAGE_SIZE) - start) <= - toread); - toread = (byte *)ALIGN_FORWARD(start, PAGE_SIZE) - start; - if (!nt_read_virtual_memory(phandle, env_ptr, buf, toread, &got)) - return NULL; + uint64 start = env_ptr; + if (PAGE_START64(start) != PAGE_START64(start + toread)) { + ASSERT((size_t)(ALIGN_FORWARD(start, PAGE_SIZE) - start) <= toread); + toread = (size_t)(ALIGN_FORWARD(start, PAGE_SIZE) - start); + if (!read_remote_memory_maybe64(phandle, env_ptr, buf, toread, &got)) + return 0; } else - return NULL; + return 0; continue; } buf[got / sizeof(buf[0]) - 1] = '\0'; @@ -1531,16 +1531,20 @@ get_process_env_var(HANDLE phandle, wchar_t *env_ptr, wchar_t *buf, size_t torea } if (keep_env) return env_ptr; - env_ptr += wcslen(buf) + 1; + env_ptr += (wcslen(buf) + 1) * sizeof(wchar_t); } - return false; + return 0; } /* called at presys-ResumeThread to append DR env vars in the target process PEB */ static bool -add_dr_env_vars(dcontext_t *dcontext, HANDLE phandle, wchar_t **env_ptr) +add_dr_env_vars(dcontext_t *dcontext, HANDLE phandle, uint64 env_ptr, bool peb_is_32) { - wchar_t *env, *cur; + union { + uint64 base64; + uint base32; + } env_base; + uint64 env, cur; size_t tot_sz = 0, app_sz, sz; size_t got; wchar_t *new_env = NULL; @@ -1564,29 +1568,30 @@ add_dr_env_vars(dcontext_t *dcontext, HANDLE phandle, wchar_t **env_ptr) return true; /* nothing to do */ } - ASSERT(env_ptr != NULL); - if (!nt_read_virtual_memory(phandle, env_ptr, &env, sizeof(env), NULL)) + ASSERT(env_ptr != 0); + if (!read_remote_memory_maybe64(phandle, env_ptr, &env_base, sizeof(env_base), NULL)) goto add_dr_env_failure; - if (env != NULL) { + env = peb_is_32 ? env_base.base32 : env_base.base64; + if (env != 0) { /* compute size of current env block, and check for existing DR vars */ cur = env; while (true) { /* for simplicity we do a syscall for each var */ cur = get_process_env_var(phandle, cur, buf, sizeof(buf)); - if (cur == NULL) + if (cur == 0) return false; if (buf[0] == '\0') break; tot_sz += wcslen(buf) + 1; - cur += wcslen(buf) + 1; + cur += (wcslen(buf) + 1) * sizeof(wchar_t); } tot_sz++; /* final 0 marking end */ /* from here on out, all *sz vars are total bytes, not wchar_t elements */ - tot_sz *= sizeof(*env); + tot_sz *= sizeof(wchar_t); } app_sz = tot_sz; - LOG(THREAD, LOG_SYSCALLS, 2, "%s: orig app env vars at " PFX "-" PFX "\n", - __FUNCTION__, env, env + app_sz / sizeof(*env)); + LOG(THREAD, LOG_SYSCALLS, 2, "%s: orig app env vars at 0x%I64x-0x%I64x\n", + __FUNCTION__, env, env + app_sz / sizeof(wchar_t)); /* calculate size needed for adding DR env vars. * for each var, we truncate if too big for buf. @@ -1599,11 +1604,16 @@ add_dr_env_vars(dcontext_t *dcontext, HANDLE phandle, wchar_t **env_ptr) SYSLOG_INTERNAL(SYSLOG_WARNING, "truncating DR env var for child"); sz_var[i] = BUFFER_SIZE_ELEMENTS(buf); } - sz_var[i] *= sizeof(*env); + sz_var[i] *= sizeof(wchar_t); tot_sz += sz_var[i]; } } - /* allocate a new env block and copy over the old */ + /* Allocate a new env block and copy over the old. + * We're fine being limited to low addresses for parent32 child64 + * (NtWow64AllocateVirtualMemory64 is win8+ only). + * That means we can also use the regular write, protect, and free calls below + * for the new block (but not the original PEB addresses). + */ res = nt_remote_allocate_virtual_memory(phandle, &new_env, tot_sz, PAGE_READWRITE, MEM_COMMIT); if (!NT_SUCCESS(res)) { @@ -1612,30 +1622,30 @@ add_dr_env_vars(dcontext_t *dcontext, HANDLE phandle, wchar_t **env_ptr) goto add_dr_env_failure; } LOG(THREAD, LOG_SYSCALLS, 2, "%s: new app env vars allocated at " PFX "-" PFX "\n", - __FUNCTION__, new_env, new_env + tot_sz / sizeof(*env)); + __FUNCTION__, new_env, new_env + tot_sz / sizeof(wchar_t)); cur = env; sz = 0; while (true) { /* for simplicity we do a syscall for each var */ size_t towrite = 0; cur = get_process_env_var(phandle, cur, buf, sizeof(buf)); - if (cur == NULL) + if (cur == 0) goto add_dr_env_failure; if (buf[0] == '\0') break; towrite = (wcslen(buf) + 1); - res = nt_raw_write_virtual_memory(phandle, new_env + sz / sizeof(*env), buf, - towrite * sizeof(*env), &got); + res = nt_raw_write_virtual_memory(phandle, new_env + sz / sizeof(wchar_t), buf, + towrite * sizeof(wchar_t), &got); if (!NT_SUCCESS(res)) { LOG(THREAD, LOG_SYSCALLS, 2, "%s copy: got status " PFX ", wrote " PIFX " vs requested " PIFX "\n", __FUNCTION__, res, got, towrite); goto add_dr_env_failure; } - sz += towrite * sizeof(*env); - cur += towrite; + sz += towrite * sizeof(wchar_t); + cur += towrite * sizeof(wchar_t); } - ASSERT(sz == app_sz - sizeof(*env) /* before final 0 */); + ASSERT(sz == app_sz - sizeof(wchar_t) /* before final 0 */); /* add DR env vars at the end. * XXX: is alphabetical sorting relied upon? adding to end is working. @@ -1645,31 +1655,40 @@ add_dr_env_vars(dcontext_t *dcontext, HANDLE phandle, wchar_t **env_ptr) _snwprintf(buf, BUFFER_SIZE_ELEMENTS(buf), L"%s=%S", wenv_to_propagate[i], get_config_val(env_to_propagate[i])); NULL_TERMINATE_BUFFER(buf); - if (!nt_write_virtual_memory(phandle, new_env + sz / sizeof(*env), buf, + if (!nt_write_virtual_memory(phandle, new_env + sz / sizeof(wchar_t), buf, sz_var[i], NULL)) goto add_dr_env_failure; + LOG(THREAD, LOG_SYSCALLS, 2, "%s: wrote DR env var |%S| to 0x%I64x\n", + __FUNCTION__, buf, new_env + sz / sizeof(wchar_t)); sz += sz_var[i]; } } - ASSERT(sz == tot_sz - sizeof(*env) /* before final 0 */); + ASSERT(sz == tot_sz - sizeof(wchar_t) /* before final 0 */); /* write final 0 */ buf[0] = 0; - if (!nt_write_virtual_memory(phandle, new_env + sz / sizeof(*env), buf, sizeof(*env), - NULL)) + if (!nt_write_virtual_memory(phandle, new_env + sz / sizeof(wchar_t), buf, + sizeof(wchar_t), NULL)) goto add_dr_env_failure; /* install new env */ - if (!nt_remote_protect_virtual_memory(phandle, (byte *)PAGE_START(env_ptr), PAGE_SIZE, - PAGE_READWRITE, &old_prot)) { - LOG(THREAD, LOG_SYSCALLS, 1, "%s: failed to mark " PFX " writable\n", - __FUNCTION__, env_ptr); + if (!remote_protect_virtual_memory_maybe64(phandle, PAGE_START64(env_ptr), PAGE_SIZE, + PAGE_READWRITE, &old_prot)) { + LOG(THREAD, LOG_SYSCALLS, 1, "%s: failed to mark 0x%I64x writable\n", + __FUNCTION__, PAGE_START64(env_ptr)); goto add_dr_env_failure; } - if (!nt_write_virtual_memory(phandle, env_ptr, &new_env, sizeof(new_env), NULL)) + union { + uint64 ptr64; + uint ptr32; + } new_env_remote; + new_env_remote.ptr64 = (uint64)new_env; + new_env_remote.ptr32 = (uint)(ptr_uint_t)new_env; + if (!write_remote_memory_maybe64(phandle, env_ptr, &new_env_remote, peb_is_32 ? 4 : 8, + NULL)) goto add_dr_env_failure; - if (!nt_remote_protect_virtual_memory(phandle, (byte *)PAGE_START(env_ptr), PAGE_SIZE, - old_prot, &old_prot)) { - LOG(THREAD, LOG_SYSCALLS, 1, "%s: failed to restore " PFX " to " PIFX "\n", + if (!remote_protect_virtual_memory_maybe64(phandle, PAGE_START64(env_ptr), PAGE_SIZE, + old_prot, &old_prot)) { + LOG(THREAD, LOG_SYSCALLS, 1, "%s: failed to restore 0x%I64x to " PIFX "\n", __FUNCTION__, env_ptr, old_prot); /* not a fatal error */ } @@ -1677,7 +1696,7 @@ add_dr_env_vars(dcontext_t *dcontext, HANDLE phandle, wchar_t **env_ptr) * is on the app heap so we can't. we could query and see if it's * a separate alloc. for now we just leave it be. */ - LOG(THREAD, LOG_SYSCALLS, 2, "%s: installed new env " PFX " at " PFX "\n", + LOG(THREAD, LOG_SYSCALLS, 2, "%s: installed new env " PFX " at 0x%I64x\n", __FUNCTION__, new_env, env_ptr); return true; @@ -1688,10 +1707,10 @@ add_dr_env_vars(dcontext_t *dcontext, HANDLE phandle, wchar_t **env_ptr) __FUNCTION__, new_env); } if (old_prot != PAGE_NOACCESS) { - if (!nt_remote_protect_virtual_memory(phandle, (byte *)PAGE_START(env_ptr), - PAGE_SIZE, old_prot, &old_prot)) { + if (!remote_protect_virtual_memory_maybe64(phandle, PAGE_START64(env_ptr), + PAGE_SIZE, old_prot, &old_prot)) { LOG(THREAD, LOG_SYSCALLS, 1, - "%s: failed to restore " PFX " to " PIFX "\n", __FUNCTION__, env_ptr, + "%s: failed to restore 0x%I64x to " PIFX "\n", __FUNCTION__, env_ptr, old_prot); } } @@ -1706,12 +1725,93 @@ static bool not_first_thread_in_new_process(HANDLE process_handle, HANDLE thread_handle) { char buf[MAX_CONTEXT_SIZE]; +#ifndef X64 + bool peb_is_32 = is_32bit_process(process_handle); + if (!peb_is_32) { + /* We'd need a CONTEXT64 define for parent32,child64.. + * We only need this for pre-Vista, so just xp64, so we bail. + */ + REPORT_FATAL_ERROR_AND_EXIT(FOLLOW_CHILD_FAILED, 3, get_application_name(), + get_application_pid(), + "32-bit parent's 64-bit child not supported on XP"); + } +#endif CONTEXT *cxt = nt_initialize_context(buf, CONTEXT_DR_STATE); if (NT_SUCCESS(nt_get_context(thread_handle, cxt))) return !is_first_thread_in_new_process(process_handle, cxt); return false; } +/* The caller should already have checked should_inject_into_process(). + * The child thread should be suspended. + * This routine directly invokes REPORT_FATAL_ERROR_AND_EXIT on errors. + */ +static void +propagate_options_via_env_vars(dcontext_t *dcontext, HANDLE process_handle, + HANDLE thread_handle) +{ + /* For -follow_children we propagate env vars (current + * DYNAMORIO_RUNUNDER, DYNAMORIO_OPTIONS, DYNAMORIO_AUTOINJECT, and + * DYNAMORIO_LOGDIR) to the child to support a simple run-all-children + * model without requiring setting up config files for children. + */ + uint64 peb; + bool peb_is_32 = is_32bit_process(process_handle); + size_t sz_read; + union { + uint64 ptr_64; + uint ptr_32; + } params_ptr; + if (process_handle == INVALID_HANDLE_VALUE) { + REPORT_FATAL_ERROR_AND_EXIT(FOLLOW_CHILD_FAILED, 3, get_application_name(), + get_application_pid(), + "Option propagation failed to acquire child handle"); + return; /* Not reached. */ + } + /* We have to write to the 32-bit env block for a 32-bit target process. */ +#ifdef X64 + if (peb_is_32) + peb = get_peb32(process_handle, thread_handle); + else +#endif + peb = get_peb_maybe64(process_handle); + if (peb == 0) { + REPORT_FATAL_ERROR_AND_EXIT(FOLLOW_CHILD_FAILED, 3, get_application_name(), + get_application_pid(), + "Option propagation failed to find PEB"); + close_handle(process_handle); /* Not reached. */ + return; /* Not reached. */ + } + if (!read_remote_memory_maybe64( + process_handle, + peb + + (peb_is_32 ? X86_PROCESS_PARAM_PEB_OFFSET : X64_PROCESS_PARAM_PEB_OFFSET), + ¶ms_ptr, sizeof(params_ptr), &sz_read) || + sz_read != sizeof(params_ptr) || + (peb_is_32 ? (params_ptr.ptr_32 == 0) : (params_ptr.ptr_64 == 0))) { + REPORT_FATAL_ERROR_AND_EXIT( + FOLLOW_CHILD_FAILED, 3, get_application_name(), get_application_pid(), + "Option propagation failed to find ProcessParameters"); + } + uint64 params_base = peb_is_32 ? params_ptr.ptr_32 : params_ptr.ptr_64; + uint64 env_ptr; + if (IF_X64(!) peb_is_32) + env_ptr = params_base + offsetof(RTL_USER_PROCESS_PARAMETERS, Environment); + else { + env_ptr = params_base + + offsetof(IF_X64_ELSE(RTL_USER_PROCESS_PARAMETERS_32, + RTL_USER_PROCESS_PARAMETERS_64), + Environment); + } + LOG(THREAD, LOG_SYSCALLS, 2, + "inserting DR env vars to child &pp->Environment=0x%I64x\n", env_ptr); + if (!add_dr_env_vars(dcontext, process_handle, env_ptr, peb_is_32)) { + REPORT_FATAL_ERROR_AND_EXIT(FOLLOW_CHILD_FAILED, 3, get_application_name(), + get_application_pid(), + "Option propagation failed to add DR env vars"); + } +} + /* NtResumeThread */ static void presys_ResumeThread(dcontext_t *dcontext, reg_t *param_base) @@ -1721,11 +1821,11 @@ presys_ResumeThread(dcontext_t *dcontext, reg_t *param_base) process_id_t pid = thread_handle_to_pid(thread_handle, tid); LOG(THREAD, LOG_SYSCALLS | LOG_THREADS, IF_DGCDIAG_ELSE(1, 2), "syscall: NtResumeThread pid=%d tid=%d\n", pid, tid); - if (DYNAMO_OPTION(follow_children) && pid != POINTER_MAX && !is_pid_me(pid)) { - /* For -follow_children we propagate env vars (current - * DYNAMORIO_RUNUNDER, DYNAMORIO_OPTIONS, DYNAMORIO_AUTOINJECT, and - * DYNAMORIO_LOGDIR) to the child to support a simple run-all-children - * model without requiring setting up config files for children. + if (get_os_version() < WINDOWS_VERSION_VISTA && DYNAMO_OPTION(follow_children) && + pid != POINTER_MAX && !is_pid_me(pid)) { + /* For Vista+ we propagate in postsys_CreateUserProcess. Waiting until here + * requires not_first_thread_in_new_process() which currently does not + * support cross-arch, so we only propagate here for pre-Vista. * * It's possible the app is explicitly resuming a thread in another * process and this has nothing to do with a new process: but our env @@ -1733,19 +1833,14 @@ presys_ResumeThread(dcontext_t *dcontext, reg_t *param_base) * * For pre-Vista, the initial thread is always suspended, and is either * resumed inside kernel32!CreateProcessW or by the app, so we should - * always see a resume. For Vista+ NtCreateUserProcess has suspend as a - * param and ideally we should replace the env pre-NtCreateUserProcess, - * but we have yet to get that to work, so for now we rely on - * Vista+ process creation going through the kernel32 routines, - * which do hardcode the thread as being suspended. + * always see a resume. */ - PEB *peb; HANDLE process_handle = process_handle_from_id(pid); - RTL_USER_PROCESS_PARAMETERS *pp = NULL; if (process_handle == INVALID_HANDLE_VALUE) { - LOG(THREAD, LOG_SYSCALLS, 1, - "WARNING: error acquiring process handle for pid=" PIFX "\n", pid); - return; + REPORT_FATAL_ERROR_AND_EXIT(FOLLOW_CHILD_FAILED, 3, get_application_name(), + get_application_pid(), + "Option propagation failed to acquire handle"); + return; /* Not reached. */ } if (!should_inject_into_process(dcontext, process_handle, NULL, NULL)) { LOG(THREAD, LOG_SYSCALLS, 1, @@ -1757,30 +1852,7 @@ presys_ResumeThread(dcontext_t *dcontext, reg_t *param_base) "Not first thread so not setting DR env vars in pid=" PIFX "\n", pid); return; } - peb = get_peb(process_handle); - if (peb == NULL) { - LOG(THREAD, LOG_SYSCALLS, 1, - "WARNING: error acquiring PEB for pid=" PIFX "\n", pid); - close_handle(process_handle); - return; - } - if (!nt_read_virtual_memory(process_handle, &peb->ProcessParameters, &pp, - sizeof(pp), NULL) || - pp == NULL) { - LOG(THREAD, LOG_SYSCALLS, 1, - "WARNING: error acquiring ProcessParameters for pid=" PIFX "\n", pid); - close_handle(process_handle); - return; - } - LOG(THREAD, LOG_SYSCALLS, 2, - "inserting DR env vars to pid=" PIFX " &pp->Environment=" PFX "\n", pid, - &pp->Environment); - if (!add_dr_env_vars(dcontext, process_handle, (wchar_t **)&pp->Environment)) { - LOG(THREAD, LOG_SYSCALLS, 1, - "WARNING: unable to add DR env vars for child pid=" PIFX "\n", pid); - close_handle(process_handle); - return; - } + propagate_options_via_env_vars(dcontext, process_handle, thread_handle); close_handle(process_handle); } } @@ -3089,83 +3161,98 @@ postsys_CreateUserProcess(dcontext_t *dcontext, reg_t *param_base, bool success) }); /* Even though syscall succeeded we use safe_read to be sure */ - if (success && d_r_safe_read(proc_handle_ptr, sizeof(proc_handle), &proc_handle) && - d_r_safe_read(thread_handle_ptr, sizeof(thread_handle), &thread_handle)) { - ACCESS_MASK rights = nt_get_handle_access_rights(proc_handle); - - if (TESTALL(PROCESS_VM_OPERATION | PROCESS_VM_READ | PROCESS_VM_WRITE | - PROCESS_QUERY_INFORMATION, - rights)) { - if (create_suspended) { - char buf[MAX_CONTEXT_SIZE]; - CONTEXT *context; - CONTEXT *cxt = NULL; - int res; - /* Since this syscall is vista+ only, whether a wow64 process - * has no bearing (xref i#381) - */ - ASSERT(get_os_version() >= WINDOWS_VERSION_VISTA); - if (!DYNAMO_OPTION(early_inject)) { - /* If no early injection we have to do thread injection, and - * on Vista+ we don't see the - * NtCreateThread so we do it here. PR 215423. - */ - context = nt_initialize_context(buf, CONTEXT_DR_STATE); - res = nt_get_context(thread_handle, context); - if (NT_SUCCESS(res)) - cxt = context; - else { - /* FIXME i#49: cross-arch injection can end up here w/ - * STATUS_INVALID_PARAMETER. Need to use proper platform's - * CONTEXT for target. - */ - DODEBUG({ - if (is_wow64_process(NT_CURRENT_PROCESS) && - !is_wow64_process(proc_handle)) { - SYSLOG_INTERNAL_WARNING_ONCE( - "Injecting from 32-bit into 64-bit process is not " - "yet supported."); - } - }); - LOG(THREAD, LOG_SYSCALLS, 1, - "syscall: NtCreateUserProcess: WARNING: failed to get cxt of " - "thread (" PIFX ") so can't follow children on WOW64.\n", - res); - } - } - if ((cxt != NULL || DYNAMO_OPTION(early_inject)) && - maybe_inject_into_process(dcontext, proc_handle, cxt) && - cxt != NULL) { - /* injection routine is assuming doesn't have to install cxt */ - res = nt_set_context(thread_handle, cxt); - if (!NT_SUCCESS(res)) { - LOG(THREAD, LOG_SYSCALLS, 1, - "syscall: NtCreateUserProcess: WARNING: failed to set cxt of " - "thread (" PIFX ") so can't follow children on WOW64.\n", - res); - } + if (!success || !d_r_safe_read(proc_handle_ptr, sizeof(proc_handle), &proc_handle) || + !d_r_safe_read(thread_handle_ptr, sizeof(thread_handle), &thread_handle)) + return; + + /* Case 9173: guard against pid reuse */ + dcontext->aslr_context.last_child_padded = 0; + + ACCESS_MASK rights = nt_get_handle_access_rights(proc_handle); + if (!TESTALL(PROCESS_VM_OPERATION | PROCESS_VM_READ | PROCESS_VM_WRITE | + PROCESS_QUERY_INFORMATION, + rights)) { + LOG(THREAD, LOG_SYSCALLS, 1, + "syscall: NtCreateUserProcess unable to get sufficient rights" + " to follow children\n"); + /* This happens for Vista protected processes (drm). xref 8485 */ + /* FIXME - could check against executable file name from + * thread_stuff to see if this was a process we're configured to + * protect. */ + /* XXX: Should we make this a fatal release build error? */ + SYSLOG_INTERNAL_WARNING("Insufficient permissions to examine " + "child process\n"); + } + if (!create_suspended) { + /* For Vista+ NtCreateUserProcess has suspend as a + * param and ideally we should replace the env pre-NtCreateUserProcess, + * but we have yet to get that to work, so for now we rely on + * Vista+ process creation going through the kernel32 routines, + * which do hardcode the thread as being suspended. + * TODO: We should change the parameter to ensure the thread is suspended. + */ + LOG(THREAD, LOG_SYSCALLS, 1, + "syscall: NtCreateUserProcess first thread not suspended " + "can't safely follow children.\n"); + REPORT_FATAL_ERROR_AND_EXIT(FOLLOW_CHILD_FAILED, 3, get_application_name(), + get_application_pid(), + "Child thread not created suspended"); + } + char buf[MAX_CONTEXT_SIZE]; + CONTEXT *context; + CONTEXT *cxt = NULL; + int res; + /* Since this syscall is vista+ only, whether a wow64 process + * has no bearing (xref i#381) + */ + ASSERT(get_os_version() >= WINDOWS_VERSION_VISTA); + if (!DYNAMO_OPTION(early_inject)) { + /* If no early injection we have to do thread injection, and + * on Vista+ we don't see the NtCreateThread so we do it here. PR 215423. + */ + context = nt_initialize_context(buf, CONTEXT_DR_STATE); + res = nt_get_context(thread_handle, context); + if (NT_SUCCESS(res)) + cxt = context; + else { + /* FIXME i#49: cross-arch injection can end up here w/ + * STATUS_INVALID_PARAMETER. Need to use proper platform's + * CONTEXT for target. + */ + DODEBUG({ + if (is_wow64_process(NT_CURRENT_PROCESS) && + !is_wow64_process(proc_handle)) { + SYSLOG_INTERNAL_WARNING_ONCE( + "Injecting from 32-bit into 64-bit " + "is not supported for -no_early_inject."); } - } else { - LOG(THREAD, LOG_SYSCALLS, 1, - "syscall: NtCreateUserProcess first thread not suspended " - "can't safely follow children.\n"); - ASSERT_NOT_IMPLEMENTED(create_suspended); - /* FIXME - NYI - should change in pre and resume the thread - * after we inject. */ - } - } else { + }); + LOG(THREAD, LOG_SYSCALLS, 1, + "syscall: NtCreateUserProcess: WARNING: failed to get cxt of " + "thread (" PIFX ") so can't follow children on WOW64.\n", + res); + REPORT_FATAL_ERROR_AND_EXIT(FOLLOW_CHILD_FAILED, 3, get_application_name(), + get_application_pid(), + "Failed to get context of child thread"); + } + } + ASSERT(cxt != NULL || DYNAMO_OPTION(early_inject)); /* Else, exited above. */ + /* Do the actual injection. */ + if (!maybe_inject_into_process(dcontext, proc_handle, cxt)) + return; + propagate_options_via_env_vars(dcontext, proc_handle, thread_handle); + if (cxt != NULL) { + /* injection routine is assuming doesn't have to install cxt */ + res = nt_set_context(thread_handle, cxt); + if (!NT_SUCCESS(res)) { LOG(THREAD, LOG_SYSCALLS, 1, - "syscall: NtCreateUserProcess unable to get sufficient rights" - " to follow children\n"); - /* This happens for Vista protected processes (drm). xref 8485 */ - /* FIXME - could check against executable file name from - * thread_stuff to see if this was a process we're configured to - * protect. */ - SYSLOG_INTERNAL_WARNING("Insufficient permissions to examine " - "child process\n"); + "syscall: NtCreateUserProcess: WARNING: failed to set cxt of " + "thread (" PIFX ") so can't follow children on WOW64.\n", + res); + REPORT_FATAL_ERROR_AND_EXIT(FOLLOW_CHILD_FAILED, 3, get_application_name(), + get_application_pid(), + "Failed to set context of child thread"); } - /* Case 9173: guard against pid reuse */ - dcontext->aslr_context.last_child_padded = 0; } } diff --git a/suite/runsuite_common_pre.cmake b/suite/runsuite_common_pre.cmake index d7a68086a0b..bac0a44a4ae 100755 --- a/suite/runsuite_common_pre.cmake +++ b/suite/runsuite_common_pre.cmake @@ -1,5 +1,5 @@ # ********************************************************** -# Copyright (c) 2011-2020 Google, Inc. All rights reserved. +# Copyright (c) 2011-2021 Google, Inc. All rights reserved. # Copyright (c) 2009-2010 VMware, Inc. All rights reserved. # ********************************************************** @@ -147,6 +147,123 @@ foreach (arg ${CTEST_SCRIPT_ARG}) endif () endforeach (arg) +# Returns a list of environment variable names to set in ${env_names}. +# For each "name" in the list, sets a variable "name_env_value" in the caller's +# scope containing the value to be set. +# +# We'd like to export this via utils_exposed.cmake but it is difficult to +# import it here from somewhere else. +# Our manual INCLUDEFILE expansion would require all users to point to a +# generated file somewhere, which requires updating all users including +# Dr. Memory which today points directly at the embedded submodule sources. +# Some of these uses have no build dir where a generated file could live. +# Using include() is difficult for a file that is itself included, because +# cmake won't search our same directory, and CTEST_SCRIPT_DIRECTORY is the +# includer's directory, so we can't easily have side-by-side files. +# For now we keep it here, and have the current only other use in +# tests/CMakeLists.txt include() this file. +function (_DR_set_VS_bitwidth_env_vars is64 env_names) + # Convert env vars to run proper compiler. + # Note that this is fragile and won't work with non-standard + # directory layouts: we assume standard VS or SDK. + # XXX: would be nice to have case-insensitive regex flag! + # For now hardcoding VC, Bin, amd64 + set(must_change_path OFF) + if (is64) + list(APPEND names_list "ASM") + set(ASM_env_value "ml64" PARENT_SCOPE) + if (NOT "$ENV{LIB}" MATCHES "[Aa][Mm][Dd]64" AND + NOT "$ENV{LIB}" MATCHES "[Xx]64") + set(must_change_path ON) + # Note that we can't set ENV{PATH} as the output var of the replace: + # it has to be its own set(). + # + # VS2017 has bin/HostX{86,64}/x{86,64}/ + string(REGEX REPLACE "((^|;)[^;]*)HostX86([/\\\\])x86" "\\1HostX64\\3x64" + newpath "$ENV{PATH}") + # i#1421: VS2013 needs base VC/bin on the path (for cross-compiler + # used by cmake) so we duplicate and put amd64 first. Older VS needs + # Common7/IDE instead which should already be elsewhere on path. + string(REGEX REPLACE "((^|;)[^;]*)VC([/\\\\])([Bb][Ii][Nn])" + "\\1VC\\3\\4\\3amd64;\\1VC\\3\\4" + newpath "${newpath}") + # VS2008's SDKs/Windows/v{6.0A,7.0} uses "x64" instead of "amd64" + string(REGEX REPLACE "([/\\\\]v[^/\\\\]*)([/\\\\])([Bb][Ii][Nn])" + "\\1\\2\\3\\2x64" + newpath "${newpath}") + if (arg_verbose) + message("Env setup: setting PATH to ${newpath}") + endif () + # VS2017 does not append so we replace first. + string(REGEX REPLACE "([/\\\\])x86" "\\1x64" newlib "$ENV{lib}") + # Now try to support pre-VS2017. + string(REGEX REPLACE "([/\\\\])([Ll][Ii][Bb])([^/\\\\])" "\\1\\2\\1amd64\\3" + newlib "${newlib}") + # VS2008's SDKs/Windows/v{6.0A,7.0} uses "x64" instead of "amd64": grrr + string(REGEX REPLACE + "([/\\\\]v[^/\\\\]*[/\\\\][Ll][Ii][Bb][/\\\\])[Aa][Mm][Dd]64" + "\\1x64" + newlib "${newlib}") + # Win8 SDK uses um/x86 and um/x64 after "Lib/win{8,v6.3}/" + string(REGEX REPLACE + "([Ll][Ii][Bb])[/\\\\]amd64([/\\\\][Ww][Ii][Nn][v0-9.]+[/\\\\]um[/\\\\])x86" + "\\1\\2x64" newlib "${newlib}") + if (arg_verbose) + message("Env setup: setting LIB to ${newlib}") + endif () + string(REGEX REPLACE "([/\\\\])([Ll][Ii][Bb][/\\\\])[Xx]86" "\\1\\2" + newlibpath "$ENV{LIBPATH}") + string(REGEX REPLACE "([/\\\\])([Ll][Ii][Bb])" "\\1\\2\\1amd64" + newlibpath "${newlibpath}") + if (arg_verbose) + message("Env setup: setting LIBPATH to ${newlibpath}") + endif () + endif () + else (is64) + list(APPEND ${env_names} "ASM") + set(ASM_env_value "ml" PARENT_SCOPE) + if ("$ENV{LIB}" MATCHES "[Aa][Mm][Dd]64" OR + "$ENV{LIB}" MATCHES "[Xx]64") + set(must_change_path ON) + # VS2017 has bin/HostX{86,64}/x{86,64}/ + string(REGEX REPLACE "((^|;)[^;]*)HostX64([/\\\\])x64" "\\1HostX86\\3x86" + newpath "$ENV{PATH}") + # Remove the duplicate we added (see i#1421 comment above). + string(REGEX REPLACE "((^|;)[^;]*)VC[/\\\\][Bb][Ii][Nn][/\\\\][Aa][Mm][Dd]64" + "" newpath "${newpath}") + if (arg_verbose) + message("Env setup: setting PATH to ${newpath}") + endif () + string(REGEX REPLACE "([Ll][Ii][Bb])[/\\\\][Aa][Mm][Dd]64" "\\1" + newlib "$ENV{LIB}") + string(REGEX REPLACE "([Ll][Ii][Bb][/\\\\])[Xx]64" "\\1x86" + newlib "${newlib}") + # Win8 SDK uses um/x86 and um/x64 + string(REGEX REPLACE "([/\\\\]um[/\\\\])x64" "\\1x86" newlib "${newlib}") + string(REGEX REPLACE "([/\\\\]ucrt[/\\\\])x64" "\\1x86" newlib "${newlib}") + if (arg_verbose) + message("Env setup: setting LIB to ${newlib}") + endif () + string(REGEX REPLACE "([Ll][Ii][Bb])[/\\\\][Aa][Mm][Dd]64" "\\1" + newlibpath "$ENV{LIBPATH}") + string(REGEX REPLACE "([Ll][Ii][Bb][/\\\\])[Xx]64" "\\1x86" + newlibpath "${newlibpath}") + if (arg_verbose) + message("Env setup: setting LIBPATH to ${newlibpath}") + endif () + endif () + endif (is64) + if (must_change_path) + list(APPEND names_list "PATH") + set(PATH_env_value "${newpath}" PARENT_SCOPE) + list(APPEND names_list "LIB") + set(LIB_env_value "${newlib}" PARENT_SCOPE) + list(APPEND names_list "LIBPATH") + set(LIBPATH_env_value "${newlibpath}" PARENT_SCOPE) + set(${env_names} ${names_list} PARENT_SCOPE) + endif () +endfunction () + # allow setting the base cache variables via an include file set(base_cache "") if (arg_include) @@ -367,7 +484,7 @@ else () message(FATAL_ERROR "Cannot determine Visual Studio version") endif () endif () - message("Using ${vs_generator} generators") + message(STATUS "Using ${vs_generator} generators") if (arg_use_msbuild) find_program(MSBUILD_PROGRAM msbuild) if (MSBUILD_PROGRAM) @@ -483,91 +600,10 @@ function(testbuild_ex name is64 initial_cache test_only_in_long set(ENV{CC} "cl") set(ENV{CXX} "cl") # Convert env vars to run proper compiler. - # Note that this is fragile and won't work with non-standard - # directory layouts: we assume standard VS or SDK. - # XXX: would be nice to have case-insensitive regex flag! - # For now hardcoding VC, Bin, amd64 - if (is64) - set(ENV{ASM} "ml64") - if (NOT "$ENV{LIB}" MATCHES "[Aa][Mm][Dd]64" AND - NOT "$ENV{LIB}" MATCHES "[Xx]64") - # Note that we can't set ENV{PATH} as the output var of the replace: - # it has to be its own set(). - # - # VS2017 has bin/HostX{86,64}/x{86,64}/ - string(REGEX REPLACE "((^|;)[^;]*)HostX86([/\\\\])x86" "\\1HostX64\\3x64" - newpath "$ENV{PATH}") - # i#1421: VS2013 needs base VC/bin on the path (for cross-compiler - # used by cmake) so we duplicate and put amd64 first. Older VS needs - # Common7/IDE instead which should already be elsewhere on path. - string(REGEX REPLACE "((^|;)[^;]*)VC([/\\\\])([Bb][Ii][Nn])" - "\\1VC\\3\\4\\3amd64;\\1VC\\3\\4" - newpath "${newpath}") - # VS2008's SDKs/Windows/v{6.0A,7.0} uses "x64" instead of "amd64" - string(REGEX REPLACE "([/\\\\]v[^/\\\\]*)([/\\\\])([Bb][Ii][Nn])" - "\\1\\2\\3\\2x64" - newpath "${newpath}") - if (arg_verbose) - message("Env setup: setting PATH to ${newpath}") - endif () - set(ENV{PATH} "${newpath}") - # VS2017 does not append so we replace first. - string(REGEX REPLACE "([/\\\\])x86" "\\1x64" newlib "$ENV{lib}") - # Now try to support pre-VS2017. - string(REGEX REPLACE "([/\\\\])([Ll][Ii][Bb])([^/\\\\])" "\\1\\2\\1amd64\\3" - newlib "${newlib}") - # VS2008's SDKs/Windows/v{6.0A,7.0} uses "x64" instead of "amd64": grrr - string(REGEX REPLACE - "([/\\\\]v[^/\\\\]*[/\\\\][Ll][Ii][Bb][/\\\\])[Aa][Mm][Dd]64" - "\\1x64" - newlib "${newlib}") - # Win8 SDK uses um/x86 and um/x64 after "Lib/win{8,v6.3}/" - string(REGEX REPLACE - "([Ll][Ii][Bb])[/\\\\]amd64([/\\\\][Ww][Ii][Nn][v0-9.]+[/\\\\]um[/\\\\])x86" - "\\1\\2x64" newlib "${newlib}") - if (arg_verbose) - message("Env setup: setting LIB to ${newlib}") - endif () - set(ENV{LIB} "${newlib}") - string(REGEX REPLACE "([/\\\\])([Ll][Ii][Bb])" "\\1\\2\\1amd64" - newlibpath "$ENV{LIBPATH}") - if (arg_verbose) - message("Env setup: setting LIBPATH to ${newlibpath}") - endif () - set(ENV{LIBPATH} "${newlibpath}") - endif () - else (is64) - set(ENV{ASM} "ml") - if ("$ENV{LIB}" MATCHES "[Aa][Mm][Dd]64" OR - "$ENV{LIB}" MATCHES "[Xx]64") - # VS2017 has bin/HostX{86,64}/x{86,64}/ - string(REGEX REPLACE "((^|;)[^;]*)HostX64([/\\\\])x64" "\\1HostX86\\3x86" - newpath "$ENV{PATH}") - # Remove the duplicate we added (see i#1421 comment above). - string(REGEX REPLACE "((^|;)[^;]*)VC[/\\\\][Bb][Ii][Nn][/\\\\][Aa][Mm][Dd]64" - "" newpath "{newpath}") - if (arg_verbose) - message("Env setup: setting PATH to ${newpath}") - endif () - set(ENV{PATH} "${newpath}") - string(REGEX REPLACE "([Ll][Ii][Bb])[/\\\\][Aa][Mm][Dd]64" "\\1" - newlib "$ENV{LIB}") - string(REGEX REPLACE "([Ll][Ii][Bb])[/\\\\][Xx]64" "\\1" - newlib "${newlib}") - # Win8 SDK uses um/x86 and um/x64 - string(REGEX REPLACE "([/\\\\]um[/\\\\])x64" "\\1x86" newlib "${newlib}") - if (arg_verbose) - message("Env setup: setting LIB to ${newlib}") - endif () - set(ENV{LIB} "${newlib}") - string(REGEX REPLACE "([Ll][Ii][Bb])[/\\\\][Aa][Mm][Dd]64" "\\1" - newlibpath "$ENV{LIBPATH}") - if (arg_verbose) - message("Env setup: setting LIBPATH to ${newlibpath}") - endif () - set(ENV{LIBPATH} "${newlibpath}") - endif () - endif (is64) + _DR_set_VS_bitwidth_env_vars(${is64} env_names) + foreach (env ${env_names}) + set(ENV{${env}} "${${env}_env_value}") + endforeach () else (WIN32) if (ARCH_IS_X86) if (is64) diff --git a/suite/tests/CMakeLists.txt b/suite/tests/CMakeLists.txt index 76f43480649..2bdd7e3eccb 100644 --- a/suite/tests/CMakeLists.txt +++ b/suite/tests/CMakeLists.txt @@ -1,5 +1,5 @@ # ********************************************************** -# Copyright (c) 2010-2020 Google, Inc. All rights reserved. +# Copyright (c) 2010-2021 Google, Inc. All rights reserved. # Copyright (c) 2009-2010 VMware, Inc. All rights reserved. # Copyright (c) 2016 ARM Limited. All rights reserved. # ********************************************************** @@ -50,6 +50,7 @@ cmake_minimum_required(VERSION 3.7) include(../../make/policies.cmake NO_POLICY_SCOPE) +include(../runsuite_common_pre.cmake) # For _DR_set_VS_bitwidth_env_vars. option(TEST_LONG "run long set of tests") option(SKIP_FLAKY_TESTS "do not run tests named *_FLAKY") @@ -4374,6 +4375,78 @@ else (UNIX) tobuild_csharp(win32.dotnet win32/dotnet.cs) + # Cross-arch injection tests. + if (X86) + set(xarch_dir ${CMAKE_CURRENT_BINARY_DIR}/xarch) + set(test_bindir suite/tests/bin) + file(MAKE_DIRECTORY ${xarch_dir}) + # Unfortunately, while CMake 3.13+ supports create_symlink on Windows, that + # only works if the user has the right privileges, so we can't set up + # lib dir symlinks like we do for unix cross-arch build-and-test. + # We have to make a copy in order to have parallel dirs with the two bitwidths. + file(MAKE_DIRECTORY "${xarch_dir}/${INSTALL_LIB}") + set(xarch_copy_stamp "${CMAKE_CURRENT_BINARY_DIR}/xarch_copy.stamp") + add_custom_command(OUTPUT "${xarch_copy_stamp}" + DEPENDS dynamorio + COMMAND ${CMAKE_COMMAND} ARGS + -E touch "${xarch_copy_stamp}" + COMMAND ${CMAKE_COMMAND} ARGS + -E copy "${MAIN_LIBRARY_OUTPUT_DIRECTORY}/dynamorio.dll" + "${xarch_dir}/${INSTALL_LIB}/dynamorio.dll" + VERBATIM) + add_custom_target(xarch_copy DEPENDS "${xarch_copy_stamp}") + add_dependencies(drcopy xarch_copy) + # We use a client that takes some options and prints something at + # init and at exit to test cross-arch injection with clients. + get_target_path_for_execution(cur_client_path client.large_options.dll + "${location_suffix}") + set(xarch_client_path ${xarch_dir}/${test_bindir}/client.large_options.dll.dll) + get_target_path_for_execution(drrun_path drrun "${location_suffix}") + if (X64) + set(client32_path ${xarch_client_path}) + set(client64_path ${cur_client_path}) + set(other_is64 OFF) + else () + set(client32_path ${cur_client_path}) + set(client64_path ${xarch_client_path}) + set(other_is64 ON) + endif () + if (DEBUG) + set(config_type -DDEBUG=ON) + else () + set(config_type "") + endif () + # I abandoned --build-generator-platform (3.13+ only; VS gen only) or explicitly + # setting the full generator (fails w/ Ninja) and went with the env var scheme we + # use in our test suite. + _DR_set_VS_bitwidth_env_vars(${other_is64} env_names) + foreach (env ${env_names}) + # Escape internal ; since this goes into a CMake "var=value;..." list. + string(REPLACE ";" "\\;" value "${${env}_env_value}") + list(APPEND test_env_pairs "${env}=${value}") + endforeach () + add_test(win32.xarch ${CMAKE_CTEST_COMMAND} -V + --build-and-test ${PROJECT_SOURCE_DIR} ${xarch_dir} + --build-generator ${CMAKE_GENERATOR} + --build-target dynamorio + --build-target common.eflags + --build-target client.large_options.dll + --build-noclean # Else it tries to clean between each target! + --build-makeprogram ${CMAKE_MAKE_PROGRAM} + --build-options ${config_type} -DBUILD_TESTS=ON -DBUILD_DOCS=OFF + -DBUILD_SAMPLES=OFF -DBUILD_EXT=OFF -DBUILD_CLIENTS=OFF + --test-command ${drrun_path} -dr_home ${xarch_dir} ${dr_test_ops} + -c32 ${client32_path} -paramA foo -paramB bar -- + -c64 ${client64_path} -paramA foo -paramB bar -- + ${MAIN_RUNTIME_OUTPUT_DIRECTORY}/create_process.exe + ${xarch_dir}/${test_bindir}/common.eflags.exe) + file(READ win32/xarch.templatex expect) + set_tests_properties(win32.xarch PROPERTIES ENVIRONMENT + "${test_env_pairs}" + # We start with .* for all the config and build stuff. + PASS_REGULAR_EXPRESSION ".*${expect}[ \n]*$") + endif (X86) + # Cross-arch mixedmode/x86_to_x64 test: can only be done via a suite of tests that # build both 32-bit and 64-bit. We have 32-bit just build, and # 64-bit then builds and runs both directions. diff --git a/suite/tests/win32/xarch.templatex b/suite/tests/win32/xarch.templatex new file mode 100644 index 00000000000..d52bc6b986e --- /dev/null +++ b/suite/tests/win32/xarch.templatex @@ -0,0 +1,20 @@ +large_options passed: -paramA foo -paramB bar +creating subprocess ".*xarch/suite/tests/bin/common.eflags.exe" +large_options passed: -paramA foo -paramB bar +OK 1 CF +OK 0 CF +OK 1 PF +OK 0 PF +OK 1 AF +OK 0 AF +OK 1 ZF +OK 0 ZF +OK 1 SF +OK 0 SF +OK 1 DF +OK 0 DF +OK 1 OF +OK 0 OF +large_options exiting +parent done +large_options exiting From 464b894a6b997c8a3cdb3aa78b190ec7b203e8d9 Mon Sep 17 00:00:00 2001 From: Derek Bruening Date: Mon, 4 Jan 2021 14:09:54 -0500 Subject: [PATCH 2/7] Fix bug where ASM env var was not set for toolchain when not switching bitwidths --- suite/runsuite_common_pre.cmake | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/suite/runsuite_common_pre.cmake b/suite/runsuite_common_pre.cmake index bac0a44a4ae..092734744a8 100755 --- a/suite/runsuite_common_pre.cmake +++ b/suite/runsuite_common_pre.cmake @@ -260,8 +260,8 @@ function (_DR_set_VS_bitwidth_env_vars is64 env_names) set(LIB_env_value "${newlib}" PARENT_SCOPE) list(APPEND names_list "LIBPATH") set(LIBPATH_env_value "${newlibpath}" PARENT_SCOPE) - set(${env_names} ${names_list} PARENT_SCOPE) endif () + set(${env_names} ${names_list} PARENT_SCOPE) endfunction () # allow setting the base cache variables via an include file From 813d70e521bb9a10514aadad8190ded40ea779e6 Mon Sep 17 00:00:00 2001 From: Derek Bruening Date: Mon, 4 Jan 2021 15:44:28 -0500 Subject: [PATCH 3/7] Add missing periods. --- core/win32/ntdll.h | 4 ++-- core/win32/ntdll_shared.c | 24 ++++++++++++------------ 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/core/win32/ntdll.h b/core/win32/ntdll.h index 106462b17da..321ce0ca120 100644 --- a/core/win32/ntdll.h +++ b/core/win32/ntdll.h @@ -97,7 +97,7 @@ #ifdef X64 typedef struct _UNICODE_STRING_32 { - /* Length field is size in bytes not counting final 0 */ + /* Length field is size in bytes not counting final 0. */ USHORT Length; USHORT MaximumLength; uint Buffer; @@ -111,7 +111,7 @@ typedef struct _RTL_USER_PROCESS_PARAMETERS_32 { } RTL_USER_PROCESS_PARAMETERS_32, *PRTL_USER_PROCESS_PARAMETERS_32; #else typedef struct ALIGN_VAR(8) _UNICODE_STRING_64 { - /* Length field is size in bytes not counting final 0 */ + /* Length field is size in bytes not counting final 0. */ USHORT Length; USHORT MaximumLength; int padding; diff --git a/core/win32/ntdll_shared.c b/core/win32/ntdll_shared.c index 9ad2db4267a..5175b0b9878 100644 --- a/core/win32/ntdll_shared.c +++ b/core/win32/ntdll_shared.c @@ -84,19 +84,19 @@ nt_write_virtual_memory(HANDLE process, void *base, const void *buffer, #ifndef X64 /* Around most of the rest of the file. */ # if !defined(NOT_DYNAMORIO_CORE) && !defined(NOT_DYNAMORIO_CORE_PROPER) -# define UNPROT_IF_INIT() \ - do { \ - /* The first call may not be during init so we have to unprot */ \ - if (dynamo_initialized) { \ - SELF_UNPROTECT_DATASEC(DATASEC_RARELY_PROT); \ - } \ +# define UNPROT_IF_INIT() \ + do { \ + /* The first call may not be during init so we have to unprot. */ \ + if (dynamo_initialized) { \ + SELF_UNPROTECT_DATASEC(DATASEC_RARELY_PROT); \ + } \ } while (0) -# define PROT_IF_INIT() \ - do { \ - /* The first call may not be during init so we have to unprot */ \ - if (dynamo_initialized) { \ - SELF_PROTECT_DATASEC(DATASEC_RARELY_PROT); \ - } \ +# define PROT_IF_INIT() \ + do { \ + /* The first call may not be during init so we have to unprot. */ \ + if (dynamo_initialized) { \ + SELF_PROTECT_DATASEC(DATASEC_RARELY_PROT); \ + } \ } while (0) # else # define PROT_IF_INIT() /* Nothing. */ From 05714a703266bf62bba0a5d2da79678d1af20f73 Mon Sep 17 00:00:00 2001 From: Derek Bruening Date: Mon, 4 Jan 2021 16:12:16 -0500 Subject: [PATCH 4/7] Add method for requesting that drrun use the old thread injection via -late and dr_inject_use_late_injection() --- api/docs/release.dox | 6 +++++- core/lib/dr_inject.h | 16 +++++++++++++++- core/win32/injector.c | 25 ++++++++++++++++++++++--- tools/drdeploy.c | 31 ++++++++++++++++++++++--------- 4 files changed, 64 insertions(+), 14 deletions(-) diff --git a/api/docs/release.dox b/api/docs/release.dox index 9a787ebba08..bcd65f9fa05 100644 --- a/api/docs/release.dox +++ b/api/docs/release.dox @@ -1,5 +1,5 @@ /* ****************************************************************************** - * Copyright (c) 2010-2020 Google, Inc. All rights reserved. + * Copyright (c) 2010-2021 Google, Inc. All rights reserved. * Copyright (c) 2011 Massachusetts Institute of Technology All rights reserved. * Copyright (c) 2008-2010 VMware, Inc. All rights reserved. * ******************************************************************************/ @@ -159,6 +159,10 @@ compatibility changes: Further non-compatibility-affecting changes include: + - On x86 Windows, different-bitwidth child processes are now followed into. + The default injection method has also changed to a new method relying on + an image entry hook in some cases. The old behavior can be requested by + passing "-late" to drrun or calling dr_inject_use_late_injection(). - Added drmgr_register_opcode_instrumentation_event() and drmgr_unregister_opcode_instrumentation_event() so that drmgr supports opcode event instrumentation. diff --git a/core/lib/dr_inject.h b/core/lib/dr_inject.h index cd1a784cbb0..597d8730757 100644 --- a/core/lib/dr_inject.h +++ b/core/lib/dr_inject.h @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2013-2015 Google, Inc. All rights reserved. + * Copyright (c) 2013-2021 Google, Inc. All rights reserved. * Copyright (c) 2010 VMware, Inc. All rights reserved. * **********************************************************/ @@ -180,6 +180,20 @@ dr_inject_prepare_new_process_group(void *data); #endif /* UNIX */ +#ifdef WINDOWS +DR_EXPORT +/** + * Specifies that late injection should be used for the process created by + * dr_inject_process_create(). + * + * \param[in] data The pointer returned by dr_inject_process_create() + * + * \return Whether successful. + */ +bool +dr_inject_use_late_injection(void *data); +#endif + DR_EXPORT /** * Injects DynamoRIO into a process created by dr_inject_process_create(), or diff --git a/core/win32/injector.c b/core/win32/injector.c index c4fc63bde26..2b391043173 100644 --- a/core/win32/injector.c +++ b/core/win32/injector.c @@ -230,6 +230,7 @@ tchar_to_char(const TCHAR *wstr, OUT char *buf, size_t buflen /*# elements*/) typedef struct _dr_inject_info_t { PROCESS_INFORMATION pi; bool using_debugger_injection; + bool using_thread_injection; TCHAR wimage_name[MAXIMUM_PATH]; /* We need something to point at for dr_inject_get_image_name so we just * keep a utf8 buffer as well. @@ -817,10 +818,10 @@ dr_inject_process_create(const char *app_name, const char **argv, void **data OU * if we have our own version of CreateProcess that doesn't check the * debugger key */ info->using_debugger_injection = using_debugger_key_injection(info->wimage_name); - if (info->using_debugger_injection) { unset_debugger_key_injection(); } + info->using_thread_injection = false; /* Must specify TRUE for bInheritHandles so child inherits stdin! */ res = CreateProcess(wapp_name, wapp_cmdline, NULL, NULL, TRUE, @@ -842,11 +843,21 @@ dr_inject_process_create(const char *app_name, const char **argv, void **data OU return errcode; } +DYNAMORIO_EXPORT +bool +dr_inject_use_late_injection(void *data) +{ + dr_inject_info_t *info = (dr_inject_info_t *)data; + info->using_thread_injection = true; + return true; +} + DYNAMORIO_EXPORT bool dr_inject_process_inject(void *data, bool force_injection, const char *library_path) { dr_inject_info_t *info = (dr_inject_info_t *)data; + CONTEXT cxt; bool inject = true; char library_path_buf[MAXIMUM_PATH]; @@ -920,8 +931,16 @@ dr_inject_process_inject(void *data, bool force_injection, const char *library_p * But it's non-trivial to gather the relevant addresses. * i#234/PR 204587 is a prereq? */ - if (!inject_into_new_process(info->pi.hProcess, (char *)library_path, true /*map*/, - INJECT_LOCATION_ImageEntry, NULL)) { + bool res = false; + /* We provide a way to fall back on thread injection. */ + if (info->using_thread_injection) { + res = inject_into_thread(info->pi.hProcess, &cxt, info->pi.hThread, + (char *)library_path); + } else { + res = inject_into_new_process(info->pi.hProcess, (char *)library_path, + true /*map*/, INJECT_LOCATION_ImageEntry, NULL); + } + if (!res) { close_handle(info->pi.hProcess); TerminateProcess(info->pi.hProcess, 0); return false; diff --git a/tools/drdeploy.c b/tools/drdeploy.c index c97725898ed..310189ad34e 100644 --- a/tools/drdeploy.c +++ b/tools/drdeploy.c @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2011-2020 Google, Inc. All rights reserved. + * Copyright (c) 2011-2021 Google, Inc. All rights reserved. * Copyright (c) 2008-2010 VMware, Inc. All rights reserved. * **********************************************************/ @@ -291,10 +291,12 @@ const char *options_list_str = " -static Do not inject under the assumption that the application\n" " is statically linked with DynamoRIO. Instead, trigger\n" " automated takeover.\n" +# ifndef MACOS /* XXX i#1285: private loader NYI on MacOS */ + " -late Requests late injection.\n" +# endif # ifdef UNIX /* FIXME i#725: Windows attach NYI */ # ifndef MACOS /* XXX i#1285: private loader NYI on MacOS */ " -early Requests early injection (the default).\n" - " -late Requests late injection.\n" # endif " -attach Attach to the process with the given pid. Pass 0\n" " for pid to launch and inject into a new process.\n" @@ -1083,6 +1085,7 @@ _tmain(int argc, TCHAR *targv[]) #ifdef WINDOWS /* FIXME i#840: Implement nudges on Linux. */ bool nudge_all = false; + bool use_late_injection = false; process_id_t nudge_pid = 0; client_id_t nudge_id = 0; uint64 nudge_arg = 0; @@ -1243,6 +1246,19 @@ _tmain(int argc, TCHAR *targv[]) limit = -1; continue; } +# ifndef MACOS /* XXX i#1285: private loader NYI on MacOS */ + else if (strcmp(argv[i], "-late") == 0) { + /* Appending -no_early_inject to extra_ops communicates our intentions + * to drinjectlib on UNIX, as well as the core for all platforms. + */ + add_extra_option(extra_ops, BUFFER_SIZE_ELEMENTS(extra_ops), &extra_ops_sofar, + "-no_early_inject"); +# ifdef WINDOWS + use_late_injection = true; +# endif + continue; + } +# endif # ifdef UNIX else if (strcmp(argv[i], "-use_ptrace") == 0) { /* Undocumented option for using ptrace on a fresh process. */ @@ -1263,13 +1279,6 @@ _tmain(int argc, TCHAR *targv[]) else if (strcmp(argv[i], "-early") == 0) { /* Now the default: left here just for back-compat */ continue; - } else if (strcmp(argv[i], "-late") == 0) { - /* Appending -no_early_inject to extra_ops communicates our intentions - * to drinjectlib. - */ - add_extra_option(extra_ops, BUFFER_SIZE_ELEMENTS(extra_ops), &extra_ops_sofar, - "-no_early_inject"); - continue; } # endif # endif /* UNIX */ @@ -1743,6 +1752,10 @@ _tmain(int argc, TCHAR *targv[]) info("created child with pid " PIDFMT " for %s", dr_inject_get_process_id(inject_data), app_name); } +# ifdef WINDOWS + if (use_late_injection) + dr_inject_use_late_injection(inject_data); +# endif # ifdef UNIX if (limit != 0 && kill_group) { /* Move the child to its own process group. */ From 800326ca93520e657243d222e5733cfe4b2ff9f5 Mon Sep 17 00:00:00 2001 From: Derek Bruening Date: Mon, 4 Jan 2021 23:19:53 -0500 Subject: [PATCH 5/7] Switch from image entry to thread start, to handle .NET and other apps that never reach the image entry. This actually better matches the prior default thread injection in any case. We obtain the thread start from the context for same-bitwidth children, from ntdll!RtlUserThreadStart in the remote ntdll if not, and if both fail, we fall back to image entry. The thread start has xax as live, so we add a save into an earliest_args_t slot in the gencode, which the init code uses to restore the app value. --- core/arch/x86/x86.asm | 12 +++++--- core/arch/x86/x86_asm_defines.asm | 3 +- core/optionsx.h | 2 +- core/os_shared.h | 7 ++++- core/win32/inject.c | 50 ++++++++++++++++++++++++++++--- core/win32/injector.c | 9 +++--- core/win32/os.c | 12 ++++---- core/win32/os_private.h | 10 ++++--- core/win32/syscall.c | 10 +++---- 9 files changed, 86 insertions(+), 29 deletions(-) diff --git a/core/arch/x86/x86.asm b/core/arch/x86/x86.asm index ddece6b8e5a..615a9467ae2 100644 --- a/core/arch/x86/x86.asm +++ b/core/arch/x86/x86.asm @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2011-2020 Google, Inc. All rights reserved. + * Copyright (c) 2011-2021 Google, Inc. All rights reserved. * Copyright (c) 2001-2010 VMware, Inc. All rights reserved. * ********************************************************** */ @@ -2695,11 +2695,15 @@ dynamorio_earliest_init_repeatme: cmp ebx, 0 jg dynamorio_earliest_init_repeat_outer # endif - /* args are pointed at by xax */ + /* Load earliest_args_t.app_xax, written by our gencode. */ + mov REG_XCX, PTRSZ [REG_XAX] + /* Store into xax slot on stack. */ + mov PTRSZ [PUSHGPR_XAX_OFFS + REG_XSP], REG_XCX + /* Args are pointed at by xax. */ CALLC1(GLOBAL_REF(dynamorio_earliest_init_takeover_C), REG_XAX) - /* we will either be under DR control or running natively at this point */ + /* We will either be under DR control or running natively at this point. */ - /* restore */ + /* Restore. */ POPGPR ret END_FUNC(dynamorio_earliest_init_takeover) diff --git a/core/arch/x86/x86_asm_defines.asm b/core/arch/x86/x86_asm_defines.asm index 72341d8e615..730cefdd00b 100644 --- a/core/arch/x86/x86_asm_defines.asm +++ b/core/arch/x86/x86_asm_defines.asm @@ -1,5 +1,5 @@ /* ********************************************************** - * Copyright (c) 2011-2019 Google, Inc. All rights reserved. + * Copyright (c) 2011-2021 Google, Inc. All rights reserved. * Copyright (c) 2001-2010 VMware, Inc. All rights reserved. * ********************************************************** */ @@ -124,6 +124,7 @@ #endif /* offsetof(dcontext_t, is_exiting) */ #define is_exiting_OFFSET (dstack_OFFSET+1*ARG_SZ) +#define PUSHGPR_XAX_OFFS (7*ARG_SZ) #define PUSHGPR_XSP_OFFS (3*ARG_SZ) #define MCONTEXT_XSP_OFFS (PUSHGPR_XSP_OFFS) #define MCONTEXT_XCX_OFFS (MCONTEXT_XSP_OFFS + 3*ARG_SZ) diff --git a/core/optionsx.h b/core/optionsx.h index 70de038385c..ea225a73339 100644 --- a/core/optionsx.h +++ b/core/optionsx.h @@ -1719,7 +1719,7 @@ OPTION_DEFAULT(bool, early_inject_map, true, "inject earliest via map") * os version. Our default is late injection to make it easier on clients * (as noted in i#980, we don't want to be too early for a private kernel32). */ -OPTION_DEFAULT(uint, early_inject_location, 7 /* INJECT_LOCATION_ImageEntry */, +OPTION_DEFAULT(uint, early_inject_location, 8 /* INJECT_LOCATION_ThreadStart */, "where to hook for early_injection. Use 5 ==" "INJECT_LOCATION_KiUserApcdefault for earliest injection; use " "4 == INJECT_LOCATION_LdrDefault for easier-but-still-early.") diff --git a/core/os_shared.h b/core/os_shared.h index 65bb2bc5b66..e83e9e133d0 100644 --- a/core/os_shared.h +++ b/core/os_shared.h @@ -1314,7 +1314,12 @@ enum { * on some app libraries being initialized */ INJECT_LOCATION_ImageEntry = 7, - INJECT_LOCATION_MAX = INJECT_LOCATION_ImageEntry, + /* Similar in lateness to ImageEntry, but is more robust in that it does not + * rely on reaching the image entry, which not all apps do (e.g., .NET). + * This is equivalent to RtlUserThreadStart. + */ + INJECT_LOCATION_ThreadStart = 8, + INJECT_LOCATION_MAX = INJECT_LOCATION_ThreadStart, }; #endif diff --git a/core/win32/inject.c b/core/win32/inject.c index 9f9986d05dc..00e56f3c1a2 100644 --- a/core/win32/inject.c +++ b/core/win32/inject.c @@ -1268,6 +1268,18 @@ inject_gencode_mapped_helper(HANDLE phandle, char *dynamo_path, uint64 hook_loca RAW_INSERT_INT32(cur_local_pos, -8); } #endif + /* Save xax, which we clobber below. It is live for INJECT_LOCATION_ThreadStart. + * We write it into earliest_args_t.app_xax, and in dynamorio_earliest_init_takeover + * we use the saved value to update the PUSHGRP pushed xax. + */ + if (target_64) + *cur_local_pos++ = REX_W; + *cur_local_pos++ = MOV_REG32_2_RM32; + *cur_local_pos++ = MOV_IMM_RM_ABS; + uint64 cur_remote_pos = remote_code_buf + (cur_local_pos - local_code_buf); + RAW_INSERT_INT32(cur_local_pos, + target_64 ? (remote_data - (cur_remote_pos + sizeof(int))) + : remote_data); /* Restore hook rather than trying to pass contents to C code * (we leave hooked page writable for this and C code restores). */ @@ -1330,14 +1342,14 @@ inject_gencode_mapped_helper(HANDLE phandle, char *dynamo_path, uint64 hook_loca /* over-estimate to be sure: we assert below we're < PAGE_SIZE */ REL32_REACHABLE((int64)pc, (int64)remote_code_buf + PAGE_SIZE)) { *cur_local_pos++ = JMP_REL32; - uint64 cur_remote_pos = remote_code_buf + (cur_local_pos - local_code_buf); + cur_remote_pos = remote_code_buf + (cur_local_pos - local_code_buf); RAW_INSERT_INT32(cur_local_pos, (int64)pc - (int64)(cur_remote_pos + sizeof(int))); } else { /* Indirect through an inlined target. */ *cur_local_pos++ = JMP_ABS_IND64_OPCODE; *cur_local_pos++ = JMP_ABS_MEM_IND64_MODRM; - uint64 cur_remote_pos = remote_code_buf + (cur_local_pos - local_code_buf); + cur_remote_pos = remote_code_buf + (cur_local_pos - local_code_buf); RAW_INSERT_INT32(cur_local_pos, target_64 ? 0 : cur_remote_pos + sizeof(int)); if (target_64) RAW_INSERT_INT64(cur_local_pos, pc); @@ -1436,8 +1448,8 @@ inject_gencode_mapped(HANDLE phandle, char *dynamo_path, uint64 hook_location, * own stack in the child and swap to that for transparency. */ bool -inject_into_new_process(HANDLE phandle, char *dynamo_path, bool map, uint inject_location, - void *inject_address) +inject_into_new_process(HANDLE phandle, HANDLE thandle, char *dynamo_path, bool map, + uint inject_location, void *inject_address) { /* To handle a 64-bit child of a 32-bit DR we use "uint64" for remote addresses. */ uint64 hook_target = 0; @@ -1494,6 +1506,36 @@ inject_into_new_process(HANDLE phandle, char *dynamo_path, bool map, uint inject hook_location = get_remote_process_entry(phandle, &x86_code); late_injection = true; break; + case INJECT_LOCATION_ThreadStart: + late_injection = true; + /* Try to get the actual thread context if possible. We do not yet have + * support for CONTEXT32 and CONTEXT64, which we'd need for non-same-bitwidth. + * We next try looking in the remote ntdll for RtlUserThreadStart. + * If we can't find the thread start, we fall back to the image entry, which + * is not many instructions later. We also need to call this to set + * x86_code: + */ + uint64 image_entry = get_remote_process_entry(phandle, &x86_code); + if (thandle != NULL && IF_X64(!) is_32bit_process(phandle)) { + CONTEXT cxt; + cxt.ContextFlags = CONTEXT_CONTROL; + if (NT_SUCCESS(nt_get_context(thandle, &cxt))) { + hook_location = cxt.CXT_XIP; + } + } else { + bool target_64 = !x86_code IF_X64(|| DYNAMO_OPTION(inject_x64)); + uint64 ntdll_base = find_remote_ntdll_base(phandle, target_64); + uint64 thread_start = + get_remote_proc_address(phandle, ntdll_base, "RtlUserThreadStart"); + if (thread_start != 0) { + hook_location = thread_start; + } + } + if (hook_location == 0) { + /* Fall back to the image entry which is just a few instructions later. */ + hook_location = image_entry; + } + break; default: ASSERT_NOT_REACHED(); goto error; } diff --git a/core/win32/injector.c b/core/win32/injector.c index 2b391043173..25b5c6a88f9 100644 --- a/core/win32/injector.c +++ b/core/win32/injector.c @@ -926,7 +926,7 @@ dr_inject_process_inject(void *data, bool force_injection, const char *library_p inject_init(); /* Like the core, we use map injection, which supports cross-arch injection, is * in some ways cleaner than thread injection, and supports early injection at - * various points. For now we use the (late) image entry as the takeover point. + * various points. For now we use the (late) thread entry as the takeover point. * TODO PR 211367: use earlier injection instead of this late injection! * But it's non-trivial to gather the relevant addresses. * i#234/PR 204587 is a prereq? @@ -935,10 +935,11 @@ dr_inject_process_inject(void *data, bool force_injection, const char *library_p /* We provide a way to fall back on thread injection. */ if (info->using_thread_injection) { res = inject_into_thread(info->pi.hProcess, &cxt, info->pi.hThread, - (char *)library_path); + (char *)library_path); } else { - res = inject_into_new_process(info->pi.hProcess, (char *)library_path, - true /*map*/, INJECT_LOCATION_ImageEntry, NULL); + res = inject_into_new_process(info->pi.hProcess, info->pi.hThread, + (char *)library_path, true /*map*/, + INJECT_LOCATION_ThreadStart, NULL); } if (!res) { close_handle(info->pi.hProcess); diff --git a/core/win32/os.c b/core/win32/os.c index 50414110228..65e094603b7 100644 --- a/core/win32/os.c +++ b/core/win32/os.c @@ -3326,8 +3326,8 @@ should_inject_into_process(dcontext_t *dcontext, HANDLE process_handle, /* cxt may be NULL if -inject_at_create_process */ static int -inject_into_process(dcontext_t *dcontext, HANDLE process_handle, CONTEXT *cxt, - inject_setting_mask_t should_inject) +inject_into_process(dcontext_t *dcontext, HANDLE process_handle, HANDLE thread_handle, + CONTEXT *cxt, inject_setting_mask_t should_inject) { /* Here in fact we don't want to have the default argument override mechanism take place. If an app specific AUTOINJECT value is @@ -3424,7 +3424,7 @@ inject_into_process(dcontext_t *dcontext, HANDLE process_handle, CONTEXT *cxt, * but if it does could fall back to late injection (though we can't * be sure that would work, i.e. early thread process for ex.) or * do a SYSLOG error. */ - res = inject_into_new_process(process_handle, library, + res = inject_into_new_process(process_handle, thread_handle, library, DYNAMO_OPTION(early_inject_map), early_inject_location, early_inject_address); } else { @@ -3517,7 +3517,8 @@ is_first_thread_in_new_process(HANDLE process_handle, CONTEXT *cxt) * Does not support cross-arch injection for cxt!=NULL. */ bool -maybe_inject_into_process(dcontext_t *dcontext, HANDLE process_handle, CONTEXT *cxt) +maybe_inject_into_process(dcontext_t *dcontext, HANDLE process_handle, + HANDLE thread_handle, CONTEXT *cxt) { /* if inject_at_create_process becomes dynamic, need to move this check below * the synchronize dynamic options */ @@ -3561,7 +3562,8 @@ maybe_inject_into_process(dcontext_t *dcontext, HANDLE process_handle, CONTEXT * /* XXX: if not -early_inject, we are going to read and write * to cxt, which may be unsafe. */ - if (inject_into_process(dcontext, process_handle, cxt, should_inject)) { + if (inject_into_process(dcontext, process_handle, thread_handle, cxt, + should_inject)) { check_for_run_once(process_handle, rununder_mask); } } diff --git a/core/win32/os_private.h b/core/win32/os_private.h index 982cf4610d0..dc1c8dc2e5d 100644 --- a/core/win32/os_private.h +++ b/core/win32/os_private.h @@ -70,6 +70,7 @@ extern dcontext_t *early_inject_load_helper_dcontext; /* Passed to early injection init by parent. Sized to work for any bitwidth. */ typedef struct { + uint64 app_xax; uint64 dr_base; uint64 ntdll_base; uint64 tofree_base; @@ -89,7 +90,8 @@ bool is_first_thread_in_new_process(HANDLE process_handle, CONTEXT *cxt); bool -maybe_inject_into_process(dcontext_t *dcontext, HANDLE process_handle, CONTEXT *cxt); +maybe_inject_into_process(dcontext_t *dcontext, HANDLE process_handle, + HANDLE thread_handle, CONTEXT *cxt); bool translate_context(thread_record_t *trec, CONTEXT *cxt, bool restore_memory); @@ -670,10 +672,10 @@ inject_init(void); /* must be called prior to inject_into_thread(void) */ bool inject_into_thread(HANDLE phandle, CONTEXT *cxt, HANDLE thandle, char *dynamo_path); -/* inject_location values come form the INJECT_LOCATION_* enum is os_shared.h */ +/* inject_location values come from the INJECT_LOCATION_* enum in os_shared.h. */ bool -inject_into_new_process(HANDLE phandle, char *dynamo_path, bool map, uint inject_location, - void *inject_address); +inject_into_new_process(HANDLE phandle, HANDLE thandle, char *dynamo_path, bool map, + uint inject_location, void *inject_address); /* in (x86.asm for us) ************************************/ diff --git a/core/win32/syscall.c b/core/win32/syscall.c index c32e9a3bfff..f2d41009327 100644 --- a/core/win32/syscall.c +++ b/core/win32/syscall.c @@ -1401,7 +1401,7 @@ presys_CreateThread(dcontext_t *dcontext, reg_t *param_base) * children) FIXME * if not injecting at all we won't change cxt. */ - maybe_inject_into_process(dcontext, process_handle, cxt); + maybe_inject_into_process(dcontext, process_handle, thread_handle, cxt); if (is_phandle_me(process_handle)) pre_second_thread(); @@ -1728,7 +1728,7 @@ not_first_thread_in_new_process(HANDLE process_handle, HANDLE thread_handle) #ifndef X64 bool peb_is_32 = is_32bit_process(process_handle); if (!peb_is_32) { - /* We'd need a CONTEXT64 define for parent32,child64.. + /* We'd need a CONTEXT64 define for parent32,child64. * We only need this for pre-Vista, so just xp64, so we bail. */ REPORT_FATAL_ERROR_AND_EXIT(FOLLOW_CHILD_FAILED, 3, get_application_name(), @@ -3238,7 +3238,7 @@ postsys_CreateUserProcess(dcontext_t *dcontext, reg_t *param_base, bool success) } ASSERT(cxt != NULL || DYNAMO_OPTION(early_inject)); /* Else, exited above. */ /* Do the actual injection. */ - if (!maybe_inject_into_process(dcontext, proc_handle, cxt)) + if (!maybe_inject_into_process(dcontext, proc_handle, thread_handle, cxt)) return; propagate_options_via_env_vars(dcontext, proc_handle, thread_handle); if (cxt != NULL) { @@ -4391,7 +4391,7 @@ post_system_call(dcontext_t *dcontext) "syscall post: NtCreateProcess section @" PFX "\n", base); }); if (success && d_r_safe_read(process_handle, sizeof(proc_handle), &proc_handle)) - maybe_inject_into_process(dcontext, proc_handle, NULL); + maybe_inject_into_process(dcontext, proc_handle, NULL, NULL); } else if (sysnum == syscalls[SYS_CreateProcessEx]) { HANDLE *process_handle = (HANDLE *)postsys_param(dcontext, param_base, 0); uint access_mask = (uint)postsys_param(dcontext, param_base, 1); @@ -4414,7 +4414,7 @@ post_system_call(dcontext_t *dcontext) } }); if (success && d_r_safe_read(process_handle, sizeof(proc_handle), &proc_handle)) - maybe_inject_into_process(dcontext, proc_handle, NULL); + maybe_inject_into_process(dcontext, proc_handle, NULL, NULL); } else if (sysnum == syscalls[SYS_CreateUserProcess]) { postsys_CreateUserProcess(dcontext, param_base, success); } else if (sysnum == syscalls[SYS_UnmapViewOfSection] || From 1ea352717bad09111606a861c67cd20aea9f57a4 Mon Sep 17 00:00:00 2001 From: Derek Bruening Date: Tue, 5 Jan 2021 01:37:57 -0500 Subject: [PATCH 6/7] For thread-entry takeover, sets the thread context if possible instead of setting a hook, since a hook there seems to cause weird instability --- core/win32/inject.c | 57 ++++++++++++++++++++++++++++++-------- core/win32/module_shared.c | 4 +-- 2 files changed, 47 insertions(+), 14 deletions(-) diff --git a/core/win32/inject.c b/core/win32/inject.c index 00e56f3c1a2..31df4c7f23b 100644 --- a/core/win32/inject.c +++ b/core/win32/inject.c @@ -1459,6 +1459,7 @@ inject_into_new_process(HANDLE phandle, HANDLE thandle, char *dynamo_path, bool byte hook_buf[EARLY_INJECT_HOOK_SIZE]; bool x86_code = false; bool late_injection = false; + uint64 image_entry = 0; /* Possible child hook points */ GET_NTDLL(KiUserApcDispatcher, @@ -1515,7 +1516,7 @@ inject_into_new_process(HANDLE phandle, HANDLE thandle, char *dynamo_path, bool * is not many instructions later. We also need to call this to set * x86_code: */ - uint64 image_entry = get_remote_process_entry(phandle, &x86_code); + image_entry = get_remote_process_entry(phandle, &x86_code); if (thandle != NULL && IF_X64(!) is_32bit_process(phandle)) { CONTEXT cxt; cxt.ContextFlags = CONTEXT_CONTROL; @@ -1567,17 +1568,51 @@ inject_into_new_process(HANDLE phandle, HANDLE thandle, char *dynamo_path, bool if (hook_target == 0) goto error; - /* Place hook */ - if (REL32_REACHABLE((int64)hook_location + 5, (int64)hook_target)) { - hook_buf[0] = JMP_REL32; - *(int *)(&hook_buf[1]) = (int)((int64)hook_target - ((int64)hook_location + 5)); - } else { - hook_buf[0] = JMP_ABS_IND64_OPCODE; - hook_buf[1] = JMP_ABS_MEM_IND64_MODRM; - *(int *)(&hook_buf[2]) = 0; /* rip-rel to following address */ - *(uint64 *)(&hook_buf[6]) = hook_target; + bool skip_hook = false; + if (inject_location == INJECT_LOCATION_ThreadStart && hook_location != image_entry && + thandle != NULL) { + /* XXX i#803: Having a hook at the thread start seems to cause strange + * instability. We instead set the thread context, like thread injection + * does. We should better understand the problems. + * If we successfully set the context, we skip the hook. The gencode + * will still write the original instructions on top (a nop). + */ + if (IF_X64_ELSE(true, is_32bit_process(phandle))) { + CONTEXT cxt; + cxt.ContextFlags = CONTEXT_CONTROL; + if (NT_SUCCESS(nt_get_context(thandle, &cxt))) { + cxt.CXT_XIP = (ptr_uint_t)hook_target; + if (NT_SUCCESS(nt_set_context(thandle, &cxt))) + skip_hook = true; + } + } +#ifndef X64 + else { + CONTEXT_64 cxt64; + cxt64.ContextFlags = CONTEXT_CONTROL; + if (thread_get_context_64(thandle, &cxt64)) { + cxt64.Rip = hook_target; + if (thread_set_context_64(thandle, &cxt64)) { + skip_hook = true; + } + } + } +#endif } - + if (!skip_hook) { + /* Place hook */ + if (REL32_REACHABLE((int64)hook_location + 5, (int64)hook_target)) { + hook_buf[0] = JMP_REL32; + *(int *)(&hook_buf[1]) = + (int)((int64)hook_target - ((int64)hook_location + 5)); + } else { + hook_buf[0] = JMP_ABS_IND64_OPCODE; + hook_buf[1] = JMP_ABS_MEM_IND64_MODRM; + *(int *)(&hook_buf[2]) = 0; /* rip-rel to following address */ + *(uint64 *)(&hook_buf[6]) = hook_target; + } + } + /* Even if skipping we have to mark writable since gencode writes to it. */ if (!remote_protect_virtual_memory_maybe64(phandle, hook_location, sizeof(hook_buf), PAGE_EXECUTE_READWRITE, &old_prot)) { goto error; diff --git a/core/win32/module_shared.c b/core/win32/module_shared.c index 2f77045fe17..085f6dcf6bf 100644 --- a/core/win32/module_shared.c +++ b/core/win32/module_shared.c @@ -60,8 +60,8 @@ * preinject just defines its own d_r_internal_error! */ # include "../globals.h" +# include "os_private.h" # if !defined(NOT_DYNAMORIO_CORE_PROPER) -# include "os_private.h" /* for is_readable_pe_base() */ # include "../module_shared.h" /* for is_in_code_section() */ # endif # ifdef CLIENT_INTERFACE @@ -1200,7 +1200,6 @@ free_library_64(HANDLE lib) return (res >= 0); } -# ifndef NOT_DYNAMORIO_CORE_PROPER bool thread_get_context_64(HANDLE thread, CONTEXT_64 *cxt64) { @@ -1233,7 +1232,6 @@ thread_set_context_64(HANDLE thread, CONTEXT_64 *cxt64) res = switch_modes_and_call(&args); return NT_SUCCESS(res); } -# endif /* !NOT_DYNAMORIO_CORE_PROPER */ bool remote_protect_virtual_memory_64(HANDLE process, uint64 base, size_t size, uint prot, From 170c6c3ec4db343887e5de21d44cebaebfb28e26 Mon Sep 17 00:00:00 2001 From: Derek Bruening Date: Tue, 5 Jan 2021 10:25:33 -0500 Subject: [PATCH 7/7] Share the 3 CONTEXT structs to save stack space; add parent32,child64 context query for the hook and not just the later set --- core/win32/inject.c | 62 ++++++++++++++++++++++++++++---------------- core/win32/syscall.c | 6 +++-- 2 files changed, 44 insertions(+), 24 deletions(-) diff --git a/core/win32/inject.c b/core/win32/inject.c index 31df4c7f23b..e7d4744afae 100644 --- a/core/win32/inject.c +++ b/core/win32/inject.c @@ -1460,6 +1460,13 @@ inject_into_new_process(HANDLE phandle, HANDLE thandle, char *dynamo_path, bool bool x86_code = false; bool late_injection = false; uint64 image_entry = 0; + union { + /* Ensure we're not using too much stack via a union. */ + CONTEXT cxt; +#ifndef X64 + CONTEXT_64 cxt64; +#endif + } cxt; /* Possible child hook points */ GET_NTDLL(KiUserApcDispatcher, @@ -1509,28 +1516,41 @@ inject_into_new_process(HANDLE phandle, HANDLE thandle, char *dynamo_path, bool break; case INJECT_LOCATION_ThreadStart: late_injection = true; - /* Try to get the actual thread context if possible. We do not yet have - * support for CONTEXT32 and CONTEXT64, which we'd need for non-same-bitwidth. + /* Try to get the actual thread context if possible. * We next try looking in the remote ntdll for RtlUserThreadStart. * If we can't find the thread start, we fall back to the image entry, which - * is not many instructions later. We also need to call this to set - * x86_code: + * is not many instructions later. We also need to call this first to set + * "x86_code": */ image_entry = get_remote_process_entry(phandle, &x86_code); - if (thandle != NULL && IF_X64(!) is_32bit_process(phandle)) { - CONTEXT cxt; - cxt.ContextFlags = CONTEXT_CONTROL; - if (NT_SUCCESS(nt_get_context(thandle, &cxt))) { - hook_location = cxt.CXT_XIP; + if (thandle != NULL) { + /* We can get the context for same-bitwidth, or (below) for parent32, + * child64. For parent64, child32, a regular query gives us + * ntdll64!RtlUserThreadStart, which our gencode can't reach and which + * is not actually executed: we'd need a reverse switch_modes_and_call? + * For now we rely on the get_remote_proc_address() and assume that's + * the thread start for parent64, child32. + */ + if (IF_X64(!) is_32bit_process(phandle)) { + cxt.cxt.ContextFlags = CONTEXT_CONTROL; + if (NT_SUCCESS(nt_get_context(thandle, &cxt.cxt))) + hook_location = cxt.cxt.CXT_XIP; } - } else { +#ifndef X64 + else { + cxt.cxt64.ContextFlags = CONTEXT_CONTROL; + if (thread_get_context_64(thandle, &cxt.cxt64)) + hook_location = cxt.cxt64.Rip; + } +#endif + } + if (hook_location == 0) { bool target_64 = !x86_code IF_X64(|| DYNAMO_OPTION(inject_x64)); uint64 ntdll_base = find_remote_ntdll_base(phandle, target_64); uint64 thread_start = get_remote_proc_address(phandle, ntdll_base, "RtlUserThreadStart"); - if (thread_start != 0) { + if (thread_start != 0) hook_location = thread_start; - } } if (hook_location == 0) { /* Fall back to the image entry which is just a few instructions later. */ @@ -1578,21 +1598,19 @@ inject_into_new_process(HANDLE phandle, HANDLE thandle, char *dynamo_path, bool * will still write the original instructions on top (a nop). */ if (IF_X64_ELSE(true, is_32bit_process(phandle))) { - CONTEXT cxt; - cxt.ContextFlags = CONTEXT_CONTROL; - if (NT_SUCCESS(nt_get_context(thandle, &cxt))) { - cxt.CXT_XIP = (ptr_uint_t)hook_target; - if (NT_SUCCESS(nt_set_context(thandle, &cxt))) + cxt.cxt.ContextFlags = CONTEXT_CONTROL; + if (NT_SUCCESS(nt_get_context(thandle, &cxt.cxt))) { + cxt.cxt.CXT_XIP = (ptr_uint_t)hook_target; + if (NT_SUCCESS(nt_set_context(thandle, &cxt.cxt))) skip_hook = true; } } #ifndef X64 else { - CONTEXT_64 cxt64; - cxt64.ContextFlags = CONTEXT_CONTROL; - if (thread_get_context_64(thandle, &cxt64)) { - cxt64.Rip = hook_target; - if (thread_set_context_64(thandle, &cxt64)) { + cxt.cxt64.ContextFlags = CONTEXT_CONTROL; + if (thread_get_context_64(thandle, &cxt.cxt64)) { + cxt.cxt64.Rip = hook_target; + if (thread_set_context_64(thandle, &cxt.cxt64)) { skip_hook = true; } } diff --git a/core/win32/syscall.c b/core/win32/syscall.c index f2d41009327..929799daa5b 100644 --- a/core/win32/syscall.c +++ b/core/win32/syscall.c @@ -1728,8 +1728,10 @@ not_first_thread_in_new_process(HANDLE process_handle, HANDLE thread_handle) #ifndef X64 bool peb_is_32 = is_32bit_process(process_handle); if (!peb_is_32) { - /* We'd need a CONTEXT64 define for parent32,child64. - * We only need this for pre-Vista, so just xp64, so we bail. + /* XXX: We need to use CONTEXT_64 and thread_get_context_64 for parent32,child64. + * We only need this for pre-Vista, so just xp64, where we are not willing + * to put much effort: for now we bail (we never supported cross-arch + * injection in the past in any case). */ REPORT_FATAL_ERROR_AND_EXIT(FOLLOW_CHILD_FAILED, 3, get_application_name(), get_application_pid(),