-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zenoh-c DLL panics in libc::atexit
handler on Windows
#973
Comments
libc::atexit
handler on Windowslibc::atexit
handler on Windows
After more digging realized that the error is non-deterministic. Sometimes the I also realized I wasn't enabling debug symbols in my build and so my backtraces were not helpful at all. The following are backtraces for each scenario. Please note that the code from which I got the backtraces is slightly modified, but is functionally the same. Backtrace when `Runtime::drop` panicsthread '<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value', /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\std\src\thread\mod.rs:1439:40
stack backtrace:
0: 0x7ffda9e7c753 - std::backtrace_rs::backtrace::dbghelp::trace
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\..\..\backtrace\src\backtrace\dbghelp.rs:98
1: 0x7ffda9e7c753 - std::backtrace_rs::backtrace::trace_unsynchronized
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\..\..\backtrace\src\backtrace\mod.rs:66
2: 0x7ffda9e7c753 - std::sys_common::backtrace::_print_fmt
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\sys_common\backtrace.rs:65
3: 0x7ffda9e7c753 - std::sys_common::backtrace::_print::impl$0::fmt
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\sys_common\backtrace.rs:44
4: 0x7ffda9beed0b - core::fmt::rt::Argument::fmt
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\core\src\fmt\rt.rs:138
5: 0x7ffda9beed0b - core::fmt::write
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\core\src\fmt\mod.rs:1094
6: 0x7ffda9e6ad50 - std::io::Write::write_fmt<std::sys::windows::stdio::Stderr>
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\io\mod.rs:1714
7: 0x7ffda9e7ea3b - std::sys_common::backtrace::_print
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\sys_common\backtrace.rs:47
8: 0x7ffda9e7ea3b - std::sys_common::backtrace::print
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\sys_common\backtrace.rs:34
9: 0x7ffda9e7e63e - std::panicking::default_hook::closure$1
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\panicking.rs:269
10: 0x7ffda9e7f594 - std::panicking::default_hook
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\panicking.rs:288
11: 0x7ffda9e7f594 - std::panicking::rust_panic_with_hook
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\panicking.rs:705
12: 0x7ffda9e7eff3 - std::panicking::begin_panic_handler::closure$0
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\panicking.rs:595
13: 0x7ffda9e7ef79 - std::sys_common::backtrace::__rust_end_short_backtrace<std::panicking::begin_panic_handler::closure_env$0,never$>
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\sys_common\backtrace.rs:151
14: 0x7ffda9e7ef64 - std::panicking::begin_panic_handler
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\panicking.rs:593
15: 0x7ffdaa2dea85 - core::panicking::panic_fmt
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\core\src\panicking.rs:67
16: 0x7ffdaa2dec52 - core::panicking::panic
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\core\src\panicking.rs:117
17: 0x7ffda9e83239 - std::thread::JoinInner<tuple$<> >::join<tuple$<> >
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\std\src\thread\mod.rs:1439
18: 0x7ffda9e8f8fe - tokio::runtime::blocking::pool::BlockingPool::shutdown
at C:\Users\zenoh\.cargo\registry\src\index.crates.io-6f17d22bba15001f\tokio-1.36.0\src\runtime\blocking\pool.rs:270
19: 0x7ffdaa1a181a - tokio::runtime::blocking::pool::impl$4::drop
at C:\Users\zenoh\.cargo\registry\src\index.crates.io-6f17d22bba15001f\tokio-1.36.0\src\runtime\blocking\pool.rs:278
20: 0x7ffdaa1a181a - core::ptr::drop_in_place
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\core\src\ptr\mod.rs:497
21: 0x7ffdaa1a181a - core::ptr::drop_in_place<tokio::runtime::runtime::Runtime>
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\core\src\ptr\mod.rs:497
22: 0x7ffdaa1a2b89 - core::ptr::drop_in_place
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\core\src\ptr\mod.rs:497
23: 0x7ffdaa1a2b89 - core::ptr::drop_in_place
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\core\src\ptr\mod.rs:497
24: 0x7ffdaa1a2b89 - core::mem::maybe_uninit::MaybeUninit<zenoh_runtime::impl$5::drop::closure$1::closure$0::closure_env$0>::assume_init_drop
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\core\src\mem\maybe_uninit.rs:728
25: 0x7ffdaa1a2b89 - std::thread::impl$0::spawn_unchecked_::impl$1::drop
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\std\src\thread\mod.rs:510
26: 0x7ffdaa1a2b89 - core::ptr::drop_in_place
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\core\src\ptr\mod.rs:497
27: 0x7ffdaa1a2b89 - core::ptr::drop_in_place<std::thread::impl$0::spawn_unchecked_::closure_env$1<zenoh_runtime::impl$5::drop::closure$1::closure$0::closure_env$0,tuple$<>
> >
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\core\src\ptr\mod.rs:497
28: 0x7ffda9e7b922 - core::ptr::drop_in_place
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\core\src\ptr\mod.rs:497
29: 0x7ffda9e7b922 - core::ptr::drop_in_place
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\core\src\ptr\mod.rs:497
30: 0x7ffda9e7b922 - core::mem::drop
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\core\src\mem\mod.rs:987
31: 0x7ffda9e7b922 - std::sys::windows::thread::Thread::new
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\sys\windows\thread.rs:47
32: 0x7ffdaa1a28b0 - std::panicking::try::do_call
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\std\src\panicking.rs:500
33: 0x7ffdaa1a28b0 - std::panicking::try
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\std\src\panicking.rs:464
34: 0x7ffdaa1a28b0 - std::panic::catch_unwind
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\std\src\panic.rs:142
35: 0x7ffdaa1a28b0 - zenoh_runtime::impl$5::drop::closure$1
at C:\Users\zenoh\tmp\zenoh\commons\zenoh-runtime\src\lib.rs:202
36: 0x7ffdaa1a28b0 - core::ops::function::impls::impl$4::call_once
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\core\src\ops\function.rs:305
37: 0x7ffdaa1a28b0 - enum2$<core::option::Option<tuple$<zenoh_runtime::ZRuntime,enum2$<core::option::Option<tokio::runtime::runtime::Runtime> > > > >::map
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\core\src\option.rs:1075
38: 0x7ffdaa1a28b0 - core::iter::adapters::map::impl$2::next<enum2$<core::result::Result<std::thread::JoinHandle<tuple$<> >,alloc::boxed::Box<dyn$<core::any::Any,core::mark
er::Send>,alloc::alloc::Global> > >,core::iter::adapters::take::Take<core::iter::adapters::filter_map::F
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\core\src\iter\adapters\map.rs:103
39: 0x7ffdaa1a19f6 - core::ptr::drop_in_place<zenoh_runtime::ZRuntimePool>
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\core\src\ptr\mod.rs:497
40: 0x7ffdaa1a168a - zenoh_runtime::cleanup
at C:\Users\zenoh\tmp\zenoh\commons\zenoh-runtime\src\lib.rs:152
41: 0x7ffdd52742d6 - execute_onexit_table
42: 0x7ffdd52741fb - execute_onexit_table
43: 0x7ffdd52741b4 - execute_onexit_table
44: 0x7ffdaa2d8aad - dllmain_crt_process_detach
at D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\dll_dllmain.cpp:180
45: 0x7ffdaa2d8bd2 - dllmain_dispatch
at D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\dll_dllmain.cpp:293
46: 0x7ffdd77a9a1d - RtlActivateActivationContextUnsafeFast
47: 0x7ffdd77edcda - LdrShutdownProcess
48: 0x7ffdd77eda8d - RtlExitUserProcess
49: 0x7ffdd611e3bb - FatalExit
50: 0x7ffdd52805bc - exit
51: 0x7ffdd528045f - exit
52: 0x7ff656f212c7 - <unknown>
53: 0x7ffdd6117344 - BaseThreadInitThunk
54: 0x7ffdd77e26b1 - RtlUserThreadStart Backtrace when `Runtime::drop` doesn't panicthread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 5, kind: PermissionDenied, message: "Access is denied." }',
C:\Users\zenoh\tmp\zenoh\commons\zenoh-runtime\src\lib.rs:208:24
stack backtrace:
0: 0x7ffda9e7c753 - std::backtrace_rs::backtrace::dbghelp::trace
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\..\..\backtrace\src\backtrace\dbghelp.rs:98
1: 0x7ffda9e7c753 - std::backtrace_rs::backtrace::trace_unsynchronized
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\..\..\backtrace\src\backtrace\mod.rs:66
2: 0x7ffda9e7c753 - std::sys_common::backtrace::_print_fmt
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\sys_common\backtrace.rs:65
3: 0x7ffda9e7c753 - std::sys_common::backtrace::_print::impl$0::fmt
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\sys_common\backtrace.rs:44
4: 0x7ffda9beed0b - core::fmt::rt::Argument::fmt
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\core\src\fmt\rt.rs:138
5: 0x7ffda9beed0b - core::fmt::write
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\core\src\fmt\mod.rs:1094
6: 0x7ffda9e6ad50 - std::io::Write::write_fmt<std::sys::windows::stdio::Stderr>
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\io\mod.rs:1714
7: 0x7ffda9e7ea3b - std::sys_common::backtrace::_print
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\sys_common\backtrace.rs:47
8: 0x7ffda9e7ea3b - std::sys_common::backtrace::print
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\sys_common\backtrace.rs:34
9: 0x7ffda9e7e63e - std::panicking::default_hook::closure$1
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\panicking.rs:269
10: 0x7ffda9e7f594 - std::panicking::default_hook
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\panicking.rs:288
11: 0x7ffda9e7f594 - std::panicking::rust_panic_with_hook
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\panicking.rs:705
12: 0x7ffda9e7f025 - std::panicking::begin_panic_handler::closure$0
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\panicking.rs:597
13: 0x7ffda9e7ef79 - std::sys_common::backtrace::__rust_end_short_backtrace<std::panicking::begin_panic_handler::closure_env$0,never$>
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\sys_common\backtrace.rs:151
14: 0x7ffda9e7ef64 - std::panicking::begin_panic_handler
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\std\src\panicking.rs:593
15: 0x7ffdaa2dea85 - core::panicking::panic_fmt
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\core\src\panicking.rs:67
16: 0x7ffdaa2defa3 - core::result::unwrap_failed
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library\core\src\result.rs:1651
17: 0x7ffdaa1a29a5 - std::panicking::try::do_call
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\std\src\panicking.rs:500
18: 0x7ffdaa1a29a5 - std::panicking::try
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\std\src\panicking.rs:464
19: 0x7ffdaa1a29a5 - std::panic::catch_unwind
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\std\src\panic.rs:142
20: 0x7ffdaa1a29a5 - zenoh_runtime::impl$5::drop::closure$1
at C:\Users\zenoh\tmp\zenoh\commons\zenoh-runtime\src\lib.rs:202
21: 0x7ffdaa1a29a5 - core::ops::function::impls::impl$4::call_once
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\core\src\ops\function.rs:305
22: 0x7ffdaa1a29a5 - enum2$<core::option::Option<tuple$<zenoh_runtime::ZRuntime,enum2$<core::option::Option<tokio::runtime::runtime::Runtime> > > > >::map
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\core\src\option.rs:1075
23: 0x7ffdaa1a29a5 - core::iter::adapters::map::impl$2::next<enum2$<core::result::Result<std::thread::JoinHandle<tuple$<> >,alloc::boxed::Box<dyn$<core::any::Any,core::mark
er::Send>,alloc::alloc::Global> > >,core::iter::adapters::take::Take<core::iter::adapters::filter_map::F
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\core\src\iter\adapters\map.rs:104
24: 0x7ffdaa1a19f6 - core::ptr::drop_in_place<zenoh_runtime::ZRuntimePool>
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\core\src\ptr\mod.rs:497
25: 0x7ffdaa1a168a - zenoh_runtime::cleanup
at C:\Users\zenoh\tmp\zenoh\commons\zenoh-runtime\src\lib.rs:152
26: 0x7ffdd52742d6 - execute_onexit_table
27: 0x7ffdd52741fb - execute_onexit_table
28: 0x7ffdd52741b4 - execute_onexit_table
29: 0x7ffdaa2d8aad - dllmain_crt_process_detach
at D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\dll_dllmain.cpp:180
30: 0x7ffdaa2d8bd2 - dllmain_dispatch
at D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\dll_dllmain.cpp:293
31: 0x7ffdd77a9a1d - RtlActivateActivationContextUnsafeFast
32: 0x7ffdd77edcda - LdrShutdownProcess
33: 0x7ffdd77eda8d - RtlExitUserProcess
34: 0x7ffdd611e3bb - FatalExit
35: 0x7ffdd52805bc - exit
36: 0x7ffdd528045f - exit
37: 0x7ff656f212c7 - <unknown>
38: 0x7ffdd6117344 - BaseThreadInitThunk
39: 0x7ffdd77e26b1 - RtlUserThreadStart To understand what's going on here. Let's start with the Windows thread spawning function of the Rust stdlib: # https://github.com/rust-lang/rust/blob/1.72.0/library/std/src/sys/windows/thread.rs#L33
let ret = c::CreateThread(
ptr::null_mut(),
stack,
Some(thread_start),
p as *mut _,
c::STACK_SIZE_PARAM_IS_A_RESERVATION,
ptr::null_mut(),
);
let ret = HandleOrNull::from_raw_handle(ret);
return if let Ok(handle) = ret.try_into() {
Ok(Thread { handle: Handle::from_inner(handle) })
} else {
// The thread failed to start and as a result p was not consumed. Therefore, it is
// safe to reconstruct the box so that it gets deallocated.
drop(Box::from_raw(p));
Err(io::Error::last_os_error())
}; Thus, if a thread creation syscall fails, Rust will try to drop the thread closure before returning the error. In our case, the # https://github.com/tokio-rs/tokio/blob/tokio-1.35.x/tokio/src/runtime/blocking/pool.rs#L269
for (_id, handle) in workers {
let _ = handle.join();
} So why do we sometimes reach this point in # https://github.com/rust-lang/rust/blob/1.72.0/library/std/src/thread/mod.rs#L528
let try_result = panic::catch_unwind(panic::AssertUnwindSafe(|| {
crate::sys_common::backtrace::__rust_begin_short_backtrace(f)
}));
// SAFETY: `their_packet` as been built just above and moved by the
// closure (it is an Arc<...>) and `my_packet` will be stored in the
// same `JoinInner` as this closure meaning the mutation will be
// safe (not modify it and affect a value far away).
unsafe { *their_packet.result.get() = Some(try_result) };
// Here `their_packet` gets dropped, and if this is the last `Arc` for that packet that
// will call `decrement_num_running_threads` and therefore signal that this thread is
// done.
drop(their_packet); In the above snippet, # https://github.com/rust-lang/rust/blob/master/library/std/src/thread/mod.rs#L1577
impl<'scope, T> JoinInner<'scope, T> {
fn join(mut self) -> Result<T> {
// Calls `WaitForSingleObject` on Windows
self.native.join();
Arc::get_mut(&mut self.packet).unwrap().result.get_mut().take().unwrap()
}
} Except that when the zenoh-c application (not the DLL) exits, Windows would've already signaled all the Tokio runtime threads by the time we reach the If a runtime thread ends up dropping its packet, then the |
I opened rust-lang/rust#124466 and rust-lang/rust#124468 to discuss/improve the stdlib's handling of this. |
Describe the bug
See this workflow run failure for context.
The
z_api_double_drop_test
fails when syncing when syncing with zenoh, starting from commit 0283aaa.I've observed this crash only when zenoh-c is linked dynamically to an application and not when linked statically. Weirdly enough, this crash still happens when one of (or both of) the
z_drop
calls are removed.Unfold this line to see the backtrace of the crash
The
Drop
implementation ofZRuntimePool
calls.shutdown_timeout()
on each runtime in parallel by spawning a thread for each shutdown operation. ThisDrop
implementation is in turn called in alibc::atexit
handler.I'm still not sure why this causes the crash. You can see from the backtrace that a new thread is created (I think this the thread spawned in the
Drop
implementation?) after zenoh receives aDLL_PROCESS_DETACH
notification because theatexit
handler of the DLL is called after the application process exits (actually, the DLL has its ownatexit
handler stack separate from the application). All application threads are signaled at process exit and thus calls toWaitForSingleObject
return immediately (this is what Rust uses to implement.join()
). The following is the origin of the panic:So my theory is that threads spawned after process exit as part of a DLL's
atexit
handler are somehow signaled at creation and therefore terminate immediately after callingWaitForSingleObject
without terminating correctly and dropping their "packet" handles, but I don't have any proof. I think the underlying issue here is much more subtle and needs more digging (but we have a release to push out!). But back to theDrop
implementation:To reproduce
System info
The text was updated successfully, but these errors were encountered: