Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

py-spy record --subprocess consistently failing #244

Closed
mtreinish opened this issue May 28, 2020 · 5 comments
Closed

py-spy record --subprocess consistently failing #244

mtreinish opened this issue May 28, 2020 · 5 comments

Comments

@mtreinish
Copy link

mtreinish commented May 28, 2020

Trying to profile an application with multiple processes being launched with a multiprocessing pool is consistently failing when I specify the --subprocess flag on py-spy record. With a couple of different errors. I've seen:

thread '<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value', src/python_spy.rs:250:68

   0:     0x56013eec4234 - backtrace::backtrace::libunwind::trace::h0b636de20dd0298a
                               at /cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/libunwind.rs:86
   1:     0x56013eec4234 - backtrace::backtrace::trace_unsynchronized::ha3553b391ef1d30f
                               at /cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/mod.rs:66
   2:     0x56013eec4234 - std::sys_common::backtrace::_print_fmt::h3cca6bc199df83f5
                               at src/libstd/sys_common/backtrace.rs:78
   3:     0x56013eec4234 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hd2e9245a861eba47
                               at src/libstd/sys_common/backtrace.rs:59
   4:     0x56013eeefbfc - core::fmt::write::h5a694455d355f552
                               at src/libcore/fmt/mod.rs:1069
   5:     0x56013eec13d3 - std::io::Write::write_fmt::h6fd0176ea0db366d
                               at src/libstd/io/mod.rs:1504
   6:     0x56013eec6a85 - std::sys_common::backtrace::_print::h075dd72a7bbaafbc
                               at src/libstd/sys_common/backtrace.rs:62
   7:     0x56013eec6a85 - std::sys_common::backtrace::print::hdc1f8c2e10016b59
                               at src/libstd/sys_common/backtrace.rs:49
   8:     0x56013eec6a85 - std::panicking::default_hook::{{closure}}::h4669a6613297f792
                               at src/libstd/panicking.rs:198
   9:     0x56013eec67c2 - std::panicking::default_hook::h6ec85866bdaab722
                               at src/libstd/panicking.rs:218
  10:     0x56013eec70e2 - std::panicking::rust_panic_with_hook::ha9eda39b4c817d70
                               at src/libstd/panicking.rs:511
  11:     0x56013eec6ccb - rust_begin_unwind
                               at src/libstd/panicking.rs:419
  12:     0x56013eeed9c1 - core::panicking::panic_fmt::h62773cc8c6fb930c
                               at src/libcore/panicking.rs:111
  13:     0x56013eeed90d - core::panicking::panic::h67eeef5aa3af5141
                               at src/libcore/panicking.rs:54
  14:     0x56013ec638ef - py_spy::python_spy::PythonSpy::_get_stack_traces::h2581adc182ccb458
  15:     0x56013ec5a4e4 - py_spy::python_spy::PythonSpy::get_stack_traces::h60fa998a92153c48
  16:     0x56013ec29212 - std::sys_common::backtrace::__rust_begin_short_backtrace::hecf912a8089a0ba0
  17:     0x56013ec91325 - core::ops::function::FnOnce::call_once{{vtable.shim}}::h4dc0116cb5175e89
  18:     0x56013eeb976f - <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::ha178bcfcf20af363
                               at /rustc/20fc02f836f3035b86b56a7cedb97c5cd4ed9612/src/liballoc/boxed.rs:1008
  19:     0x56013eece0e3 - <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h8fbe5db4385f75a3
                               at /rustc/20fc02f836f3035b86b56a7cedb97c5cd4ed9612/src/liballoc/boxed.rs:1008
  20:     0x56013eece0e3 - std::sys::unix::thread::Thread::new::thread_start::h254ab4a9d4c60802
                               at src/libstd/sys/unix/thread.rs:87
  21:     0x7f0628ffb422 - start_thread
  22:     0x7f0629124bf3 - __GI___clone
  23:                0x0 - <unknown>

and

thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', src/main.rs:265:35

stack backtrace:
   0:     0x55a4089d3234 - backtrace::backtrace::libunwind::trace::h0b636de20dd0298a
                               at /cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/libunwind.rs:86
   1:     0x55a4089d3234 - backtrace::backtrace::trace_unsynchronized::ha3553b391ef1d30f
                               at /cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/mod.rs:66
   2:     0x55a4089d3234 - std::sys_common::backtrace::_print_fmt::h3cca6bc199df83f5
                               at src/libstd/sys_common/backtrace.rs:78
   3:     0x55a4089d3234 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hd2e9245a861eba47
                               at src/libstd/sys_common/backtrace.rs:59
   4:     0x55a4089febfc - core::fmt::write::h5a694455d355f552
                               at src/libcore/fmt/mod.rs:1069
   5:     0x55a4089d03d3 - std::io::Write::write_fmt::h6fd0176ea0db366d
                               at src/libstd/io/mod.rs:1504
   6:     0x55a4089d5a85 - std::sys_common::backtrace::_print::h075dd72a7bbaafbc
                               at src/libstd/sys_common/backtrace.rs:62
   7:     0x55a4089d5a85 - std::sys_common::backtrace::print::hdc1f8c2e10016b59
                               at src/libstd/sys_common/backtrace.rs:49
   8:     0x55a4089d5a85 - std::panicking::default_hook::{{closure}}::h4669a6613297f792
                               at src/libstd/panicking.rs:198
   9:     0x55a4089d57c2 - std::panicking::default_hook::h6ec85866bdaab722
                               at src/libstd/panicking.rs:218
  10:     0x55a4089d60e2 - std::panicking::rust_panic_with_hook::ha9eda39b4c817d70
                               at src/libstd/panicking.rs:511
  11:     0x55a4089d5ccb - rust_begin_unwind
                               at src/libstd/panicking.rs:419
  12:     0x55a4089fc9c1 - core::panicking::panic_fmt::h62773cc8c6fb930c
                               at src/libcore/panicking.rs:111
  13:     0x55a4089fc90d - core::panicking::panic::h67eeef5aa3af5141
                               at src/libcore/panicking.rs:54
  14:     0x55a4087ce4cc - py_spy::run_spy_command::hff58a00b73ec0b61
  15:     0x55a4087cf433 - py_spy::main::h2a8e8a4218ab0ded
  16:     0x55a4087386b3 - std::rt::lang_start::{{closure}}::hbecf4b966f44bc45
  17:     0x55a4089d6528 - std::rt::lang_start_internal::{{closure}}::ha68e6c09133a8b56
                               at src/libstd/rt.rs:52
  18:     0x55a4089d6528 - std::panicking::try::do_call::h4a6a9e06f0f24b30
                               at src/libstd/panicking.rs:331
  19:     0x55a4089d6528 - std::panicking::try::h25396bc9de2fe2a9
                               at src/libstd/panicking.rs:274
  20:     0x55a4089d6528 - std::panic::catch_unwind::hfd213a703bb176df
                               at src/libstd/panic.rs:394
  21:     0x55a4089d6528 - std::rt::lang_start_internal::h14b708e32681f6d5
                               at src/libstd/rt.rs:51
  22:     0x55a4087d0142 - main
  23:     0x7fdeba221002 - __libc_start_main
  24:     0x55a4086ee1fe - _start
  25:                0x0 - <unknown>

this second stack trace also logs:

[2020-05-28T16:59:44.109327501Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-05-28T16:59:44.109345665Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside

I have also encountered the error reported in #243 I suspect all three error modes are related but wasn't sure so I opened a separate issue about it. Feel free to close this as a duplicate if necessary.

To reproduce I've been running this script:

from qiskit import Aer
from qiskit.circuit.random import random_circuit
from qiskit import execute

circuits = []
for i in range(2):
    circuits.append(random_circuit(20, 20, measure=True))

backend = Aer.get_backend('qasm_simulator')
result = execute(circuits, backend).result()
print(result.get_counts())

which requires that qiskit-terra and qiskit-aer be installed from pypi. Then running:

py-spy record --subprocesses --native -o flame.svg -- python test_circuits.py

(it fails without the native flag too, but there are compiled extensions written in cython, c++, and rust as executed as part of the script which I want to profile so I left it in)

This happens consistently from both release and master with python 3.8 and python 3.7 on linux.

@JMP252983
Copy link

ช่วยสอนหน่อยจ้า

@benfred
Copy link
Owner

benfred commented Jun 23, 2020

Thanks for the bug report!

I think there was an issue when subprocesses exited that caused this - I put a fix in #265 that I believe should fix.

Can you try it out? It's included in the latest development release pip install py-spy==0.4.0.dev1

@mtreinish
Copy link
Author

Thanks, yeah I gave it a try and I'm not able to reproduce any of the hard failure cases. I'm still seeing Error: No child process (os error 10) (like #243) after each run though. When I run with RUST_BACKTRACE=full it shows:

py-spy> Stopped sampling because process exitted
py-spy> Wrote flamegraph data to 'flame.svg'. Samples: 244 Errors: 1
Error: No child process (os error 10)
   0: failure::backtrace::internal::InternalBacktrace::new
   1: py_spy::main
   2: std::rt::lang_start_internal::{{closure}}::{{closure}}
             at /rustc/c7087fe00d2ba919df1d813c040a5d47e43b0fe7/src/libstd/rt.rs:52
      std::sys_common::backtrace::__rust_begin_short_backtrace
             at /rustc/c7087fe00d2ba919df1d813c040a5d47e43b0fe7/src/libstd/sys_common/backtrace.rs:130
   3: main

@gsnedders
Copy link

Just to add: I likewise can't reproduce it failing any more running 0.4.0.dev1

@benfred
Copy link
Owner

benfred commented Sep 29, 2021

I think this is resolved now ( and I suspect that the error @mtreinish reported was fixed by #391 ). Closing this, but feel free to reopen if this error is still occurring in the latest build

@benfred benfred closed this as completed Sep 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants