Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include python thread names in recording #237

Merged
merged 3 commits into from
May 10, 2020

Conversation

jarus
Copy link
Contributor

@jarus jarus commented May 9, 2020

In comparison to single py-spy dump, a recording doesn't include the custom defined python thread name. While analysing the performance of an application with various threads, I spent the most time on identifying the thread so that I could correlate the recording with other observability data e.g. logs and distributed traces.

This change extends py-spy's recording capabilities by including the python thread name in each stacktrace struct. The speedscope output formatter uses the collected data to create sample profiles with the thread name as profile name so that the thread name is visible in the speedscope UI.

Copy link
Owner

@benfred benfred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

One potential problem with this is that the thread names can change on the python thread, and we'd still use the old value since we're caching the thread names. You are reloading when new threads are getting created or destroyed, so this is probably not a huge deal - and I don't want to incur the cost of the thread name lookup per sample so I'm thinking we should just merge this as is.

Btw, I've fixed the freebsd CI failure here: #238. Can you fix the travis CI issue though?

expected_threads.insert("MainThread".to_string());
let detected_threads: HashSet<String> = traces
.iter()
.map(|trace| trace.thread_name.as_ref().unwrap().clone())
Copy link
Owner

@benfred benfred May 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unwrap on this line is causing issues on travis with the x86_64 linux variants: https://travis-ci.org/github/benfred/py-spy/builds/684995442 . I believe this is probably because we are using a python2 interpreter there - and we can only get thread names from version 3.6 on.

Can you change to only run this on python 3.6+ ? There is code in test_local_vars that does this

@jarus jarus force-pushed the python-thread-names branch from 40fa11b to 1def353 Compare May 9, 2020 18:02
@jarus jarus force-pushed the python-thread-names branch from 1def353 to a1db435 Compare May 9, 2020 20:09
@jarus
Copy link
Contributor Author

jarus commented May 9, 2020

One potential problem with this is that the thread names can change on the python thread, and we'd still use the old value since we're caching the thread names.

Yes, I also thought about this limitation but I actually never saw a Python application that changes the thread name after the initial start.

What do you think, is it acceptable to include the thread name determination overhead by default? The sampling of applications with frequent changing threads could be negatively impacted.

@benfred benfred merged commit 076b975 into benfred:master May 10, 2020
@benfred
Copy link
Owner

benfred commented May 10, 2020

Yeah - I think I'd rather not incur the performance penalty. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants