Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

start/stop profiling programmatically while target program is running #125

Closed
guoshimin opened this issue Feb 12, 2021 · 11 comments
Closed
Labels
enhancement New feature or request

Comments

@guoshimin
Copy link

Is your feature request related to a problem? Please describe.
I'd like to use scalene to profile production workloads. I can't easily reproduce the production load pattern offline, so I can't just run scalene myprog offline. I can't start my job as scalene myprog in production either, because 1) I don't want to incur the overhead, and 2) I don't want it to profile from the start of program - I only want to profile specific periods and I want to see the report without stopping my job.

Describe the solution you'd like
A way to start and stop profiling, and produce reports programmatically from my program. I don't mind modifying my program.

Describe alternatives you've considered
I tried py-spy which allows attaching to a running process, but the results don't look plausible. I would like to get cross validation from scalene.

Additional context

@guoshimin guoshimin changed the title start/stop profiling programmatically when target program is running start/stop profiling programmatically while target program is running Feb 12, 2021
@emeryberger emeryberger added the enhancement New feature or request label Feb 14, 2021
@emeryberger
Copy link
Member

In principle, this is doable. That said, can you be more specific about the kind of profiling results you are looking to enable/disable? If it's pure CPU profiling (e.g., breakdown of CPU time by Python/native/system, on a per-line basis), Scalene imposes quite low overhead (this mode is enabled via --cpu-only).

@guoshimin
Copy link
Author

I'm interested in CPU profiling primarily. Even if the overhead is low enough that I can enable it in production, I still don't want it to profile the entire duration of my job. The load varies with time and I'd like to run profiling for the periods of my choosing.

@mangleddata
Copy link

+1 to this request. I'd love this feature also. Even for non prod workloads, it looks like I have to run the whole program and have it exit gracefully (only interested in CPU profiling)

@emeryberger
Copy link
Member

For your use case, @mangleddata, Scalene already has a feature that outputs profiles every so many seconds:

--profile-interval PROFILE_INTERVAL
                    output profiles every so many seconds (default: inf)

So, for example, you could say

scalene --cpu-only --profile-interval 60 --html --outfile profile.html

and a new (CPU-only) profile would be written into the file profile.html every 60 seconds.

@emeryberger
Copy link
Member

@guoshimin We are considering enabling this via signals, so that you could invoke kill SOMESIGNAL pid on the Scalene process to turn on and turn off profiling. It's a crude interface, but would this work for your use case?

@emeryberger
Copy link
Member

Much nicer approach now in a branch: https://github.com/plasma-umass/scalene/tree/signal-profiling

From --help:

When running Scalene in the background, you can suspend/resume profiling
for the process ID that Scalene reports. For example:

   % python3 -m scalene [options] yourprogram.py &
 Scalene now profiling process 12345
   to disable profiling: python3 -m scalene.profile --off --pid 12345
   to resume profiling:  python3 -m scalene.profile --on  --pid 12345

@emeryberger
Copy link
Member

This has now been merged. Note that this feature does not yet support multiprocessing, but that's coming!

@guoshimin
Copy link
Author

Sorry just saw the update. I believe this will work nicely for our use case. Thanks!

@emeryberger
Copy link
Member

Great! Please report your findings (if successful!) here: #58

@KevinRSX
Copy link

Is it possible that this could work with a program that has been started without scalene myprogram.py?

For example, I'd like to run a Python script with

python myprogram.py & # pid of myprgoram is 12345
# After a while
python3 -m scalene.profile --on  --pid 12345 --outfile profile.html
# After another while
python3 -m scalene.profile --off  --pid 12345 --outfile profile.html

Then I'd like to be able to access the report from output.html

@emeryberger
Copy link
Member

No, it is not possible. Scalene - as it is currently designed - needs to launch the Python program being profiled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants