Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend flamegraphs to show alloc diff #238

Merged
merged 1 commit into from
Feb 4, 2023
Merged

Extend flamegraphs to show alloc diff #238

merged 1 commit into from
Feb 4, 2023

Conversation

LenkaNovak
Copy link
Collaborator

@LenkaNovak LenkaNovak commented Jan 27, 2023

Experimental PR that test the ease and usefulness of color coding flame graphs based on allocation changes compared to the previous commit, using a local testing bed.

To-do

  • Import ProfileCanvas and ProfileView
  • familiarize with code
  • implement a simple toy example with required changes and dummy history file
  • implement Buildkite to store report data
  • run a baseline build
  • implement final file read on slurm
  • clean files and indicate diff
  • docs

Content

Note: For the functions that were directly copied from ProfileCanvas.jl and ProfileView.js, we haven't added any new doc strings/CI tests yet, because Simon and I are still figuring out if we're going to integrate this eventually to the original packages. This PR is mainly a proof of concept, and if we decide to keep it long term, we'll clean it up to be more accessible for users. In the meantime though, it would be good to have this merged and available ASAP for our own perf monitoring purposes.


  • I have read and checked the items on the review checklist.

@LenkaNovak LenkaNovak mentioned this pull request Jan 28, 2023
10 tasks
@LenkaNovak LenkaNovak marked this pull request as ready for review February 3, 2023 17:22
Copy link
Member

@juliasloan25 juliasloan25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! My main comment is I think it would be more clear to use the ProfileCanvas.jl package and extend anything we need to change/add, rather than copying over the whole files. This way we'd have less code and the modifications would be more obvious :)

@@ -11,6 +11,7 @@
![canvas](images/canvas_coupler.png)

- each row along the y-axis represents a level of backtraces. In this case the lowermost level is at the top, and the top level represents what is directly being run on the CPU. The stacks in each level are sorted alphabetically (not chronologically, like flame _charts_). The column width is proportional to the presence in samples (related to allocations). The colors are grouped into runtime-dispatch, gc, compilation and default. The intensity is random.
- we also have a local beta version of flame graphs (in `perf/ProfileCanvasDiff.jl` and `perf/ProfileViewerDiff.js`), which plots the same flame graphs as above but with the color corresponding to whether the stack allocation has been reduced (blue) or increased (red) compared to the last staged runs. The color intensity is proportional to the fractional change, and black signifies untracked traces.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be helpful to explain how to run the version of flame graphs with colors showing stack allocation change vs showing allocation type

@@ -0,0 +1,390 @@
# temporarily copied and modified from https://github.com/pfitzseb/ProfileCanvas.jl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of copying the code over, could you use the ProfileCanvas package and extend any functions you need to behave differently?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, totally. We need to decide if we're going to keep this format when we generalize this for the other repos. If so, then we'll definitely condense this like you suggest. I've made a separate issue for this final format #245 , but since this may take a while to figure out, it would be useful to merge this prototype ASAP, so we can start tracking staged runs.

transfer + modify from ClimaCore/ClimaAtmos

add alloc benchmarks

add buildkite driver

rev allocs

docs

activate docs

clean

fix

doc fix

doc fix

check in perf Manifest

bkite flag order fix

scope fix

FT redef error

fix

fix

stash

fix

toy example runs

prep for buildkite staging test

buildkite driver

format

bug

plot in bkite

fix

fix

module path

bkite build path

switch from toy to coupler

switch from toy to coupler

rebase test

Manifest up

prep dummy benchmark

fix

original ProfileCanvas, ProfileViewer files

diff

format

diff

race and docs

race and docs

permissions

restart

restart

restart

revs
@LenkaNovak
Copy link
Collaborator Author

Discussed and approved offline. Merging this for immediate tracking and will follow up on the interface refactor in #245.

@LenkaNovak
Copy link
Collaborator Author

bors r+

@bors
Copy link
Contributor

bors bot commented Feb 4, 2023

@bors bors bot merged commit 17da02a into main Feb 4, 2023
@bors bors bot deleted the ln/flamegraph-diff branch February 4, 2023 08:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants