Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some Profile compatibility routines #42482

Merged
merged 7 commits into from
Oct 10, 2021
Merged

Conversation

timholy
Copy link
Member

@timholy timholy commented Oct 3, 2021

This makes it easier for external packages like FlameGraphs to continue
to support Profile by hiding some of the details of the internal
metadata format.

It also fixes a couple of typos in docstrings and comments.

@timholy timholy requested a review from IanButterworth October 3, 2021 08:51
@timholy
Copy link
Member Author

timholy commented Oct 3, 2021

Interesting! One of the buildkite errors shows that occasionally the "stripped" data has the same magic sequence as the end-terminator in unstripped data. I can replicate this locally. Presumably it would be better to prevent this?

Copy link
Member

@IanButterworth IanButterworth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's odd.. is your reproducer just running the test? It would be good to look at the raw data

stdlib/Profile/src/Profile.jl Outdated Show resolved Hide resolved
timholy added a commit to timholy/FlameGraphs.jl that referenced this pull request Oct 8, 2021
@vchuravy
Copy link
Member

vchuravy commented Oct 9, 2021

Buildkite failure is related, https://buildkite.com/julialang/julia-master/builds/4507#099ef975-964b-490d-a462-b3129151c01b

@IanButterworth
Copy link
Member

IanButterworth commented Oct 10, 2021

As seen in the buildkite failures, The is_block_end heuristic wasn't narrow-enough to work fully reliably on stripped data (where no proper block ends should exist)
Below is an offending excerpt of data.

To fix it I narrowed the heuristic a bit further

      From worker 8:	3900280160

      From worker 8:	3900280606

      From worker 8:	4154356168

      From worker 8:	4154491433

      From worker 8:	0

      From worker 8:	4158664865

      From worker 8:	0

      From worker 8:	0

      From worker 8:	4160067078

      From worker 8:	4160068269

      From worker 8:	4158664504

      From worker 8:	4158664814

timholy and others added 7 commits October 10, 2021 01:18
This makes it easier for external packages like FlameGraphs to continue
to support Profile by hiding some of the details of the internal
metadata format.
This reverts commit cc2ebd4.
Copy link
Member Author

@timholy timholy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this across the finish line! Approved. (GitHub doesn't let me approve a PR I started.)

@IanButterworth IanButterworth merged commit c041759 into master Oct 10, 2021
@IanButterworth IanButterworth deleted the teh/profile_compat branch October 10, 2021 13:59
timholy added a commit to timholy/FlameGraphs.jl that referenced this pull request Oct 11, 2021
@vtjnash
Copy link
Member

vtjnash commented Oct 12, 2021

To fix it I narrowed the heuristic a bit further

I'm a bit confused. This was not a heuristic before. Seems like this PR was just generating corrupt blocks of data?

@IanButterworth
Copy link
Member

By heuristic, I mean the is_block_end logic. Previously it just checked for two nulls and a following non-null, which evidently wasn't enough to not have false positives on data that has been stripped of metadata

@vtjnash
Copy link
Member

vtjnash commented Oct 12, 2021

Yes, it isn't valid to call that on data that isn't formatted into blocks.

@IanButterworth
Copy link
Member

Ok, so instead of making it work for both stripped and unstripped, would it be better to keep is_block_end as it was, and make another called contains_metadata that just returns Bool?

I'm also questioning whether it was even right to strip out the metadata in the first place. It was done to avoid breakage downstream, but consumers are using internals like Profile.tree! which assume metadata is present, so breaks happened anyway.

@vtjnash
Copy link
Member

vtjnash commented Oct 12, 2021

yeah, we might want to add a versioning bit at the front, so we can continue to update this without causing breakage

@timholy
Copy link
Member Author

timholy commented Oct 12, 2021

Yes, it's possible that the fundamental problem was stripping it out sounded nicer than it was.

However, tests like these in FlameGraphs explicitly constructed raw Profile data and so end up being dependent on internals. Perhaps we need a utility in Profile to support construction of fake data so it can be better-insulated from the rest of the world?

Two other observations:

  • compared to FlameGraphs, our Profile tests look pretty anemic. I am substantially to blame for this, my standards have risen over the years
  • despite the more exhaustive tests, I've anecdotally seen cases where I am sure that FlameGraph's dependence on LeftChildRightSiblingTrees introduces bugs that are not present in Profile. LCRS is waaay more complicated and should have been yanked out long ago for a simpler representation; it was an early case of premature optimization in the very early days of ProfileView.

IanButterworth added a commit that referenced this pull request Nov 2, 2021
As was discussed in #42482 after merge, there's some things that could be done better:

- Reverts the is_block_end logic, given it should only be run on data that includes metadata, so can afford to be more efficient
- Adds has_meta for identifying when profile data has metadata
- Adds metadata checks to functions that expect metadata to be present
- Changes Profile.fetch to by default include metadata. The reason for this is that it was previously made to strip metadata to avoid breaking existing downstream Profile data consumers, but it turns out that they use Profile internals like tree! that require the metadata to be present, so this ended up causing an unintended breakage anyway
- Adds consts for the metadata field offsets, for consumers to use
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Feb 22, 2022
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Feb 22, 2022
As was discussed in JuliaLang#42482 after merge, there's some things that could be done better:

- Reverts the is_block_end logic, given it should only be run on data that includes metadata, so can afford to be more efficient
- Adds has_meta for identifying when profile data has metadata
- Adds metadata checks to functions that expect metadata to be present
- Changes Profile.fetch to by default include metadata. The reason for this is that it was previously made to strip metadata to avoid breaking existing downstream Profile data consumers, but it turns out that they use Profile internals like tree! that require the metadata to be present, so this ended up causing an unintended breakage anyway
- Adds consts for the metadata field offsets, for consumers to use
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Mar 8, 2022
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Mar 8, 2022
As was discussed in JuliaLang#42482 after merge, there's some things that could be done better:

- Reverts the is_block_end logic, given it should only be run on data that includes metadata, so can afford to be more efficient
- Adds has_meta for identifying when profile data has metadata
- Adds metadata checks to functions that expect metadata to be present
- Changes Profile.fetch to by default include metadata. The reason for this is that it was previously made to strip metadata to avoid breaking existing downstream Profile data consumers, but it turns out that they use Profile internals like tree! that require the metadata to be present, so this ended up causing an unintended breakage anyway
- Adds consts for the metadata field offsets, for consumers to use
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants