Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Enhance metrics reported by superpmi diffs #74584

Merged
merged 25 commits into from
Sep 19, 2022

Conversation

jakobbotsch
Copy link
Member

  • Report the total number of contexts, minopts contexts and fullopts
    contexted processed
  • Report number of successful and missing contexts
  • Report asmdiffs and tpdiffs for minopts/fullopts separately

Fixes #70350
Contributes to #73506

* Report the total number of contexts, minopts contexts and fullopts
  contexted processed
* Report number of successful and missing contexts
* Report asmdiffs and tpdiffs for minopts/fullopts separately

Fixes dotnet#70350
Contributes to dotnet#73506
@ghost ghost assigned jakobbotsch Aug 25, 2022
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Aug 25, 2022
@ghost
Copy link

ghost commented Aug 25, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details
  • Report the total number of contexts, minopts contexts and fullopts
    contexted processed
  • Report number of successful and missing contexts
  • Report asmdiffs and tpdiffs for minopts/fullopts separately

Fixes #70350
Contributes to #73506

Author: jakobbotsch
Assignees: jakobbotsch
Labels:

area-CodeGen-coreclr

Milestone: -

PAL does not have the template variant of vsprintf_s.
@JulieLeeMSFT JulieLeeMSFT added this to the 8.0.0 milestone Aug 25, 2022
@jakobbotsch
Copy link
Member Author

@dotnet/jit-contrib I'm decently happy with what I have now. It looks like the following for a change with diffs: https://dev.azure.com/dnceng-public/public/_build/results?buildId=14828&view=ms.vss-build-web.run-extensions-tab

Currently running it to make sure it does not look horrible when there are no diffs, but otherwise this is ready for review. @kunalspathak can you take a look?

@jakobbotsch jakobbotsch marked this pull request as ready for review September 13, 2022 19:53
Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More details are always good to have, but I honestly think that the extension page is getting very crowded, and it is hard to follow the most common information that we want to see.

I think the confusing part is when you extend "Details" and at one point I lose track of whose details I was looking for.

@@ -89,39 +89,8 @@ def append_diff_file(f, arch, file_name, full_file_path, asmdiffs):
else:
diffs_found = True
f.write("## {} {}\n".format(diff_os, diff_arch))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remind me why this section is removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sections are printed by superpmi.py now.

color = "red" if diff > base else "green"
return "<span style=\"color:{}\">{}{:,d}</span>".format(color, plus_if_positive, diff - base)

diffed_contexts = sum(int(diff_metrics["Overall"]["Successful compiles"]) for (_, _, diff_metrics, _, _) in asm_diffs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you convert sum(....) into a function where you pass 1st and 2nd level names? E.g. get_sum("Overall", "Successful compiles")?


write_fh.write("<summary>{} ({} bytes)</summary>\n\n".format(col, format_delta(sum_base, sum_diff)))
write_fh.write("|Collection|Base size (bytes)|Diff size (bytes)|\n")
write_fh.write("|---|--:|--:|\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to draw table and cell boundaries to see things clearly? I have to stare at them to understand which number belongs to which column.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is possible with markdown tables in Azure Pipelines. I agree the default style there is not great, though with only three columns I haven't really had this particular problem.
Maybe switching to HTML tables would work.

*pParam->isMinOpts =
jitFlags.IsSet(CORJIT_FLAGS::CORJIT_FLAG_DEBUG_CODE) ||
jitFlags.IsSet(CORJIT_FLAGS::CORJIT_FLAG_MIN_OPT) ||
jitFlags.IsSet(CORJIT_FLAGS::CORJIT_FLAG_TIER0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think Tier0 code is same as MinOpt.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today they are practically the same, I think. They both lead to OptimizarionDisabled() returning true.

Copy link
Member Author

@jakobbotsch jakobbotsch Sep 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also note that Compiler::Options::MinOpts() returns true in tier-0 compilations, so it's not wrong to say that tier-0 compilations are MinOpts)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I was checking yesterday and comparing it with "debuggable code" which is different. I was thinking that "debuggable code" is min-opts.

@jakobbotsch
Copy link
Member Author

More details are always good to have, but I honestly think that the extension page is getting very crowded, and it is hard to follow the most common information that we want to see.

What is the most common information you want to see? Currently it is showing the overall diffs for each collection by default, which is what I want to see, but required multiple clicks before to individually look at.

I do agree it is still a bit overwhelming to look at, but I'm not sure how to present it better. Part of this is that some of this information is new but is information we really ought to always have looked at as part of evaluating changes (in particular the MISSED contexts).

@jakobbotsch
Copy link
Member Author

the extension page is getting very crowded

Do you think it would be better if I leave the "Overall" sections unexpanded by default? Then you at least aren't presented with so many tables immediately when you go to look.

@kunalspathak
Copy link
Member

Do you think it would be better if I leave the "Overall" sections unexpanded by default? Then you at least aren't presented with so many tables immediately when you go to look.

Yes, that's what I mean by important information shown by default and everything else is hidden.

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Sep 14, 2022

Yes, that's what I mean by important information shown by default and everything else is hidden.

Well, that's already how it was. It showed you the diffs by default and hid everything else.

The new one that hides the diffs by default looks like this: https://dev.azure.com/dnceng-public/public/_build/results?buildId=16291&view=ms.vss-build-web.run-extensions-tab

Since we show the sum of diffs now it seems fine to me to hide the per-collection diffs by default. What do you think?

I've also changed the Details section a bit by adding a horizontal line between the table and jit-analyze output (which I have added a header for as well).

The "sum of diffs" is not really a meaningful number to base anything on, but it does help to guide where one should drill in.

@kunalspathak
Copy link
Member

  1. I mean, here I am not sure showing > MinOpts (+0 bytes), > FullOpts (+194,496 bytes) by default is of any value. It is good information to have, but not be default IMO. We can just stick with Overall numbers.
    image

If someone sees those numbers, they anyway don't have clue when they expand "Details diff" to see where they are coming from? I would suggest putting it few clicks under should be fine. For instance, we might not have cases where MinOpts code size diffs are changing frequently, but perhaps they might show up once in a while on TP metrics.

  1. The "sum of diffs" is not really a meaningful number to base anything on, but it does help to guide where one should drill in.

image

How about removing the "Base size" column? What are we going to do knowing that? Instead `Diff size (bytes)` and `Diff %` might be better?
  1. I always felt spacing between Details and next configuration say OSX arm64 show be more to distinguish that we are starting the next section. Without that it looks more congested when expanded.

image

Probably a fancy line separator might help.

If we are ever able to post back these results to PR as comments, I would like to include some asmdiff examples of each category too and then, we will have to find some more space to have them.

Again, these are just some suggestions, and it might take multiple rounds to make it to the point where it proves readable for everyone. Feel free to ignore some of the suggestions.

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Sep 14, 2022

  1. I mean, here I am not sure showing > MinOpts (+0 bytes), > FullOpts (+194,496 bytes) by default is of any value. It is good information to have, but not be default IMO. We can just stick with Overall numbers.

I strongly disagree. Having the diffs separated out into these categories is quite important to me. Code size regressions in MinOpts are not as important as code size regressions in FullOpts. TP regressions in MinOpts are more important than TP regressions in FullOpts. If I'm making a transformation that is enabled only when optimizing then "Overall" is a misleading category -- I am interesting in its impact only on optimized contexts. I am also interested in verifying that there is no impact on MinOpts.

FWIW, I recall seeing quite a few instances of "small diffs except in the asp.net collection", which I always viewed as an indication that there were primarily MinOpts diffs in a change.

I'd be fine with hiding it under "Details", but I think the separated categories are quite important. What does the rest of @dotnet/jit-contrib think? Should they be shown by default or hidden away?

If someone sees those numbers, they anyway don't have clue when they expand "Details diff" to see where they are coming from?

We can fix jit-analyze separately. At least this makes it clear that there is impact on these categories of contexts and that further investigation is warranted.

How about removing the "Base size" column? What are we going to do knowing that? Instead Diff size (bytes) and Diff % might be better?

This was how I first had it, but I quickly saw that "Diff %" is almost always too small to give any meaningful information. For most changes I looked at it is 0.00%. This very large forward sub change is in the 0.01% - 0.02% range. So this was an alternative way to give some notion of relative impact. Not sure if there's a better way (permille?) -- I personally have internalized what a "large diff" is based on just the absolute diff after working on the JIT for a while.

I always felt spacing between Details and next configuration say OSX arm64 show be more to distinguish that we are starting the next section. Without that it looks more congested when expanded.

I agree, let me try to add some more separation.

@SingleAccretion
Copy link
Contributor

Should they be shown by default or hidden away?

I agree with Jakob that the minopts/fullopts separation should be shown prominently.

I also agree it's a bit unfortunate getting to the jit-analyze output is not so quick, since it contain the outliers (large relative regressions), but I don't think it's a big deal.

Suggestion: grey out (don't color) 0.00% for TP diffs - green -0.00% look a bit strange.

@kunalspathak
Copy link
Member

Code size regressions in MinOpts are not as important as code size regressions in FullOpts. TP regressions in MinOpts are more important than TP regressions in FullOpts.

That's exactly what I stated: "For instance, we might not have cases where MinOpts code size diffs are changing frequently, but perhaps they might show up once in a while on TP metrics." and if we agree that why not hide code size diffs of MinOpts and show TP diff instead by default?

I think I am trying to convey that we should just surface minimum information on top that matters 80% of time and rest all can be hidden behind few clicks.

@jakobbotsch
Copy link
Member Author

if we agree that why not hide code size diffs of MinOpts and show TP diff instead by default?

So is your suggestion to move the MinOpts section under Details for asmdiffs, and the FullOpts section under Details for the TP diffs? But retain Overall/FullOpts and OverAll/MinOpts respectively? It's odd to me to have these sections appear in different places.

FWIW, I agree the diffs themselves are less important if they are expected. But easily being able to see whether there are any unexpected diffs is very important too. I dislike having to go expand various subsections for something that I expect I always want to check.

@kunalspathak
Copy link
Member

Sounds ok to me. May be adding the extra spacing around (as I suggested above) might make it better. And of course, we can re-iterate with better ideas.

@AndyAyersMS
Copy link
Member

AndyAyersMS commented Sep 16, 2022

we can re-iterate with better ideas.

Right, this is a clear improvement over what we have now, and once we have some mileage on it we can revise.

My only suggestion would be to try and also record the net +/- size contributions, so it is easier to tell if something is a pure improvement or regression or mixed bag, eg

Overall (+190,456 bytes) (+250,000/-59,544)

But this can wait. Maybe even in the fullness of time, some sort of empirical PDF/histogram for the gains and losses. I am usually interested in knowing when there are outliers and such... many of my recent changes have a lot of small wins and a few big regressions, for example.

@jakobbotsch
Copy link
Member Author

This is how it looks with no diffs: https://dev.azure.com/dnceng-public/public/_build/results?buildId=21852&view=ms.vss-build-web.run-extensions-tab
Unfortunately we have spurious diffs in the coreclr_tests.run collection for arm platforms right now (see #75672 (comment))). But you can see how it should look with no diffs for the Linux/Windows x64 sections.

I think I'll remove the "missed contexts" line when there are no misses, though it's probably a quite rare situation.

@jakobbotsch jakobbotsch merged commit c3d09aa into dotnet:main Sep 19, 2022
@jakobbotsch jakobbotsch deleted the spmi-metrics branch September 19, 2022 17:19
@ghost ghost locked as resolved and limited conversation to collaborators Oct 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SPMI CI runs need to surface context misses
5 participants