Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

superpmi: create asm diffs AzDO pipeline #59445

Closed
BruceForstall opened this issue Sep 22, 2021 · 9 comments · Fixed by #61194
Closed

superpmi: create asm diffs AzDO pipeline #59445

BruceForstall opened this issue Sep 22, 2021 · 9 comments · Fixed by #61194
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@BruceForstall
Copy link
Member

BruceForstall commented Sep 22, 2021

We now have a SuperPMI replay pipeline: #56871.

Based on this work, create a SuperPMI pipeline for generating asm diffs.

The purpose of this pipeline is not to create all the asm diffs and allow the user to download the various asm diffs text files, to avoid doing the (simple) work themselves. It is to use the parallelism available in the AzDO system to quickly determine if there are any asm diffs in any configuration.

This will be used by JIT developers who assert that their change contains no asm diffs, and will be able to quickly show and validate that across our SuperPMI collections.

Note that in contrast to the superpmi-replay pipeline, there is no need here to run with different stress modes.

This pipeline should run automatically for every JIT PR.

The pipeline should create a "summary.md" that summarizes all the asmdiffs run, and upload that to the pipeline artifacts. It should also post it as an "Extensions" page like Antigen/Fuzzlyn do (e.g., https://dev.azure.com/dnceng/public/_build/results?buildId=1452609&view=ms.vss-build-web.run-extensions-tab). A stretch goal would be to post the summary.md results to the GitHub comments stream.

A follow-up goal is to generate base/diff textual .dasm files for a subset of the diffs (perhaps just the "top 20" or so, as reported by the summary.md file) and package these up for easy downloading and viewing. Note that it should be a restricted subset to avoid accidentally generating and uploading .dasm files or every single method context in the case of pervasive diffs.

@BruceForstall BruceForstall added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 22, 2021
@BruceForstall BruceForstall added this to the 7.0.0 milestone Sep 22, 2021
@ghost
Copy link

ghost commented Sep 22, 2021

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

We now have a SuperPMI replay pipeline: #56871.

Based on this work, create a SuperPMI pipeline for generating asm diffs.

The purpose of this pipeline is not to create all the asm diffs and allow the user to download the various asm diffs text files, to avoid doing the (simple) work themselves. It is to use the parallelism available in the AzDO system to quickly determine if there are any asm diffs in any configuration.

This will be used by JIT developers who assert that their change contains no asm diffs, and will be able to quickly show and validate that across our SuperPMI collections.

Note that in contrast to the superpmi-replay pipeline, there is no need here to run with different stress modes.

Author: BruceForstall
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: 7.0.0

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Sep 22, 2021
@BruceForstall BruceForstall removed the untriaged New issue has not been triaged by the area owner label Sep 22, 2021
@kunalspathak
Copy link
Member

Few thoughts:

  • We need a way to figure out how to determine base commit (should be doable using some git command). Then, we need to do a JIT only build (using jitrolling build) for the baseline JIT.
  • Once the asmdiffs are ran, we should generate the summary.md for (at least) CodeSize and PerfScore metrics and upload them as an artifact that a developer can point us to.

@BruceForstall
Copy link
Member Author

If we can figure out the baseline commit (in the upstream main branch), we could just download the JIT rolling build created baseline build. There should be no need to build it ourselves.

@BruceForstall
Copy link
Member Author

One way to figure out the baseline:

The pipelines work in the context of a git clone of the PR branch. With git remote add origin https://github.com/dotnet/runtime (this is there by default),

# The normal AzDO checkout doesn't fetch the 'main' ref, so we need to get that.
git fetch origin main --depth=20
# Where do the merge PR branch (e.g., refs/pull/61179/merge) and the 'main' branch meet?
git merge-base HEAD origin/main
# e.g., outputs f7be57f11b7e0a4782f3836669ec1bebcd53ccd8

The "fetch" commands seem to pass some authorization options, e.g.:

-c http.extraheader="AUTHORIZATION: basic ***"

do we need to do this?

@BruceForstall
Copy link
Member Author

Then we need to walk back the commits to the last JIT one so we can find a commit hash to use for the JIT rolling build.

This is essentially the logic in superpmi.py::process_base_jit_path_arg().

So we probably can't use "--depth=20". Maybe start with pulling down everything for 'main', and prune later if necessary.

@kunalspathak
Copy link
Member

So we probably can't use "--depth=20". Maybe start with pulling down everything for 'main', and prune later if necessary.

Right, and when I was experimenting this last time while working on #59598, if we pull everything, it is very expensive in terms of data it pulls down. Perhaps, we will do this only in asmdiff pipeline and not anywhere else, it should be ok though.

@BruceForstall BruceForstall self-assigned this Nov 4, 2021
@BruceForstall
Copy link
Member Author

Draft PR: #61194

@BruceForstall
Copy link
Member Author

Processing generated .dasm files requires running jit-analyze. This means cloning and building jitutils, using the dotnet CLI. The build machine can cross-build a target-specific, self-contained jit-analyze as follows:

C:\gh\jitutils>dotnet publish -c Release --runtime osx-x64 --self-contained --output c:\gh\jitutils\bin\osx-x64 c:\gh\jitutils\src\jit-analyze\jit-analyze.csproj

or replacing osx-x64 by win-x64, linux-x64, or one of the other appropriate RIDs .

On Windows, you get jit-analyze.exe, on Linux/Mac jit-analyze (which perhaps needs to be made executable on the target machine?)

@kunalspathak
Copy link
Member

jit-analyze (which perhaps needs to be made executable on the target machine?)

Possibly, we do that in other pipelines too.

def make_executable(file_name):

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Nov 6, 2021
BruceForstall added a commit that referenced this issue Nov 10, 2021
Create a new `runtime-coreclr superpmi-asmdiffs` pipeline that runs SuperPMI asmdiffs for every change in the JIT directory.

The diffs are run on two platforms: Windows x64 and Windows x86. Linux, and Arm64 and Arm32, asm diffs are done using cross-compilers, as follows:

| Platform | Asm diffs |
| -- | -- |
| Windows x64 | win-x64, win-arm64, linux-x64, linux-arm64 |
| Windows x86 | win-x86, linux-arm |

The resulting summary .md files are uploaded into the pipeline artifacts, one .md file per platform (so, one for the Windows x64 runs and one for the Windows x86 runs). The results are also displayed in "Extensions" page of the AzDO pipeline.

The runs take about 50 minutes to complete (assuming not much waiting for machines).

The asm diffs pipeline is similar to the "superpmi-replay" pipeline, except:
1. It determines what an appropriate baseline JIT would be based on the PR commit and how it merges with the `main` branch. Given this, it downloads the matching baseline JITs from the JIT rolling build artifacts in Azure Storage.
2. It clones the `jitutils` repo and builds the `jit-analyze` tool, needed to generate the summary .md file.
3. It downloads and adds to the Helix machine payload a "portable" `git` installation, as `git diff` is used by `jit-analyze` for analyzing the generated .dasm files of the diff.
4. It collects all the various summary.md files into one per platform on which the runs are done, and publishes that to the artifacts and the `Extensions` page.
5. It only does one replay (asmdiffs) run, not one for each of a set of multiple stress modes.

As part of this implementation,
a. The `azdo_pipelines_util.py` was renamed to `jitutil.py`, and a lot of utility functions from superpmi.py were moved over to it. This was mostly to share the code for downloading and uncompressing .zip files. (There is a slight change to the output from the `superpmi.py download` command as a result.) However, I also moved a bunch of simple, more general helpers, for possible future sharing.
b. `jitrollingbuild.py download` can now take no arguments and download a baseline JIT (from the JIT rolling build Azure Storage location), for the current enlistment, to the default location. Previously, it required a specific git_hash and target directory. There is similar logic in superpmi.py, but not quite the same.
c. The `superpmi.py --no_progress` option was made global, and applied in a few more places. This was necessary because `superpmi.py asmdiffs` will download a coredistools binary from the JIT Azure Storage if one isn't found in the Core_Root folder.

Fixes #59445
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Nov 10, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Dec 11, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants