Bump LLVM to get bazel fixes #2517

sjain-stanford · 2023-10-18T01:30:17Z

The last llvm bump in #2511 pointed to llvm/llvm-project@b44b349, however the bazel build upstream was not clean at this point:

ERROR: /root/.cache/bazel/_bazel_root/b89349c08f7224396763d14fe35cba11/external/llvm-project/mlir/BUILD.bazel:5837:18: TdGenerate
external/llvm-project/mlir/include/mlir/Dialect/LLVMIR/NVVMOpsInterface.h.inc failed: (Exit 1): mlir-tblgen failed: error executing command ...
                                                                                                                                                    
external/llvm-project/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td:20:9: error: Could not find include file 'mlir/Dialect/LLVMIR/BasicPtxBuilderInterface.td'                                                                                                           
include "mlir/Dialect/LLVMIR/BasicPtxBuilderInterface.td"                                                                                                                                                                                                              
        ^                                                                                                                                                                                                                                                              
external/llvm-project/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td:20:9: error: Unexpected token at top level                                                                                                                                                           
include "mlir/Dialect/LLVMIR/BasicPtxBuilderInterface.td"                                                                                                                                                                                                              
        ^

The bazel fixes followed in a subsequent commit at llvm/llvm-project@28b27c1. This PR bumps LLVM by a few more commits (to include the bazel fixes) which helps restore Torch-MLIR's bazel build back to 🟢 .

GHA workflow to test bazel build: https://github.com/sjain-stanford/torch-mlir/actions/runs/6555101471/job/17803082508

stellaraccident

This week was a mess of conflicting upstream patches. Sorry...

sjain-stanford · 2023-10-18T01:50:04Z

This week was a mess of conflicting upstream patches. Sorry...

No worries at all! It is unavoidable though with bazel builds not being merge gating upstream 😅 . Thanks for the quick ✅ .

…ync (#11) ## Why When bumping LLVM up, it is crucial to be able to test all downstream repos depending on it to ensure they work **in tandem** (and not just in isolation). In the past, LLVM upgrades were simpler because torch-mlir took a hard dependency on mhlo/stablehlo and, in doing so, ensured that the llvm "green commit" (sha1) that torch-mlir and stablehlo were built+tested against was pre-identified. During this time mlir-tcp was developed on a branch of torch-mlir. This meant when upgrades were needed downstream, we’d simply point to torch-mlir@HEAD (sha4) and pick the llvm-project (sha1) and mhlo/stablehlo (sha3) hashes it’d refer to, since these are already tested to work together. This became our set of green commits (llvm@sha1, stablehlo@sha3, torch-mlir@sha4) for downstream integrations (e.g cruise monorepo). <img width="500" alt="image" src="https://github.com/cruise-automation/mlir-tcp/assets/19234106/42078522-466c-449f-8d7e-496facc1447c"> At present the situation is complicated because torch-mlir no longer takes a hard dependency on stablehlo (stablehlo e2e tests [disabled](llvm/torch-mlir#2460)). Here's details from a recent upgrade scenario that motivated this RFC. We picked torch-mlir@HEAD which was right after the llvm bump in llvm/torch-mlir#2511 pointing to llvm/llvm-project@b44b349, but soon realized (when we started building torch-mlir) that the llvm bazel build upstream was broken: ``` ERROR: /root/.cache/bazel/_bazel_root/b89349c08f7224396763d14fe35cba11/external/llvm-project/mlir/BUILD.bazel:5837:18: TdGenerate external/llvm-project/mlir/include/mlir/Dialect/LLVMIR/NVVMOpsInterface.h.inc failed: (Exit 1): mlir-tblgen failed: error executing command ... external/llvm-project/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td:20:9: error: Could not find include file 'mlir/Dialect/LLVMIR/BasicPtxBuilderInterface.td' include "mlir/Dialect/LLVMIR/BasicPtxBuilderInterface.td" ^ ``` The bazel fixes followed in a subsequent commit at llvm/llvm-project@28b27c1. Hence llvm had to be re-bumped in torch-mlir (llvm/torch-mlir#2517). However, after a bit more work we hit these failing stablehlo tests, which surfaced the fact that stablehlo pointed to by torch-mlir could no longer be used, and we had to separately identify the sha3 of stablehlo that would build cleanly against sha1 of llvm. ``` @stablehlo//stablehlo/conversions/tosa/tests:binary.mlir.test FAILED in 0.7s @stablehlo//stablehlo/tests:print_stablehlo.mlir.test FAILED in 4.7s ``` This meant the burden of identifying the llvm green commit (that works across the board) is shifted further downstream from torch-mlir. Incidentally we are in a great position to leverage mlir-tcp to identify the set of green commits, given it already directly depends on each of these repos. <img width="500" alt="image" src="https://github.com/cruise-automation/mlir-tcp/assets/19234106/cadd38c4-71ec-45b0-8888-85ac0bfd4e99"> ## What This PR is an attempt to leverage the mlir-tcp repo as our "proxy" for such downstream integrations, and _I think_ contains everything needed to be able to do that. ## How Specifically, we should now be able to run these from the comfort of `mlir-tcp`: ```shell bazel test --config=clang_linux @llvm-project//mlir/... bazel test --config=clang_linux @stablehlo//... bazel test --config=clang_linux @torch-mlir//... ``` We provide `local_repos.bzl` that allows easier local testing of patches that later need to be upstreamed, and while they're being upstreamed we could land them as patches to our `http_archive` targets. Note: I include a `stablehlo.patch` that allows testing stablehlo from `mlir-tcp`. This is temporary and can be removed once openxla/stablehlo#1810 lands. This PR also enables each of the 3p test suites as GHA workflows (non-merge gating for now, we can change this). These workflows are automatically skipped unless a change is made to `deps.bzl` (which usually means bumping 3p deps), as it would be unnecessary to run them for every PR and `main` commit post-merge. Here's a snapshot from this PR's workflows, having bumped stablehlo commit. <img width="747" alt="image" src="https://github.com/cruise-automation/mlir-tcp/assets/19234106/e535ed39-33f7-4941-958c-3a5d0c0adef6">

bump llvm to 28b27c1 to pick bazel fixes

63d9976

stellaraccident marked this pull request as ready for review October 18, 2023 01:35

stellaraccident approved these changes Oct 18, 2023

View reviewed changes

sjain-stanford merged commit 52abae1 into llvm:main Oct 18, 2023
5 checks passed

sjain-stanford deleted the sambhav/llvm_bump branch October 18, 2023 05:00

sjain-stanford mentioned this pull request Oct 19, 2023

[RFC + PR] Use TCP for {LLVM / Torch-MLIR / StableHLO} Green Commit Sync cruise-automation/mlir-tcp#11

Merged

hamptonm1 mentioned this pull request Nov 9, 2023

Initial changes for llvm uplift onnx/onnx-mlir#2568

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump LLVM to get bazel fixes #2517

Bump LLVM to get bazel fixes #2517

sjain-stanford commented Oct 18, 2023 •

edited

Loading

stellaraccident left a comment

sjain-stanford commented Oct 18, 2023

Bump LLVM to get bazel fixes #2517

Bump LLVM to get bazel fixes #2517

Conversation

sjain-stanford commented Oct 18, 2023 • edited Loading

stellaraccident left a comment

Choose a reason for hiding this comment

sjain-stanford commented Oct 18, 2023

sjain-stanford commented Oct 18, 2023 •

edited

Loading