-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Workflow] Add new code format helper. #66684
Conversation
This helper will format python files with black/darker and C/C++ files with clang-format. The format helper is written so that we can expand it with new formatters in the future like clang-tidy. Demo of the output is here: #8
@llvm/pr-subscribers-github-workflow ChangesThis helper will format python files with black/darker and The format helper is written so that we can expand it with new Demo of the output is here: Full diff: https://github.com/llvm/llvm-project/pull/66684.diff 5 Files Affected:
diff --git a/.github/workflows/pr-code-format.yml b/.github/workflows/pr-code-format.yml
new file mode 100644
index 000000000000000..102e1a263b15a88
--- /dev/null
+++ b/.github/workflows/pr-code-format.yml
@@ -0,0 +1,54 @@
+name: "Check code formatting"
+on: pull_request
+permissions:
+ pull-requests: write
+
+jobs:
+ code_formatter:
+ runs-on: ubuntu-latest
+ steps:
+ - name: Fetch LLVM sources
+ uses: actions/checkout@v4
+ with:
+ persist-credentials: false
+ fetch-depth: 2
+
+ - name: Get changed files
+ id: changed-files
+ uses: tj-actions/changed-files@v39
+ with:
+ separator: ","
+
+ - name: "Listed files"
+ run: |
+ echo "Formatting files:"
+ echo "${{ steps.changed-files.outputs.all_changed_files }}"
+
+ - name: Install clang-format
+ uses: aminya/setup-cpp@v1
+ with:
+ clangformat: 16.0.6
+
+ - name: Setup Python env
+ uses: actions/setup-python@v4
+ with:
+ python-version: '3.11'
+ cache: 'pip'
+ cache-dependency-path: 'llvm/utils/git/requirements_formatting.txt'
+
+ - name: Install python dependencies
+ run: pip install -r llvm/utils/git/requirements_formatting.txt
+
+ - name: Run code formatter
+ env:
+ GITHUB_PR_NUMBER: ${{ github.event.pull_request.number }}
+ START_REV: ${{ github.event.pull_request.base.sha }}
+ END_REV: ${{ github.event.pull_request.head.sha }}
+ CHANGED_FILES: ${{ steps.changed-files.outputs.all_changed_files }}
+ run: |
+ python llvm/utils/git/code-format-helper.py \
+ --token ${{ secrets.GITHUB_TOKEN }} \
+ --issue-number $GITHUB_PR_NUMBER \
+ --start-rev $START_REV \
+ --end-rev $END_REV \
+ --changed-files "$CHANGED_FILES"
diff --git a/.github/workflows/pr-python-format.yml b/.github/workflows/pr-python-format.yml
deleted file mode 100644
index c6122958826545c..000000000000000
--- a/.github/workflows/pr-python-format.yml
+++ /dev/null
@@ -1,39 +0,0 @@
-name: "Check Python Formatting"
-on:
- pull_request:
- # run on .py
- paths:
- - '**.py'
-
-jobs:
- python_formatting:
- runs-on: ubuntu-latest
- steps:
- - name: Fetch LLVM sources
- uses: actions/checkout@v4
- with:
- persist-credentials: false
- fetch-depth: 2
-
- - name: Get changed files
- id: changed-files
- uses: tj-actions/changed-files@v39
- with:
- files: '**/*.py'
-
- - name: "Listed files"
- run: |
- echo "Formatting files:"
- echo "${{ steps.changed-files.outputs.all_changed_files }}"
-
- - name: Setup Python env
- uses: actions/setup-python@v4
- with:
- python-version: '3.11'
-
- - name: Python Formatting
- uses: akaihola/[email protected]
- with:
- options: "--check --diff --color"
- version: "~=1.7.2"
- src: "${{ steps.changed-files.outputs.all_changed_files }}"
diff --git a/llvm/utils/git/code-format-helper.py b/llvm/utils/git/code-format-helper.py
new file mode 100644
index 000000000000000..07f4d78496e1b64
--- /dev/null
+++ b/llvm/utils/git/code-format-helper.py
@@ -0,0 +1,236 @@
+#!/usr/bin/env python3
+#
+# ====- code-format-helper, runs code formatters from the ci --*- python -*--==#
+#
+# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+# See https://llvm.org/LICENSE.txt for license information.
+# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+#
+# ==-------------------------------------------------------------------------==#
+
+import argparse
+import os
+import subprocess
+import sys
+from functools import cached_property
+
+import github
+from github import IssueComment, PullRequest
+
+
+class FormatHelper:
+ COMMENT_TAG = "<!--LLVM CODE FORMAT COMMENT: {fmt}-->"
+ name = "unknown"
+
+ @property
+ def comment_tag(self) -> str:
+ return self.COMMENT_TAG.replace("fmt", self.name)
+
+ def format_run(self, changed_files: [str], args: argparse.Namespace) -> str | None:
+ pass
+
+ def pr_comment_text(self, diff: str) -> str:
+ return f"""
+{self.comment_tag}
+
+:warning: {self.friendly_name}, {self.name} found issues in your code. :warning:
+
+<details>
+<summary>
+You can test this locally with the following command:
+</summary>
+
+``````````bash
+{self.instructions}
+``````````
+
+</details>
+
+<details>
+<summary>
+View the diff from {self.name} here.
+</summary>
+
+``````````diff
+{diff}
+``````````
+
+</details>
+"""
+
+ def find_comment(
+ self, pr: PullRequest.PullRequest
+ ) -> IssueComment.IssueComment | None:
+ for comment in pr.as_issue().get_comments():
+ if self.comment_tag in comment.body:
+ return comment
+ return None
+
+ def update_pr(self, diff: str, args: argparse.Namespace):
+ repo = github.Github(args.token).get_repo(args.repo)
+ pr = repo.get_issue(args.issue_number).as_pull_request()
+
+ existing_comment = self.find_comment(pr)
+ pr_text = self.pr_comment_text(diff)
+
+ if existing_comment:
+ existing_comment.edit(pr_text)
+ else:
+ pr.as_issue().create_comment(pr_text)
+
+ def update_pr_success(self, args: argparse.Namespace):
+ repo = github.Github(args.token).get_repo(args.repo)
+ pr = repo.get_issue(args.issue_number).as_pull_request()
+
+ existing_comment = self.find_comment(pr)
+ if existing_comment:
+ existing_comment.edit(
+ f"""
+{self.comment_tag}
+:white_check_mark: With the latest revision this PR passed the {self.friendly_name}.
+"""
+ )
+
+ def run(self, changed_files: [str], args: argparse.Namespace):
+ diff = self.format_run(changed_files, args)
+ if diff:
+ self.update_pr(diff, args)
+ return False
+ else:
+ self.update_pr_success(args)
+ return True
+
+
+class ClangFormatHelper(FormatHelper):
+ name = "clang-format"
+ friendly_name = "C/C++ code formatter"
+
+ @property
+ def instructions(self):
+ return " ".join(self.cf_cmd)
+
+ @cached_property
+ def libcxx_excluded_files(self):
+ with open("libcxx/utils/data/ignore_format.txt", "r") as ifd:
+ return [excl.strip() for excl in ifd.readlines()]
+
+ def should_be_excluded(self, path: str) -> bool:
+ for excl in self.libcxx_excluded_files:
+ if path == excl:
+ print(f"Excluding file {path}")
+ return True
+ return False
+
+ def filter_changed_files(self, changed_files: [str]) -> [str]:
+ filtered_files = []
+ for path in changed_files:
+ _, ext = os.path.splitext(path)
+ if ext in (".cpp", ".c", ".h", ".hpp", ".hxx", ".cxx"):
+ if not self.should_be_excluded(path):
+ filtered_files.append(path)
+ return filtered_files
+
+ def format_run(self, changed_files: [str], args: argparse.Namespace) -> str | None:
+ self.args = args
+ cpp_files = self.filter_changed_files(changed_files)
+ if not cpp_files:
+ return
+ cf_cmd = [
+ "git-clang-format",
+ "--diff",
+ args.start_rev,
+ args.end_rev,
+ "--",
+ ] + cpp_files
+ print(f"Running: {' '.join(cf_cmd)}")
+ self.cf_cmd = cf_cmd
+ proc = subprocess.run(cf_cmd, capture_output=True)
+
+ # formatting needed
+ if proc.returncode == 1:
+ return proc.stdout.decode("utf-8")
+
+ return None
+
+
+class DarkerFormatHelper(FormatHelper):
+ name = "darker"
+ friendly_name = "Python code formatter"
+
+ @property
+ def instructions(self):
+ return " ".join(self.darker_cmd)
+
+ def filter_changed_files(self, changed_files: [str]) -> [str]:
+ filtered_files = []
+ for path in changed_files:
+ name, ext = os.path.splitext(path)
+ if ext == ".py":
+ filtered_files.append(path)
+
+ return filtered_files
+
+ def format_run(self, changed_files: [str], args: argparse.Namespace) -> str | None:
+ self.args = args
+ py_files = self.filter_changed_files(changed_files)
+ if not py_files:
+ return
+ darker_cmd = [
+ "darker",
+ "--check",
+ "--diff",
+ "-r",
+ f"{args.start_rev}..{args.end_rev}",
+ ] + py_files
+ print(f"Running: {' '.join(darker_cmd)}")
+ self.darker_cmd = darker_cmd
+ proc = subprocess.run(darker_cmd, capture_output=True)
+
+ # formatting needed
+ if proc.returncode == 1:
+ return proc.stdout.decode("utf-8")
+
+ return None
+
+
+ALL_FORMATTERS = (DarkerFormatHelper(), ClangFormatHelper())
+
+if __name__ == "__main__":
+ parser = argparse.ArgumentParser()
+ parser.add_argument(
+ "--token", type=str, required=True, help="GitHub authentiation token"
+ )
+ parser.add_argument(
+ "--repo",
+ type=str,
+ default=os.getenv("GITHUB_REPOSITORY", "llvm/llvm-project"),
+ help="The GitHub repository that we are working with in the form of <owner>/<repo> (e.g. llvm/llvm-project)",
+ )
+ parser.add_argument("--issue-number", type=int, required=True)
+ parser.add_argument(
+ "--start-rev",
+ type=str,
+ required=True,
+ help="Compute changes from this revision.",
+ )
+ parser.add_argument(
+ "--end-rev", type=str, required=True, help="Compute changes to this revision"
+ )
+ parser.add_argument(
+ "--changed-files",
+ type=str,
+ help="Comma separated list of files that has been changed",
+ )
+
+ args = parser.parse_args()
+
+ changed_files = []
+ if args.changed_files:
+ changed_files = args.changed_files.split(",")
+
+ exit_code = 0
+ for fmt in ALL_FORMATTERS:
+ if not fmt.run(changed_files, args):
+ exit_code = 1
+
+ sys.exit(exit_code)
diff --git a/llvm/utils/git/requirements_formatting.txt b/llvm/utils/git/requirements_formatting.txt
new file mode 100644
index 000000000000000..ff744f0d4225f59
--- /dev/null
+++ b/llvm/utils/git/requirements_formatting.txt
@@ -0,0 +1,52 @@
+#
+# This file is autogenerated by pip-compile with Python 3.11
+# by the following command:
+#
+# pip-compile --output-file=llvm/utils/git/requirements_formatting.txt llvm/utils/git/requirements_formatting.txt.in
+#
+black==23.9.1
+ # via
+ # -r llvm/utils/git/requirements_formatting.txt.in
+ # darker
+certifi==2023.7.22
+ # via requests
+cffi==1.15.1
+ # via
+ # cryptography
+ # pynacl
+charset-normalizer==3.2.0
+ # via requests
+click==8.1.7
+ # via black
+cryptography==41.0.3
+ # via pyjwt
+darker==1.7.2
+ # via -r llvm/utils/git/requirements_formatting.txt.in
+deprecated==1.2.14
+ # via pygithub
+idna==3.4
+ # via requests
+mypy-extensions==1.0.0
+ # via black
+packaging==23.1
+ # via black
+pathspec==0.11.2
+ # via black
+platformdirs==3.10.0
+ # via black
+pycparser==2.21
+ # via cffi
+pygithub==1.59.1
+ # via -r llvm/utils/git/requirements_formatting.txt.in
+pyjwt[crypto]==2.8.0
+ # via pygithub
+pynacl==1.5.0
+ # via pygithub
+requests==2.31.0
+ # via pygithub
+toml==0.10.2
+ # via darker
+urllib3==2.0.4
+ # via requests
+wrapt==1.15.0
+ # via deprecated
diff --git a/llvm/utils/git/requirements_formatting.txt.in b/llvm/utils/git/requirements_formatting.txt.in
new file mode 100644
index 000000000000000..6782f248bccb80d
--- /dev/null
+++ b/llvm/utils/git/requirements_formatting.txt.in
@@ -0,0 +1,3 @@
+black~=23.0
+darker==1.7.2
+PyGithub==1.59.1
\ No newline at end of file
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me.
My personal preference would be to have these changes added as code suggestions to the review, so the author could automatically apply the fixes just by clicking on them, but that can be done as a follow on patch. Did you look into doing this at all? If so, do you have some documentation you could share in case someone else wants to try to implement it.
|
||
@cached_property | ||
def libcxx_excluded_files(self): | ||
with open("libcxx/utils/data/ignore_format.txt", "r") as ifd: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to have in tree config files that tell clang-format what to ignore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked something similar in discord the other day: https://discord.com/channels/636084430946959380/636732894974312448/1152588493902589962
it's not well documented. But apparently you can add a comment with a specific syntax: https://github.com/orgs/community/discussions/24848#discussioncomment-3245615 You would have to parse the diff somehow, match it in the diff and make the comments. Seems like a lot of work. This tool is supposed to be able to do that, but I haven't tested it and I didn't want to add any dependencies when it was not clear that we actually wanted suggestions at all. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this, thanks for setting that up! I am a tiny bit concerned about the comments being extra verbose, but I guess we can start that way and adjust as needed.
I'd like @mordante and @philnik777 to take a look in case I'd have missed something but this LGTM for libc++.
A notable behavior change is that we'll now start enforcing that any modified file in libcxx/test
is clang-formatted
, but I think that's OK. We could handle that by making our ignore_format.txt
file contain more entries if needed.
filtered_files = [] | ||
for path in changed_files: | ||
_, ext = os.path.splitext(path) | ||
if ext in (".cpp", ".c", ".h", ".hpp", ".hxx", ".cxx"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Libc++ has a bunch of headers without any extension (e.g. libcxx/include/algorithm
), will those be handled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but this method can be expanded to include all files in that subdir. Happy to do so if you can outline the rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need rules at all? ie, clang-format should already handle that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rule is just "all files inside libcxx/include
".
I know it's a real pain, but we don't have a choice here because the name of these headers is mandated by the Standard. Sometimes I wish the Standard had used an extension like everybody else, but they don't and it creates these small (but annoying) complexities.
If that's a problem I suggest doing what clang is doing for their tests: https://github.com/llvm/llvm-project/blob/main/clang/test/.clang-format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM too.
@@ -0,0 +1,3 @@ | |||
black~=23.0 | |||
darker==1.7.2 | |||
PyGithub==1.59.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing new lines
- name: Install clang-format | ||
uses: aminya/setup-cpp@v1 | ||
with: | ||
clangformat: 16.0.6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to give ourselves more control by installing from https://apt.llvm.org/ ?
Although we probably don't want to use an experimental build so as long as the action keeps up, this is fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actually uses apt.llvm.org already which is why I thought it was fine :)
filtered_files = [] | ||
for path in changed_files: | ||
_, ext = os.path.splitext(path) | ||
if ext in (".cpp", ".c", ".h", ".hpp", ".hxx", ".cxx"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need rules at all? ie, clang-format should already handle that.
llvm/utils/git/code-format-helper.py
Outdated
return filtered_files | ||
|
||
def format_run(self, changed_files: [str], args: argparse.Namespace) -> str | None: | ||
self.args = args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that ever used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope it's not. Will fix!
|
||
@property | ||
def instructions(self): | ||
return " ".join(self.cf_cmd) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe that's good as a good iteration but i think as a user I want to know how to fix it rather than how to
run the bot locally.
Ie I guess for clang-format I want the command without the --diff
" ".join(self.cf_cmd.remove("--diff"))
is probably workable for clang-format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that make sense. Let me just remove the diff then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually - I tried this - and it becomes a bit complicated, if you want to pass two SHA's as we do you also have to pass --diff
So, if we want to reformat the local files we need more instructions to ensure that you are in the right branch on the right commit.
Are you ok with landing this as it is right now and we can iterate on the instructions later or make sure we link to external documentation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, of course!
Actually, i could not find docs to use git-clang-format in LLVM so if we don't have that maybe we can improve that too later
Co-authored-by: philnik777 <[email protected]>
Not sure why I can't reply directly to the comment about the filter: We need the filter for libc++ at least, so the code could either just filter out libc++ file and accept everything else, but then we can't control what extensions we should format. I kind of like that it's explicit now. |
7660b8e
to
2e14811
Compare
This helper will format python files with black/darker and C/C++ files with clang-format. The format helper is written so that we can expand it with new formatters in the future like clang-tidy.
This helper will format python files with black/darker and C/C++ files with clang-format. The format helper is written so that we can expand it with new formatters in the future like clang-tidy.
My PR fails in this new CI target because |
No that's me testing some other fixes. It should be fixed now. But you might have to rebase your change on main to get the latest fix. |
OK, cool, thanks! Seems to be working now :) |
This helper will format python files with black/darker and
C/C++ files with clang-format.
The format helper is written so that we can expand it with new
formatters in the future like clang-tidy.
Demo of the output is here:
tru#8