Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Workflow] Add new code format helper. #66684

Merged
merged 3 commits into from
Sep 20, 2023
Merged

[Workflow] Add new code format helper. #66684

merged 3 commits into from
Sep 20, 2023

Conversation

tru
Copy link
Collaborator

@tru tru commented Sep 18, 2023

This helper will format python files with black/darker and
C/C++ files with clang-format.

The format helper is written so that we can expand it with new
formatters in the future like clang-tidy.

Demo of the output is here:

tru#8

This helper will format python files with black/darker and
C/C++ files with clang-format.

The format helper is written so that we can expand it with new
formatters in the future like clang-tidy.

Demo of the output is here:

#8
@llvmbot
Copy link
Member

llvmbot commented Sep 18, 2023

@llvm/pr-subscribers-github-workflow

Changes

This helper will format python files with black/darker and
C/C++ files with clang-format.

The format helper is written so that we can expand it with new
formatters in the future like clang-tidy.

Demo of the output is here:

tru#8


Full diff: https://github.com/llvm/llvm-project/pull/66684.diff

5 Files Affected:

  • (added) .github/workflows/pr-code-format.yml (+54)
  • (removed) .github/workflows/pr-python-format.yml (-39)
  • (added) llvm/utils/git/code-format-helper.py (+236)
  • (added) llvm/utils/git/requirements_formatting.txt (+52)
  • (added) llvm/utils/git/requirements_formatting.txt.in (+3)
diff --git a/.github/workflows/pr-code-format.yml b/.github/workflows/pr-code-format.yml
new file mode 100644
index 000000000000000..102e1a263b15a88
--- /dev/null
+++ b/.github/workflows/pr-code-format.yml
@@ -0,0 +1,54 @@
+name: "Check code formatting"
+on: pull_request
+permissions:
+  pull-requests: write
+
+jobs:
+  code_formatter:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Fetch LLVM sources
+        uses: actions/checkout@v4
+        with:
+          persist-credentials: false
+          fetch-depth: 2
+
+      - name: Get changed files
+        id: changed-files
+        uses: tj-actions/changed-files@v39
+        with:
+          separator: ","
+
+      - name: "Listed files"
+        run: |
+          echo "Formatting files:"
+          echo "${{ steps.changed-files.outputs.all_changed_files }}"
+
+      - name: Install clang-format
+        uses: aminya/setup-cpp@v1
+        with:
+          clangformat: 16.0.6
+
+      - name: Setup Python env
+        uses: actions/setup-python@v4
+        with:
+          python-version: '3.11'
+          cache: 'pip'
+          cache-dependency-path: 'llvm/utils/git/requirements_formatting.txt'
+
+      - name: Install python dependencies
+        run: pip install -r llvm/utils/git/requirements_formatting.txt
+
+      - name: Run code formatter
+        env:
+          GITHUB_PR_NUMBER: ${{ github.event.pull_request.number }}
+          START_REV: ${{ github.event.pull_request.base.sha }}
+          END_REV: ${{ github.event.pull_request.head.sha }}
+          CHANGED_FILES: ${{ steps.changed-files.outputs.all_changed_files }}
+        run: |
+          python llvm/utils/git/code-format-helper.py \
+            --token ${{ secrets.GITHUB_TOKEN }} \
+            --issue-number $GITHUB_PR_NUMBER \
+            --start-rev $START_REV \
+            --end-rev $END_REV \
+            --changed-files "$CHANGED_FILES"
diff --git a/.github/workflows/pr-python-format.yml b/.github/workflows/pr-python-format.yml
deleted file mode 100644
index c6122958826545c..000000000000000
--- a/.github/workflows/pr-python-format.yml
+++ /dev/null
@@ -1,39 +0,0 @@
-name: "Check Python Formatting"
-on:
-  pull_request:
-    # run on .py
-    paths:
-      - '**.py'
-
-jobs:
-  python_formatting:
-    runs-on: ubuntu-latest
-    steps:
-      - name: Fetch LLVM sources
-        uses: actions/checkout@v4
-        with:
-          persist-credentials: false
-          fetch-depth: 2
-
-      - name: Get changed files
-        id: changed-files
-        uses: tj-actions/changed-files@v39
-        with:
-          files: '**/*.py'
-
-      - name: "Listed files"
-        run: |
-          echo "Formatting files:"
-          echo "${{ steps.changed-files.outputs.all_changed_files }}"
-
-      - name: Setup Python env
-        uses: actions/setup-python@v4
-        with:
-          python-version: '3.11'
-
-      - name: Python Formatting
-        uses: akaihola/[email protected]
-        with:
-          options: "--check --diff --color"
-          version: "~=1.7.2"
-          src: "${{ steps.changed-files.outputs.all_changed_files }}"
diff --git a/llvm/utils/git/code-format-helper.py b/llvm/utils/git/code-format-helper.py
new file mode 100644
index 000000000000000..07f4d78496e1b64
--- /dev/null
+++ b/llvm/utils/git/code-format-helper.py
@@ -0,0 +1,236 @@
+#!/usr/bin/env python3
+#
+# ====- code-format-helper, runs code formatters from the ci --*- python -*--==#
+#
+# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+# See https://llvm.org/LICENSE.txt for license information.
+# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+#
+# ==-------------------------------------------------------------------------==#
+
+import argparse
+import os
+import subprocess
+import sys
+from functools import cached_property
+
+import github
+from github import IssueComment, PullRequest
+
+
+class FormatHelper:
+    COMMENT_TAG = "<!--LLVM CODE FORMAT COMMENT: {fmt}-->"
+    name = "unknown"
+
+    @property
+    def comment_tag(self) -> str:
+        return self.COMMENT_TAG.replace("fmt", self.name)
+
+    def format_run(self, changed_files: [str], args: argparse.Namespace) -> str | None:
+        pass
+
+    def pr_comment_text(self, diff: str) -> str:
+        return f"""
+{self.comment_tag}
+
+:warning: {self.friendly_name}, {self.name} found issues in your code. :warning:
+
+<details>
+<summary>
+You can test this locally with the following command:
+</summary>
+
+``````````bash
+{self.instructions}
+``````````
+
+</details>
+
+<details>
+<summary>
+View the diff from {self.name} here.
+</summary>
+
+``````````diff
+{diff}
+``````````
+
+</details>
+"""
+
+    def find_comment(
+        self, pr: PullRequest.PullRequest
+    ) -> IssueComment.IssueComment | None:
+        for comment in pr.as_issue().get_comments():
+            if self.comment_tag in comment.body:
+                return comment
+        return None
+
+    def update_pr(self, diff: str, args: argparse.Namespace):
+        repo = github.Github(args.token).get_repo(args.repo)
+        pr = repo.get_issue(args.issue_number).as_pull_request()
+
+        existing_comment = self.find_comment(pr)
+        pr_text = self.pr_comment_text(diff)
+
+        if existing_comment:
+            existing_comment.edit(pr_text)
+        else:
+            pr.as_issue().create_comment(pr_text)
+
+    def update_pr_success(self, args: argparse.Namespace):
+        repo = github.Github(args.token).get_repo(args.repo)
+        pr = repo.get_issue(args.issue_number).as_pull_request()
+
+        existing_comment = self.find_comment(pr)
+        if existing_comment:
+            existing_comment.edit(
+                f"""
+{self.comment_tag}
+:white_check_mark: With the latest revision this PR passed the {self.friendly_name}.
+"""
+            )
+
+    def run(self, changed_files: [str], args: argparse.Namespace):
+        diff = self.format_run(changed_files, args)
+        if diff:
+            self.update_pr(diff, args)
+            return False
+        else:
+            self.update_pr_success(args)
+            return True
+
+
+class ClangFormatHelper(FormatHelper):
+    name = "clang-format"
+    friendly_name = "C/C++ code formatter"
+
+    @property
+    def instructions(self):
+        return " ".join(self.cf_cmd)
+
+    @cached_property
+    def libcxx_excluded_files(self):
+        with open("libcxx/utils/data/ignore_format.txt", "r") as ifd:
+            return [excl.strip() for excl in ifd.readlines()]
+
+    def should_be_excluded(self, path: str) -> bool:
+        for excl in self.libcxx_excluded_files:
+            if path == excl:
+                print(f"Excluding file {path}")
+                return True
+        return False
+
+    def filter_changed_files(self, changed_files: [str]) -> [str]:
+        filtered_files = []
+        for path in changed_files:
+            _, ext = os.path.splitext(path)
+            if ext in (".cpp", ".c", ".h", ".hpp", ".hxx", ".cxx"):
+                if not self.should_be_excluded(path):
+                    filtered_files.append(path)
+        return filtered_files
+
+    def format_run(self, changed_files: [str], args: argparse.Namespace) -> str | None:
+        self.args = args
+        cpp_files = self.filter_changed_files(changed_files)
+        if not cpp_files:
+            return
+        cf_cmd = [
+            "git-clang-format",
+            "--diff",
+            args.start_rev,
+            args.end_rev,
+            "--",
+        ] + cpp_files
+        print(f"Running: {' '.join(cf_cmd)}")
+        self.cf_cmd = cf_cmd
+        proc = subprocess.run(cf_cmd, capture_output=True)
+
+        # formatting needed
+        if proc.returncode == 1:
+            return proc.stdout.decode("utf-8")
+
+        return None
+
+
+class DarkerFormatHelper(FormatHelper):
+    name = "darker"
+    friendly_name = "Python code formatter"
+
+    @property
+    def instructions(self):
+        return " ".join(self.darker_cmd)
+
+    def filter_changed_files(self, changed_files: [str]) -> [str]:
+        filtered_files = []
+        for path in changed_files:
+            name, ext = os.path.splitext(path)
+            if ext == ".py":
+                filtered_files.append(path)
+
+        return filtered_files
+
+    def format_run(self, changed_files: [str], args: argparse.Namespace) -> str | None:
+        self.args = args
+        py_files = self.filter_changed_files(changed_files)
+        if not py_files:
+            return
+        darker_cmd = [
+            "darker",
+            "--check",
+            "--diff",
+            "-r",
+            f"{args.start_rev}..{args.end_rev}",
+        ] + py_files
+        print(f"Running: {' '.join(darker_cmd)}")
+        self.darker_cmd = darker_cmd
+        proc = subprocess.run(darker_cmd, capture_output=True)
+
+        # formatting needed
+        if proc.returncode == 1:
+            return proc.stdout.decode("utf-8")
+
+        return None
+
+
+ALL_FORMATTERS = (DarkerFormatHelper(), ClangFormatHelper())
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser()
+    parser.add_argument(
+        "--token", type=str, required=True, help="GitHub authentiation token"
+    )
+    parser.add_argument(
+        "--repo",
+        type=str,
+        default=os.getenv("GITHUB_REPOSITORY", "llvm/llvm-project"),
+        help="The GitHub repository that we are working with in the form of <owner>/<repo> (e.g. llvm/llvm-project)",
+    )
+    parser.add_argument("--issue-number", type=int, required=True)
+    parser.add_argument(
+        "--start-rev",
+        type=str,
+        required=True,
+        help="Compute changes from this revision.",
+    )
+    parser.add_argument(
+        "--end-rev", type=str, required=True, help="Compute changes to this revision"
+    )
+    parser.add_argument(
+        "--changed-files",
+        type=str,
+        help="Comma separated list of files that has been changed",
+    )
+
+    args = parser.parse_args()
+
+    changed_files = []
+    if args.changed_files:
+        changed_files = args.changed_files.split(",")
+
+    exit_code = 0
+    for fmt in ALL_FORMATTERS:
+        if not fmt.run(changed_files, args):
+            exit_code = 1
+
+    sys.exit(exit_code)
diff --git a/llvm/utils/git/requirements_formatting.txt b/llvm/utils/git/requirements_formatting.txt
new file mode 100644
index 000000000000000..ff744f0d4225f59
--- /dev/null
+++ b/llvm/utils/git/requirements_formatting.txt
@@ -0,0 +1,52 @@
+#
+# This file is autogenerated by pip-compile with Python 3.11
+# by the following command:
+#
+#    pip-compile --output-file=llvm/utils/git/requirements_formatting.txt llvm/utils/git/requirements_formatting.txt.in
+#
+black==23.9.1
+    # via
+    #   -r llvm/utils/git/requirements_formatting.txt.in
+    #   darker
+certifi==2023.7.22
+    # via requests
+cffi==1.15.1
+    # via
+    #   cryptography
+    #   pynacl
+charset-normalizer==3.2.0
+    # via requests
+click==8.1.7
+    # via black
+cryptography==41.0.3
+    # via pyjwt
+darker==1.7.2
+    # via -r llvm/utils/git/requirements_formatting.txt.in
+deprecated==1.2.14
+    # via pygithub
+idna==3.4
+    # via requests
+mypy-extensions==1.0.0
+    # via black
+packaging==23.1
+    # via black
+pathspec==0.11.2
+    # via black
+platformdirs==3.10.0
+    # via black
+pycparser==2.21
+    # via cffi
+pygithub==1.59.1
+    # via -r llvm/utils/git/requirements_formatting.txt.in
+pyjwt[crypto]==2.8.0
+    # via pygithub
+pynacl==1.5.0
+    # via pygithub
+requests==2.31.0
+    # via pygithub
+toml==0.10.2
+    # via darker
+urllib3==2.0.4
+    # via requests
+wrapt==1.15.0
+    # via deprecated
diff --git a/llvm/utils/git/requirements_formatting.txt.in b/llvm/utils/git/requirements_formatting.txt.in
new file mode 100644
index 000000000000000..6782f248bccb80d
--- /dev/null
+++ b/llvm/utils/git/requirements_formatting.txt.in
@@ -0,0 +1,3 @@
+black~=23.0
+darker==1.7.2
+PyGithub==1.59.1
\ No newline at end of file

Copy link
Collaborator

@tstellar tstellar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.

My personal preference would be to have these changes added as code suggestions to the review, so the author could automatically apply the fixes just by clicking on them, but that can be done as a follow on patch. Did you look into doing this at all? If so, do you have some documentation you could share in case someone else wants to try to implement it.


@cached_property
def libcxx_excluded_files(self):
with open("libcxx/utils/data/ignore_format.txt", "r") as ifd:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to have in tree config files that tell clang-format what to ignore?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tru
Copy link
Collaborator Author

tru commented Sep 18, 2023

Did you look into doing this at all? If so, do you have some documentation you could share in case someone else wants to try to implement it

it's not well documented. But apparently you can add a comment with a specific syntax:

https://github.com/orgs/community/discussions/24848#discussioncomment-3245615

You would have to parse the diff somehow, match it in the diff and make the comments. Seems like a lot of work.

This tool is supposed to be able to do that, but I haven't tested it and I didn't want to add any dependencies when it was not clear that we actually wanted suggestions at all.

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this, thanks for setting that up! I am a tiny bit concerned about the comments being extra verbose, but I guess we can start that way and adjust as needed.

I'd like @mordante and @philnik777 to take a look in case I'd have missed something but this LGTM for libc++.

A notable behavior change is that we'll now start enforcing that any modified file in libcxx/test is clang-formatted, but I think that's OK. We could handle that by making our ignore_format.txt file contain more entries if needed.

filtered_files = []
for path in changed_files:
_, ext = os.path.splitext(path)
if ext in (".cpp", ".c", ".h", ".hpp", ".hxx", ".cxx"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Libc++ has a bunch of headers without any extension (e.g. libcxx/include/algorithm), will those be handled?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but this method can be expanded to include all files in that subdir. Happy to do so if you can outline the rules.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need rules at all? ie, clang-format should already handle that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rule is just "all files inside libcxx/include".

I know it's a real pain, but we don't have a choice here because the name of these headers is mandated by the Standard. Sometimes I wish the Standard had used an extension like everybody else, but they don't and it creates these small (but annoying) complexities.

@tru
Copy link
Collaborator Author

tru commented Sep 18, 2023

I really like this, thanks for setting that up! I am a tiny bit concerned about the comments being extra verbose, but I guess we can start that way and adjust as needed.

I'd like @mordante and @philnik777 to take a look in case I'd have missed something but this LGTM for libc++.

A notable behavior change is that we'll now start enforcing that any modified file in libcxx/test is clang-formatted, but I think that's OK. We could handle that by making our ignore_format.txt file contain more entries if needed.

If that's a problem I suggest doing what clang is doing for their tests: https://github.com/llvm/llvm-project/blob/main/clang/test/.clang-format

Copy link
Contributor

@philnik777 philnik777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM too.

llvm/utils/git/code-format-helper.py Outdated Show resolved Hide resolved
@@ -0,0 +1,3 @@
black~=23.0
darker==1.7.2
PyGithub==1.59.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing new lines

Comment on lines +27 to +30
- name: Install clang-format
uses: aminya/setup-cpp@v1
with:
clangformat: 16.0.6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to give ourselves more control by installing from https://apt.llvm.org/ ?
Although we probably don't want to use an experimental build so as long as the action keeps up, this is fine

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually uses apt.llvm.org already which is why I thought it was fine :)

filtered_files = []
for path in changed_files:
_, ext = os.path.splitext(path)
if ext in (".cpp", ".c", ".h", ".hpp", ".hxx", ".cxx"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need rules at all? ie, clang-format should already handle that.

return filtered_files

def format_run(self, changed_files: [str], args: argparse.Namespace) -> str | None:
self.args = args
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that ever used?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope it's not. Will fix!

Comment on lines +107 to +111

@property
def instructions(self):
return " ".join(self.cf_cmd)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe that's good as a good iteration but i think as a user I want to know how to fix it rather than how to
run the bot locally.
Ie I guess for clang-format I want the command without the --diff
" ".join(self.cf_cmd.remove("--diff")) is probably workable for clang-format

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that make sense. Let me just remove the diff then.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually - I tried this - and it becomes a bit complicated, if you want to pass two SHA's as we do you also have to pass --diff

So, if we want to reformat the local files we need more instructions to ensure that you are in the right branch on the right commit.

Are you ok with landing this as it is right now and we can iterate on the instructions later or make sure we link to external documentation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, of course!
Actually, i could not find docs to use git-clang-format in LLVM so if we don't have that maybe we can improve that too later

@tru
Copy link
Collaborator Author

tru commented Sep 19, 2023

Not sure why I can't reply directly to the comment about the filter:

We need the filter for libc++ at least, so the code could either just filter out libc++ file and accept everything else, but then we can't control what extensions we should format. I kind of like that it's explicit now.

@tru tru force-pushed the tru/code_formatting branch from 7660b8e to 2e14811 Compare September 19, 2023 06:45
@tru tru merged commit da94bf0 into llvm:main Sep 20, 2023
@tru tru deleted the tru/code_formatting branch September 20, 2023 06:52
tru added a commit that referenced this pull request Sep 20, 2023
tru added a commit to tru/llvm-project that referenced this pull request Sep 22, 2023
This helper will format python files with black/darker and
C/C++ files with clang-format.

The format helper is written so that we can expand it with new
formatters in the future like clang-tidy.
tru added a commit to tru/llvm-project that referenced this pull request Sep 22, 2023
This helper will format python files with black/darker and
C/C++ files with clang-format.

The format helper is written so that we can expand it with new
formatters in the future like clang-tidy.
@ingomueller-net
Copy link
Contributor

My PR fails in this new CI target because llvm/utils/git/requirements_formatting.txt cannot be found. See output here. I have rebased just now. Am I doing something wrong?

@tru
Copy link
Collaborator Author

tru commented Sep 23, 2023

My PR fails in this new CI target because llvm/utils/git/requirements_formatting.txt cannot be found. See output here. I have rebased just now. Am I doing something wrong?

No that's me testing some other fixes. It should be fixed now. But you might have to rebase your change on main to get the latest fix.

@ingomueller-net
Copy link
Contributor

OK, cool, thanks! Seems to be working now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

7 participants