Consider producing SARIF output? #5995
FollowTheProcess
started this conversation in
Ideas
Replies: 2 comments
-
We have no plans to add such capabilities. However, pyright does output its diagnostic results in JSON output, and it would be easy to write a separate tool that transforms this output into other formats as you desire. You're welcome to develop and publish such a tool. |
Beta Was this translation helpful? Give feedback.
0 replies
-
Here's an initial attempt at such a tool if anyone else is interested: """Convert pyright JSON output to SARIF format"""
import json
import re
import sys
from pathlib import Path
parse_camel_case = re.compile("((?<=[a-z0-9])[A-Z]|(?!^)[A-Z](?=[a-z]))")
LEVEL_TO_SEVERITY_MAPPING = {
"error": "error",
"warning": "warning",
"information": "note",
}
if __name__ == "__main__":
# Read input from stdin (pyright JSON output should be piped in like
# `pyright --outputjson | python convert.py`)
input_data = sys.stdin.read()
pyright_data = json.loads(input_data)
# Build lists of rules and artifacts from the pyright 'generalDiagnostics' section
rules = [
*{
diagnostic.get("rule", "none"): {
"id": diagnostic.get("rule", "none"),
"shortDescription": {
"text": parse_camel_case.sub(r" \1", diagnostic.get("rule", "")).title()
if diagnostic.get("rule", "none") != "none"
else "No Rule Associated With This Error"
},
}
for diagnostic in pyright_data["generalDiagnostics"]
}.values()
]
rules_index = {rule["id"]: index for index, rule in enumerate(rules)}
artifacts = [
*{
diagnostic["file"]: {"location": {"uri": Path(diagnostic["file"]).as_uri()}}
for diagnostic in pyright_data["generalDiagnostics"]
}.values()
]
artifacts_index = {
artifact["location"]["uri"]: index for index, artifact in enumerate(artifacts)
}
# Build the SARIF data structure
sarif_data = {
"version": "2.1.0",
"$schema": "http://json.schemastore.org/sarif-2.1.0-rtm.4",
"runs": [
{
"tool": {
"driver": {
"name": "pyright",
"informationUri": "https://microsoft.github.io/pyright/#/",
"rules": rules,
}
},
"artifacts": artifacts,
"results": [
{
"level": LEVEL_TO_SEVERITY_MAPPING[diagnostic["severity"]],
"message": {"text": diagnostic["message"]},
"locations": [
{
"physicalLocation": {
"artifactLocation": {
"uri": Path(diagnostic["file"]).as_uri(),
"index": artifacts_index[Path(diagnostic["file"]).as_uri()],
},
"region": {
"startLine": diagnostic["range"]["start"]["line"],
"startColumn": diagnostic["range"]["start"]["character"],
"endLine": diagnostic["range"]["end"]["line"],
"endColumn": diagnostic["range"]["end"]["character"],
},
}
}
],
"ruleId": diagnostic.get("rule", "none"),
"ruleIndex": rules_index[diagnostic.get("rule", "none")],
}
for diagnostic in pyright_data["generalDiagnostics"]
],
}
],
}
print(json.dumps(sarif_data, indent=2)) It expects the input to be piped in, i.e. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello 👋🏻
Just wondering if there are any plans or possibilities for pyright to output a sarif file? I've been using a few tools that do output sarifs and I have to say the consistency and being able to aggregate multiple tools outputs into consistent reports etc. is incredibly useful.
I've been working on a GitHub action to take sarif files and turn them into GitHub Annotations so warnings from various static analysis tools can be displayed in a more native, user friendly UI on a PR. It would be great if pyright could do this too!
I know the option is there to output JSON and the structure of that JSON is very nice and could easily be parsed into GitHub Annotations, but would be great if more tools could standardise on sarif, maybe pyright leading by example!
This is definitely a feature I'd love to see on the roadmap, curious to hear others thoughts too 🙂
Beta Was this translation helpful? Give feedback.
All reactions