-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stardoc: auto-dedent rule doc and attribute docs #13403
Conversation
f58ba31
to
bf72961
Compare
Ran bazel build src/main/java/com/google/devtools/build/skydoc:skydoc_deploy.jar && cp bazel-bin/src/main/java/com/google/devtools/build/skydoc/skydoc_deploy.jar ../stardoc/stardoc/stardoc_binary.jar
Ran bazel build src/main/java/com/google/devtools/build/skydoc:skydoc_deploy.jar && cp bazel-bin/src/main/java/com/google/devtools/build/skydoc/skydoc_deploy.jar ../stardoc/stardoc/stardoc_binary.jar
Ran bazel build src/main/java/com/google/devtools/build/skydoc:skydoc_deploy.jar && cp bazel-bin/src/main/java/com/google/devtools/build/skydoc/skydoc_deploy.jar ../stardoc/stardoc/stardoc_binary.jar
FYI @c-parsons could use someone's help to review/merge :) |
Makes them suitable for markdown output. Fixes bazelbuild#13402
Ran bazel build src/main/java/com/google/devtools/build/skydoc:skydoc_deploy.jar && cp bazel-bin/src/main/java/com/google/devtools/build/skydoc/skydoc_deploy.jar ../stardoc/stardoc/stardoc_binary.jar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR!
Main question is why we need a new dedentDocstring()
method - can we use the existing DocstringParser
for dedenting non-function docstrings?
Secondary comment - whatever we do to fix rule docstrings, we ought to also do for providers and aspects; so I think we'd want to dedent in FakeStarlarkRuleFunctionsApi.asProviderFieldInfo()
, FakeStarlarkRuleFunctionsApi.forProviderInfo()
, FakeStarlarkRuleFunctionsApi.aspect()
.
Yeah I considered changing the existing DocstringParser class, but of course it has a bunch of fields which are specific to docs with structured sections like It's also somewhat risky to make deep changes to existing code, but the test coverage here is pretty good so I think the main issue is just author/reviewer time. |
Ran bazel build src/main/java/com/google/devtools/build/skydoc:skydoc_deploy.jar && cp bazel-bin/src/main/java/com/google/devtools/build/skydoc/skydoc_deploy.jar ../stardoc/stardoc/stardoc_binary.jar
Ran bazel build src/main/java/com/google/devtools/build/skydoc:skydoc_deploy.jar && cp bazel-bin/src/main/java/com/google/devtools/build/skydoc/skydoc_deploy.jar ../stardoc/stardoc/stardoc_binary.jar
Friendly ping to all. |
My question for @tetromino is whether you have time to review a big refactoring here and if that's what you want, before I spend time digging in to how is could be done. |
src/tools/starlark/java/com/google/devtools/starlark/common/DocstringUtils.java
Show resolved
Hide resolved
@tetromino I think the ball is still in your court, I don't want to invest in a refactoring here that doesn't get merged |
@tetromino Friendly ping |
Hello @tetromino, Any update on this PR? Please. Thanks! |
🤷🏻♂️ |
@alexeagle Can you rebase to fix the merge conflict? |
@tetromino will be back in office next month. We can discuss broader questions of docstring formatting then. In the meantime I'll leave this PR open. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I apologize for dropping the ball here and not reviewing this sooner. I somehow operated under the impression that this code was only needed for the legacy stardoc extractor - but obviously both extractors need something like this logic.
So overall I agree with the proposed change, but there are a number of corner case problems (see my comments). I suggest limiting this PR to just adding DocstringUtils.dedentDocstring() and a set of java tests for it to make sure the corner cases are addressed.
As a side effect, limiting the PR in this way would automatically eliminate merge conflicts :)
Then as a followup, we can apply dedentation to everywhere it needs to be applied to (both in the old and new doc extractors, for all documentable entities).
if (firstLine) { | ||
description.append(bufLine); | ||
firstLine = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why never dedent the first line? Note that you do use the first line for calculating indentation
if the first line's indentation level is non-zero.
description.append(bufLine); | ||
firstLine = false; | ||
} else if (bufLine.trim().isEmpty()) { | ||
description.append("\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning doc
unchanged on line 110 implies that all-whitespace lines become empty lines (even if they contain whitespace beyond indentation
) if any dedentation is done, but remain unchanged when not dedenting - which is unintuitive behavior.
Would it make more sense to instead dedent all-whitespace lines by up to the same number of spaces as any other line?
do { | ||
endOfLineOffset = doc.indexOf("\n", lineOffset); | ||
String line = endOfLineOffset < 0 ? doc.substring(lineOffset) : doc.substring(lineOffset, endOfLineOffset + 1); | ||
boolean allWhitespace = line.trim().isEmpty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line.trim()
removes tabs (as well as all other characters below 0x20, e.g. windows carriage returns), not just spaces. DocstringParser.getIndentation()
counts only spaces.
Thanks for the review! I'm not sure when I'll get time to return to this, but I'll try to get someone to carry it to the finish. |
I have an in-progress change (depends on other changes which have not been submitted yet) which ought to fix trimming/dedenting everywhere (rules, aspects, providers, repo rules, module extensions, and attributes of any of above). |
That's awesome, I've been sad this one is languishing. Thanks for the update! |
For reference, it's part of https://bazel-review.googlesource.com/c/bazel/+/225435 (unfortunately, Gerrit squashed other in-progress changes into it). |
Documentation processors often require doc strings to be dedented. For example, Markdown interprets indented blocks as code. This means before handing doc strings to any form of API doc processor (and ideally - when storing the doc string), we want to dedent them to a minimal indentation level and trim blank lines. In the Python world, the standard algorithm for doing so is specified in PEP-257. Until now, we dedented multiline function doc strings in DocstringUtils using an algorithm which differed from PEP-257 in a number of corner cases. Meanwhile, all other docs (doc strings for rules, attributes, providers etc.) were not trimmed/dedented at all, despite often containing multiple lines. To fix, we introduce Starlark.trimDocString and use it comprehensively on all doc strings. Whenever possible, we store the docstring in its trimmed form. The one exception is function docstrings, because they are stored at parse time, not eval time; we have to trim them in the accessor method. This change allows us to massively simplify the DocstringUtils parser, since it no longer needs to mix low-level string munging for dedenting/trimming with the task of finding args, returns, and deprecation stanzas. A more comprehensive alternative to #13403 Fixes #13402 PiperOrigin-RevId: 552859517 Change-Id: I225f064c7b38f2fdbf78242d5b4597ec545518d4
Makes them suitable for markdown output.
Fixes #13402