Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stardoc: auto-dedent rule doc and attribute docs #13403

Closed
wants to merge 1 commit into from

Conversation

alexeagle
Copy link
Contributor

Makes them suitable for markdown output.

Fixes #13402

@google-cla google-cla bot added the cla: yes label Apr 25, 2021
@alexeagle alexeagle force-pushed the i13402 branch 2 times, most recently from f58ba31 to bf72961 Compare April 25, 2021 19:25
alexeagle added a commit to aspect-forks/stardoc that referenced this pull request Apr 25, 2021
Ran bazel build src/main/java/com/google/devtools/build/skydoc:skydoc_deploy.jar && cp bazel-bin/src/main/java/com/google/devtools/build/skydoc/skydoc_deploy.jar ../stardoc/stardoc/stardoc_binary.jar
alexeagle added a commit to aspect-forks/stardoc that referenced this pull request Apr 25, 2021
Ran bazel build src/main/java/com/google/devtools/build/skydoc:skydoc_deploy.jar && cp bazel-bin/src/main/java/com/google/devtools/build/skydoc/skydoc_deploy.jar ../stardoc/stardoc/stardoc_binary.jar
alexeagle added a commit to aspect-forks/stardoc that referenced this pull request Apr 25, 2021
Ran bazel build src/main/java/com/google/devtools/build/skydoc:skydoc_deploy.jar && cp bazel-bin/src/main/java/com/google/devtools/build/skydoc/skydoc_deploy.jar ../stardoc/stardoc/stardoc_binary.jar
@alexeagle
Copy link
Contributor Author

FYI @c-parsons could use someone's help to review/merge :)

@aiuto aiuto requested a review from tetromino April 26, 2021 15:35
@aiuto aiuto added the team-Starlark-Integration Issues involving Bazel's integration with Starlark, excluding builtin symbols label Apr 26, 2021
Makes them suitable for markdown output.

Fixes bazelbuild#13402
alexeagle added a commit to aspect-forks/stardoc that referenced this pull request Apr 28, 2021
Ran bazel build src/main/java/com/google/devtools/build/skydoc:skydoc_deploy.jar && cp bazel-bin/src/main/java/com/google/devtools/build/skydoc/skydoc_deploy.jar ../stardoc/stardoc/stardoc_binary.jar
Copy link
Contributor

@tetromino tetromino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR!

Main question is why we need a new dedentDocstring() method - can we use the existing DocstringParser for dedenting non-function docstrings?

Secondary comment - whatever we do to fix rule docstrings, we ought to also do for providers and aspects; so I think we'd want to dedent in FakeStarlarkRuleFunctionsApi.asProviderFieldInfo(), FakeStarlarkRuleFunctionsApi.forProviderInfo(), FakeStarlarkRuleFunctionsApi.aspect().

@alexeagle
Copy link
Contributor Author

Yeah I considered changing the existing DocstringParser class, but of course it has a bunch of fields which are specific to docs with structured sections like Args: and therefore it's probably a bigger change than this. More work to write and more work to review - and given how PRs often don't find anyone with time to review/merge them I didn't want to do all that work up front.

It's also somewhat risky to make deep changes to existing code, but the test coverage here is pretty good so I think the main issue is just author/reviewer time.

alexeagle added a commit to aspect-forks/stardoc that referenced this pull request May 14, 2021
Ran bazel build src/main/java/com/google/devtools/build/skydoc:skydoc_deploy.jar && cp bazel-bin/src/main/java/com/google/devtools/build/skydoc/skydoc_deploy.jar ../stardoc/stardoc/stardoc_binary.jar
alexeagle added a commit to aspect-forks/stardoc that referenced this pull request May 14, 2021
Ran bazel build src/main/java/com/google/devtools/build/skydoc:skydoc_deploy.jar && cp bazel-bin/src/main/java/com/google/devtools/build/skydoc/skydoc_deploy.jar ../stardoc/stardoc/stardoc_binary.jar
@brandjon brandjon added team-Build-Language and removed team-Starlark-Integration Issues involving Bazel's integration with Starlark, excluding builtin symbols labels Jun 3, 2021
@aiuto
Copy link
Contributor

aiuto commented Jun 24, 2021

Friendly ping to all.

@alexeagle
Copy link
Contributor Author

My question for @tetromino is whether you have time to review a big refactoring here and if that's what you want, before I spend time digging in to how is could be done.

@alexeagle
Copy link
Contributor Author

@tetromino I think the ball is still in your court, I don't want to invest in a refactoring here that doesn't get merged

@aiuto
Copy link
Contributor

aiuto commented Oct 7, 2021

@tetromino Friendly ping

@sgowroji sgowroji added the awaiting-review PR is awaiting review from an assigned reviewer label Apr 21, 2022
@sgowroji
Copy link
Member

Hello @tetromino, Any update on this PR? Please. Thanks!

@alexeagle
Copy link
Contributor Author

🤷🏻‍♂️

@aiuto
Copy link
Contributor

aiuto commented Jun 3, 2022

@alexeagle Can you rebase to fix the merge conflict?
@brandjon Since Alex is out for a few months, can you take a look?

@aiuto aiuto requested a review from brandjon June 3, 2022 02:28
@brandjon brandjon added team-Starlark-Integration Issues involving Bazel's integration with Starlark, excluding builtin symbols and removed team-Build-Language labels Nov 2, 2022
@brandjon
Copy link
Member

@tetromino will be back in office next month. We can discuss broader questions of docstring formatting then. In the meantime I'll leave this PR open.

@brandjon brandjon removed the awaiting-review PR is awaiting review from an assigned reviewer label Dec 19, 2022
Copy link
Contributor

@tetromino tetromino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize for dropping the ball here and not reviewing this sooner. I somehow operated under the impression that this code was only needed for the legacy stardoc extractor - but obviously both extractors need something like this logic.

So overall I agree with the proposed change, but there are a number of corner case problems (see my comments). I suggest limiting this PR to just adding DocstringUtils.dedentDocstring() and a set of java tests for it to make sure the corner cases are addressed.

As a side effect, limiting the PR in this way would automatically eliminate merge conflicts :)

Then as a followup, we can apply dedentation to everywhere it needs to be applied to (both in the old and new doc extractors, for all documentable entities).

Comment on lines +116 to +118
if (firstLine) {
description.append(bufLine);
firstLine = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why never dedent the first line? Note that you do use the first line for calculating indentation if the first line's indentation level is non-zero.

description.append(bufLine);
firstLine = false;
} else if (bufLine.trim().isEmpty()) {
description.append("\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning doc unchanged on line 110 implies that all-whitespace lines become empty lines (even if they contain whitespace beyond indentation) if any dedentation is done, but remain unchanged when not dedenting - which is unintuitive behavior.

Would it make more sense to instead dedent all-whitespace lines by up to the same number of spaces as any other line?

do {
endOfLineOffset = doc.indexOf("\n", lineOffset);
String line = endOfLineOffset < 0 ? doc.substring(lineOffset) : doc.substring(lineOffset, endOfLineOffset + 1);
boolean allWhitespace = line.trim().isEmpty();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line.trim() removes tabs (as well as all other characters below 0x20, e.g. windows carriage returns), not just spaces. DocstringParser.getIndentation() counts only spaces.

@alexeagle
Copy link
Contributor Author

Thanks for the review! I'm not sure when I'll get time to return to this, but I'll try to get someone to carry it to the finish.

@tetromino
Copy link
Contributor

tetromino commented Jul 26, 2023

I have an in-progress change (depends on other changes which have not been submitted yet) which ought to fix trimming/dedenting everywhere (rules, aspects, providers, repo rules, module extensions, and attributes of any of above).

@alexeagle
Copy link
Contributor Author

That's awesome, I've been sad this one is languishing. Thanks for the update!

@tetromino
Copy link
Contributor

For reference, it's part of https://bazel-review.googlesource.com/c/bazel/+/225435 (unfortunately, Gerrit squashed other in-progress changes into it).

@alexeagle alexeagle closed this Jul 27, 2023
copybara-service bot pushed a commit that referenced this pull request Aug 1, 2023
Documentation processors often require doc strings to be dedented. For example,
Markdown interprets indented blocks as code. This means before handing doc
strings to any form of API doc processor (and ideally - when storing the doc
string), we want to dedent them to a minimal indentation level and trim blank
lines.

In the Python world, the standard algorithm for doing so is specified in
PEP-257.

Until now, we dedented multiline function doc strings in DocstringUtils
using an algorithm which differed from PEP-257 in a number of corner cases.
Meanwhile, all other docs (doc strings for rules, attributes, providers etc.)
were not trimmed/dedented at all, despite often containing multiple lines.

To fix, we introduce Starlark.trimDocString and use it comprehensively on
all doc strings. Whenever possible, we store the docstring in its trimmed
form. The one exception is function docstrings, because they are stored at
parse time, not eval time; we have to trim them in the accessor method.

This change allows us to massively simplify the DocstringUtils parser, since
it no longer needs to mix low-level string munging for dedenting/trimming
with the task of finding args, returns, and deprecation stanzas.

A more comprehensive alternative to #13403
Fixes #13402

PiperOrigin-RevId: 552859517
Change-Id: I225f064c7b38f2fdbf78242d5b4597ec545518d4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes team-Starlark-Integration Issues involving Bazel's integration with Starlark, excluding builtin symbols
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Typical formatting of rule docstring is unusable for Markdown output
5 participants