Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document comment representation is inefficient #65714

Closed
nnethercote opened this issue Oct 23, 2019 · 3 comments
Closed

Document comment representation is inefficient #65714

nnethercote opened this issue Oct 23, 2019 · 3 comments

Comments

@nnethercote
Copy link
Contributor

nnethercote commented Oct 23, 2019

Imagine a doc comment like this:

/// aaa
/// `bbb`
/// ccc

This gets treated like this:

#[doc=" aaa"]
#[doc=" `bbb`"]
#[doc=" ccc"]

And each line gets represented with a TokenStream containing two elements: a Eq token and a Lit token. These end up in metadata, and the decoding cost is quite high. Because doc comments can be long, spanning many lines, it would be nice to join them together -- much like rustdoc's collapse-span pass does, but earlier -- to reduce the metadata decoding cost.

I wrote an experimental, not-quite-right patch that condensed to the following in the parser:

#[doc="/// aaa\n/// `bbb`\n/// ccc"]

It failed lots of rustdoc tests, but the perf results were great:

ssue-46449-check
        avg: -16.1%     min: -16.1%     max: -16.1%
helloworld-check
        avg: -12.6%     min: -12.6%     max: -12.6%
unify-linearly-check
        avg: -10.9%     min: -10.9%     max: -10.9%
deeply-nested-check
        avg: -8.7%      min: -8.7%      max: -8.7%
await-call-tree-check
        avg: -8.1%      min: -8.1%      max: -8.1%
regression-31157-check
        avg: -4.0%      min: -4.0%      max: -4.0%
ripgrep-check
        avg: -2.7%      min: -2.7%      max: -2.7%
encoding-check
        avg: -2.4%      min: -2.4%      max: -2.4%
regex-check
        avg: -2.0%      min: -2.0%      max: -2.0%
syn-check
        avg: -1.6%      min: -1.6%      max: -1.6%
futures-check
        avg: -1.5%      min: -1.5%      max: -1.5%
clap-rs-check
        avg: -0.9%      min: -0.9%      max: -0.9%
piston-image-check
        avg: -0.9%      min: -0.9%      max: -0.9%
coercions-check
        avg: -0.8%?     min: -0.8%?     max: -0.8%?
webrender-check
        avg: -0.6%      min: -0.6%      max: -0.6%
token-stream-stress-check
        avg: -0.6%      min: -0.6%      max: -0.6%
html5ever-check
        avg: -0.6%      min: -0.6%      max: -0.6%
inflate-check
        avg: -0.4%      min: -0.4%      max: -0.4%
cargo-check
        avg: -0.4%      min: -0.4%      max: -0.4%
cranelift-codegen-check
        avg: -0.4%      min: -0.4%      max: -0.4%
serde-check
        avg: -0.3%      min: -0.3%      max: -0.3%
wg-grammar-check
        avg: -0.3%      min: -0.3%      max: -0.3%
deep-vector-check
        avg: -0.2%      min: -0.2%      max: -0.2%

I then modified it to be more correct, like this:

#[doc="/// aaa\n `bbb`\n ccc"]

But then I get build failures -- the doc string appears to be tokenized somewhere, and it complains about the backticks not being valid tokens ("error: unknown start of token: `") due to the lack of /// after the first newline. Anyone know where that tokenization might arise from?

So, there is definitely room for improvement in the representation of doc comments, but some care will be needed to keep things working. I'd love to hear suggestions on the right way to do this.

cc @rust-lang/wg-compiler-performance

@nnethercote
Copy link
Contributor Author

Hmm, it's slightly different to how I described it above. In the current code, the /// at the start of each line is actually preserved with the doc attribute, and is only stripped when it is converted into a DocFragment::SugaredDoc by rustdoc::clean::Attributes::from_ast().

So perhaps I can join the lines like I did, keeping the /// markers in there, and then change Attributes::from_ast() to strip /// markers from the middle of the string as well as the beginning.

@nnethercote
Copy link
Contributor Author

I guess a relevant question is whether doc comments need to be put into metadata?

@petrochenkov
Copy link
Contributor

Duplicate of #60936 and #60935.
cc #60930

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants