Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Close all tags when rendering rustdoc summaries #79749

Closed
wants to merge 1 commit into from

Conversation

camelid
Copy link
Member

@camelid camelid commented Dec 5, 2020

Otherwise we may generate HTML like <em>hello<b> if the text after the
<b> tag is too long. Now it should correctly generate
<em>hello<b></b></em>. It's perhaps not "ideal" that we generate an
empty <b> tag, but it's at least valid HTML.

It isn't a huge deal that we didn't close all the tags because the
browser is probably smart enough to figure out where the tags should
end, but it's still good to fix :)

r? @GuillaumeGomez
cc @jyn514

Otherwise we may generate HTML like `<em>hello<b>` if the text after the
`<b>` tag is too long. Now it should correctly generate
`<em>hello<b></b></em>`. It's perhaps not "ideal" that we generate an
empty `<b>` tag, but it's at least valid HTML.

It isn't a *huge* deal that we didn't close all the tags because the
browser is probably smart enough to figure out where the tags should
end, but it's still good to fix :)
@camelid camelid added A-markdown-parsing Area: Markdown parsing for doc-comments C-bug Category: This is a bug. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Dec 5, 2020
@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 5, 2020
@camelid
Copy link
Member Author

camelid commented Dec 5, 2020

(This needs a test, so the PR is draft.)

@GuillaumeGomez
Copy link
Member

I'm not sure to understand how this code is fixing it You seem to still add text even after you've reached the length limit. Wouldn't it be simpler to keep track of the opened tags instead (using a vec or a queue)? And then, once you've reach the length limit, you close all elements.

Also yes, I think it's mostly useless because web browsers handle that perfectly since a long time already and it'll increase the search-index size once again but not a strong opinion.

@camelid
Copy link
Member Author

camelid commented Dec 7, 2020

You seem to still add text even after you've reached the length limit.

Hmm, what makes you think that? The first match arm matches text or code, and will break the loop if stopped_early is true.

Wouldn't it be simpler to keep track of the opened tags instead (using a vec or a queue)? And then, once you've reach the length limit, you close all elements.

That seems like a viable approach. It would probably make the control-flow simpler. I'll try that.

@camelid camelid added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 19, 2021
@camelid
Copy link
Member Author

camelid commented Feb 1, 2021

This is my alternative approach, based on what you suggested:

diff --git a/src/librustdoc/html/markdown.rs b/src/librustdoc/html/markdown.rs
index 0e4c5410abe..d08ffde201f 100644
--- a/src/librustdoc/html/markdown.rs
+++ b/src/librustdoc/html/markdown.rs
@@ -1051,12 +1051,24 @@ fn markdown_summary_with_limit(md: &str, length_limit: usize) -> (String, bool)
     let mut s = String::with_capacity(md.len() * 3 / 2);
     let mut text_length = 0;
     let mut stopped_early = false;
+    let mut unclosed_tags = SmallVec::<[&str; 4]>::new();
 
-    fn push(s: &mut String, text_length: &mut usize, text: &str) {
+    fn push_text(s: &mut String, text_length: &mut usize, text: &str) {
         s.push_str(text);
         *text_length += text.len();
     };
 
+    fn push_open_tag(s: &mut String, open_tag: &str, close_tag: &str) {
+        s.push_str(open_tag);
+        unclosed_tags.push(close_tag);
+    }
+
+    fn push_close_tag(s: &mut String, expected_close_tag: &str) {
+        let close_tag = unclosed_tags.pop().expect("no unclosed tags left");
+        assert_eq!(close_tag, expected_close_tag);
+        s.push_str(close_tag);
+    }
+
     'outer: for event in Parser::new_ext(md, Options::ENABLE_STRIKETHROUGH) {
         match &event {
             Event::Text(text) => {
@@ -1066,7 +1078,7 @@ fn push(s: &mut String, text_length: &mut usize, text: &str) {
                         break 'outer;
                     }
 
-                    push(&mut s, &mut text_length, word);
+                    push_text(&mut s, &mut text_length, word);
                 }
             }
             Event::Code(code) => {
@@ -1076,18 +1088,18 @@ fn push(s: &mut String, text_length: &mut usize, text: &str) {
                 }
 
                 s.push_str("<code>");
-                push(&mut s, &mut text_length, code);
+                push_text(&mut s, &mut text_length, code);
                 s.push_str("</code>");
             }
             Event::Start(tag) => match tag {
-                Tag::Emphasis => s.push_str("<em>"),
-                Tag::Strong => s.push_str("<strong>"),
+                Tag::Emphasis => push_open_tag(&mut s, "<em>", "</em>"),
+                Tag::Strong => push_open_tag(&mut s, "<strong>", "</strong>"),
                 Tag::CodeBlock(..) => break,
                 _ => {}
             },
             Event::End(tag) => match tag {
-                Tag::Emphasis => s.push_str("</em>"),
-                Tag::Strong => s.push_str("</strong>"),
+                Tag::Emphasis => push_close_tag(&mut s, "</em>"),
+                Tag::Strong => push_close_tag(&mut s, "</strong>"),
                 Tag::Paragraph => break,
                 _ => {}
             },
@@ -1097,7 +1109,7 @@ fn push(s: &mut String, text_length: &mut usize, text: &str) {
                     break;
                 }
 
-                push(&mut s, &mut text_length, " ");
+                push_text(&mut s, &mut text_length, " ");
             }
             _ => {}
         }

I worry that it's bug-prone; note the expect and assert_eq!. Am I missing an obvious way to make this less bug-prone? What do you think of this approach?

@camelid camelid added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 1, 2021
@GuillaumeGomez
Copy link
Member

There shouldn't be a need for the close_tags vec since the open_tags is a queue. ;)

@camelid
Copy link
Member Author

camelid commented Feb 1, 2021

There shouldn't be a need for the close_tags vec since the open_tags is a queue. ;)

What close_tags vec and open_tags queue are you talking about? There's only one, unclosed_tags, which is a vec (and thus a stack).

@GuillaumeGomez
Copy link
Member

I was talking about

+    fn push_open_tag(s: &mut String, open_tag: &str, close_tag: &str) {
+        s.push_str(open_tag);
+        unclosed_tags.push(close_tag);
+    }

And you're right, there is indeed only one vec, my bad. I went through the diff too quickly.

@jyn514 jyn514 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 15, 2021
@jyn514
Copy link
Member

jyn514 commented Aug 16, 2021

I'm going to close this as inactive; feel free to reopen when you get time to work on it.

@jyn514 jyn514 closed this Aug 16, 2021
@jyn514 jyn514 added S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 16, 2021
@camelid
Copy link
Member Author

camelid commented Aug 17, 2021

I'm going to close this as inactive; feel free to reopen when you get time to work on it.

Yeah, I have a new approach to this that I'll probably open a PR for soon.

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Aug 29, 2021
…mit, r=GuillaumeGomez

Refactor Markdown length-limited summary implementation

This PR is a new approach to rust-lang#79749.

This PR refactors the implementation of `markdown_summary_with_limit()`,
separating the logic of determining when the limit has been reached from
the actual rendering process.

The main advantage of the new approach is that it guarantees that all
HTML tags are closed, whereas the previous implementation could generate
tags that were never closed. It also ensures that no empty tags are
generated (e.g., `<em></em>`).

The new implementation consists of a general-purpose struct
`HtmlWithLimit` that manages the length-limiting logic and a function
`markdown_summary_with_limit()` that renders Markdown to HTML using the
struct.

r? `@GuillaumeGomez`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-markdown-parsing Area: Markdown parsing for doc-comments C-bug Category: This is a bug. S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants