-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Close all tags when rendering rustdoc summaries #79749
Conversation
Otherwise we may generate HTML like `<em>hello<b>` if the text after the `<b>` tag is too long. Now it should correctly generate `<em>hello<b></b></em>`. It's perhaps not "ideal" that we generate an empty `<b>` tag, but it's at least valid HTML. It isn't a *huge* deal that we didn't close all the tags because the browser is probably smart enough to figure out where the tags should end, but it's still good to fix :)
(This needs a test, so the PR is draft.) |
I'm not sure to understand how this code is fixing it You seem to still add text even after you've reached the length limit. Wouldn't it be simpler to keep track of the opened tags instead (using a vec or a queue)? And then, once you've reach the length limit, you close all elements. Also yes, I think it's mostly useless because web browsers handle that perfectly since a long time already and it'll increase the search-index size once again but not a strong opinion. |
Hmm, what makes you think that? The first match arm matches text or code, and will
That seems like a viable approach. It would probably make the control-flow simpler. I'll try that. |
This is my alternative approach, based on what you suggested: diff --git a/src/librustdoc/html/markdown.rs b/src/librustdoc/html/markdown.rs
index 0e4c5410abe..d08ffde201f 100644
--- a/src/librustdoc/html/markdown.rs
+++ b/src/librustdoc/html/markdown.rs
@@ -1051,12 +1051,24 @@ fn markdown_summary_with_limit(md: &str, length_limit: usize) -> (String, bool)
let mut s = String::with_capacity(md.len() * 3 / 2);
let mut text_length = 0;
let mut stopped_early = false;
+ let mut unclosed_tags = SmallVec::<[&str; 4]>::new();
- fn push(s: &mut String, text_length: &mut usize, text: &str) {
+ fn push_text(s: &mut String, text_length: &mut usize, text: &str) {
s.push_str(text);
*text_length += text.len();
};
+ fn push_open_tag(s: &mut String, open_tag: &str, close_tag: &str) {
+ s.push_str(open_tag);
+ unclosed_tags.push(close_tag);
+ }
+
+ fn push_close_tag(s: &mut String, expected_close_tag: &str) {
+ let close_tag = unclosed_tags.pop().expect("no unclosed tags left");
+ assert_eq!(close_tag, expected_close_tag);
+ s.push_str(close_tag);
+ }
+
'outer: for event in Parser::new_ext(md, Options::ENABLE_STRIKETHROUGH) {
match &event {
Event::Text(text) => {
@@ -1066,7 +1078,7 @@ fn push(s: &mut String, text_length: &mut usize, text: &str) {
break 'outer;
}
- push(&mut s, &mut text_length, word);
+ push_text(&mut s, &mut text_length, word);
}
}
Event::Code(code) => {
@@ -1076,18 +1088,18 @@ fn push(s: &mut String, text_length: &mut usize, text: &str) {
}
s.push_str("<code>");
- push(&mut s, &mut text_length, code);
+ push_text(&mut s, &mut text_length, code);
s.push_str("</code>");
}
Event::Start(tag) => match tag {
- Tag::Emphasis => s.push_str("<em>"),
- Tag::Strong => s.push_str("<strong>"),
+ Tag::Emphasis => push_open_tag(&mut s, "<em>", "</em>"),
+ Tag::Strong => push_open_tag(&mut s, "<strong>", "</strong>"),
Tag::CodeBlock(..) => break,
_ => {}
},
Event::End(tag) => match tag {
- Tag::Emphasis => s.push_str("</em>"),
- Tag::Strong => s.push_str("</strong>"),
+ Tag::Emphasis => push_close_tag(&mut s, "</em>"),
+ Tag::Strong => push_close_tag(&mut s, "</strong>"),
Tag::Paragraph => break,
_ => {}
},
@@ -1097,7 +1109,7 @@ fn push(s: &mut String, text_length: &mut usize, text: &str) {
break;
}
- push(&mut s, &mut text_length, " ");
+ push_text(&mut s, &mut text_length, " ");
}
_ => {}
} I worry that it's bug-prone; note the |
There shouldn't be a need for the |
What |
I was talking about
And you're right, there is indeed only one vec, my bad. I went through the diff too quickly. |
I'm going to close this as inactive; feel free to reopen when you get time to work on it. |
Yeah, I have a new approach to this that I'll probably open a PR for soon. |
…mit, r=GuillaumeGomez Refactor Markdown length-limited summary implementation This PR is a new approach to rust-lang#79749. This PR refactors the implementation of `markdown_summary_with_limit()`, separating the logic of determining when the limit has been reached from the actual rendering process. The main advantage of the new approach is that it guarantees that all HTML tags are closed, whereas the previous implementation could generate tags that were never closed. It also ensures that no empty tags are generated (e.g., `<em></em>`). The new implementation consists of a general-purpose struct `HtmlWithLimit` that manages the length-limiting logic and a function `markdown_summary_with_limit()` that renders Markdown to HTML using the struct. r? `@GuillaumeGomez`
Otherwise we may generate HTML like
<em>hello<b>
if the text after the<b>
tag is too long. Now it should correctly generate<em>hello<b></b></em>
. It's perhaps not "ideal" that we generate anempty
<b>
tag, but it's at least valid HTML.It isn't a huge deal that we didn't close all the tags because the
browser is probably smart enough to figure out where the tags should
end, but it's still good to fix :)
r? @GuillaumeGomez
cc @jyn514