Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustfmt does not understand cfg_attr used to start a rustdoc example/test #5420

Closed
mqudsi opened this issue Jun 29, 2022 · 12 comments
Closed

Comments

@mqudsi
Copy link

mqudsi commented Jun 29, 2022

With wrap_comments = true, rustfmt mangles source code (in different ways) within a test/example in rustdoc if #![cfg_attr] is used to enable the start of a code block:

Sample input:

#![cfg_attr(not(feature = "std"), doc = "```ignore")]
#![cfg_attr(feature = "std", doc = "```")]
//! use size::Size;
//!
//! // This is a short comment. It causes no problems.
//! let s1 = Size::from_mib(13) / 2;
//! assert_eq!(s1, Size::from_mib(6.5_f32));
//!
//! // This is a contrived long comment line within what should be detected as a rustdoc section. With `wrap_comments = true`, this comment is mangled.
//! let s3 = Size::from_mib(12) - Size::from_mib(14.2_f64);
//! assert_eq!(s3, Size::from_kib(-2252.8));
//! ```
//!
//! I think `rustfmt` treats everything here as code and doesn't touch it via either of `wrap_comments` or `max_width`, although one or the other should?
//! My guess is that the closing ``` above are treated as opening ``` because the original ``` via
//! the cfg_attr macro isn't recognized.

The result after a format run:

#![cfg_attr(not(feature = "std"), doc = "```ignore")]
#![cfg_attr(feature = "std", doc = "```")]
//! use size::Size;
//!
//! // This is a short comment. It causes no problems.
//! let s1 = Size::from_mib(13) / 2;
//! assert_eq!(s1, Size::from_mib(6.5_f32));
//!
//! // This is a contrived long comment line within what should be detected as a
//! rustdoc section. With `wrap_comments = true`, this comment is mangled.
//! let s3 = Size::from_mib(12) - Size::from_mib(14.2_f64);
//! assert_eq!(s3, Size::from_kib(-2252.8));
//! ```
//! 
//! I think `rustfmt` treats everything here as code and doesn't touch it via either of `wrap_comments` or `max_width`, although one or the other should?
//! My guess is that the closing ``` above are treated as opening ``` because the original ``` via
//! the cfg_attr macro isn't recognized.

As you can see,

  • regardless of whether it's being run with or without the std feature, the section should be recognized as code, just that it will either be ignored or executed by the test harness
  • the comment in the code within the rustdoc has been wrapped to the next line, but not continued with // so it will be treated as code (and fail).
  • the text after the closing ``` seems to be treated like code, but it's not reformatted via either of wrap_comments or max_width despite being in violation of both. As I mention, I think it's because the closing ``` is treated as opening one instead (because the actual opening one is not seen)

But I'm not sure that's actually the explanation because with a different input, rustfmt mangles the closing ``` as well (for some reason it doesn't happen with the example above):

input:

//! This is a rustdoc comment that goes past the default allowed width of eighty characters. rustfmt will re-wrap this comment when run with `wrap_comments`
//! enabled.
//!
//! The preceding line does not have any whitespace and is a total of three backslashes followed by a new line (`\n`). This does not change after a
//! `rustfmt` run.
//!
//! Some examples of supported mathematical operations:
#![cfg_attr(not(feature = "std"), doc = "```ignore")]
#![cfg_attr(feature = "std", doc = "```")]
//! use size::Size;
//!
//! // No trailing whitespace is introduced on the line above
//! let s1 = Size::from_mib(13) / 2;
//! assert_eq!(s1, Size::from_mib(6.5_f32));
//!
//! // Nor is any trailing whitespace introduced on the line above or the line below, even though this comment exceeds the maximum supported width
//! let s3 = Size::from_mib(12) - Size::from_mib(14.2_f64);
//! assert_eq!(s3, Size::from_kib(-2252.8));
//! ```
//!
//! In this section of the module documentation, `rustfmt` will introduce a trailing whitespace on the line above after
//! it is executed with `wrap_comments = true`.
//!
//! `rustfmt` does not introduce trailing whitespace on the line above this comment, even though it too is rewrapped from its long length.

formatted output:

//! This is a rustdoc comment that goes past the default allowed width of eighty
//! characters. rustfmt will re-wrap this comment when run with `wrap_comments`
//! enabled.
//!
//! The preceding line does not have any whitespace and is a total of three
//! backslashes followed by a new line (`\n`). This does not change after a
//! `rustfmt` run.
//!
//! Some examples of supported mathematical operations:
#![cfg_attr(not(feature = "std"), doc = "```ignore")]
#![cfg_attr(feature = "std", doc = "```")]
//! use size::Size;
//!
//! // No trailing whitespace is introduced on the line above
//! let s1 = Size::from_mib(13) / 2;
//! assert_eq!(s1, Size::from_mib(6.5_f32));
//!
//! // Nor is any trailing whitespace introduced on the line above or the line
//! below, even though this comment exceeds the maximum supported width let s3 =
//! Size::from_mib(12) - Size::from_mib(14.2_f64); assert_eq!(s3,
//! Size::from_kib(-2252.8)); ```
//! 
//! In this section of the module documentation, `rustfmt` will introduce a trailing whitespace on the line above after
//! it is executed with `wrap_comments = true`.
//!
//! `rustfmt` does not introduce trailing whitespace on the line above this comment, even though it too is rewrapped from its long length.

Notice how here the text after the closing ``` is still not reformatted but the closing ``` was treated as text and merged with the (incorrectly) wrapped contents of the previous line. So it might be more complicated than that.

@calebcartwright
Copy link
Member

Thank you for expanding with these details.

Yes, fundamentally the problem is that you're trying to have rustfmt format something that doesn't actually exist at the phase rustfmt operates within.

rustfmt works with the absolute earliest stages of the compiler's processing, fully before any and all macros expansion and semantic processing of attributes, features, etc. All rustfmt sees is the AST representation of a sequence of attributes, and the comment text exactly as authored in the input source file.

As such, rustfmt will never be able to see and process your comment fully correctly (it recognizes certain code block languages and acts differently accordingly.

There's probably some additional things at play which we can try to dig into, but want to stress that there's likely a relatively low ceiling of what can be done and that you may want to consider skipping formatting or perhaps refactoring things a bit (albeit at the expense of likely having to duplicate comment text)

@ytmimi
Copy link
Contributor

ytmimi commented Jun 30, 2022

I think your thought process is correct here. I'm also inclined to believe that we're treating the first occurrence of ``` as an opening code fence. As @calebcartwright pointed out rustfmt operates on the AST pre-expansion. The relevant code is in impl Rewrite for [ast::Attribute]

rustfmt/src/attr.rs

Lines 382 to 501 in b3d4fb4

impl Rewrite for [ast::Attribute] {
fn rewrite(&self, context: &RewriteContext<'_>, shape: Shape) -> Option<String> {
if self.is_empty() {
return Some(String::new());
}
// The current remaining attributes.
let mut attrs = self;
let mut result = String::new();
// Determine if the source text is annotated with `#[rustfmt::skip::attributes(derive)]`
// or `#![rustfmt::skip::attributes(derive)]`
let skip_derives = context.skip_context.skip_attribute("derive");
// This is not just a simple map because we need to handle doc comments
// (where we take as many doc comment attributes as possible) and possibly
// merging derives into a single attribute.
loop {
if attrs.is_empty() {
return Some(result);
}
// Handle doc comments.
let (doc_comment_len, doc_comment_str) =
rewrite_initial_doc_comments(context, attrs, shape)?;
if doc_comment_len > 0 {
let doc_comment_str = doc_comment_str.expect("doc comments, but no result");
result.push_str(&doc_comment_str);
let missing_span = attrs
.get(doc_comment_len)
.map(|next| mk_sp(attrs[doc_comment_len - 1].span.hi(), next.span.lo()));
if let Some(missing_span) = missing_span {
let snippet = context.snippet(missing_span);
let (mla, mlb) = has_newlines_before_after_comment(snippet);
let comment = crate::comment::recover_missing_comment_in_span(
missing_span,
shape.with_max_width(context.config),
context,
0,
)?;
let comment = if comment.is_empty() {
format!("\n{}", mlb)
} else {
format!("{}{}\n{}", mla, comment, mlb)
};
result.push_str(&comment);
result.push_str(&shape.indent.to_string(context.config));
}
attrs = &attrs[doc_comment_len..];
continue;
}
// Handle derives if we will merge them.
if !skip_derives && context.config.merge_derives() && is_derive(&attrs[0]) {
let derives = take_while_with_pred(context, attrs, is_derive);
let derive_str = format_derive(derives, shape, context)?;
result.push_str(&derive_str);
let missing_span = attrs
.get(derives.len())
.map(|next| mk_sp(attrs[derives.len() - 1].span.hi(), next.span.lo()));
if let Some(missing_span) = missing_span {
let comment = crate::comment::recover_missing_comment_in_span(
missing_span,
shape.with_max_width(context.config),
context,
0,
)?;
result.push_str(&comment);
if let Some(next) = attrs.get(derives.len()) {
if next.is_doc_comment() {
let snippet = context.snippet(missing_span);
let (_, mlb) = has_newlines_before_after_comment(snippet);
result.push_str(mlb);
}
}
result.push('\n');
result.push_str(&shape.indent.to_string(context.config));
}
attrs = &attrs[derives.len()..];
continue;
}
// If we get here, then we have a regular attribute, just handle one
// at a time.
let formatted_attr = attrs[0].rewrite(context, shape)?;
result.push_str(&formatted_attr);
let missing_span = attrs
.get(1)
.map(|next| mk_sp(attrs[0].span.hi(), next.span.lo()));
if let Some(missing_span) = missing_span {
let comment = crate::comment::recover_missing_comment_in_span(
missing_span,
shape.with_max_width(context.config),
context,
0,
)?;
result.push_str(&comment);
if let Some(next) = attrs.get(1) {
if next.is_doc_comment() {
let snippet = context.snippet(missing_span);
let (_, mlb) = has_newlines_before_after_comment(snippet);
result.push_str(mlb);
}
}
result.push('\n');
result.push_str(&shape.indent.to_string(context.config));
}
attrs = &attrs[1..];
}
}
}

What's happening is (based on your first snippet):

  • Both #![cfg_attr(...)] are treated as normal attributes and are formatted one by one (line 473).
  • Only the lines starting with //! are treated as doc comments. We collect as many doc comments as we can and then format them all together (line 405-406).

But I'm not sure that's actually the explanation because with a different input, rustfmt mangles the closing ``` as well (for some reason it doesn't happen with the example above):

The code fence issue not being wrapped to the next line when wrap_comments=true is a duplicate of #5244 and #5267 is open to fix that

@mqudsi
Copy link
Author

mqudsi commented Jun 30, 2022

Thanks both for taking the time to dig into this. #5244 makes it much easier to deduce what is happening here as that had me confused.

@calebcartwright I would have guessed that means that this isn't actually limited to #![cfg_attr(..., doc = "...")] but would also apply to the simpler case of #![doc("...")] were it not for the snippet @ytmimi posted which seems to clarify that there's an attribute-to-string substitution pass that takes place where this kind of magic can be done. (Edit: I think I misunderstood the ast handling in that snippet and didn't realize /// foo was considered an attribute at the ast level rather than a comment)

A heuristic could be used to naively transform #![cfg_attr(..., doc = "foo")] to #![doc("foo")] as a first approximation (since code isn't being evaluated and we're just interested in its overall shape) except that in cases like the one demonstrated in this issue, the biggest reason to use #![cfg_attr(...)] would be to toggle between multiple cases, meaning if you just naively applied such a transform you'd see an opening ``` fence followed by a closing ```ignore fence. Perhaps coalescing all consecutive #![cfg_attr(...)] (taking the first and ignoring the others) would get closer?

The cost of doing that would be negligible (only if such an attribute were encountered) but the real question would be how well it would handle the majority of cases using such a construct and how other cases might break differently with such a hack heuristic in place. Probably worth doing a GitHub code search to see what others are doing!

@ytmimi
Copy link
Contributor

ytmimi commented Jun 30, 2022

I would have guessed that means that this isn't actually limited to #![cfg_attr(..., doc = "...")] but would also apply to the simpler case of #![doc("...")] were it not for the snippet @ytmimi posted which seems to clarify that there's an attribute-to-string substitution pass that takes place where this kind of magic can be done.

Just to clarify //! my comment is just syntax sugar for #![doc="my comment"]. at the AST level //! my comment, #![doc="my comment"] and #![cfg_attr(..)] are all just crate level ast::Atribute nodes. I think the difference is that //! get assigned a kind of ast::AttrKind::DocComment.

The docs for ast::AttrKind::DocComment state:

A doc comment (e.g. /// ..., //! ..., /** ... /, /! ... */). Doc attributes (e.g. #[doc="..."]) are represented with the Normal variant (which is much less compact and thus more expensive).

@mqudsi
Copy link
Author

mqudsi commented Jun 30, 2022

I edited my previous comment just as you were posting yours. Thanks for explaining what I had missed.

I was using astexplorer.net (which uses syn) to generate an ast for a code sample with the various doc comment types, but it turns out that syn does its own thing and that doesn't match what rustc_ast does at all. I'm using rustc -Z ast-json foo.rs now and I can see the differences.

(I also realize that my suggested heuristic above was strictly one-way and didn't provide for how rustfmt would need to be modified to be able to emit rustdoc back in the same form it ingested it.)

@ytmimi
Copy link
Contributor

ytmimi commented Jun 30, 2022

Would the advice from this Stack Overflow be an acceptable workaround for your issue?

Something like this:

//! ```
//! use size::Size;
//!
//! // This is a short comment. It causes no problems.
//! let s1 = Size::from_mib(13) / 2;
//! # #![cfg_attr(feature = "std")]
//! assert_eq!(s1, Size::from_mib(6.5_f32));
//!
//! // This is a contrived long comment line within what should be detected as a rustdoc section. With `wrap_comments = true`, this comment is mangled.
//! let s3 = Size::from_mib(12) - Size::from_mib(14.2_f64);
//! # #![cfg_attr(feature = "std")]
//! assert_eq!(s3, Size::from_kib(-2252.8));
//! ```
//!
//! I think `rustfmt` treats everything here as code and doesn't touch it via either of `wrap_comments` or `max_width`, although one or the other should?
//! My guess is that the closing ``` above are treated as opening ``` because the original ``` via
//! the cfg_attr macro isn't recognized.

If not, then I think the easiest way to deal with this limitation right now is to slap a #![rustfmt::skip] attribute on the doc comments that need to be conditionally compiled.

@mqudsi
Copy link
Author

mqudsi commented Jun 30, 2022

That's an option, although semantically it would be incorrect as the entire block is no longer valid. (In this case, without the std feature floating point values cannot be passed to Size functions, so each usage would need to be decorated. Since a good number of statements are not one-offs and contribute to downstream code, ignoring one line would need all subsequent lines to be ignored one-by-one).

I tried to use rustfmt::skip but ran into some limitations, one of which is that rustc refused to compile some call sites since external attributes are not supported for expressions and the other being that I had no idea it was possible to apply a rustfmt::skip attribute to another (doc) attribute.

e.g.

#![rustfmt::skip]
//! This is a module-level documentation with an accompanying test

My naive reading would be that the #![rustfmt::skip] above would apply to the entirety of the module and what's in it and not just to the module docs (please correct me if I'm wrong!); and

#[rustfmt::skip]
//! This is a module-level documentation with an accompanying test

wasn't valid because attributes don't apply to attributes but rather the first non-attribute thing after them.

A #[rustfmt::skip(doc_comments)] or #![rustfmt::skip(doc_comments)] attribute would solve that, though it might involve a tiny bit of ast traversal trickery to associate the skip attribute with the doc comment attributes to be skipped.

@ytmimi
Copy link
Contributor

ytmimi commented Jun 30, 2022

Looks like there's an open issue for rustdoc to implement a feature like this rust-lang/rust#93083. For now, maybe you could try wrapping all the code that needs to be conditinally compiled in a block that will be hidden as suggested in the issue:

/// 
/// ```
/// # #[cfg(feature = "foo)]
/// # {
///  ... doctest goes here ...
/// # }
/// ```

@mqudsi
Copy link
Author

mqudsi commented Jun 30, 2022

That's a much cleaner suggestion, thanks! Note that tests will still report "run" instead of "skipped" when using an unmatching feature with this approach vs the original way outlined in the issue.

Would there be interest in implementing #[rustfmt::skip(doc_comments)]?

@calebcartwright
Copy link
Member

Would there be interest in implementing #[rustfmt::skip(doc_comments)]?

Sorry, but no, absolutely zero. Naturally we don't want to spend cycles to introduce rustfmt features that would be a relatively short term workaround for something that should (and ostensibly will) be addressed upstream.

Regardless, there's myriad reasons both technical/internal and user facing as to why we wouldn't want to pursue an attribute-driven strategy.

Thanks again bringing this to our attention and elevating from Discord with relevant details, though I'm going to close as it seems there's sufficient workarounds available already and no lasting/strategic action on our end that would make sense to pursue at this time.

We can revisit down the road if there's no upstream movement, but we'd almost certainly need to go down the config route instead of attributes.

@calebcartwright calebcartwright closed this as not planned Won't fix, can't repro, duplicate, stale Jul 1, 2022
@mqudsi
Copy link
Author

mqudsi commented Aug 12, 2022

@calebcartwright Thanks for the info and no hard feelings on my end.

I do have but one follow-up however, evoked by your saying

but we'd almost certainly need to go down the config route instead of attributes.

I want to clarify that the example attribute (#[rustfmt::skip(doc_comments)]) wasn't intended to be a crate-level attribute explaining what rustfmt should or shouldn't be processing in general (i.e. as configuration) but rather an item-level attribute only to cover a specific limitation of rust/#[rustfmt::skip] (that you cannot decorate an attribute with an attribute, i.e. all attributes are tacked on to the next non-attribute token in the ast), because as was said earlier in the thread

... I think the easiest way to deal with this limitation right now is to slap a #![rustfmt::skip] attribute on the doc comments that need to be conditionally compiled.

which isn't actually possible, because you can't apply an attribute to the attribute, so the entire module would have been skipped from formatting in this case.

To the best of my knowledge, .rustfmt.toml is independent of any code and can't include overrides or different configurations for items specified by their module/ast path. (happy to be wrong on this one!)

In all cases, I agree that if there's some solution forthcoming from the rust side of things that it would obviate the need for any of this and be preferred to any workarounds we would deploy in rustfmt itself or the codebase being formatted. 👍

@calebcartwright
Copy link
Member

because you can't apply an attribute to the attribute,

Thanks, but I'm aware 😉 I think you may have misinterpreted a few of my points, or perhaps I didn't include adequate detail to fully explain them, but either way it's moot at this point.

The notion of dynamic formatting driven via outer attributes isn't new. There's been plenty of prior discussion, and suffice to say it's not happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants