Reduce formatting `width` and `precision` to 16 bits #136932

m-ou-se · 2025-02-12T16:46:32Z

This is part of #99012

This is reduces the width and precision fields in format strings to 16 bits. They are currently full usizes, but it's a bit nonsensical that we need to support the case where someone wants to pad their value to eighteen quintillion spaces and/or have eighteen quintillion digits of precision.

By reducing these fields to 16 bit, we can reduce FormattingOptions to 64 bits (see #136974) and improve the in memory representation of format_args!(). (See additional context below.)

This also fixes a bug where the width or precision is silently truncated when cross-compiling to a target with a smaller usize. By reducing the width and precision fields to the minimum guaranteed size of usize, 16 bits, this bug is eliminated.

This is a breaking change, but affects almost no existing code.

Details of this change:

There are three ways to set a width or precision today:

Directly a formatting string, e.g. println!("{a:1234}")
Indirectly in a formatting string, e.g. println!("{a:width$}", width=1234)
Through the unstable FormattingOptions::width method.

This PR:

Adds a compiler error for 1. (println!("{a:9999999}") no longer compiles and gives a clear error.)
Adds a runtime check for 2. (println!("{a:width$}, width=9999999) will panic.)
Changes the signatures of the (unstable) FormattingOptions::[get_]width methods to use a u16 instead.

Additional context for improving FormattingOptions and fmt::Arguments:

All the formatting flags and options are currently:

The + flag (1 bit)
The - flag (1 bit)
The # flag (1 bit)
The 0 flag (1 bit)
The x? flag (1 bit)
The X? flag (1 bit)
The alignment (2 bits)
The fill character (21 bits)
Whether a width is specified (1 bit)
Whether a precision is specified (1 bit)
If used, the width (a full usize)
If used, the precision (a full usize)

Everything except the last two can simply fit in a u32 (those add up to 31 bits in total).

If we can accept a max width and precision of u16::MAX, we can make a FormattingOptions that is exactly 64 bits in size; the same size as a thin reference on most platforms.

If, additionally, we also limit the number of formatting arguments, we can also reduce the size of fmt::Arguments (that is, of a format_args!() expression).

m-ou-se · 2025-02-12T16:48:10Z

@bors try

bors · 2025-02-12T16:49:22Z

⌛ Trying commit 7355738 with merge 7af7790...

…try> Reduce formatting `width` and `precision` to 16 bits This is reduces the `width` and `precision` fields in format strings to 16 bits. They are currently full `usize`s, but it's a bit nonsensical that we need to support the case where someone wants to pad their value to eighteen quintillion spaces and/or have eighteen quintillion digits of precision. This is a breaking change, but probably affects virtually no code. Let's do a crater run to find out. Marking this as experiment for now. --- There are three ways to set a width or precision today: 1. Directly a formatting string, e.g. `println!("{a:1234}")` 2. Indirectly in a formatting string, e.g. `println!("{a:width$}", width=1234)` 3. Through the unstable `FormattingOptions::width` method. This PR: - Adds a compiler error for 1. (`println!("{a:9999999}")` no longer compiles and gives a clear error.) - Adds a runtime check for 2. (`println!("{a:width$}, width=9999999)` will panic.) - Changes the signature of `FormattingOptions::width` to take a `u16` instead. --- Additional context: All the formatting flags and options are currently: - The `+` flag (1 bit) - The `-` flag (1 bit) - The `#` flag (1 bit) - The `0` flag (1 bit) - The `x?` flag (1 bit) - The `X?` flag (1 bit) - The alignment (2 bits) - The fill character (21 bits) - Whether a width is specified (1 bit) - Whether a precision is specified (1 bit) - If used, the width (a full usize) - If used, the precision (a full usize) Everything except the last two can simply fit in a `u32` (those add up to 31 bits in total). If we can accept a max width and precision of u16::MAX, we can make a `FormattingOptions` that is exactly 64 bits in size; the same size as a thin reference on most platforms. If, additionally, we also limit the number of formatting arguments to u16::MAX, we can also reduce the size of `fmt::Arguments` (that is, of a `format_args!()` expression).

tests/mir-opt/funky_arms.float_to_exponential_common.GVN.panic-abort.diff

bors · 2025-02-12T18:41:27Z

☀️ Try build successful - checks-actions
Build commit: 7af7790 (7af779037716ae4125ceabb429791b4cf5dd0a43)

m-ou-se · 2025-02-12T19:16:27Z

@craterbot run mode=build-and-test

craterbot · 2025-02-12T19:16:55Z

👌 Experiment pr-136932 created and queued.
🤖 Automatically detected try build 7af7790
🔍 You can check out the queue and this experiment's details.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

craterbot · 2025-02-12T19:17:23Z

🚧 Experiment pr-136932 is now running

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

oxalica · 2025-02-13T01:39:12Z

This is a breaking change, but probably affects virtually no code. Let's do a crater run to find out. Marking this as experiment for now.

As a data point, I do use usize width for indentation generation of pretty-printer in some of my crates.
It's used as a shortcut for " ".repeat(width) without allocation (at least I think so?). I can accept to enforce u16 here because nobody actually use 65536 indentation levels, but usize is definitely a more natural choice for "length" semantic than u16.

write!(self.w, "\n{:1$}]", "", self.cur_indent)?;
write!(f, "{:len$}", "", len = (stride - 1) * 4)?;

m-ou-se · 2025-02-13T09:44:23Z

The type will stay a usize; changing that would break a lot of things. This PR adds a runtime check (panic) that the usize you give it isn't above u16::MAX. (And then stores everything internally as a u16.)

Reduce FormattingOptions to 64 bits This reduces FormattingOptions from 6-7 machine words (384 bits on 64-bit platforms, 224 bits on 32-bit platforms) to just 64 bits (a single register on 64-bit platforms). This PR includes rust-lang#136932, which reduces the width and precision options to 16 bits, to make it all fit. Before: ```rust pub struct FormattingOptions { flags: u32, // only 6 bits used fill: char, align: Option<Alignment>, width: Option<usize>, precision: Option<usize>, } ``` After: ```rust pub struct FormattingOptions { /// Bits: /// - 0: `+` flag [rt::Flag::SignPlus] /// - 1: `-` flag [rt::Flag::SignMinus] /// - 2: `#` flag [rt::Flag::Alternate] /// - 3: `0` flag [rt::Flag::SignAwareZeroPad] /// - 4: `x?` flag [rt::Flag::DebugLowerHex] /// - 5: `X?` flag [rt::Flag::DebugUpperHex] /// - 6-7: Alignment (0: Left, 1: Right, 2: Center, 3: Unknown) /// - 8: Width flag (if set, the width field below is used) /// - 9: Precision flag (if set, the precision field below is used) /// - 10: unused /// - 11-31: fill character (21 bits, a full `char`) flags: u32, /// Width if width flag above is set. Otherwise, always 0. width: u16, /// Precision if precision flag above is set. Otherwise, always 0. precision: u16, } ```

craterbot · 2025-02-13T22:15:07Z

🎉 Experiment pr-136932 is completed!
📊 172 regressed and 113 fixed (582049 total)
📰 Open the full report.

⚠️ If you notice any spurious failure please add them to the denylist!
ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

m-ou-se · 2025-02-14T08:35:30Z

Crater results: Only two regressions are caused by an error or panic from this change.

~~gh/vasekp/stream-rust - Runtime panic - Using a value close to usize::MAX as precision~~ - Update: fixed
~~gh/uutils/coreutils - Compiler error in a unit test~~ - Update: fixed

m-ou-se · 2025-02-14T08:55:22Z

This reduces the width and precision flags from a usize to a u16, with the goal to reduce FormattingOptions to 64 bits.

This is technically a breaking change, in two ways:

format!("{a:9999999}") will become a compiler error.
format!("{a:width$}") with let width=999999; will become a runtime panic.

This is expected to have virtually no impact on real world code. See crater run results above: #136932 (comment)

@rfcbot merge

rfcbot · 2025-02-14T08:55:25Z

Team member @m-ou-se has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

m-ou-se · 2025-02-14T11:33:07Z

Fun fact: This also fixes a previously unknown bug that can occur when you run rustc on a 64-bit platform but target a 32-bit platform.

$ cat src/main.rs 
fn main() {
    println!("[{:4294967306}]", "hello");
}
$ cargo run --target=i686-unknown-linux-gnu 
   Compiling scratchpad v0.1.0
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.26s
     Running `target/i686-unknown-linux-gnu/debug/scratchpad`
[hello     ]

A width of 4294967306 was accepted because it fits in the compiler's usize, but it is then silently truncated to 10 when casted to the target usize later, resulting in much less padding than requested. :)

joshtriplett · 2025-02-14T21:45:50Z

@m-ou-se I updated the release notes based on that bug, which I think makes this change even more well-justified.

rfcbot · 2025-02-15T02:13:56Z

🔔 This is now entering its final comment period, as per the review above. 🔔

Fixes #1 (see rust-lang/rust#136932).

matthieu-m · 2025-02-20T18:21:52Z

Has any thought been given to reducing the maximum width/precision to u16::MAX - 1 rather than u16::MAX?

In terms of breakage, I'd expect it'd be mostly similar: 65534 or 65535 is about one and the same.

On the other hand, from a "compaction" point of view, this allows using NonMaxU16 instead of u16 + presence bit, thereby shaving off two bits.

Even if not changing to NonMaxU16 internally, it may be worth just setting the maximum value to u16::MAX - 1, to reverse the possibility to switch the internal representation later on without further impacting the API.

m-ou-se · 2025-02-20T19:34:49Z

Has any thought been given to reducing the maximum width/precision to u16::MAX - 1 rather than u16::MAX?

Yes, I've considered that.

I agree that it'd also work fine, but here are the reasons I didn't go for that option:

We have three bits left over if we pack all the flags/alignment/fill in 32 bits, so we're not really saving any space.
We don't actually have a NonMaxU16 type. So the FormattingOptions methods would still need to take/return a regular u16 and have a runtime check. If we accept all u16, then FormattingOptions::width/precision won't need any runtime checks.
With those two bits part of the 32-bit flags field, one can check if a FormattingOptions is fully set to defaults by only checking that one 32-bit field, ignoring the width and precision fields. I don't know if that's very useful, but it seems it might be a nice property.
It might be more surprising/unexpected if the limit is one less than u16::MAX. u16::MAX makes sense becaues it's the lowest guaranteed usize::MAX. But u16::MAX-1 feels more arbitrary, although I agree that it's unlikely to ever matter in real world code.
There are a few cases where an unset width and a width of zero result in the same behaviour. In those cases, it could be nice if you can just ignore the 'width set' flag and just use the 'width' field directly (which will store zero when unused). (Although that might only save one or two instructions. 🤷‍♀️)
Initializing the width and precision fields to zero rather than 0xFFFF might also save an instruciton or two. (The 32-bit flags field will be nonzero anyway, if it includes the ' ' fill character.) With the width and precision fields set to zero, an entire default FormattingOptions can be loaded into a 64-bit register in a single ARM64 instruction.

All of these arguments are quite weak. But unless there are clear benefits for u16::MAX - 1, I think it's simpler to stick to the full u16 range.

tspiteri · 2025-02-22T10:29:44Z

While printing primitives doesn't need large width and precision, they can sometimes be used when dealing with arbitrary precision. Would this code to get 100,000 digits of pi break?

    let pi = rug::Float::with_val(332200, rug::float::Constant::Pi);
    let pi_s = format!("{pi:.100000}");

Edit: for what it's worth, Rug does have an alternative method, so that code could be changed to let pi_s = pi.to_string_radix(10, Some(100_000));, but this change would still make the Rust formatting system less suitable for arbitrary precision numbers.

rfcbot · 2025-02-25T02:23:38Z

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

m-ou-se · 2025-02-25T11:44:14Z

While printing primitives doesn't need large width and precision, they can sometimes be used when dealing with arbitrary precision. Would this code to get 100,000 digits of pi break?
    let pi = rug::Float::with_val(332200, rug::float::Constant::Pi);
    let pi_s = format!("{pi:.100000}");

That code will give a compiler error:

error: invalid format string: integer `100000` does not fit into the type `u16` whose range is `0..=65535`
  |
  | let pi_s = format!("{pi:.100000}");
  |                          ^^^^^^ integer out of range for `u16` in format string

m-ou-se added T-libs Relevant to the library team, which will review and decide on the PR/issue. A-fmt Area: `core::fmt` S-experimental Status: Ongoing experiment that does not require reviewing and won't be merged in its current state. labels Feb 12, 2025

m-ou-se self-assigned this Feb 12, 2025

rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 12, 2025

m-ou-se mentioned this pull request Feb 12, 2025

core: use FormattingOptions by reference #136862

Closed

m-ou-se removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 12, 2025

scottmcm reviewed Feb 12, 2025

View reviewed changes

tests/mir-opt/funky_arms.float_to_exponential_common.GVN.panic-abort.diff Show resolved Hide resolved

This comment has been minimized.

Sign in to view

craterbot added S-waiting-on-crater Status: Waiting on a crater run to be completed. and removed S-experimental Status: Ongoing experiment that does not require reviewing and won't be merged in its current state. labels Feb 12, 2025

m-ou-se force-pushed the fmt-width-precision-u16 branch from 7355738 to ccb9429 Compare February 12, 2025 20:05

m-ou-se mentioned this pull request Feb 13, 2025

Reduce FormattingOptions to 64 bits #136974

Draft

craterbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-crater Status: Waiting on a crater run to be completed. labels Feb 13, 2025

m-ou-se added I-libs-nominated Nominated for discussion during a libs team meeting. I-libs-api-nominated Nominated for discussion during a libs-api team meeting. labels Feb 14, 2025

m-ou-se marked this pull request as ready for review February 14, 2025 08:45

rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Feb 14, 2025

m-ou-se added the relnotes Marks issues that should be documented in the release notes of the next release. label Feb 14, 2025

rustbot mentioned this pull request Feb 14, 2025

Tracking issue for release notes of #136932: Reduce formatting width and precision to 16 bits #137014

Open

3 tasks

m-ou-se removed I-libs-nominated Nominated for discussion during a libs team meeting. I-libs-api-nominated Nominated for discussion during a libs-api team meeting. labels Feb 14, 2025

m-ou-se mentioned this pull request Feb 14, 2025

split: fix bug with large arguments to -C uutils/coreutils#7128

Merged

rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Feb 15, 2025

Amanieu removed the S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). label Feb 18, 2025

m-ou-se mentioned this pull request Feb 19, 2025

Future compatiblity warning: {:.*} precision will be limited to u16::MAX in future Rust version vasekp/stream-rust#1

Closed

vasekp added a commit to vasekp/stream-rust that referenced this pull request Feb 19, 2025

Use Option to express no limit on precision or item count

f6ee7e8

Fixes #1 (see rust-lang/rust#136932).

m-ou-se mentioned this pull request Feb 19, 2025

Tracking issue for improving std::fmt::Arguments and format_args!() #99012

Open

50 tasks

m-ou-se added 3 commits February 20, 2025 17:33

Limit formatting width and precision to 16 bits.

1fbb7ec

Update tests.

ee6d703

Fix rust-analyzer for 16-bit fmt width and precision.

b61458a

m-ou-se force-pushed the fmt-width-precision-u16 branch from ccb9429 to b61458a Compare February 20, 2025 16:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce formatting `width` and `precision` to 16 bits #136932

Reduce formatting `width` and `precision` to 16 bits #136932

m-ou-se commented Feb 12, 2025 •

edited

Loading

m-ou-se commented Feb 12, 2025

bors commented Feb 12, 2025

This comment has been minimized.

bors commented Feb 12, 2025

m-ou-se commented Feb 12, 2025

craterbot commented Feb 12, 2025

craterbot commented Feb 12, 2025

oxalica commented Feb 13, 2025

m-ou-se commented Feb 13, 2025 •

edited

Loading

craterbot commented Feb 13, 2025

m-ou-se commented Feb 14, 2025 •

edited

Loading

m-ou-se commented Feb 14, 2025 •

edited

Loading

rfcbot commented Feb 14, 2025 •

edited by Amanieu

Loading

m-ou-se commented Feb 14, 2025 •

edited

Loading

joshtriplett commented Feb 14, 2025

rfcbot commented Feb 15, 2025

matthieu-m commented Feb 20, 2025

m-ou-se commented Feb 20, 2025 •

edited

Loading

tspiteri commented Feb 22, 2025 •

edited

Loading

rfcbot commented Feb 25, 2025

m-ou-se commented Feb 25, 2025

Reduce formatting width and precision to 16 bits #136932

Are you sure you want to change the base?

Reduce formatting width and precision to 16 bits #136932

Conversation

m-ou-se commented Feb 12, 2025 • edited Loading

m-ou-se commented Feb 12, 2025

bors commented Feb 12, 2025

This comment has been minimized.

bors commented Feb 12, 2025

m-ou-se commented Feb 12, 2025

craterbot commented Feb 12, 2025

craterbot commented Feb 12, 2025

oxalica commented Feb 13, 2025

m-ou-se commented Feb 13, 2025 • edited Loading

craterbot commented Feb 13, 2025

m-ou-se commented Feb 14, 2025 • edited Loading

m-ou-se commented Feb 14, 2025 • edited Loading

rfcbot commented Feb 14, 2025 • edited by Amanieu Loading

m-ou-se commented Feb 14, 2025 • edited Loading

joshtriplett commented Feb 14, 2025

rfcbot commented Feb 15, 2025

matthieu-m commented Feb 20, 2025

m-ou-se commented Feb 20, 2025 • edited Loading

tspiteri commented Feb 22, 2025 • edited Loading

rfcbot commented Feb 25, 2025

m-ou-se commented Feb 25, 2025

Reduce formatting `width` and `precision` to 16 bits #136932

Reduce formatting `width` and `precision` to 16 bits #136932

m-ou-se commented Feb 12, 2025 •

edited

Loading

m-ou-se commented Feb 13, 2025 •

edited

Loading

m-ou-se commented Feb 14, 2025 •

edited

Loading

m-ou-se commented Feb 14, 2025 •

edited

Loading

rfcbot commented Feb 14, 2025 •

edited by Amanieu

Loading

m-ou-se commented Feb 14, 2025 •

edited

Loading

m-ou-se commented Feb 20, 2025 •

edited

Loading

tspiteri commented Feb 22, 2025 •

edited

Loading