Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fmt: Make sure the goal is always positive when given a positive width #6094

Merged
merged 1 commit into from
Mar 19, 2024

Conversation

sargas
Copy link
Contributor

@sargas sargas commented Mar 19, 2024

Currently, the fmt utility will panic with an integer overflow when given a small enough width:

$ cargo run -q -- fmt -w3 Cargo.toml
thread 'main' panicked at src/uu/fmt/src/linebreak.rs:242:21:
attempt to subtract with overflow
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
#%

GNU coreutil's fmt doesn't have a problem with this:

$ fmt -w3 Cargo.toml|head -n 3
#
coreutils
(uutils)

This PR adds a test to tests/by-util/test_fmt.rs reproducing this panic. The root cause is this calculation:

    let g = (w * DEFAULT_GOAL_TO_WIDTH_RATIO / 100).min(w - 3);

This sets the goal g to be zero when the width w is three, or negative for smaller widths. This also contradicts the behavior in the argument help and in the GNU fmt info page, both of which just say the goal will be 93% of the width.

This PR makes the goal set when only specifying the width be the value expected by the documentation. This avoids the overflow later on in linebreak.rs, which comes about from essentially calculating 2*goal - width as a minimum length to consider breakpoints (this can't be correct if, for example, the goal is 2 and the width is 12, but I'm leaving addressing that issue out of this PR).

A .max(..) call is added to ensure the goal is 1 if the width is positive or zero if the width is zero. This ensures that 2*goal - width is always nonnegative for the code path where a width (but not a goal) is provided.

There are a few other things in this PR:

  • A few warnings about unneeded parenthesis and the use of an almost deprecated std::i64 module were fixed.
  • A debug assertion was added to enforce "width >= goal" to catch that case before a panic in linebreak.rs. This is mainly relevant for the codepaths where the user has specified the width or the goal and we calculate the unspecified one.
  • A few warnings in linebreak.rs were addressed as well, and some isize's that should always be positive (if there's no width/goal bugs) were changed to usizes to catch bugs earlier. In some of these cases the math was being done with usizes and then converted to isize (making it impossible for them to store negative values, they would have panicked during the usize operations).
  • test_fmt_width tests the output with a set width, but the expected result differs from what I get on my system running fmt -10 tests/fixtures/fmt/one-word-per-line.txt. With this PR, I get the same as the fmt command, so I've updated the test.

Copy link

GNU testsuite comparison:

GNU test failed: tests/timeout/timeout. tests/timeout/timeout is passing on 'main'. Maybe you have to rebase?

@tertsdiepraam
Copy link
Member

Looks good, thanks! I wonder why that - 3 was there in the first place? Maybe it was leftover from an implementation that set the goal to width - 3 without taking the percentages into account?


#[test]
fn test_small_width() {
for width in ["1", "2", "3"] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one value missing that unfortunately makes the test fail: 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I don't see this on the man or info pages, but I'm guessing GNU fmt treats -w0 as -w1?

Running this in the uutils/coreutils repo gave no output:

fd -t f -0 | while IFS= read -r -d '' file; do diff <(fmt -0 "$file") <(fmt -1 "$file"); done

I'm not sure whether to special case treating a zero width as one width, or treating the goal as zero in that case (both fix this test when run with "0"). I'm pushing a PR that changes the calculated goal since that seems like the smaller change, but I'm not sure either are different

@sargas
Copy link
Contributor Author

sargas commented Mar 19, 2024

Looks good, thanks! I wonder why that - 3 was there in the first place? Maybe it was leftover from an implementation that set the goal to width - 3 without taking the percentages into account?

Did a bit of git archeology: it looks like that was added in this commit from 2014 by @kwantam , where it was changed from a - 4 to a - 3 🤷 . The code used constants that weren't consistent with GNU's implementation/documentation, although some of that was fixed in @cakebaker's commit 09db8d8af9.

A debug assertion was added to enforce "width >= goal" to catch
that case before a panic in linebreak.rs. A few warnings in linebreak.rs
were addressed as well, and some isize's that should always be positive
(if there's no width/goal bugs) were changed to usizes to catch bugs
earlier.

test_fmt_width is updated to test for the same result as GNU fmt
@sargas sargas force-pushed the fmt-small-widths branch from 11c063a to f456b95 Compare March 19, 2024 13:52
@cakebaker cakebaker merged commit 7c8dfca into uutils:main Mar 19, 2024
62 checks passed
@cakebaker
Copy link
Contributor

Thanks for the PR :)

@sargas sargas changed the title fmt: Make sure goal is always positive fmt: Make sure the goal is always positive when given a positive width Mar 19, 2024
@sargas
Copy link
Contributor Author

sargas commented Mar 19, 2024

Thanks for the PR :)

Thanks for the review 🙂

@kwantam
Copy link
Contributor

kwantam commented Mar 21, 2024

Did a bit of git archeology: it looks like that was added in this commit from 2014 by @kwantam , where it was changed from a - 4 to a - 3 🤷 . The code used constants that weren't consistent with GNU's implementation/documentation, although some of that was fixed in @cakebaker's commit 09db8d8af9.

Yeah, sorry about the lack of comments. I have little recollection of why I ended up on those particular numbers, but I do seem to recall experimenting with the constants in an attempt to make the output reasonable over a range of widths...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants