Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for #4264: --line-ranges formats entire file when ranges are at EOF #4273

Merged
merged 3 commits into from
Mar 15, 2024

Conversation

sumezulike
Copy link
Contributor

@sumezulike sumezulike commented Mar 12, 2024

Description

This fixes #4264. The issue was that the empty last line does not count as a line to adjusted_lines because it is not in the list returned by str.splitlines. Since adjusted_lines is only called on the second pass of _format_str_once in format_str, the first pass would format the code correctly, then the "invalid" line would get removed from lines, and the second pass would format the whole code.

I added a small change to adjusted_lines to cap the end value of any line tuple. For example, --line-ranges 1-100 gets reduced to (1, 4) if the code is only four lines long.
One could change if end > original_line_count to if end == original_line_count + 1 to only allow this one additional line, but I think allowing an oversized range to just cover the rest of the code is not surprising behavior, slices act similarly.

I also added a call to adjusted_lines with the unmodified source code before the first pass of _format_str_once. This is an additional computational expense but ensures consistency.

Checklist - did you ...

  • Add an entry in CHANGES.md if necessary?
  • Add / update tests if necessary?
  • [-] Add new / update outdated documentation?

@sumezulike sumezulike changed the title Fix for #4264 Fix for #4264: --line-ranges formats entire file when ranges are at EOF Mar 12, 2024
Copy link

github-actions bot commented Mar 12, 2024

diff-shades reports zero changes comparing this PR (a1ba877) to main (1abcffc).


What is this? | Workflow run | diff-shades documentation

@JelleZijlstra
Copy link
Collaborator

cc @yilei

@yilei
Copy link
Contributor

yilei commented Mar 13, 2024

Thanks for tagging, @JelleZijlstra!

I also added a call to adjusted_lines with the unmodified source code before the first pass of _format_str_once. This is an additional computational expense but ensures consistency.

Why is this still necessary, after the change to adjusted_lines to cap the end value?

Could you also add a test file next to https://github.com/psf/black/blob/main/tests/data/cases/line_ranges_basic.py ?

This opens up a question on what happens when you specify a --line-ranges= that's outside of the unformatted file. This change makes it valid when you specify a larger <END>, but if the entire range is outside then the entire file is still formatted. How about make it format nothing if everything is outside of the range too? The implementation could be, instead of changing adjusted_lines, to call a new sanitize_lines(lines, src_contents) once in format_str:

def format_str(...):
    if lines:
        lines = sanitize_lines(lines, src_contents)
        if not lines:
            return src_contents  # Nothing to format
    dst_contents = _format_str_once(src_contents, mode=mode, lines=lines)
    ...

@sumezulike
Copy link
Contributor Author

Thank you so much for the feedback!

Calling adjusted_lines beforehand is not really necessary. I just used it like one would sanitize_lines to make sure that any lines that would be removed on the second pass would already be removed on the first. Writing a new function for that definitely makes more sense!

Thank you also for noticing that moving the entire range out of the file still formats everything, I'll fix that as suggested.

Copy link
Contributor

@yilei yilei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks good to me overall, left a few minor comments.

2. def func(arg1,
3. arg2, arg3):
4. pass
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a case for source not ending with a newline?

def foo3(parameter_1, parameter_2, parameter_3, parameter_4, parameter_5, parameter_6, parameter_7): pass
def foo4(parameter_1, parameter_2, parameter_3, parameter_4, parameter_5, parameter_6, parameter_7): pass

# Adding some unformated code covering a wide range of syntaxes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would simply remove the lines below, as this test case just need to verify a completely out-of-range input doesn't format

if not src_contents:
return []
good_lines = []
src_line_count = len(src_contents.splitlines())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not too strong an opinion, it can be more efficient to do a count("\n") but then you also need to add 1 when it doesn't end with a new line.

Copy link
Contributor Author

@sumezulike sumezulike Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's really quite a difference

$ python -m timeit -n 10000 -s "f = open('src/black/__init__.py'); src=f.read(); f.close()" "len(src.splitlines())"
10000 loops, best of 5: 171 usec per loop
$ python -m timeit -n 10000 -s "f = open('src/black/__init__.py'); src=f.read(); f.close()" "src.count('\n')"  
10000 loops, best of 5: 36.6 usec per loop

I resisted the temptation to write src_contents.count("\n") + src_contents[-1] != "\n" 😄

@JelleZijlstra JelleZijlstra merged commit 7b5a657 into psf:main Mar 15, 2024
46 checks passed
@sumezulike
Copy link
Contributor Author

sumezulike commented Mar 15, 2024

Thanks for reviewing and approving my PR! Glad to be able to contribute to one of my all-time favorite Python projects :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

--line-ranges formats entire file when ranges are at EOF
3 participants