Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply ignore logic before symlink resolution #3846

Merged
merged 3 commits into from
Sep 7, 2023

Conversation

hauntsaninja
Copy link
Collaborator

@hauntsaninja hauntsaninja commented Aug 19, 2023

This means, for instance, that a gitignored symlink cannot affect your formatting. Fixes #3527, fixes #3826

This means, for instance, that a gitignored symlink cannot affect your
formatting. Fixes psf#3527, fixes psf#3826
@github-actions
Copy link

diff-shades reports zero changes comparing this PR (2a8804c) to main (c6a031e).


What is this? | Workflow run | diff-shades documentation

@hauntsaninja hauntsaninja marked this pull request as ready for review August 19, 2023 18:11
@JelleZijlstra JelleZijlstra merged commit df50fee into psf:main Sep 7, 2023
@hauntsaninja hauntsaninja deleted the better-symlink branch September 7, 2023 05:00
sth added a commit to sth/black that referenced this pull request Oct 27, 2023
This adjusts exclusion checking for command line parameters the same way
as psf#3846 did for other files. Specifically, --force-exclude is now
checked before symlink resolution. This way symlink handling is
consistent between command line parameters and other collected files.

Fixes psf#3826.
sth added a commit to sth/black that referenced this pull request Oct 27, 2023
This adjusts exclusion checking for command line parameters the same way
as psf#3846 did for other files. Specifically, --force-exclude is now
checked before symlink resolution. This way symlink handling is
consistent between command line parameters and other collected files.

Fixes psf#3826.
hauntsaninja added a commit to hauntsaninja/black that referenced this pull request Feb 11, 2024
This relates to psf#4015, psf#4161 and the behaviour of os.getcwd()

Black is a big user of pathlib and as such loves doing `.resolve()`,
since for a long time it was the only good way of getting an absolute
path in pathlib. However, this has two problems:

The first minor problem is performance, e.g. in psf#3751 I (safely) got rid
of a bunch of `.resolve()` which made Black 40% faster on cached runs.

The second more important problem is that always resolving symlinks
results in unintuitive exclusion behaviour. For instance, a gitignored
symlink should never alter formatting of your actual code. This was
reported by users a few times.

In psf#3846, I improved the exclusion rule logic for symlinks in
`gen_python_files` and everything was good.

But `gen_python_files` isn't enough, there's also `get_sources`, which
handles user specified paths directly (instead of files Black
discovers). So in psf#4015, I made a very similar change to psf#3846 for
`get_sources`, and this is where some problems began.

The core issue was the line:
```
root_relative_path = path.absolute().relative_to(root).as_posix()
```
The first issue is that despite root being computed from user inputs, we
call `.resolve()` while computing it (likely unecessarily). Which means
that `path` may not actually be relative to `root`. So I started off
this PR trying to fix that, when I ran into the second issue. Which is
that `os.getcwd()` (as called by `os.path.abspath` or `Path.absolute` or
`Path.cwd`) also often resolves symlinks!
```
>>> import os
>>> os.environ.get("PWD")
'/Users/shantanu/dev/black/symlink/bug'
>>> os.getcwd()
'/Users/shantanu/dev/black/actual/bug'
```
This also meant that the breakage often would not show up when input
relative paths.

This doesn't affect `gen_python_files` / psf#3846 because things are always
absolute and known to be relative to `root`.

Anyway, it looks like psf#4161 fixed the crash by just swallowing the error
and ignoring the file. Instead, we should just try to compute the actual
relative path. I think this PR should be quite safe, but we could also
consider reverting some of the previous changes; the associated issues
weren't too popular.

At the same time, I think there's still behaviour that can be improved
and I kind of want to make larger changes, but maybe I'll save that for
if we do something like psf#3952

Hopefully fixes psf#4205, fixes psf#4209, actual fix for psf#4077
hauntsaninja added a commit to hauntsaninja/black that referenced this pull request Feb 11, 2024
This relates to psf#4015, psf#4161 and the behaviour of os.getcwd()

Black is a big user of pathlib and as such loves doing `.resolve()`,
since for a long time it was the only good way of getting an absolute
path in pathlib. However, this has two problems:

The first minor problem is performance, e.g. in psf#3751 I (safely) got rid
of a bunch of `.resolve()` which made Black 40% faster on cached runs.

The second more important problem is that always resolving symlinks
results in unintuitive exclusion behaviour. For instance, a gitignored
symlink should never alter formatting of your actual code. This kind of
thing was reported by users a few times.

In psf#3846, I improved the exclusion rule logic for symlinks in
`gen_python_files` and everything was good.

But `gen_python_files` isn't enough, there's also `get_sources`, which
handles user specified paths directly (instead of files Black
discovers). So in psf#4015, I made a very similar change to psf#3846 for
`get_sources`, and this is where some problems began.

The core issue was the line:
```
root_relative_path = path.absolute().relative_to(root).as_posix()
```
The first issue is that despite root being computed from user inputs, we
call `.resolve()` while computing it (likely unecessarily). Which means
that `path` may not actually be relative to `root`. So I started off
this PR trying to fix that, when I ran into the second issue. Which is
that `os.getcwd()` (as called by `os.path.abspath` or `Path.absolute` or
`Path.cwd`) also often resolves symlinks!
```
>>> import os
>>> os.environ.get("PWD")
'/Users/shantanu/dev/black/symlink/bug'
>>> os.getcwd()
'/Users/shantanu/dev/black/actual/bug'
```
This also meant that the breakage often would not show up when input
relative paths.

This doesn't affect `gen_python_files` / psf#3846 because things are always
absolute and known to be relative to `root`.

Anyway, it looks like psf#4161 fixed the crash by just swallowing the error
and ignoring the file. Instead, we should just try to compute the actual
relative path. I think this PR should be quite safe, but we could also
consider reverting some of the previous changes; the associated issues
weren't too popular.

At the same time, I think there's still behaviour that can be improved
and I kind of want to make larger changes, but maybe I'll save that for
if we do something like psf#3952

Hopefully fixes psf#4205, fixes psf#4209, actual fix for psf#4077
hauntsaninja added a commit that referenced this pull request Feb 12, 2024
This relates to #4015, #4161 and the behaviour of os.getcwd()

Black is a big user of pathlib and as such loves doing `.resolve()`,
since for a long time it was the only good way of getting an absolute
path in pathlib. However, this has two problems:

The first minor problem is performance, e.g. in #3751 I (safely) got rid
of a bunch of `.resolve()` which made Black 40% faster on cached runs.

The second more important problem is that always resolving symlinks
results in unintuitive exclusion behaviour. For instance, a gitignored
symlink should never alter formatting of your actual code. This kind of
thing was reported by users a few times.

In #3846, I improved the exclusion rule logic for symlinks in
`gen_python_files` and everything was good.

But `gen_python_files` isn't enough, there's also `get_sources`, which
handles user specified paths directly (instead of files Black
discovers). So in #4015, I made a very similar change to #3846 for
`get_sources`, and this is where some problems began.

The core issue was the line:
```
root_relative_path = path.absolute().relative_to(root).as_posix()
```
The first issue is that despite root being computed from user inputs, we
call `.resolve()` while computing it (likely unecessarily). Which means
that `path` may not actually be relative to `root`. So I started off
this PR trying to fix that, when I ran into the second issue. Which is
that `os.getcwd()` (as called by `os.path.abspath` or `Path.absolute` or
`Path.cwd`) also often resolves symlinks!
```
>>> import os
>>> os.environ.get("PWD")
'/Users/shantanu/dev/black/symlink/bug'
>>> os.getcwd()
'/Users/shantanu/dev/black/actual/bug'
```
This also meant that the breakage often would not show up when input
relative paths.

This doesn't affect `gen_python_files` / #3846 because things are always
absolute and known to be relative to `root`.

Anyway, it looks like #4161 fixed the crash by just swallowing the error
and ignoring the file. Instead, we should just try to compute the actual
relative path. I think this PR should be quite safe, but we could also
consider reverting some of the previous changes; the associated issues
weren't too popular.

At the same time, I think there's still behaviour that can be improved
and I kind of want to make larger changes, but maybe I'll save that for
if we do something like #3952

Hopefully fixes #4205, fixes #4209, actual fix for #4077
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Force Exclude does not work with symlinks Apply excludes before normalising symlinks
2 participants