warn before overwriting files owned by previously installed wheels #8119

dhellmann · 2020-04-23T19:00:25Z

Before installing a wheel, look for other distributions that have also
written files into the same top level directory. Warn if any of the
same files will be modified by the new wheel.

Addresses #4625

dhellmann · 2020-04-23T22:56:35Z

The Windows tests are failing because they're seeing a deprecation warning for python 2.7 instead of the new warning about the conflicting file. Should that test be marked python 3 only?

The Travis job seems to have failed with an error for PyPI.org. I can push a trivial change to rerun those, but maybe if someone has time to give meaningful feedback there's some other real change to be made so I'll hold off.

uranusjr · 2020-04-24T06:45:19Z

The Windows tests are failing because they're seeing a deprecation warning for python 2.7 instead of the new warning about the conflicting file.

They should see both the deprecation warning and the conflicting file. This likely indicates a bug in the implementation.

dhellmann · 2020-04-24T14:18:17Z

The Windows tests are failing because they're seeing a deprecation warning for python 2.7 instead of the new warning about the conflicting file.

They should see both the deprecation warning and the conflicting file. This likely indicates a bug in the implementation.

Yes, you’re right. That’s fixed now.

BrownTruck · 2020-05-20T01:15:01Z

Hello!

I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the master branch into this pull request or rebase this pull request against master then it will be eligible for code review and hopefully merging!

dhellmann · 2020-05-20T16:29:18Z

rebased

pradyunsg

A couple of minor comments, more about communication around the change rather than the change itself.

src/pip/_internal/operations/install/wheel.py

news/4625.bugfix

BrownTruck · 2020-05-22T01:45:04Z

Hello!

I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the master branch into this pull request or rebase this pull request against master then it will be eligible for code review and hopefully merging!

dhellmann · 2020-05-23T10:58:40Z

Rebased and comments addressed

BrownTruck · 2020-07-01T11:30:03Z

Hello!

I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the master branch into this pull request or rebase this pull request against master then it will be eligible for code review and hopefully merging!

McSinyx · 2020-07-01T13:40:29Z

Hi there, I've just noticed the nice oxford_comma_join, what do you think about making it taking values as variadic arguments, @dhellmann?

dhellmann · 2020-07-01T14:16:52Z

Hi there, I've just noticed the nice oxford_comma_join, what do you think about making it taking values as variadic arguments, @dhellmann?

I used a sequence argument because I expected the input to come from splitting inputs or filtering some other sequence and have an unknown number of values. The passing a single such value to a function, rather than exploding it to use a variadic calling pattern, seemed like the right fit.

Did you have a different sort of use case in mind?

pradyunsg

I think this looks basically good to go -- could you rebase this + (optionally) make a small change to the tests?

tests/unit/test_messages.py

McSinyx · 2020-07-01T15:43:05Z

I used a sequence argument because I expected the input to come from splitting inputs or filtering some other sequence and have an unknown number of values.

That's right, I was thinking about manual call but it doesn't make much sense for me anymore after reading this 😄

dhellmann · 2020-07-01T19:52:30Z

I get completely different results when I run the linter locally (lots more errors, in unrelated files). Is there some trick to running it beyond using tox -e lint?

deveshks · 2020-07-01T20:02:34Z

I get completely different results when I run the linter locally (lots more errors, in unrelated files). Is there some trick to running it beyond using tox -e lint?

I just run that, but my tox runs inside a virtualenv. Could you share what output do you see? I never saw local linter results being different then the linter result on CI.

src/pip/_internal/operations/install/wheel.py

src/pip/_internal/utils/messages.py

deveshks · 2020-07-01T20:12:57Z

tests/functional/test_install_wheel.py

+    is reported and that the second wheel does overwrite
+    the first.
+    """
+    from tests.lib import TestFailure


I see a few other tests in this file importing this inside the test. Maybe it's a good time to move this import along with the others at the top?

I'm just following the local coding patterns in this PR. Changing all of that seems like a task for another PR.

dhellmann · 2020-07-01T22:21:45Z

I get completely different results when I run the linter locally (lots more errors, in unrelated files). Is there some trick to running it beyond using tox -e lint?

I just run that, but my tox runs inside a virtualenv. Could you share what output do you see? I never saw local linter results being different then the linter result on CI.

I have tox in a virtualenv, too.

$ tox --version
3.14.6 imported from /Users/dhellmann/Envs/pip/lib/python3.8/site-packages/tox/__init__.py

https://gist.github.com/dhellmann/4bba77827a56e6890f5e3633153f5153

tests/unit/test_messages.py

chrahunt

Can we take a slightly different approach, like:

pre-compute a list of the paths that will be installed for the current wheel, from the files in the source directory and the anticipated generated script files (this would be easier with some refactoring, which could be done in separate PRs ahead-of-time)
if any file already exists, then determine what package owns it using RECORD or installed_files.txt for the installed packages in the environment
report the conflict as a warning, after filtering out __init__.py associated with namespace packages, as mentioned in the original issue

This has several benefits:

1 will be mostly compatible with installation directly from wheel (see Extract wheels directly to install location #6030)
1 also covers generated script files
2 should be compatible with more distributions and have fewer false negatives than looking at top_level.txt (which I believe is non-standard)
doing 2 lazily (when we see a file which will be overwritten) means in the majority of cases we will not need to process every installed package

chrahunt · 2020-07-02T04:47:34Z

src/pip/_internal/operations/install/wheel.py

+    with open(installing_top_level_path, 'r') as fd:
+        installing_top_level = fd.read().strip()
+    files_from_other_owners = _get_file_owners(
+        lib_dir, installing_top_level, name)


I don't think it should be necessary to ignore the installing package, since we should have uninstalled the existing distribution before trying to install a new version.

Eventually a file conflict will be an error that prevents installation. We want to report the error before removing the existing version and breaking the environment.

We currently have a rollback mechanism that reverts an upgrading package on any installation failure. I think this should have us covered, if I understand the concern you're raising.

Before installing a wheel, look for other distributions that have also written files into the same top level directory. Warn if any of the same files will be modified by the new wheel. Addresses pypa#4625 Signed-off-by: Doug Hellmann <[email protected]>

Signed-off-by: Doug Hellmann <[email protected]>

Rather than emitting a separate warning for each owner, produce one warning with all of the other owner names in the message. Signed-off-by: Doug Hellmann <[email protected]>

dhellmann · 2020-07-02T16:19:44Z

Can we take a slightly different approach, like:

pre-compute a list of the paths that will be installed for the current wheel, from the files in the source directory and the anticipated generated script files (this would be easier with some refactoring, which could be done in separate CRs ahead-of-time)

if any file already exists, then determine what package owns it using RECORD or installed_files.txt for the installed packages in the environment

report the conflict as a warning, after filtering out __init__.py associated with namespace packages, as mentioned in the original issue

This has several benefits:

1 will be mostly compatible with installation directly from wheel (see Extract wheels directly to install location #6030)

Can you describe how the approach that has already been implemented is incompatible with installing directly?

1 also covers generated script files

Yes, that's a shortcoming of this implementation. It's still an incremental improvement.

2 should be compatible with more distributions and have fewer false negatives than looking at top_level.txt (which I believe is non-standard)

Is it? It's in all of the installed packages I see on my system and it's there in the test suite when the test wheels are installed. What's creating it?

doing 2 lazily (when we see a file which will be overwritten) means in the majority of cases we will not need to process every installed package

If we're going to turn this into a blocking error, we need to examine the state of the system before making changes. To handle the upgrade case, we need to look for conflicts while we may have an old version of the package installed, which would automatically trigger looking for the owner of (effectively) every file that would be installed. So I don't think we would buy much time by doing the ownership evaluation lazily in a lot of cases. The current implementation only looks at distributions that claim to have written files to the same top level directory as the one being installed, so it ignores most of them already.

pfmoore · 2020-07-02T18:18:35Z

top_level.txt is, I believe, a setuptools thing. It's certainly not a standardised metadata file (of which the only ones that I can recall are as per PEP 376 - RECORD, REQUESTED and INSTALLER)

dhellmann · 2020-07-02T19:25:31Z

So I guess to be accurate this would need to read the RECORD file for every installed distribution. That seems like something we want to only do one time, if we can, but wheels are processed individually. Would it be safe to use a module-level global to hold a cache?

chrahunt · 2020-07-02T22:45:38Z

So I guess to be accurate this would need to read the RECORD file for every installed distribution.

Yes. Since pip itself generates installed_files.txt I think it is also reasonable to support, but that could be done in a separate PR.

Would it be safe to use a module-level global to hold a cache?

Yes, that should be OK! A few things that may help:

see the approach taken here for how we normally control those kinds of globals
we could scope it with a with statement around this block
where we are expecting it in the code we would want to assert _global_file_cache is not None to satisfy the type checker and because there is no point testing the case when it is not set (since we expect it to always be set for an installation)
to provide this in unit tests automatically one approach could be to have an autoused pytest fixture, similar to what we do here. Ideally we would want to keep the number of impacted tests to a minimum, but I don't think it's a big deal to have it global like this one if it impacts a lot of tests.

BrownTruck · 2020-07-11T14:00:03Z

Hello!

I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the master branch into this pull request or rebase this pull request against master then it will be eligible for code review and hopefully merging!

There have been other (valid) concerns raised that I agree with. :)

dhellmann force-pushed the warn-on-file-owner-confict branch from e864d31 to c03530d Compare April 23, 2020 19:05

dhellmann mentioned this pull request Apr 23, 2020

pip overwrites existing files unconditionally during installation #4625

Open

dhellmann force-pushed the warn-on-file-owner-confict branch from 5d66c75 to bf7b22b Compare April 24, 2020 12:57

pradyunsg added state: needs discussion This needs some more discussion state: needs eyes Needs a maintainer/triager to take a closer look type: enhancement Improvements to functionality labels Apr 25, 2020

BrownTruck added the needs rebase or merge PR has conflicts with current master label May 20, 2020

dhellmann force-pushed the warn-on-file-owner-confict branch from bf7b22b to 7a2abf4 Compare May 20, 2020 16:29

pypa-bot removed the needs rebase or merge PR has conflicts with current master label May 20, 2020

pradyunsg reviewed May 21, 2020

View reviewed changes

src/pip/_internal/operations/install/wheel.py Outdated Show resolved Hide resolved

news/4625.bugfix Outdated Show resolved Hide resolved

BrownTruck added the needs rebase or merge PR has conflicts with current master label May 22, 2020

dhellmann force-pushed the warn-on-file-owner-confict branch from 7a2abf4 to f45a974 Compare May 23, 2020 10:57

pypa-bot removed the needs rebase or merge PR has conflicts with current master label May 23, 2020

dhellmann force-pushed the warn-on-file-owner-confict branch from f45a974 to c8d819f Compare May 23, 2020 11:07

pradyunsg previously approved these changes May 23, 2020

View reviewed changes

pradyunsg removed state: needs discussion This needs some more discussion state: needs eyes Needs a maintainer/triager to take a closer look labels May 23, 2020

pradyunsg mentioned this pull request May 31, 2020

Parallelizing the install process + PoC! #8187

Open

BrownTruck added the needs rebase or merge PR has conflicts with current master label Jul 1, 2020

pradyunsg reviewed Jul 1, 2020

View reviewed changes

tests/unit/test_messages.py Outdated Show resolved Hide resolved

dhellmann force-pushed the warn-on-file-owner-confict branch from c8d819f to 2dae22b Compare July 1, 2020 18:32

pypa-bot removed the needs rebase or merge PR has conflicts with current master label Jul 1, 2020

dhellmann force-pushed the warn-on-file-owner-confict branch 3 times, most recently from 3396da0 to ace9b21 Compare July 1, 2020 19:52

deveshks reviewed Jul 1, 2020

View reviewed changes

src/pip/_internal/operations/install/wheel.py Outdated Show resolved Hide resolved

deveshks reviewed Jul 1, 2020

View reviewed changes

src/pip/_internal/operations/install/wheel.py Outdated Show resolved Hide resolved

deveshks reviewed Jul 1, 2020

View reviewed changes

src/pip/_internal/utils/messages.py Outdated Show resolved Hide resolved

deveshks reviewed Jul 1, 2020

View reviewed changes

deveshks reviewed Jul 2, 2020

View reviewed changes

tests/unit/test_messages.py Outdated Show resolved Hide resolved

deveshks reviewed Jul 2, 2020

View reviewed changes

tests/unit/test_messages.py Outdated Show resolved Hide resolved

chrahunt suggested changes Jul 2, 2020

View reviewed changes

dhellmann added 3 commits July 2, 2020 12:04

add a utility function for building nice messages from lists of strings

23f152e

Signed-off-by: Doug Hellmann <[email protected]>

emit a single warning for overwriting files

cca8cfb

Rather than emitting a separate warning for each owner, produce one warning with all of the other owner names in the message. Signed-off-by: Doug Hellmann <[email protected]>

dhellmann force-pushed the warn-on-file-owner-confict branch from 7bb5beb to cca8cfb Compare July 2, 2020 16:04

BrownTruck added the needs rebase or merge PR has conflicts with current master label Jul 11, 2020

dhellmann closed this Nov 19, 2020

pradyunsg mentioned this pull request Feb 26, 2021

Should pip uninstall before updating dependencies? #9655

Open

1 task

github-actions bot locked as resolved and limited conversation to collaborators Oct 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

warn before overwriting files owned by previously installed wheels #8119

warn before overwriting files owned by previously installed wheels #8119

dhellmann commented Apr 23, 2020

dhellmann commented Apr 23, 2020

uranusjr commented Apr 24, 2020

dhellmann commented Apr 24, 2020

BrownTruck commented May 20, 2020

dhellmann commented May 20, 2020

pradyunsg left a comment

BrownTruck commented May 22, 2020

dhellmann commented May 23, 2020

BrownTruck commented Jul 1, 2020

McSinyx commented Jul 1, 2020

dhellmann commented Jul 1, 2020

pradyunsg left a comment

McSinyx commented Jul 1, 2020

dhellmann commented Jul 1, 2020

deveshks commented Jul 1, 2020

deveshks Jul 1, 2020

dhellmann Jul 1, 2020

dhellmann commented Jul 1, 2020

chrahunt left a comment •

edited

Loading

chrahunt Jul 2, 2020

dhellmann Jul 2, 2020

chrahunt Jul 2, 2020 •

edited

Loading

dhellmann commented Jul 2, 2020

pfmoore commented Jul 2, 2020

dhellmann commented Jul 2, 2020

chrahunt commented Jul 2, 2020 •

edited

Loading

BrownTruck commented Jul 11, 2020

warn before overwriting files owned by previously installed wheels #8119

warn before overwriting files owned by previously installed wheels #8119

Conversation

dhellmann commented Apr 23, 2020

dhellmann commented Apr 23, 2020

uranusjr commented Apr 24, 2020

dhellmann commented Apr 24, 2020

BrownTruck commented May 20, 2020

dhellmann commented May 20, 2020

pradyunsg left a comment

Choose a reason for hiding this comment

BrownTruck commented May 22, 2020

dhellmann commented May 23, 2020

BrownTruck commented Jul 1, 2020

McSinyx commented Jul 1, 2020

dhellmann commented Jul 1, 2020

pradyunsg left a comment

Choose a reason for hiding this comment

McSinyx commented Jul 1, 2020

dhellmann commented Jul 1, 2020

deveshks commented Jul 1, 2020

deveshks Jul 1, 2020

Choose a reason for hiding this comment

dhellmann Jul 1, 2020

Choose a reason for hiding this comment

dhellmann commented Jul 1, 2020

chrahunt left a comment • edited Loading

Choose a reason for hiding this comment

chrahunt Jul 2, 2020

Choose a reason for hiding this comment

dhellmann Jul 2, 2020

Choose a reason for hiding this comment

chrahunt Jul 2, 2020 • edited Loading

Choose a reason for hiding this comment

dhellmann commented Jul 2, 2020

pfmoore commented Jul 2, 2020

dhellmann commented Jul 2, 2020

chrahunt commented Jul 2, 2020 • edited Loading

BrownTruck commented Jul 11, 2020

chrahunt left a comment •

edited

Loading

chrahunt Jul 2, 2020 •

edited

Loading

chrahunt commented Jul 2, 2020 •

edited

Loading