Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't print full paths for cached wheels when using them #7815

Open
pradyunsg opened this issue Mar 3, 2020 · 11 comments
Open

Don't print full paths for cached wheels when using them #7815

pradyunsg opened this issue Mar 3, 2020 · 11 comments
Labels
resolution: deferred till PR Further discussion will happen when a PR is made state: needs discussion This needs some more discussion type: bug A confirmed bug or unintended behavior

Comments

@pradyunsg
Copy link
Member

Environment

  • pip version: 20.0.2
  • Python version: 3.8.0
  • OS: MacOS

Description

Currently, during installation, occasionally prints "Processing <full-wheel-name-in-cache>"

Expected behavior

pip should say "Processing <wheel-basename> (cached)" instead of printing the entire path.

How to Reproduce

Not sure what exact cache state needs to be but I can consistently reproduce this with: pip install 'django-rest-swagger[reST]==0.3.10'.

Output

Collecting django-rest-swagger[reST]==0.3.10
  Using cached django_rest_swagger-0.3.10-py2.py3-none-any.whl (212 kB)
  Ignoring docutils: markers 'extra == "reST"' don't match your environment
Processing /Users/pradyunsg/Library/Caches/pip/wheels/e4/76/4d/a95b8dd7b452b69e8ed4f68b69e1b55e12c9c9624dd962b191/PyYAML-5.3-cp38-cp38-macosx_10_14_x86_64.whl
Collecting djangorestframework>=2.3.8
  Using cached djangorestframework-3.11.0-py3-none-any.whl (911 kB)
Collecting Django>=1.8
  Using cached Django-3.0.3-py3-none-any.whl (7.5 MB)
Collecting sqlparse>=0.2.2
  Using cached sqlparse-0.3.1-py2.py3-none-any.whl (40 kB)
Collecting asgiref~=3.2
  Using cached asgiref-3.2.3-py2.py3-none-any.whl (18 kB)
Collecting pytz
  Using cached pytz-2019.3-py2.py3-none-any.whl (509 kB)
Installing collected packages: PyYAML, sqlparse, asgiref, pytz, Django, djangorestframework, django-rest-swagger
Successfully installed Django-3.0.3 PyYAML-5.3 asgiref-3.2.3 django-rest-swagger-0.3.10 djangorestframework-3.11.0 pytz-2019.3 sqlparse-0.3.1
@pradyunsg pradyunsg added type: bug A confirmed bug or unintended behavior state: needs discussion This needs some more discussion resolution: deferred till PR Further discussion will happen when a PR is made labels Mar 3, 2020
@microcat49
Copy link

Could I take this one? I think I know how to fix it.

@uranusjr
Copy link
Member

Go ahead! I’d be very interested in learning what the problem is.

@McSinyx
Copy link
Contributor

McSinyx commented Mar 26, 2020

Unless I miss something (which is likely), I don't think the PyYAML wheel should be logged with Processing at all, i.e. it cannot appear as a (local) file to the preparer. However, there is some sort of black magic happening behind the scene that caused a cached wheel to be seen as a file, and I'm not sure how that only happens to that particular wheel.

@sbidoul
Copy link
Member

sbidoul commented Mar 27, 2020

how that only happens to that particular wheel

Probably because PyYAML has no wheel for linux on PyPI and that cached wheel is the result of a local build.

@sbidoul
Copy link
Member

sbidoul commented Mar 28, 2020

I think this issue is a bit deeper than simply changing the file name displayed. @McSinyx has a point here.

Looking at the code where Processing... is logged:

link = req.link
# TODO: Breakup into smaller functions
if link.scheme == 'file':
path = link.file_path
logger.info('Processing %s', display_path(path))
else:
logger.info('Collecting %s', req.req or req)

We see that it tests if req.link.scheme is a file. When the original link was a direct url, link may point to a local wheel in cache which is why we end up with the cache entry path in the log.

Is it not better to test for req.original_link.scheme == 'file' to diplay Processing ...? In that case the full path makes sense. And it also makes sense to display Collecting... if the original link is not a local file, and then display Using cached wheel ....

It could look like this:

        assert req.link
        link = req.link

        # TODO: Breakup into smaller functions
        if req.original_link and req.original_link.is_file:
            path = req.original_link.file_path
            logger.info('Processing %s', display_path(path))
        else:
            logger.info('Collecting %s', req.req or req)

        if link.is_file and link.is_wheel and link != req.original_link:
            with indent_log():
                logger.info("Using cached wheel %s", link.filename)
                logger.debug("Cached wheel is located at %s", link.file_path)

The user-visible result looks much more coherent like this.

Also I note that I personally sometimes find useful to have the full path of the cached wheel displayed. That could remain as a debug-level log though.

@McSinyx
Copy link
Contributor

McSinyx commented Mar 28, 2020

Since the info is to notice users on what's being carried out, does it make sense to separate the two cases where

  1. The wheel is downloaded and cached
  2. The wheel is build from source distribution, then cached

@uranusjr
Copy link
Member

It does make sense, but the current code base does not make the distinction. It will be some significant refactoring to achieve this 😞

@McSinyx
Copy link
Contributor

McSinyx commented Mar 28, 2020

Is it a welcoming change? I mean not just to solve this particular issue but to improve the codebase in general. To handle GH-825 I may touch many things done before/during dependency resolution.

@uranusjr
Copy link
Member

Code base improvement is definitely welcomed. I would suggest filing the refactorings as a separate PR from the actual feature, so the feature does not unnecessarily block the refactoring.

@sbidoul
Copy link
Member

sbidoul commented Mar 29, 2020

@uranusjr would it make sense to first move the cache check from populate_link to prepare_linked_requirement? That would make it easier to do meaningful logging in prepare_linked_requirement. Relatedly, it seems populate_link (and thus the cache check) is currently not called in the new resolver code path?

BTW there is a significant refactoring of the cache access in #7612 so I hope we can get that one in before going further in this area of the code base.

@uranusjr
Copy link
Member

uranusjr commented Mar 29, 2020

Actually I think it should be moved further out, either when the resolver is sure the link will be installed, or when the link is actually used to install. See #7801 and its linked issues for discussion. The reason is unrelated to this issue, but would incidentally provide a solution 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
resolution: deferred till PR Further discussion will happen when a PR is made state: needs discussion This needs some more discussion type: bug A confirmed bug or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants