Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better identification of broken Distribution objects #508

Open
henryiii opened this issue Oct 1, 2024 · 3 comments
Open

Better identification of broken Distribution objects #508

henryiii opened this issue Oct 1, 2024 · 3 comments

Comments

@henryiii
Copy link
Contributor

henryiii commented Oct 1, 2024

Currently, I'm getting an error on Python 3.8 and 3.9 in pypa/build#820:

docker run --rm -it python:3.9 bash
pip install tox
git clone https://github.com/pypa/build
cd build
tox -e py39 -- -k test_metadata_path_no_prepare -v
...
File "/build/.tox/py39/lib/python3.9/site-packages/importlib_metadata/compat/py39.py", line 23, in normalized_name
    return Prepared.normalize(getattr(dist, "name", None) or dist.metadata['Name'])
  File "/build/.tox/py39/lib/python3.9/site-packages/importlib_metadata/__init__.py", line 889, in normalize
    return re.sub(r"[-_.]+", "-", name).lower().replace('-', '_')
  File "/usr/local/lib/python3.9/re.py", line 210, in sub
    return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object

The problem is dist.__dict__={'_path': PosixPath('/build/tests/packages/test-no-prepare/test_no_prepare.egg-info')}. I think this is tripping up on tests/packages/test-no-prepare/test_no_prepare.egg-info/ and the local backend. But I don't know what updated to cause this to start happening.

Regardless of the solution, though, I think the handling here for a missing name should be better, there wasn't any useful info in the error message to tell me about the dist that was failing. I had to add print(f"{dist.__dict__=}") to see it. Or maybe the normalize name could return None, and let the failure happen elsewhere.

@jaraco
Copy link
Member

jaraco commented Oct 6, 2024

I believe this issue is fixed in importlib_metadata >= 8, where a missing Name key will raise a KeyError.

Add another line to the repro, I am able to reproduce the issue.

git checkout a73ecbdf16d8a8abb44cbbe95e9ab5f8f2a7c9b9

And with the issue reproduced, I still see the TypeError under importlib_metadata 8.5.0.

@jaraco
Copy link
Member

jaraco commented Oct 6, 2024

Unfortunately, the repro doesn't provide a very good understanding of the factors that lead to the missed expectation, as pyproject-hooks encapsulates all of the behavior. I don't know a good way to debug that behavior.

I did find that importlib_metadata is in fact raising the appropriate KeyError in the test environment:

 build a73ecbd 🐚 .tox/py39/bin/python -q
>>> import importlib_metadata
>>> dist, = importlib_metadata.distributions(path=['tests/packages/test-no-prepare'])
>>> dist.name
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jaraco/code/pypa/build/.tox/py39/lib/python3.9/site-packages/importlib_metadata/__init__.py", line 504, in name
    return self.metadata['Name']
  File "/Users/jaraco/code/pypa/build/.tox/py39/lib/python3.9/site-packages/importlib_metadata/_adapters.py", line 54, in __getitem__
    raise KeyError(item)
KeyError: 'Name'
>>> dist.metadata['Name']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jaraco/code/pypa/build/.tox/py39/lib/python3.9/site-packages/importlib_metadata/_adapters.py", line 54, in __getitem__
    raise KeyError(item)
KeyError: 'Name'

@jaraco
Copy link
Member

jaraco commented Oct 6, 2024

Oh, right. I remember now - the issue stems from some other provider producing importlib.metadata.Distribution objects... and it's those objects that will return None for missing keys (violating the protocol).

In #486, we're working on addressing that concern (where older interfaces are inadequate for newer behaviors).

I'm still not sure that issue would address the concern reported here, which is that the implicated distribution (bad metadata) isn't apparent. Even after a KeyError is (more correctly) raised, it won't include which Dist was impacted. There is, however, there is an importlib_metadata.diagnose script that's designed to help diagnose a broken environment. It doesn't help much for a case like the one reported here, where the environment is broken inside an opaque build tool.

We could as you suggest wrap the normalize_name call in a special exception that includes the details about the affected dist. And while that might address this immediate need, it implies that basically any unhandled exception in any operation on a dist should somehow wrap it to reveal which dist is relevant. I don't think that's viable.

Or maybe the normalize name could return None, and let the failure happen elsewhere.

I'm disinclined to go that route. A lot of the interfaces are working from the assumption that the dists are valid. I'd like to continue to rely on that assumption and not create another space for "null" distributions (or names or similar).


Another option could be to emit a warning whenever a broken distribution is encountered. Maybe it's opt-in so the warnings are only emitted on demand (for performance reasons).

@jaraco jaraco changed the title Improve error for missing name in Python 3.8-3.9 compat module Better identification of broken Distribution objects Oct 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants