Skip to content

Commit

Permalink
PEP 427 mandates UTF-8, we don't need the fallback
Browse files Browse the repository at this point in the history
  • Loading branch information
uranusjr committed Aug 3, 2020
1 parent d4995cb commit a12e2f1
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 19 deletions.
4 changes: 2 additions & 2 deletions news/8684.bugfix
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
Use the same encoding logic from Python 3 to handle ZIP archive entries on
Python 2, so non-ASCII paths can be resolved as expected.
Use UTF-8 to handle ZIP archive entries on Python 2 according to PEP 427, so
non-ASCII paths can be resolved as expected.
21 changes: 4 additions & 17 deletions src/pip/_internal/operations/install/wheel.py
Original file line number Diff line number Diff line change
Expand Up @@ -425,23 +425,10 @@ def _getinfo(self):
# type: () -> ZipInfo
if not PY2:
return self._zip_file.getinfo(self.src_record_path)

# Python 2 does not expose a way to detect a ZIP's encoding, so we
# "guess" with the heuristics below:
# 1. Try encoding the path with UTF-8.
# 2. Check the matching info's flags for language encoding (bit 11).
# 3. If the flag is set, assume UTF-8 is correct.
# 4. If any of the above steps fails, fallback to getting an info with
# CP437 (matching Python 3).
try:
arcname = self.src_record_path.encode("utf-8")
info = self._zip_file.getinfo(arcname)
if info.flag_bits & 0x800:
return info
except (KeyError, UnicodeEncodeError):
pass
arcname = self.src_record_path.encode("cp437")
return self._zip_file.getinfo(arcname)
# Python 2 does not expose a way to detect a ZIP's encoding, but the
# wheel specification (PEP 427) explicitly mandates that paths should
# use UTF-8, so we assume it is true.
return self._zip_file.getinfo(self.src_record_path.encode("utf-8"))

def save(self):
# type: () -> None
Expand Down

0 comments on commit a12e2f1

Please sign in to comment.