-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
utils: decode requirement files according to their BOM if present #3485
Conversation
LGTM. Remarkably close to what I have in my local checkout at the moment. But cleaner :-) |
Sorry to see we worked on the same thing :-/ |
https://docs.python.org/2/library/shlex.html
|
No problem. I pray for the day when we can drop 2.6 support :-( |
Well I tried being more forgiving and encoded to utf8 before I'm trying to add some code cleaning (always make |
3b1dde2
to
667593c
Compare
assert auto_decode(data) == "Django==1.4.2" | ||
|
||
def test_auto_decode_no_bom(self): | ||
data = b"foobar" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decoding text produces text, not bytes
6648558
to
6a87e0c
Compare
6a87e0c
to
dafe020
Compare
Is this still a WIP? Or ready to go? |
dafe020
to
0b01167
Compare
Ok, from a simple
I could reproduce the issue of #1441 with pip 1.5.0. |
locale.getpreferredencoding(False) is not always utf8
and hope pypa#1441 is behind us
0b01167
to
6cc6f7b
Compare
utils: decode requirement files according to their BOM if present
Hi, this change causes error during reading requirements.txt encoded utf-8 without BOM.
|
It looks like pip should be calling Or maybe we could accept this kind of header: |
I've implemented belt and suspenders in #3547 |
For the simple cases, I think pip should treat the requirements file as a text file (i.e. open it in the system default encoding), with BOM detection being a useful convenience for Windows users whose tools have a tendency to use things like UTF-16 with BOM in spite of the default encoding. I'm -0.5 on defaulting to UTF-8, as Windows tools need extra effort to specify UTF-8, so we'll likely just end up with Windows users complaining that pip isn't handling their requirements files properly. I don't know what incantations are needed with For cross-platform requirements files, an encoding header seems like a plausible approach. (And I see that while I've been writing this response, you've created a PR implementing this. Looks like a good solution to me, go for it (although there's a text/bytes problem I've commented on in the PR). |
Work In Progrees, should fix #2865