-
Notifications
You must be signed in to change notification settings - Fork 430
utf-8 decode bytes inputs for python3 compat + python3 test fix #157
Conversation
Use `isinstance(s, bytes)` to check if the input should be utf-8 decoded before passing it to json.loads. Some places in the code always tried to utf-8 decode the inputs without checking their type (which can cause decoding failures on Python 2 unicode inputs and AttributeErrors on Python 3 strings) while others failed to decode them causing the code paths to fail on Python 3 where HTTP responses are returned as `bytes`.
@@ -272,7 +272,7 @@ def new_from_json(cls, s): | |||
An instance of the subclass of Credentials that was serialized with | |||
to_json(). | |||
""" | |||
if six.PY3 and isinstance(s, bytes): |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
`bytes` was introduced in Python 2.6 as an alias for `str`. `six.binary_type` is equivalent to it, but it makes sense to use the native type instead of a compatibilty libray type when there's no difference in them. `bytes` was already being used in most places, this commit replaces the three remaining uses.
Added a commit to convert the last remaining uses of |
+1 Are there any other considerations around this PR that are keeping it from being merged? I'm completely unable to use Python 3 in my infrastructure solely due to this issue. |
@sullivanmatt Have you updated |
I'm using httplib2 0.9.1 (latest). Here's the error I believe this would resolve (haven't checked out the branch to test, I admit) - >>> access_token = get_access_token_from_cache()
>>> credentials = AccessTokenCredentials(access_token, user_agent="httpclient/1.0")
>>> http = httplib2.Http()
>>> http = credentials.authorize(http)
Traceback (most recent call last):
File "/opt/project-service/project_service/gproject_service/__init__.py", line 160, in _get_history
labelId="INBOX").execute(http=http)
File "/home/msulliv/mb-project-sdk-py3/lib/python3.4/site-packages/oauth2client/util.py", line 137, in positional_wrapper
return wrapped(*args, **kwargs)
File "/home/msulliv/mb-project-sdk-py3/lib/python3.4/site-packages/googleapiclient/http.py", line 722, in execute
body=self.body, headers=self.headers)
File "/home/msulliv/mb-project-sdk-py3/lib/python3.4/site-packages/oauth2client/util.py", line 137, in positional_wrapper
return wrapped(*args, **kwargs)
File "/home/msulliv/mb-project-sdk-py3/lib/python3.4/site-packages/oauth2client/client.py", line 549, in new_request
self.apply(headers)
File "/home/msulliv/mb-project-sdk-py3/lib/python3.4/site-packages/oauth2client/client.py", line 615, in apply
headers['Authorization'] = 'Bearer ' + self.access_token
TypeError: Can't convert 'bytes' object to str implicitly |
@sullivanmatt Your stacktrace doesn't seem to match the code snippet. After running $ python3
Python 3.4.0 (default, Apr 11 2014, 13:05:11)
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from oauth2client.client import AccessTokenCredentials
>>>
>>>
>>> creds_unicode = AccessTokenCredentials(u'foo', user_agent='UA')
>>> headers = {}
>>> creds_unicode.apply(headers)
>>> print(headers)
{'Authorization': 'Bearer foo'}
>>>
>>> creds_bytes = AccessTokenCredentials(b'bar', user_agent='UA')
>>> headers = {}
>>> creds_bytes.apply(headers)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.4/dist-packages/oauth2client/client.py", line 615, in apply
headers['Authorization'] = 'Bearer ' + self.access_token
TypeError: Can't convert 'bytes' object to str implicitly |
For whatever reason I didn't realize this until I saw your reproduction above, but simply doing |
@sullivanmatt Do you mind if we close out. A lot of the changes are just But there are some places where you add explicit decodes: resp, content = http_request(token_revoke_uri)
if isinstance(content, bytes):
content = content.decode('utf-8')
...
def from_json(cls, s):
if isinstance(s, bytes):
s = s.decode('utf-8')
...
if isinstance(content, bytes):
content = content.decode('utf-8')
if resp.status == 200:
# WAS certs = json.loads(content.decode('utf-8'))
certs = json.loads(content) How did you go about determining these were needed? |
I am not the author of this PR (@saaros is), so I can't comment on his methodology. |
D'oh! The passage of time has made me weak and dumb :) |
We've all been there man. Thanks! |
It's to make sure the input to json.loads() is always an unicode string; it's not always clear where the input comes from and whether it's |
I agree with you there. Do you mind breaking this into two PRs: one with the Also, how did you locate all the code paths you edited? |
I just ran |
|
@nathanielmanistaatgoogle I am going through this to see if there is anything we missed in #250 |
Closing since #272 is in |
bytes
inputs