utf-8 decode bytes inputs for python3 compat + python3 test fix #157

saaros · 2015-03-31T14:10:43Z

json loading: always utf-8 decode bytes inputs
tests: use six.moves.urllib for py2/3 compatibility

Use `isinstance(s, bytes)` to check if the input should be utf-8 decoded before passing it to json.loads. Some places in the code always tried to utf-8 decode the inputs without checking their type (which can cause decoding failures on Python 2 unicode inputs and AttributeErrors on Python 3 strings) while others failed to decode them causing the code paths to fail on Python 3 where HTTP responses are returned as `bytes`.

coveralls · 2015-03-31T14:14:16Z

Coverage increased (+0.15%) to 64.95% when pulling ed36586 on saaros:py3-compat into 0a6241c on google:master.

oauth2client/client.py

@@ -272,7 +272,7 @@ def new_from_json(cls, s):
      An instance of the subclass of Credentials that was serialized with
      to_json().
    """
-    if six.PY3 and isinstance(s, bytes):


saaros · 2015-04-01T07:48:02Z

This PR fixes #150 which also had a proposed but slightly broken fix in #152.

`bytes` was introduced in Python 2.6 as an alias for `str`. `six.binary_type` is equivalent to it, but it makes sense to use the native type instead of a compatibilty libray type when there's no difference in them. `bytes` was already being used in most places, this commit replaces the three remaining uses.

saaros · 2015-04-07T09:22:27Z

Added a commit to convert the last remaining uses of six.binary_type to bytes which was already being used in most places.

coveralls · 2015-04-07T09:24:58Z

Coverage increased (+0.13%) to 64.93% when pulling 3584b07 on saaros:py3-compat into 0a6241c on google:master.

sullivanmatt · 2015-05-19T03:09:30Z

+1

Are there any other considerations around this PR that are keeping it from being merged? I'm completely unable to use Python 3 in my infrastructure solely due to this issue.

dhermes · 2015-05-19T19:44:28Z

@sullivanmatt Have you updated httplib2? Is it still an issue with the latest httplib2 (which now does support Python 3 correctly, and fixed many issues here)?

sullivanmatt · 2015-05-19T20:30:51Z

I'm using httplib2 0.9.1 (latest).

Here's the error I believe this would resolve (haven't checked out the branch to test, I admit) -

>>> access_token = get_access_token_from_cache()
>>> credentials = AccessTokenCredentials(access_token, user_agent="httpclient/1.0")
>>> http = httplib2.Http()
>>> http = credentials.authorize(http)
Traceback (most recent call last):
  File "/opt/project-service/project_service/gproject_service/__init__.py", line 160, in _get_history
    labelId="INBOX").execute(http=http)
  File "/home/msulliv/mb-project-sdk-py3/lib/python3.4/site-packages/oauth2client/util.py", line 137, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/home/msulliv/mb-project-sdk-py3/lib/python3.4/site-packages/googleapiclient/http.py", line 722, in execute
    body=self.body, headers=self.headers)
  File "/home/msulliv/mb-project-sdk-py3/lib/python3.4/site-packages/oauth2client/util.py", line 137, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/home/msulliv/mb-project-sdk-py3/lib/python3.4/site-packages/oauth2client/client.py", line 549, in new_request
    self.apply(headers)
  File "/home/msulliv/mb-project-sdk-py3/lib/python3.4/site-packages/oauth2client/client.py", line 615, in apply
    headers['Authorization'] = 'Bearer ' + self.access_token
TypeError: Can't convert 'bytes' object to str implicitly

dhermes · 2015-05-20T18:51:24Z

@sullivanmatt Your stacktrace doesn't seem to match the code snippet.

After running pip3 install -U oauth2client (which also updates the dependencies) and running a related snippet with both bytes and unicode I can reproduce the failure

$ python3
Python 3.4.0 (default, Apr 11 2014, 13:05:11) 
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from oauth2client.client import AccessTokenCredentials
>>> 
>>> 
>>> creds_unicode = AccessTokenCredentials(u'foo', user_agent='UA')
>>> headers = {}
>>> creds_unicode.apply(headers)
>>> print(headers)
{'Authorization': 'Bearer foo'}
>>> 
>>> creds_bytes = AccessTokenCredentials(b'bar', user_agent='UA')
>>> headers = {}
>>> creds_bytes.apply(headers)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.4/dist-packages/oauth2client/client.py", line 615, in apply
    headers['Authorization'] = 'Bearer ' + self.access_token
TypeError: Can't convert 'bytes' object to str implicitly

sullivanmatt · 2015-05-22T12:46:03Z

For whatever reason I didn't realize this until I saw your reproduction above, but simply doing access_token.decode('utf-8') was sufficient to get around the issue for the time being.

dhermes · 2015-05-26T15:33:06Z

@sullivanmatt Do you mind if we close out. A lot of the changes are just bytes "grammar" changes.

But there are some places where you add explicit decodes:

    resp, content = http_request(token_revoke_uri)
    if isinstance(content, bytes):
      content = content.decode('utf-8')
...
  def from_json(cls, s):
    if isinstance(s, bytes):
      s = s.decode('utf-8')
...
  if isinstance(content, bytes):
    content = content.decode('utf-8')
  if resp.status == 200:
    # WAS certs = json.loads(content.decode('utf-8'))
    certs = json.loads(content)

How did you go about determining these were needed?

sullivanmatt · 2015-05-26T15:38:21Z

I am not the author of this PR (@saaros is), so I can't comment on his methodology.

dhermes · 2015-05-26T15:50:00Z

D'oh! The passage of time has made me weak and dumb :)

sullivanmatt · 2015-05-26T15:51:12Z

We've all been there man. Thanks!

saaros · 2015-05-26T17:15:18Z

It's to make sure the input to json.loads() is always an unicode string; it's not always clear where the input comes from and whether it's bytes or unicode. Having if isinstance(s,bytes): s = s.decode("utf-8") in every place that loads json isn't pretty, though. A function in util.py to do it would probably look cleaner.

dhermes · 2015-05-26T17:29:55Z

I agree with you there. Do you mind breaking this into two PRs: one with the bytes grammar fix and the other with the actual code path fixes. (Or I can do it and CC you on the PRs.)

Also, how did you locate all the code paths you edited?

saaros · 2015-05-26T17:35:51Z

I just ran git grep to find them after I ran into various problems using this library on Python 3. Do you want the commit 3584b07 with the six.binary_type -> bytes change in a PR of its own leaving the other two ("fix urllib tests on py3" and "utf8-decode before json.loads") in this PR?

dhermes · 2015-05-26T17:42:46Z

The bytes grammar stuff I'm referring to is removing six.PY3 and just using bytes instead of six.binary_type.
The "fix urllib tests on py3" is somewhat moot since App Engine only supports Py27
utf8-decode before json.loads is fine in this PR but seems worth just closing and opening a new one given your suggestion for moving the functionality to utils.

dhermes · 2015-08-14T17:19:23Z

@nathanielmanistaatgoogle I am going through this to see if there is anything we missed in #250

dhermes · 2015-08-19T04:07:08Z

Closing since #272 is in

saaros added 2 commits March 31, 2015 17:07

tests: use six.moves.urllib for py2/3 compatibility

ed36586

googlebot added the cla: yes label Mar 31, 2015

nathanielmanistaatgoogle reviewed Mar 31, 2015
View reviewed changes

dhermes mentioned this pull request Aug 14, 2015

Unifying all conversions to bytes. #250

Merged

dhermes mentioned this pull request Aug 14, 2015

Adding _from_bytes helpers as a foil for _to_bytes. #272

Merged

dhermes closed this Aug 19, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

utf-8 decode bytes inputs for python3 compat + python3 test fix #157

utf-8 decode bytes inputs for python3 compat + python3 test fix #157

saaros commented Mar 31, 2015

coveralls commented Mar 31, 2015

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

saaros commented Apr 1, 2015

saaros commented Apr 7, 2015

coveralls commented Apr 7, 2015

sullivanmatt commented May 19, 2015

dhermes commented May 19, 2015

sullivanmatt commented May 19, 2015

dhermes commented May 20, 2015

sullivanmatt commented May 22, 2015

dhermes commented May 26, 2015

sullivanmatt commented May 26, 2015

dhermes commented May 26, 2015

sullivanmatt commented May 26, 2015

saaros commented May 26, 2015

dhermes commented May 26, 2015

saaros commented May 26, 2015

dhermes commented May 26, 2015

dhermes commented Aug 14, 2015

dhermes commented Aug 19, 2015

utf-8 decode bytes inputs for python3 compat + python3 test fix #157

utf-8 decode bytes inputs for python3 compat + python3 test fix #157

Conversation

saaros commented Mar 31, 2015

coveralls commented Mar 31, 2015

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

saaros commented Apr 1, 2015

saaros commented Apr 7, 2015

coveralls commented Apr 7, 2015

sullivanmatt commented May 19, 2015

dhermes commented May 19, 2015

sullivanmatt commented May 19, 2015

dhermes commented May 20, 2015

sullivanmatt commented May 22, 2015

dhermes commented May 26, 2015

sullivanmatt commented May 26, 2015

dhermes commented May 26, 2015

sullivanmatt commented May 26, 2015

saaros commented May 26, 2015

dhermes commented May 26, 2015

saaros commented May 26, 2015

dhermes commented May 26, 2015

dhermes commented Aug 14, 2015

dhermes commented Aug 19, 2015