Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support bodies with unicode #33

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

obruns
Copy link

@obruns obruns commented Feb 10, 2015

Hi @peritus,

please consider these fixes regarding Unicode in request/response bodies for merging.
It seems likely that this PR fixes the issues mentioned by @jinlxz in #18 (comment).

Btw, while debugging this I found that the special treatment regarding 'Content-Type' originally
added in 7cf67c8 for POST and in 04c9552 for PUT may have become superfluous.

Thanks
Oliver

Keep the symmetry with set_request_body() which encodes the unicode
object (Robot Framework defaults to unicode for all strings) into an
instance of str by decoding the response into an instance of unicode in
get_response_body().

Failure to do so causes UnicodeDecodeErrors in all keywords that compare
the response body against a given string if any of the two contains
Unicode characters.
Robot Framework defaults to unicode objects for all strings created. As
a result, httplib.py:848 has unicode += str (msg += message_body) which
crashes with UnicodeDecodeError when the body contains Unicode
characters:

  File "/usr/lib64/python2.7/httplib.py", line 1001, in request
    self._send_request(method, url, body, headers)
  File "/usr/lib64/python2.7/httplib.py", line 1035, in _send_request
    self.endheaders(body)
  File "/usr/lib64/python2.7/httplib.py", line 997, in endheaders
    self._send_output(message_body)
  File "/usr/lib64/python2.7/httplib.py", line 848, in _send_output
    msg += message_body
  UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 32: ordinal not in range(128)

A similar stack trace has been posted as part of Python bug #11898 [1]
(duplicated by #12398 [2]) which has later been closed as a programming
error.

Both robotframework-httplibrary and webtest treat the 'Content-Type'
header special which results in surprising behavior regarding the tests:
Had the 'Accept' header been removed from the provided test case no
UnicodeDecodeError would have been thrown in httplib.py:848.

Turning unicode objects into str objects is the right thing to do
according to the HTTP standard ([3], updated by [5]) because it requires
conformity to MIME [5] (hint by [6]).

[1] http://bugs.python.org/issue11898
[2] http://bugs.python.org/issue12398
[3] https://tools.ietf.org/html/rfc2616
[4] https://tools.ietf.org/html/rfc2047
[5] https://tools.ietf.org/html/rfc7230
[6] https://stackoverflow.com/a/5426648
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant