Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass through bytes in path and querystring #304

Merged
merged 18 commits into from
Jan 15, 2021
Merged

Pass through bytes in path and querystring #304

merged 18 commits into from
Jan 15, 2021

Conversation

twm
Copy link
Contributor

@twm twm commented Nov 30, 2020

Per discussion on #303, arbitrary bytes in the path and query should pass through even if they aren't UTF-8 encoded. The params argument should also accept arbitrary bytes, rather than just ASCII.

The approach here is to use hyperlink.EncodedURL (a.k.a. hyperlink.URL) rather than DecodedURL. The latter has strong UTF-8 opinions. treq continues to accept either type (or bytes or str) as the url argument.

Per discussion on #303, bytes should pass through even if they aren't
UTF-8 encoded.
@twm twm mentioned this pull request Dec 23, 2020
@twm twm force-pushed the nonutf8-url-params branch from 9d87c5a to 777d97b Compare December 29, 2020 07:19
@twm twm requested review from wsanchez and a team January 11, 2021 07:39
Copy link
Member

@adiroiban adiroiban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Only minor comments.
Please check the comments, update as you link and then merge.

Thanks!

@@ -162,7 +206,7 @@ def test_request_tuple_query_param_coercion(self):
"""
self.client.request('GET', 'http://example.com/', params=[
(u'text', u'A\u03a9'),
(b'bytes', ['ascii']),
(b'bytes', ['ascii', b'\x00\xff\xfb']),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# A value used to test that it is never encoded or decoded.
# It should be invalid UTF-8 or UTF-32 (at least).
raw_bytes = b'\x00\xff\xfb'

```suggestion
            (b'bytes', ['ascii', raw_bytes]),


.. literalinclude:: examples/query_params.py
:linenos:
:lines: 7-37

Full example: :download:`query_params.py <examples/query_params.py>`

If you prefer a strictly-typed API, try :class:`hyperlink.DecodedURL`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be hyperlink.EncodedURL ?

I think that examples/query_params.py or inline here, add an example of using hyperlink.EncodedURL together with treq.

from hyperlink import EncodedURL

resp = yield treq.get(EncodedURL('https://httpbin.org/get?raw-value-here'))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It very much should not be EncodedURL. IMO that API is dangerous because it doesn't handle percent-encoding for you. If you use it with untrusted input you'll get security injection vulnerabilities! Sometimes you need that lower-level interface, but it's not something I'd want to suggest. If you need it, you'll know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I think this question speaks to the way that EncodedURL and DecodedURL are confusing names. Particularly since URL is EncodedURL and so seems like the default.)

docs/examples/query_params.py Outdated Show resolved Hide resolved
@twm
Copy link
Contributor Author

twm commented Jan 15, 2021

Thank you very much for the review, @adiroiban! I'll merge and release shortly.

@twm twm merged commit 96870c3 into master Jan 15, 2021
@twm twm deleted the nonutf8-url-params branch January 15, 2021 07:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Klein #339 Can't make request with non-ascii/utf8 query params Update "Query Parameters" howto
2 participants