-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for python3 http client when accept-encoding is gzip #5
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This is a good catch, it would be nice to tidy this up. I've made a couple of comments here but if you're happy to fix those up I'm definitely interested in including this.
src/targets/python/python3.js
Outdated
|
||
|
||
// Decode response | ||
if (headers['accept-encoding'] == 'gzip') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is close, but it's not quite right I'm afraid.
In this code you're checking the request headers, but that doesn't mean that the response is actually using gzip. accept-encoding
just means that client told the server it could use gzip - there's lots of cases where the server won't actually use gzip in the response despite this.
To do this correctly, we'll need to check the content-encoding
value in the response headers, at runtime, so outputting this if
into the Python code itself. Does that make sense? I think we do only need to do that if accept-encoding
is set (and that'll help to keep the snippets simpler in the cases where this isn't required.
Do you want to take a look at that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it makes sense to check the content-encoding
in the response. i will change to check for encoding in response headers. But in the case there is no response and accept-encoding
is gzip, we will have to decide whether to decompress the response or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, no, I don't mean the response in the HAR! I mean the actual response the Python code receives when you run the snippet. There will always be a response, because the Python code has to wait for the response before it does the check & decodes the body.
Basically, if there is an accept-encoding header set, then we need to output python code like:
if res.headers['content-encoding'] == 'gzip':
print(gzip.decompress(data).decode("utf-8"))
else:
print(data.decode("utf-8"))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see. this makes more sense. i will add the check. I think the response data should still be kept while parsing the HAR file. There might be use of it in the future.
doing runtime check also means import gzip
will be added to every script. might need to change test cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pimterry content-encoding check should be good now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay. Yes - the check looks good now! I think that approach will work nicely.
But the new change does mean that we change the output for every case, and it's a bit of extra noise & confusion that most requests won't need...
I think we should add this only if there is an accept-encoding request header that contains gzip
. Does that make sense? That would keep all the other test cases the same, but add this only when we might get a gzip result. Can you add that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added and updated test case
c38aac2
to
26b2e46
Compare
nice, |
dd795bd
to
8432fac
Compare
Yep! I've simplified that a little, updated all the other tests, added support for a few other languages in here too, and squashed all these commits down a bit. Now merged! Thanks for this 👍. I'll pull it through into HTTP Toolkit itself shortly. |
Overview
Added handling of gzip encoded response for python3 http client.
Problem
Current framework generates python3 code without checking the encoding of the response and handles the network response by this line of code:
print(data.decode("utf-8"))
This always results in UnicodeDecodeError when running the generated script.
Solution
This can be fixed by decompressing the gzip data:
print(gzip.decompress(data).decode("utf-8"))