Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support chunked transfer encoding #1198

Merged
merged 13 commits into from
Nov 28, 2017
Merged

Support chunked transfer encoding #1198

merged 13 commits into from
Nov 28, 2017

Conversation

jan-auer
Copy link
Contributor

@jan-auer jan-auer commented Nov 27, 2017

This PR introduces support for requests with chunked transfer encoding to WSGIRequestHandler by wrapping the input in a DechunkedInput. Chunks are unrolled in the following way

  • Read a line to get the chunk's size in bytes
  • Read the chunk data, but only to the length of the buffer
  • Reduce the chunk size by the number of bytes read
  • At the end of the chunk, skip the chunk's final newline

Once a chunk of size zero is encountered, a final newline is read, and then the input is marked as done. All further requests to readinto will exit early.

There is only a test for the success case. It needs to issue a manual HTTP to achieve a proper chunked multipart payload.

Note: For some reason, werkzeug crashes when parsing the response body if a Content-Length header is present. If you would like this fixed in this PR too, let me know. RFC 2616, Section 4.4 states that the Content-Length header must be ignored with any Transfer-Encoding other than identity.

Fixes #1149

@davidism
Copy link
Member

Thanks for working on this!

@davidism
Copy link
Member

There were some issues with our tests on Travis, you'll need to merge in master to fix the unrelated failing tests.

CONTRIBUTING.rst Outdated
@@ -54,7 +54,7 @@ Install Werkzeug in editable mode::

Install the minimal test requirements::

pip install pytest pytest-xprocess requests
pip install pytest pytest-xprocess requests six

This comment was marked as off-topic.

This comment was marked as off-topic.

@davidism davidism requested a review from mitsuhiko November 27, 2017 19:55
@mitsuhiko
Copy link
Contributor

Generally looks good. Needs some fixes for the test failures and I will look into what uses the content length currently.

@davidism
Copy link
Member

davidism commented Nov 27, 2017

Hold off merging until I roll back the changes mentioned in #1149 (#1126), since they're related.

 - Try to read multiple chunks until the buffer is filled
 - Add comments to make control flow more clear
 - Add doc comment for DechunkedInput
@jan-auer
Copy link
Contributor Author

Pushed some updates:

  • The DechunkedInput read only one chunk at a time and then immediately returned the buffer. Now it tries to fill the buffer and put as many chunks (or parts of it) as possible
  • Also added some comments and a docstring to make the control flow more clear. It's still a bit weird on first sight, but someone familiar with the spec should understand it easily
  • Fixed the bug where the Content-Length header crashed the form parser. The problem is that the content length header includes chunk headers and newlines, but the payload returned by DechunkedInput does not. The form parser, however, wraps the input in a LimitedStream if a Content-Length header is present, which asserts they are the same. By removing the content length, the form parser does not apply LimitedStream anymore and the problem goes away. Also, this is more spec compliant.
  • Replaced six by an explicit import based on the python version. Could have put it in _compat like suggested, but since httplib/http.client is only used in the test, it's easer this way.
  • Finally, the test had some unused code which is removed now.

@@ -38,7 +38,7 @@
def default_stream_factory(total_content_length, filename, content_type,
content_length=None):
"""The stream factory that is used per default."""
if total_content_length > 1024 * 500:
if total_content_length is None or total_content_length > 1024 * 500:

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@davidism
Copy link
Member

Reverted #1126 in #1200, please merge master. Still working on unrelated test failures.

@jan-auer
Copy link
Contributor Author

Alright, here you go. FWIW, I also ran some manual tests with uwsgi 2.0.15 and master in both python 2.7 and 3.6 and it seemed to work just fine.

Just one thing I'm still wondering about is whether we need to take care of infinite streams in DechunkedInput, or is something else already dealing with that?

@davidism
Copy link
Member

This shouldn't affect other WSGI server support of chunked encoding, but thanks for testing it.

#1126 was supposed to handle infinite streams by setting a hard limit on the content read in, but that was based on some incorrect assumptions from me. I can't recall any special handling in Gunicorn from when I was looking a while ago, but might be good to check.

@davidism
Copy link
Member

I merged master again and fixed some 2.6 and style issues. Travis should pass now.

@jan-auer
Copy link
Contributor Author

Whoop, it's all green! 🎉

@davidism
Copy link
Member

Would you mind looking at how this behaves with infinite streams (not chunked, but no content length) versus how Gunicorn behaves?

@jan-auer
Copy link
Contributor Author

Sure 👍 Please let me know if you or @mitsuhiko have a perspective on this, though.

@mitsuhiko
Copy link
Contributor

I will merge this in the current state now. I think we can look at the edge cases once we see how others are doing it. This is basically just the local dev setup anyways.

@ghost ghost mentioned this pull request Oct 19, 2018
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 13, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Chunked requests still don't really work
3 participants