-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Streaming #1791
WIP: Streaming #1791
Conversation
Status update: rewriting the server with async instead of a lot of callbacks made the code a lot easier to work with and actually faster despite also removing the MagicHttp stack (because it doesn't support async). The current pure Python parser pushes 50000 req/s where the master branch only does 44000 req/s. Quite a bit of work remains to be done, especially if all corner cases are to be covered. A new parser means that it is very difficult to make it behave identically to the old one. At this point I've got 6 failing tests and 636 passing. |
Wow! You took out |
Codecov Report
@@ Coverage Diff @@
## master #1791 +/- ##
==========================================
- Coverage 92.17% 90.09% -2.08%
==========================================
Files 23 24 +1
Lines 2312 2525 +213
Branches 424 475 +51
==========================================
+ Hits 2131 2275 +144
- Misses 141 190 +49
- Partials 40 60 +20
Continue to review full report at Codecov.
|
So ... Since we might be moving away from |
Agreed, we definitely need to comprehensive testing of request handling. There are also interesting cases around streaming, like if a handler finishes its response before request body is even (fully) sent. I'd appreciate it if someone else was willing to write tests because I'll probably overlook the assumptions done in my implementation. Next milestones:
|
…nger be supported. Chunked mode is now autodetected, so do not put content-length header if chunked mode is preferred.
…), all requests made streaming. A few compatibility issues and a lot of cleanup to be done remain, 16 tests failing.
All responses made streaming now, too. That was quite a bit more work than anticipated, and there is still work to be done but this is taking good shape now, and doesn't break existing interfaces much. Looks like this could be aimed for Sanic 20.06 release, possibly with some deprecations to prepare for it in 20.03. Next milestones:
|
Awesome work, @Tronic ! Regarding the parser (yet), I see that you already transferred most of the logic to the |
@vltr The accessing protocol members from Http is only a temporary workaround until tests are passing and the code can be cleaned up some more. I also considered sans-I/O implementation but async I/O calls allow for automatic flow control without extra tasks for copying data, and presumably higher performance. After cleanup you should be able to insert mock functions |
…timeout=<value> which wasn't the case with existing implementation, and secondly none of the other web servers I tried include this header.
For some reason on Windows the payload too large test fails because the test client cannot read the response. I suspected a deadlock between writing request and reading response blocking but increasing buffer sizes didn't help, and I can't currently think of other reasons why this might be happening. |
I've traced how that goes: the HTTP handler function writes its response to asyncio protocol and exits cleanly, after which transport.close() is called. According to docs this function should write out any pending data before actually closing. My bet is that this failure occurs because httpx is still trying to send the request and doesn't notice that there is a response coming. Adding a delay allows for the request to flow into Sanic's receive buffer, after which httpx manages to read the response without any error. I do wonder how this is supposed to be solved... |
Yep, confirmed working, although that is not really a fix, just a hack around httpx in this particular test where the request happens to fit in network buffers despite being larger than request maximum size. |
Deadlocks are possible in bidirectional streaming handlers if the HTTP client does not support streaming but insists on sending all of request before reading any response. This will lead to request timeout but Sanic could also detect when this happens and issue a proper diagnostic. Unfortunately the problem itself cannot be avoided but it is up to the application whether or not to implement this kind of handlers. This does not affect:
Non-streaming clients fail:
The problems are not new to this branch, and also occur with any other HTTP server implementing this functionality. Notably, browsers don't do bidirectional streaming but they have large network buffers to avoid any issues with moderately sized bodies. Curl, Nginx, Express and others implement proper bidirectional streaming. |
@Tronic Can you squash these commits? |
@ahopkins You mean rebase and force-push? And all of them or just the merge commits? There's further work to be done on this prior to merge to master, so probably such merging should only be done then; and I believe this can be done more cleanly in Github UI while merging. |
Since the new policy is not to introduce any breaking changes in .09 and .12 releases, I guess this should target 20.06 (due to the delay with 20.03 I would been more comfortable with 20.09). To make that happen, I'll need help with proper code review, API discussion and writing comprehensive tests. Is anyone up for that? Also, can I expect 20.06 to be delayed at least until the end for June, or better yet, July? |
I think this is a big enough change that as many @huge-success/sanic-core-devs thay put their eyes on this and chime in the better. |
Closing PR as development is moved to huge-success/sanic:streaming |
This is an effort the clean up server code and make streaming requests and responses first class citizens, with non-streaming implemented on top of them instead of having duplicate code and separate modes for them.
So far request streaming is looked at, and it seems to work fairly well. 7 tests fail, which might be due to expectation that all of request body would be read prior to sending response because the new code sends a response as soon as possible.
A possible bug discovered: I think the old code would get confused if more than one request was sent without waiting for the prior ones be responded first. Additionally, such parallel streaming responses could get entirely mixed up. Although pipelining in such a way is non-standard, it should now be fixed.
This is NOT complete or meant to be merged yet, but the idea is to finish the request handling, and then try to attempt streaming responses without callbacks as prototyped earlier in the Trio branch.
Testing for the advantageous:
Bug reports resolved in this:
Fix #1722
Fix #1730
Fix #1785
Fix #1790
Fix #1804