Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request Streaming is extremely slow #1587

Closed
Leonidimus opened this issue May 22, 2019 · 11 comments · Fixed by #1600
Closed

Request Streaming is extremely slow #1587

Leonidimus opened this issue May 22, 2019 · 11 comments · Fixed by #1600

Comments

@Leonidimus
Copy link

Leonidimus commented May 22, 2019

Describe the bug
I created a simple POC app to accept streaming binary data. The syntax is compliant with the official docs. The app works as expected but the response time is more than 1000 ms for even small data sizes (1.1K). In the snippet below tried uploading 2 files - tiny.txt (57B) and small.txt (1137B). File tiny.txt took 0.021s and small.txt took 1.026s on average. Testing was done on the same host via a loopback, so no network delay involved. I think the issue is cause by Sanic not responding with 100-continue so the client wastes time waiting for it.

Code snippet
Source code:

from sanic import Sanic
from sanic.response import stream, text

app = Sanic('request_stream')

async def detect_handler(request):
    result = bytes()
    while True:
        body = await request.stream.read()
        if body is None:
            break
    return text('done waiting!')

app.add_route(detect_handler, '/api/v1/detect', methods=['POST'], stream=True)

if __name__ == '__main__':
    app.run(host='127.0.0.1', port=8000)

curl testing with Sanic:

$ time curl -vvv -H "Transfer-Encoding: chunked"  --data-binary @small.txt  http://127.0.0.1:8000/api/v1/detect
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8000 (#0)
> POST /api/v1/detect HTTP/1.1
> Host: 127.0.0.1:8000
> User-Agent: curl/7.54.0
> Accept: */*
> Transfer-Encoding: chunked
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
* Done waiting for 100-continue
< HTTP/1.1 200 OK
< Connection: keep-alive
< Keep-Alive: 5
< Content-Length: 13
< Content-Type: text/plain; charset=utf-8
<
* Connection #0 to host 127.0.0.1 left intact
done waiting!
real	0m1.028s

curl testing with a similar Golang/go-chi based app that returns 100-continue:

*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8000 (#0)
> POST /detect HTTP/1.1
> Host: 127.0.0.1:8000
> User-Agent: curl/7.54.0
> Accept: */*
> Transfer-Encoding: chunked
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
< HTTP/1.1 200 OK
< Date: Wed, 22 May 2019 22:40:40 GMT
< Content-Length: 38
< Content-Type: text/plain; charset=utf-8
<
{
    "predictions": [0.911111]
* Connection #0 to host 127.0.0.1 left intact
}
real	0m0.040s

Expected behavior
I'd expect all small POSTs to take less than 50 ms

Environment (please complete the following information):

  • OS: MacOS High Sierra 10.13.6, Python 3.7.3
  • Sanic Version 19.3.1
@ahopkins
Copy link
Member

which version of Sanic did you run this on?

@Leonidimus
Copy link
Author

Sanic version 19.3.1. As was mentioned in #1535, the correct behavior would be to honor client's request for 100-Continue. Python's requests lib doesn't expect 100-continue, but curl does and there might be other clients which do. I'm also going to test with JavaScript request lib.

@ahopkins
Copy link
Member

@Leonidimus Sorry, I didn't see you had provided the details in your original post. I edited it to move the ``` so that it would format properly.

I am taking a look at this also as it relates to ASGI #1475.

@ahopkins
Copy link
Member

ahopkins commented May 23, 2019

My machine:

Python 3.7.3
5.1.3-arch1-1-ARCH
AMD Ryzen 5 2600 Six-Core Processor

My results when I run your same code. Granted I am not sure how big your small.txt is. Mine is only 13B.

time curl -vvv -H "Transfer-Encoding: chunked"  --data-binary @small.txt  http://127.0.0.1:8000/api/v1/detect
Warning: Couldn't read data from file "small.txt", this makes an empty POST.
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8000 (#0)
> POST /api/v1/detect HTTP/1.1
> Host: 127.0.0.1:8000
> User-Agent: curl/7.64.1
> Accept: */*
> Transfer-Encoding: chunked
> Content-Type: application/x-www-form-urlencoded
> 
* upload completely sent off: 5 out of 0 bytes
< HTTP/1.1 200 OK
< Connection: keep-alive
< Keep-Alive: 5
< Content-Length: 13
< Content-Type: text/plain; charset=utf-8
< 
* Connection #0 to host 127.0.0.1 left intact
done waiting!
* Closing connection 0
curl -vvv -H "Transfer-Encoding: chunked" --data-binary @small.txt   0.00s user 0.00s system 84% cpu 0.004 total

Regardless, I am not seeing the long delay times that you are on the Sanic server.

I tried the test also using ASGI servers uvicorn and hypercorn.

uvicorn server:app

It does not work out of the box without sending the Expect: 100-continue header (for now, I have some work still to do on streaming. But, if I add the Expect header to the request, I get the 100 Continue interim response.

time curl -vvv -H "Transfer-Encoding: chunked" -H "Expect: 100-continue"  --data-binary @small.txt  http://127.0.0.1:8000/api/v1/detect
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8000 (#0)
> POST /api/v1/detect HTTP/1.1
> Host: 127.0.0.1:8000
> User-Agent: curl/7.64.1
> Accept: */*
> Transfer-Encoding: chunked
> Expect: 100-continue
> Content-Type: application/x-www-form-urlencoded
> 
< HTTP/1.1 100 Continue
* Signaling end of chunked upload via terminating chunk.
< HTTP/1.1 200 OK
< date: Thu, 23 May 2019 07:29:10 GMT
< server: uvicorn
< content-length: 13
< 
* Connection #0 to host 127.0.0.1 left intact
done waiting!
* Closing connection 0
curl -vvv -H "Transfer-Encoding: chunked" -H "Expect: 100-continue"     0.01s user 0.00s system 87% cpu 0.008 total

Same results with hypercorn

uvicorn server:app

time curl -vvv -H "Transfer-Encoding: chunked" -H "Expect: 100-continue"  --data-binary @small.txt  http://127.0.0.1:8000/api/v1/detect
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8000 (#0)
> POST /api/v1/detect HTTP/1.1
> Host: 127.0.0.1:8000
> User-Agent: curl/7.64.1
> Accept: */*
> Transfer-Encoding: chunked
> Expect: 100-continue
> Content-Type: application/x-www-form-urlencoded
> 
< HTTP/1.1 100 
< date: Thu, 23 May 2019 07:29:46 GMT
< server: hypercorn-h11
* Signaling end of chunked upload via terminating chunk.
< HTTP/1.1 200 
< content-length: 13
< date: Thu, 23 May 2019 07:29:46 GMT
< server: hypercorn-h11
< 
* Connection #0 to host 127.0.0.1 left intact
done waiting!
* Closing connection 0
curl -vvv -H "Transfer-Encoding: chunked" -H "Expect: 100-continue"     0.00s user 0.00s system 54% cpu 0.006 total

As a side note, hypercorn is also okay without the Expect header:

time curl -vvv -H "Transfer-Encoding: chunked" --data-binary @small.txt  http://127.0.0.1:8000/api/v1/detect                           
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8000 (#0)
> POST /api/v1/detect HTTP/1.1
> Host: 127.0.0.1:8000
> User-Agent: curl/7.64.1
> Accept: */*
> Transfer-Encoding: chunked
> Content-Type: application/x-www-form-urlencoded
> 
> e
* upload completely sent off: 21 out of 14 bytes
< HTTP/1.1 200 
< content-length: 13
< date: Thu, 23 May 2019 07:30:30 GMT
< server: hypercorn-h11
< 
* Connection #0 to host 127.0.0.1 left intact
done waiting!
* Closing connection 0
curl -vvv -H "Transfer-Encoding: chunked" --data-binary @small.txt   0.00s user 0.00s system 79% cpu 0.006 total

So, when 19.6 is released (assuming we complete ASGI support by then), this will be "fixed" by using one of the ASGI servers.

The questions that I believe still needs to be answered:

  1. Should we add this to the Sanic server?
  2. Should there be a 100 Continue response even if the client does not request it.

My thoughts are that (1) yes, we should respond to Expect: 100 Continue, and (2) no, we should not add the response if it is not requested.

I could be wrong on point 2, and I am open to debate, but my reading of RFC 7231 is that the response is not required if it is not requested.

6.2.1. 100 Continue

The 100 (Continue) status code indicates that the initial part of a
request has been received and has not yet been rejected by the
server. The server intends to send a final response after the
request has been fully received and acted upon.

When the request contains an Expect header field that includes a
100-continue expectation, the 100 response indicates that the server
wishes to receive the request payload body, as described in
Section 5.1.1. The client ought to continue sending the request and
discard the 100 response.

If the request did not contain an Expect header field containing the
100-continue expectation, the client can simply discard this interim
response.

--

I am removing the Bug label because I do not think this is a bug per se, and more of a feature request.

@sjsadowski
Copy link
Contributor

It should also be noted that the last go round with 'sanic is slow when I test it with curl' the culprit ended up being curl; as people tested with other methods the slowness could not be reproduced, but could be reproduced with curl.

@yunstanford
Copy link
Member

yunstanford commented May 23, 2019

We probably could add this support

@Leonidimus
Copy link
Author

@ahopkins Thanks for quick responses! Now that I see the project is actively supported, I can safely continue using the framework :)

My small.txt file was about 1.1K. The issue doesn't happen for really tiny files. I agree it's a client specific issue (curl), but still it would be correct behavior to respect client's request for 100-continue.

BTW, I tested with a popular JS library and a 22M file and it worked very quickly.
Code:

const request = require('request-promise');
const fs = require('fs');

const start = Date.now();
fs.createReadStream('terraform_0.11.13_darwin_amd64.zip')
    .pipe(request.post('http://127.0.0.1:8000/api/v1/detect'))
    .then( result => {
        console.log(result);
        console.log("Elapsed time, ms:", Date.now()-start);
    });

Output:

$ node request.js
done waiting!
Elapsed time, ms: 59

@ahopkins
Copy link
Member

ahopkins commented May 23, 2019

Now that I see the project is actively supported, I can safely continue using the framework :)

There is a core team of developers that are working on Sanic in a community. Part of our reasoning for moving to the community supported model was to foster an ongoing group of developers so that it would stay active. I am glad you feel this way 😄

I updated my small.txt to a larger file. Still runs ok for me. 🤔 If anyone else has any experience I'd be interested to hear.

╭─adam@thebrewery ~/Projects/Sanic/playground/issue1587  
╰─$ ls -l
drwxr-xr-x  adam  users       -   Thu May 23 17:23:34 2019    __pycache__
.rw-r--r--  adam  users   516 B   Thu May 23 17:23:34 2019    server.py
.rw-r--r--  adam  users  29.9 MB  Thu May 23 23:08:42 2019    small.txt
╭─adam@thebrewery ~/Projects/Sanic/playground/issue1587  
╰─$ time curl -vvv -H "Transfer-Encoding: chunked" -H "Expect: 100-continue"  --data-binary @small.txt  http://127.0.0.1:8000/api/v1/detect
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8000 (#0)
> POST /api/v1/detect HTTP/1.1
> Host: 127.0.0.1:8000
> User-Agent: curl/7.64.1
> Accept: */*
> Transfer-Encoding: chunked
> Expect: 100-continue
> Content-Type: application/x-www-form-urlencoded
> 
< HTTP/1.1 100 Continue
* Signaling end of chunked upload via terminating chunk.
< HTTP/1.1 200 OK
< date: Thu, 23 May 2019 20:08:49 GMT
< server: uvicorn
< content-length: 13
< 
* Connection #0 to host 127.0.0.1 left intact
done waiting!
* Closing connection 0
curl -vvv -H "Transfer-Encoding: chunked" -H "Expect: 100-continue"     0.01s user 0.02s system 54% cpu 0.042 total

@yunstanford 💪

@ahopkins ahopkins added this to the 19.9 milestone May 23, 2019
@ahopkins
Copy link
Member

I added this to 19.9, but if anyone thinks then can handle providing 100 Continue responses before then, maybe we can get it into 19.6.

@yunstanford yunstanford self-assigned this May 28, 2019
@LTMenezes
Copy link
Contributor

It should also be noted that the last go round with 'sanic is slow when I test it with curl' the culprit ended up being curl

Just to give more insight, libcurl by default waits 1 second for a 100 response before timing out and continuing the request. There's a thread about this behaviour: https://curl.haxx.se/mail/lib-2017-07/0013.html

@ahopkins Is anyone currently working on this? I would be glad to help.

@ahopkins
Copy link
Member

ahopkins commented May 28, 2019

@LTMenezes Would be happy to have you help take a stab at this.

Check with @yunstanford, looks like he self-assigned this so he may already be working on it.

@yunstanford yunstanford modified the milestones: 19.9, 19.6 Jun 6, 2019
seemethere added a commit that referenced this issue Jun 10, 2019
Fix #1587: add support for handling Expect Header
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants