Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability for AsgiHandler and AsgiRequest to process the request body in chunks. #1269

Closed

Conversation

hozblok
Copy link
Contributor

@hozblok hozblok commented Mar 26, 2019

#1001

To prevent a situation when the request body is loaded completely into RAM, I suggest adding io.IOBase streams support to the AsgiRequest class. django.http.HttpRequest supports work with streams so we have to correctly construct the stream and implement read() method for Django.

In case of early termination of loading the request body, in cases of disconnect or errors (for example, exceeding the file size), we must close the stream correctly and allow to send a response if it is necessary.

The error handling mechanisms and the generation of responses are left unchanged. The ability to work with chunks is essential for the file upload which is usually implemented through a POST request with multipart/from-data.

Thanks in advance for your feedback. I'm interested in making it better.

@hozblok hozblok requested a review from carltongibson as a code owner March 26, 2019 21:01
@carltongibson
Copy link
Member

Hi @hozblok. Super thanks. I'm already working on this in #1251 so I shall compare.

@hozblok
Copy link
Contributor Author

hozblok commented Mar 27, 2019

Hi @carltongibson. Great. Please, feel free to ask anything about implementation.

@carltongibson
Copy link
Member

Hi @hozblok. Good stuff. Help me understand it... 🙂

So can you tell me why we need the more complex task based approach to consuming the receive callable, rather than (essentially) just await-ing it in read(), as I have going in #1251? (What sort of test case shows the other way to be inadequate?)

Make sense?

Thanks.

@hozblok
Copy link
Contributor Author

hozblok commented Mar 28, 2019

Hi @carltongibson

  1. I'm looking at your implementation now. I am ready to admit that your way looks more suitable and simple in this case. (I mean pass the receive to the stream). Although I have not tested your version.

  2. The truncate() for self.buffer call is not needed? Will chunks be stored in RAM until truncate?

  3. RequestBodyWrapper if it's file-like object ... If I were you I would inherit it from io.IOBase of RawIOBase or BufferedIOBase. Thus, we will automatically get the opportunity to close the stream, the readline() inherited method, context manager interface.

  4. def __init__(self, scope, body): I propose to change the name body -> stream

  5. Why did you decide to remove the check?

        # Limit the maximum request data size that will be handled in-memory.
        if (
            settings.DATA_UPLOAD_MAX_MEMORY_SIZE is not None
            and self._content_length > settings.DATA_UPLOAD_MAX_MEMORY_SIZE
        ):
            raise RequestDataTooBig(
                "Request body exceeded settings.DATA_UPLOAD_MAX_MEMORY_SIZE."
            )
  1. Perhaps an interesting case for tests: self._content_length does not match the size of the body being transferred. This should be an exception and we should correctly abort the upload.

@carltongibson
Copy link
Member

So the idea was that if we can just make receive into a (sync) file-like then we can basically leverage django.http.request.HttpRequest, rather than duplicating all the functionality. So, we remove the settings.DATA_UPLOAD_MAX_MEMORY_SIZE check, because Django already does that (and it's much more battle-tested, so we won't hit issues like #1240.)

I agree with your other points. RequestBodyWrapper should probably be something like RequestIO and extend from IO... as you say. etc.

Fancy jumping onto #1251 and co-authoring it with me? I have time scheduled for the sprints at DjangoCon Europe but if you have capacity we could get it finished and a release out before then?

The code side is nearly there. (Just needs removing of the _read_started checks, as per the comments) The changes you've suggested here make sense. Then it's just tidying the tests and so on. (the pattern of creating the wrapper in the tests could be factored better, etc.) Let me know: happy to input if you can assist! Thanks.

@hozblok
Copy link
Contributor Author

hozblok commented Mar 29, 2019

Of course, I'm ready to participate. Let's finish it together.

The scope of work is clear. First I will reread the entire history of the ticket more carefully. 🙂

@hozblok
Copy link
Contributor Author

hozblok commented Apr 2, 2019

Hi @carltongibson ,

Just in case, I will duplicate the new PR here:
carltongibson#1

@carltongibson
Copy link
Member

Super. Thanks. I’ll have a look tomorrow. Good work!

@carltongibson
Copy link
Member

Closing in favour of #1251 — Thanks again @hozblok!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants