Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reverseproxy: only change the content-length header when the full request was buffered #5830

Merged
merged 1 commit into from
Jan 9, 2024

Conversation

mcfedr
Copy link
Contributor

@mcfedr mcfedr commented Sep 20, 2023

fixes: #5829

currently if a request body was longer than request_buffers the content-length header sent to the upstream will be wrong, it will be the length of the buffer.

With this change, only in the case that the full request was read into the buffer, because it was set to unlimited length then the content-length is changed

@mholt
Copy link
Member

mholt commented Sep 21, 2023

Ohh I see, this is a good point. Thanks for finding this.

So I guess we need to set Content-Length to keep php-fpm (and other upstreams) from hanging. We only know the ContentLength in these cases:

  • Buffer size is -1 (unlimited)
  • Buffer is larger than body

We don't know it in these cases:

  • Buffer is smaller than body
  • There's an error reading the body up to the buffer size

In the first two cases, we should be setting ContentLength.

This change sets ContentLength if buffer size is unlimited, but not if a limited-size buffer contains the entire body. Have you tested and accounted for that case as well?

@mcfedr
Copy link
Contributor Author

mcfedr commented Sep 21, 2023

in the case of 'Buffer is larger than body' - I did think about it - and in the current version, it wont set the content-length - but my thinking is this - if you are relying on the content-length always being set and the body being buffered - then using anything other than an unlimited buffer is going to cause you problems - as the buffer size doesnt limit the size of the request, and these other bytes will be in the body stream.

I would suggest for php-fpm you want to use unlimited buffer size, and possible limit the request size using the setting from request_body

@mholt
Copy link
Member

mholt commented Sep 21, 2023

Sorry, I don't think I follow:

in the case of 'Buffer is larger than body' - I did think about it - and in the current version, it wont set the content-length - but my thinking is this - if you are relying on the content-length always being set and the body being buffered - then using anything other than an unlimited buffer is going to cause you problems - as the buffer size doesnt limit the size of the request, and these other bytes will be in the body stream.

We're talking about the case where the ENTIRE body fits and is read into the buffer, right? There's no bytes remaining on the wire, and we know the exact ContentLength. I think in that case we could set the ContentLength (esp. if it's not set already).

Now if the body size does exceed the buffer size, then it's true: we don't know the full length and probably shouldn't change the ContentLength.

Does that make sense?

@mcfedr
Copy link
Contributor Author

mcfedr commented Sep 22, 2023

We're talking about the case where the ENTIRE body fits and is read into the buffer, right? There's no bytes remaining on the wire, and we know the exact ContentLength. I think in that case we could set the ContentLength (esp. if it's not set already).

You are completely right - and it could be the right way to go - but - my thought it that if Caddy does that it allows you to create a dangerous situation.

  • if php-fpm requires content-length to be set - so you enable buffering, and now its set - but as soon as someone sends your server a request that is longer than your buffer, php-fpm is going to crash.
  • so if you are required to have a content-length header, you must have an unlimited size buffer
  • so maybe its best to only 'fix' the content-length when you set the buffer to unlimited

Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see! That does make sense 🤔

I guess we can try it and see how it goes -- I suppose this is still better than what we currently have. :)

If @francislavoie is good with it then we can merge this in.

@mcfedr
Copy link
Contributor Author

mcfedr commented Oct 2, 2023

interesting thing I've noticed going back to the original issue (#5420) from php-fpm - nginx will fully buffer requests for fpm backends by default, https://nginx.org/en/docs/http/ngx_http_fastcgi_module.html?#fastcgi_request_buffering and i actually cannot see an option to limit the request buffer size (there are options for response buffers only)

@francislavoie
Copy link
Member

/cc @trea in case you want to take a look at this, since you dug quite deep into the behaviour last time we touched this area.

@mholt mholt added help wanted 🆘 Extra attention is needed under review 🧐 Review is pending before merging labels Oct 3, 2023
@mcfedr
Copy link
Contributor Author

mcfedr commented Oct 11, 2023

Just rebased to keep up with master.

Is there anything I can do to help get this merged? I'd love to switch back to a release build of caddy

@francislavoie
Copy link
Member

I'd still like to hear from @trea to make sure it still works for their usecase with this change. I don't want to radio-button between two different usecases being either working or broken.

@trea
Copy link
Contributor

trea commented Oct 11, 2023

Thanks Francis, I will check this out today and get an update to you all. 🤝

@ueffel
Copy link
Contributor

ueffel commented Oct 12, 2023

I had a similar issue with the cgi plugin. In order to set the correct Content-Length, which is required by the CGI protocol, the whole body of the request had to be buffered. An additional challenge was that the original http requests had Transfer-Encoding: chunked, so there was no Content-Length initially.
I had a look how nginx deals with this problem. Of course: request buffering. But how to limit memory usage?

so if you are required to have a content-length header, you must have an unlimited size buffer

Theoretically yes

nginx will fully buffer requests for fpm backends by default

Yes, but nginx buffers request bodies in temporary files. I don't know if there even is buffering in memory.1

To solve the problem for the cgi plugin, I implemented a hybrid solution: Buffer in memory if it fits within the limits. If it doesn't, buffer in a temporary file.

Maybe this idea could be implemented here as well? It would limit the memory usage and use hard disk space temporarily, which is typically more available :)

What would interest me: How does caddy deal with requests that are Transfer-Encoding: chunked, when reverse proxying especially with a fastcgi transport?

Footnotes

  1. I looked at the nginx documentation again. It does memory buffering up to a limit as well.

@mholt
Copy link
Member

mholt commented Dec 13, 2023

@ueffel That's interesting. Yeah, writing to files could be a thing, but then we risk exhausting disk space 🤷‍♂️ AND it's basically a swap file, so there's a huge performance hit by orders of magnitude. If enough people would find it useful, though, perhaps we could support it and make it opt-in...

@mcfedr
Copy link
Contributor Author

mcfedr commented Jan 4, 2024

@mholt what would it take to get this PR merged? The current behaviour is buggy, and this should fix it. There is clearly room for future work, but it would appear to not be a blocker

@mholt
Copy link
Member

mholt commented Jan 8, 2024

@mcfedr I was mainly waiting for @trea to to make sure it works for them, due to the sensitivity of this change.

@trea
Copy link
Contributor

trea commented Jan 8, 2024

Sorry friends, I did try to verify this before but I wasn't able to recreate the previous circumstances in my test setup so I didn't have any confidence in being able to verify or not. I should have updated this thread at that time. I'd recommend moving forward with this and hope you'll accept my apologies!

@mholt
Copy link
Member

mholt commented Jan 9, 2024

No worries, thanks for letting us know :)

I don't see any immediate issues with this change other than the "what if the body fits within the limited-size buffer" case, which was discussed above. We can try it this way for now -- it's still an improvement probably -- and hopefully @mcfedr can help us patch it if needed later 😃

@mholt mholt merged commit d9ff7b1 into caddyserver:master Jan 9, 2024
23 checks passed
@mholt mholt removed help wanted 🆘 Extra attention is needed under review 🧐 Review is pending before merging labels Jan 9, 2024
@mholt mholt added this to the v2.7.7 milestone Jan 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

reverseproxy: content-length is wrong when the request doesnt fit in the request_buffer
5 participants