Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize initial size in ContentCachingRequestWrapper when contentCacheLimit set #29775

Conversation

ryanrupp
Copy link
Contributor

@ryanrupp ryanrupp commented Jan 5, 2023

To avoid over-allocating the initial buffer for content caching:

  1. If the content size is known and is smaller than the content limit, use that
  2. If the content size is unknown use the default initial buffer size (or content limit if smaller) and allow it to grow as needed

This allows for scenarios where limiting behavior is desired to avoid overly large caching but at the same time the majority of requests are not near that limit. For example, allowing a maximum of 1MB content caching but the majority of requests are in the single digit KB size. Rather than allocate the worst case 1MB byte arrays per request these can be scaled in to be more appropriately sized.

To avoid over-allocating the initial buffer for content caching:
1. If the content size is known and is smaller than the content limit, use that
2. If the content size is unknown use the default initial buffer size (or content limit if smaller) and allow it to group as needed

This allows for scenarios where limiting behavior is desired to avoid overly large caching but at the same time the majority of requests are not near that limit. For example, allowing a maximum of 1MB content caching but the majority of requests are in the single digit KB size. Rather than allocate the worst case 1MB byte arrays per request these can be scaled in to be more appropriately sized.
@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged or decided on label Jan 5, 2023
@ryanrupp
Copy link
Contributor Author

ryanrupp commented Jan 5, 2023

Ran into this because we're using this similar to AbstractRequestLoggingFilter except slightly different formatting (just wanted to capture the contents) and we're trying to capture up to a larger size (say 1MB) but at the same time safeguard ourselves from overly large requests. These large requests are an edge case though so we want to avoid eagerly allocating the content limit size because this would over-allocate quite a bit in most cases. Alternatively, there could be another overloaded constructor that takes the buffer initial size but this seemed like maybe exposing too much implementation detail.

Only downside really of this is that on the unknown size requests, the init buffer will now just be the smaller of 1024 or the content size limit. In theory there's someone that had the limit set to something larger and knew most of their requests would end up that size and now there's some level of byte buffer resizing that takes place. Maybe use Spring's FastByteArrayOutputStream if this is a concern. AbstractRequestLoggingFilter currently defaults the content size limit to 50 bytes so it would always take that over the 1024 default for unknown sizing.

* would cause an OOM Error likely otherwise.
*/
@Test
void cachedContentWithLimitKnownLengthAvoidOverAllocation() throws Exception {
Copy link
Contributor Author

@ryanrupp ryanrupp Jan 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit awkward to test without exposing some internals in a package private method or something (maybe reflection to grab the byte array size). Basically, using a really large content limit before this change could cause an OOM Error if you ran this test with < 2GB heap or so. After this change it wouldn't use the 2GB limit, it would just allow growing to it (but the tests don't exercise growing to it as that's not relevant here).

@sbrannen sbrannen added in: web Issues in web modules (web, webmvc, webflux, websocket) type: enhancement A general enhancement labels Jan 31, 2023
@sbrannen sbrannen marked this pull request as draft January 31, 2023 15:25
Copy link
Contributor

@rstoyanchev rstoyanchev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the higher one needs to make contentCacheLimit for outliers, the less ideal it is as for use as an initial size for the average request body size.

The change seems reasonable to me, treating contentCacheLimit mostly as an upper boundary, but otherwise using the same default for an initial size. Or we could add one more constructor that takes defaultInitialSize in addition.

I would like more opinions from other team members.

@rstoyanchev rstoyanchev changed the title Optimize the initial buffer size when using content cache limiting Optimize initial size in ContentCachingRequestWrapper when contentCacheLimit set Feb 21, 2023
@bclozel bclozel marked this pull request as ready for review March 8, 2023 08:42
@bclozel
Copy link
Member

bclozel commented Mar 8, 2023

I think this change is a good tradeoff, avoiding over-allocation in many cases. If this change causes too much resizing, we can always consider using FastByteArrayOutputStream as pointed out by @ryanrupp . Maybe we should do that right away? With FastByteArrayOutputStream we could even drop the initial capacity value as allocating more is very efficient.

As for the tests, I'm not convinced we should use the OOM approach to validate the changes. I'd rather use ReflectionUtils to get the backing cachedContent and check its actual allocated size in the test.

@ryanrupp
Copy link
Contributor Author

ryanrupp commented Mar 8, 2023

@bclozel

  1. I'm good with moving to FastByteArrayOutputStream - mainly just less familiar with this class/implementation (I know it exists) but not sure if there were any tradeoffs around its usage
  2. I'm good with using ReflectionUtils to grab internal state for the test assertions, TBD how this changes/is necessary if using FastByteArrayOutputStream (probably still some internal state that could be confirmed on FastByteArrayOutputStream).

Will play with this when I get a chance, thanks.

@bclozel bclozel self-assigned this Sep 11, 2023
@bclozel bclozel removed the status: waiting-for-triage An issue we've not yet triaged or decided on label Sep 11, 2023
@bclozel bclozel added this to the 6.1.0-RC1 milestone Sep 11, 2023
@bclozel bclozel closed this in 6de0be1 Sep 11, 2023
bclozel added a commit that referenced this pull request Sep 11, 2023
This commit replaces the initial allocation size for the content caching
buffer by a `FastByteArrayOutputStream`. With this variant, allocations
are cheap and we don't need to apply heuristics anymore to guess the
best initial buffer size.

See gh-29775
@bclozel
Copy link
Member

bclozel commented Sep 11, 2023

Thanks @ryanrupp for this PR. I've polished it a bit and used the FastByteArrayOutputStream as a way to make allocations cheap and avoid the initial buffer size issue altogether. As for tests, it was not possible to reflect on ByteArrayOutputStream to test for the internal buffer size. With the latest change, it's not really necessary anymore.

This will be released with the first 6.1 release candidate.

@lonre
Copy link
Contributor

lonre commented Nov 15, 2023

Hi @bclozel

can we port f83c609 to branch 5.3.x?

@bclozel
Copy link
Member

bclozel commented Nov 15, 2023

@lonre Is this causing significant problems in your applications? This is a light performance improvement and we don't usually backport those in maintenance branches as only bug fixes apply there.

@lonre
Copy link
Contributor

lonre commented Nov 16, 2023

@bclozel Yes, according to the jfr

image

image

ContentCachingRequestWrapper seems allocating much memory.

@bclozel
Copy link
Member

bclozel commented Nov 16, 2023

Thanks for your response. Allocations by the ContentCachingRequestWrapper are very much expected; here, we're just optimizing for the case where the actual response body is smaller than the default. Backporting such a change to a long-standing maintenance branch is too risky so I'm afraid upgrading to 6.1 is the best way to get this change.

@kilink
Copy link
Contributor

kilink commented Dec 12, 2023

Could we reconsider aspects of this change? I think switching to FastByteArrayOutputStream is good for the case where the content size is unknown or there is no contentCacheLimit; however, when the content size is known, it seems to me to always be preferable to size the backing array precisely. I assume most usage of ContentCachingRequestWrapper will involve either calling getContentAsByteArray or getContentAsString, both of which will perform better if the backing buffer is either larger than or precisely the size of the contents.

Example scenario:

We have a 10KB request payload, and the content size is known. If we pass this size to the FastByteArrayOutputStream, no "collation" is needed if the user then calls getContentAsString or getContentAsByteArray. If we go with the current approach of passing in a default size of 1024, we'll end up with several buffers that must be combined (which happens in the resize method of FastByteArrayOutputStream).

In practice we do have some services that have such large request / response payloads, so I think this may be a regression in practice from simply using a ByteArrayOutputStream with a precisely sized buffer.

@bclozel
Copy link
Member

bclozel commented Dec 13, 2023

@kilink what do you think about this change?

@kilink
Copy link
Contributor

kilink commented Dec 13, 2023

@kilink what do you think about this change?

@bclozel That looks good to me. Thanks!

bclozel added a commit that referenced this pull request Dec 13, 2023
This commit builds on top of changes made in gh-29775 and gh-31737.
Before this change, we would allocate several byte arrays even in cases
of known request size. This could decrease performance when getting the
cached content as it requires merging several arrays and data is not
colocated in memory.

This change ensures that we create a `FastByteArrayOutputStream`
instance with the known request size so that the first allocated segment
can contain the entire content.
If the request size is not know, we will default back on the default
allocation size for the `FastByteArrayOutputStream`.

Closes gh-31834
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) type: enhancement A general enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants