Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize parsing headers for pekko-http and akka-http #3575

Merged
merged 5 commits into from
Mar 8, 2024

Conversation

kciesielski
Copy link
Member

@kciesielski kciesielski commented Mar 6, 2024

Fixes #3544

CPU samples registered in profiler:
Before:

sttp.tapir.server.pekkohttp.PekkoToResponseBody.parseContentType(String) 3.53%
sttp.tapir.server.pekkohttp.PekkoModel$.parseHeadersOrThrowWithoutContentHeaders(HasHeaders) 2.13%

After:

sttp.tapir.server.pekkohttp.PekkoToResponseBody.parseContentType(String) 0.18%
sttp.tapir.server.pekkohttp.PekkoModel$.parseHeadersOrThrowWithoutContentHeaders(HasHeaders) 0.14%

Throughput in SimpleGet Simulation for 128 users:
Before:
image
After:
image
(Yes, the timestamps are OK, I ran the test on the optimized server first :))
Note:
The issue mentions one more hot frame:

sttp.tapir.server.pekkohttp.PekkoServerRequest.acceptsContentTypes() 2.4%

It's a common issue caused by the base ServerRequest.acceptContentTypes calling Accepts.parse from sttp-model, involving some pattern matching. I tried two approaches, without great results:

  1. A low-level string-based parser instead of using sttp-model. This lowered CPU samples to ~1.8%, which isn't a great gain, while the code was pretty complex and didn't cover some error corner cases.
  2. Switching to Pekko's HttpHeader.parse which is more efficient, then repacking Pekko datatypes to Tapir's. This lowered the load to ~1.9%. Again, doesn't seem worthwhile, the repacking was also pretty complex and error prone, it probably involved unnecessary allocations as well. HttpHeader.parse is not very efficient by itself, because internally it calls
val parser = new HeaderParser(value, settings)

every time. The full power of HeaderParser could be leveraged if the parser was instantiated once and reused within a connection, then passed to new connections once the connection is closed (it has its own internal cache). This is what happens underneath if one uses raw pekko-http.

@kciesielski kciesielski marked this pull request as ready for review March 6, 2024 15:59
@kciesielski kciesielski requested a review from adamw March 6, 2024 15:59
@adamw
Copy link
Member

adamw commented Mar 6, 2024

Maybe we can optimize Accepts.parse somehow then?

@adamw
Copy link
Member

adamw commented Mar 6, 2024

But otherwise nice gains! :)

@@ -132,7 +132,7 @@ private[akkahttp] class AkkaToResponseBody(implicit m: Materializer, ec: Executi
}

private def parseContentType(ct: String): ContentType =
ContentType.parse(ct).getOrElse(throw new IllegalArgumentException(s"Cannot parse content type: $ct"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if can get some "wild" media types, e.g. with some parameters, which could fill our cache with junk. We have a list of (currently) 1208 mime types in mimeByExtension.txt, but it has some very weird media types, and parsing them upfront would take significant time. Or maybe we could cache only if contentType.mediaType.params.isEmpty? (well .. maybe except for the charset)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of storing only values which have empty params (except the charset), this should significantly limit the cases where the cache is not useful after being filled with a lot of types.

@kciesielski
Copy link
Member Author

Maybe we can optimize Accepts.parse somehow then?

I can work some more on this, it's kinda outside of scope of this pekko-specific PR anyway.

@kciesielski kciesielski merged commit 937a96c into master Mar 8, 2024
28 checks passed
@kciesielski kciesielski deleted the pekko-http-perf-optimization branch March 8, 2024 07:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize costly operations for pekko-http server
2 participants