Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling SSL can double RSS due to Netty off-heap arenas #42224

Open
franz1981 opened this issue Jul 30, 2024 · 19 comments
Open

Enabling SSL can double RSS due to Netty off-heap arenas #42224

franz1981 opened this issue Jul 30, 2024 · 19 comments
Labels
area/netty kind/bug Something isn't working

Comments

@franz1981
Copy link
Contributor

franz1981 commented Jul 30, 2024

This is due to eclipse-vertx/vert.x#5168 (comment)
and will be "fixed" via eclipse-vertx/vert.x#5262

TLDR: enabling SSL make Vertx to use a custom off-heap pooled allocator for I/O (e.g. while reading or copying heap to off-heap to send data on the wire), but Jackson (and other quarkus parts) uses the Netty default one, making both to be resident in memory and doubling the required capacity.

This is the Jackson one:

tmpBuf = PooledByteBufAllocator.DEFAULT.directBuffer(chunkCapacity);

A complete fix should be able to correctly pass the allocator associated to the vertx connection to the above append buffer, saving to directly reference it.

A simple(r) and less invasive fix, instead, should change the above code to use the vertx pooled allocator, instead, given that via eclipse-vertx/vert.x#5262 it should be the default used.

@franz1981 franz1981 added the kind/bug Something isn't working label Jul 30, 2024
Copy link

quarkus-bot bot commented Jul 30, 2024

/cc @cescoffier (netty), @jponge (netty)

@franz1981
Copy link
Contributor Author

franz1981 commented Sep 5, 2024

This issue wasn't accurate enough: eclipse-vertx/vert.x#5292 (comment) describe a non-SSL case where the pooled allocators can still be duplicated - although it shouldn't be used, making the cost to just be in the empty arenas/data structures of the allocator.

@franz1981 franz1981 changed the title SSL can double RSS due to Netty off-heap arenas Enabling SSL can double RSS due to Netty off-heap arenas Sep 5, 2024
@franz1981
Copy link
Contributor Author

franz1981 commented Sep 7, 2024

Ping for @mabartos : keycloak uses the JDK SSL engine by default? Is it a common use for users?

@mabartos
Copy link
Contributor

mabartos commented Sep 9, 2024

@franz1981 For the server, we just rely on Quarkus internals with regards to TLS. For other stuff (HTTP client, remote Infinispan,...) we use SSL context obtained from JDK. AFAIK there is no explicit usage of SSLEngine in Keycloak, as mentioned here, and no other impls used as OpenSSL.

Perhaps @pedroigor or @vmuzikar might have more information if necessary.

@franz1981
Copy link
Contributor Author

For the server, we just rely on Quarkus internals with regards to TLS

which means the JDK SSL engine sadly .-.
see #41880
we don't yet support exposing any other ATM

@cescoffier
Copy link
Member

@franz1981 What's the status here? I'm planning to introduce Vert.x 4.5.11 next week.

@franz1981
Copy link
Contributor Author

franz1981 commented Nov 13, 2024

@cescoffier we need to:

  1. Decide if the new SSL option to enable heap buffer pooling is exposed to users and which default it got
  2. Do the same with the sys property vertx.reuseNettyAllocators (see Try reuse existing Netty pooled allocator singleton (Fixes #5168) eclipse-vertx/vert.x#5292)

I suggest the latter (sys property) to always be enabled, since is lower risk to me, while the former, we have to decide.

@cescoffier
Copy link
Member

@franz1981

  1. I would avoid diverging between the LTS and the current version - so, we need to keep the old behavior. This can be changed before the next LTS
  2. System properties are a bit tricky, but I believe something like this would work:
# Check if the system property is set by the user
# If not -> Apply default (define the system property)
# If is it -> Use the property

I'm wondering if this would work in native. In native, we could also make the decision at build time (during the compilation)—@zakkak WDYT?

@zakkak
Copy link
Contributor

zakkak commented Nov 14, 2024

I'm wondering if this would work in native.

I believe we can do it (both at build time or run time if needed).

In native, we could also make the decision at build time (during the compilation)—@zakkak WDYT?

I think this is not specific to this case. We have had this question before regarding things that can be configured through system properties. The trade off being that if we do things at build time we increase the chances of generating more optimized code, while doing thing at run time allows us to be more JVM-like. Note however that we have not measured the actual performance impact of the two options.

@cescoffier
Copy link
Member

@jponge FYI.

@mabartos
Copy link
Contributor

@franz1981 Any progress on this?

@franz1981
Copy link
Contributor Author

The changes are already in Quarkus but requires an additional PR to enable 2 properties

@franz1981
Copy link
Contributor Author

@mabartos if I provide a Quarkus branch with the required changes do you have any chances to try it?

@mabartos
Copy link
Contributor

@franz1981 Probably yes... The best would be to have some Quarkus snapshot release to have all artifacts available online to leverage our Keycloak benchmark.

@franz1981
Copy link
Contributor Author

That's up to @cescoffier and @gsmet if they want my changes (which can be disabled by default) in a snapshot release

@cescoffier
Copy link
Member

Do you have the link to the PRs?

@gsmet
Copy link
Member

gsmet commented Dec 18, 2024

If the PR is not too invasive and something we want to merge... except if something goes very wrong in the benchmarks, I'm fine merging so that you can trigger your benchmarks.

If it's experimental code not so much :).

@franz1981
Copy link
Contributor Author

@gsmet let me send it quickly so can be reviewed

@franz1981
Copy link
Contributor Author

franz1981 commented Dec 18, 2024

@gsmet @cescoffier #45185

@mabartos Do you use reactive SQL clients too? or gRPC?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/netty kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants