-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xds: fix the race condition in SslContextProviderSupplier's updateSslContext and close #8294
Conversation
if (!shutdown) { | ||
if (sslContextProvider == null) { | ||
sslContextProvider = getSslContextProvider(); | ||
} | ||
} | ||
// we want to increment the ref-count so call findOrCreate again... | ||
final SslContextProvider toRelease = getSslContextProvider(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just return if shutdown
is true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you return when shutdown
is true, the caller's callback will never be called and that would be bad.
Let me give you some context for this change. This is for a very rare race condition: when the control plane sends a DownstreamTlsContext
we translate it to an SslContextProviderSupplier
(which internally performs lazy loading and ref-counting etc). For a new incoming connection to the server, gRPC will figure out the SslContextProviderSupplier
to use and give it to the protocol negotiator. Before the protocol negotiator has a chance to use it if the control plane replaces the DownstreamTlsContext
value for the server, then we call close
on the existing SslContextProviderSupplier
which will set shutdown
to true
and release the SslContextProvider (thereby making its ref-count 0). Let's say after this event the protocol negotiator wants to get the SslContext for the connection so it calls updateSslContext
on the SslContextProviderSupplier
which is now in shut-down state. We can either throw an exception via the callback's onException
(but not silently return) or fall through here to get an SslContextProvider
and use the callback to provide a proper SslContext to the protocol negotiator (even if it is related to the old DownstreamTlsContext
value). So this change does the latter to allow the protocol negotiator to succeed. There will be delays due to fresh loading of the SslContextProvider
but that is better than failing the connection just because it came in the middle of switching of the DownstreamTlsContext
values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hope this answers your question
@dapengzhang0 @ejona86 I think I have addressed the comments so far. A gentle reminder to move this PR along... |
xds/src/main/java/io/grpc/xds/internal/sds/SslContextProviderSupplier.java
Outdated
Show resolved
Hide resolved
…Context and close (grpc#8294)
This PR fixes the race condition where a
SslContextProviderSupplier
is returned for an inbound-connection and before the handshake for the connection starts, xDS replaces the listener content because of which theSslContextProviderSupplier
is closed. With this change, theSslContextProviderSupplier
will just re-allocate the SslContextProvider and release after the callback.