Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xds: fix the race condition in SslContextProviderSupplier's updateSslContext and close #8294

Merged
merged 2 commits into from
Jul 9, 2021

Conversation

sanjaypujare
Copy link
Contributor

This PR fixes the race condition where a SslContextProviderSupplier is returned for an inbound-connection and before the handshake for the connection starts, xDS replaces the listener content because of which the SslContextProviderSupplier is closed. With this change, the SslContextProviderSupplier will just re-allocate the SslContextProvider and release after the callback.

Comment on lines +58 to 64
if (!shutdown) {
if (sslContextProvider == null) {
sslContextProvider = getSslContextProvider();
}
}
// we want to increment the ref-count so call findOrCreate again...
final SslContextProvider toRelease = getSslContextProvider();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just return if shutdown is true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you return when shutdown is true, the caller's callback will never be called and that would be bad.

Let me give you some context for this change. This is for a very rare race condition: when the control plane sends a DownstreamTlsContext we translate it to an SslContextProviderSupplier (which internally performs lazy loading and ref-counting etc). For a new incoming connection to the server, gRPC will figure out the SslContextProviderSupplier to use and give it to the protocol negotiator. Before the protocol negotiator has a chance to use it if the control plane replaces the DownstreamTlsContext value for the server, then we call close on the existing SslContextProviderSupplier which will set shutdown to true and release the SslContextProvider (thereby making its ref-count 0). Let's say after this event the protocol negotiator wants to get the SslContext for the connection so it calls updateSslContext on the SslContextProviderSupplier which is now in shut-down state. We can either throw an exception via the callback's onException (but not silently return) or fall through here to get an SslContextProvider and use the callback to provide a proper SslContext to the protocol negotiator (even if it is related to the old DownstreamTlsContext value). So this change does the latter to allow the protocol negotiator to succeed. There will be delays due to fresh loading of the SslContextProvider but that is better than failing the connection just because it came in the middle of switching of the DownstreamTlsContext values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hope this answers your question

@sanjaypujare
Copy link
Contributor Author

@dapengzhang0 @ejona86 I think I have addressed the comments so far. A gentle reminder to move this PR along...

@sanjaypujare sanjaypujare merged commit 629748d into grpc:master Jul 9, 2021
@sanjaypujare sanjaypujare deleted the supplier-changes branch July 9, 2021 17:48
YifeiZhuang pushed a commit to YifeiZhuang/grpc-java that referenced this pull request Jul 25, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants