Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spanner: Re-create sessions that have been invalidated by the server #4734

Merged

Conversation

olavloite
Copy link

@olavloite olavloite commented Mar 24, 2019

Sessions in the session pool that have been invalidated by the server may cause SessionNotFoundExceptions to bubble up to the user, even though the user has no control over the session management. When this occurs, the session should be removed from the SessionPool and a new session should be fetched from the SessionPool without the user having to intervene.
In case of a SessionNotFoundException being thrown while using a TransactionManager, the client library will replace the session with a new one from the pool, and trigger a restart of the transaction by throwing an AbortedException (it is not possible to replace the session without also restarting the transaction).

This change has been initiated on request from the Spanner team.

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Mar 24, 2019
@codecov
Copy link

codecov bot commented Mar 24, 2019

Codecov Report

Merging #4734 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #4734   +/-   ##
=========================================
  Coverage     50.36%   50.36%           
  Complexity    23665    23665           
=========================================
  Files          2233     2233           
  Lines        225856   225856           
  Branches      24956    24956           
=========================================
  Hits         113742   113742           
  Misses       103517   103517           
  Partials       8597     8597

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aace1e5...a24bfb7. Read the comment docs.

@olavloite olavloite marked this pull request as ready for review March 26, 2019 10:39
@olavloite olavloite requested a review from a team as a code owner March 26, 2019 10:39
@olavloite olavloite changed the title [WIP] Spanner: Re-create sessions that have been invalidated by the server Spanner: Re-create sessions that have been invalidated by the server Mar 26, 2019
Copy link
Contributor

@sduskis sduskis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have stylistic preferences. I'm not going to block this PR for those comments.

@sduskis sduskis requested a review from kolea2 March 26, 2019 16:33
@sduskis sduskis added the api: spanner Issues related to the Spanner API. label Mar 26, 2019
@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Mar 31, 2019
Copy link
Contributor

@snehashah16 snehashah16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One high level comment, customers should ideally never see a SessionNotFound with this change. But in ur change, u only have 1 retry for error case. I would change it such that it is always retried for SessionNotFound


private void maybeRecreateUnderlyingSession(SessionNotFoundException e) {
if (beforeFirstServerCall && allowDelegateRecreation) {
delegate = spanner.createSession(db);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we get another session from the pool instead of creating a new one ?

I am afraid that the latency on this request will be too high when we internally retry.

try {
T res = method.get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be after markUsed() ?

} catch (SessionNotFoundException e) {
maybeRecreateUnderlyingSession(e);
lastException = null;
return internalRunWithSessionRetry(method);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a chance we got sessionNotFound here as well..

have u considered putting this in a loop ?
Pros: customers never see a SessionNotFound
Cons: high latency, but our timeouts should still take effect.

} catch (SessionNotFoundException e) {
maybeRecreateUnderlyingSession(e);
TransactionRunner newRunner = delegate.readWriteTransaction();
result = newRunner.run(callable);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, this could potentially still see a sessionNotFound (SNF)

@olavloite olavloite force-pushed the spanner-retry-on-invalidated-session branch from 10ccbb2 to a05d6f1 Compare April 9, 2019 10:35
@sduskis
Copy link
Contributor

sduskis commented Apr 11, 2019

@olavloite, it looks like there are conflicts.

@sduskis sduskis added needs work This is a pull request that needs a little love. and removed 🚨 This issue needs some love. labels Apr 11, 2019
@olavloite olavloite changed the title Spanner: Re-create sessions that have been invalidated by the server [WIP] Spanner: Re-create sessions that have been invalidated by the server Apr 11, 2019
@olavloite
Copy link
Author

@sduskis Yeah, I know. I'll start to rebase it once #4895 has been merged, as that will otherwise also cause further merge conflicts with this change.

@olavloite olavloite force-pushed the spanner-retry-on-invalidated-session branch 2 times, most recently from e0224e2 to a63929e Compare April 16, 2019 05:27
@olavloite olavloite changed the title [WIP] Spanner: Re-create sessions that have been invalidated by the server Spanner: Re-create sessions that have been invalidated by the server Apr 16, 2019
try {
return callable.apply(session);
} catch (SessionNotFoundException e) {
session = pool.replaceReadWriteSession(e, session);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following up from my last comment, if the session is READ, is this correct, as replaceReadWriteSession eventually returns 'getReadWriteSession()'?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, it wasn't right, even though the method was never called with READ. But that was not correct either, as executePartitionedUpdate does not need a read/write session, as it always starts a new transaction (as it needs a special type of transaction).

@@ -188,7 +188,7 @@ public long executePartitionedUpdate(final Statement stmt) {
Span span = tracer.spanBuilder(PARTITION_DML_TRANSACTION).startSpan();
try (Scope s = tracer.withSpan(span)) {
return runWithSessionRetry(
SessionMode.READ_WRITE,
SessionMode.READ, // PartitionedUpdate does not need a prepared tx.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this in a separate PR? Because this PR is quite large, I'd like to separate out the recreating sessions implementation from the refactorings/fixes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll move it to a separate PR later this afternoon.

}

@VisibleForTesting
int getNumberOfAvailableWritePreparedSessions() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not DBClient's responsibility. I wouldn't expose it here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method is now removed and the visibility of the session pool changed to package-private to make it visible for test cases.

@@ -69,7 +99,7 @@ public Timestamp writeAtLeastOnce(Iterable<Mutation> mutations) throws SpannerEx
public ReadContext singleUse() {
Span span = tracer.spanBuilder(READ_ONLY_TRANSACTION).startSpan();
try (Scope s = tracer.withSpan(span)) {
return pool.getReadSession().singleUse();
return getReadSession().singleUse();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious whats the behavior with read sessions seeing SNF errors ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SNF is detected at the first actual database request. For read-only transactions (both single use and actual read transactions), that is at the first query. The query is sent to the server at the first call to ResultSet#next(), so the SNF for a read transaction is handled here:

} catch (RuntimeException e) {
TraceUtil.endSpanWithFailure(span, e);
throw e;
}
}

private <T> T runWithSessionRetry(Function<Session, T> callable) {
PooledSession session = getReadWriteSession();
while (true) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this respect client side timeout values set ?
If our backend is down, and does not return any sessions -> whats the outcome of the customer application ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The short answer: It will behave the same as now, and the behavior will depend on whether the pool has a session ready to be returned or not. The runWithSessionRetry will only retry on the specific condition of a SessionNotFoundException being returned.

There are two possible scenarios:

  1. The application requests a session and the session pool has a session that it thinks is valid. This session is handed to the application and the application tries to execute the transaction. This will cause the query or begin transaction statement to time out.
  2. The pool does not have a session available that can be returned and starts the creation of a new session asynchronously. The requesting thread is placed in a waiter. The create session request will fail and notify the waiter of the error. The waiting thread will return with the create session error.

Both of the above will not be considered a SessionNotFoundException by the runWithSessionRetry method, and the method will return with an error.

olavloite added 22 commits May 9, 2019 17:24
Replaced while(true) loops with internal methods.
Added method for creating ReadOnlyTransactions.
When a SessionNotFoundException is thrown, a new session is now picked
from the pool, instead of creating a new session on the server. This
reduces latency in calls that get an invalidated session from the pool.
It does however also require that the retry-logic for SessionNotFound is
put in a loop, as the second session returned from the pool could also
possibly be an invalidated session.
When a SessionNotFoundException occurs, a new session should be
requested from the SessionPool instead of creating a new one.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: spanner Issues related to the Spanner API. cla: yes This human has signed the Contributor License Agreement. needs work This is a pull request that needs a little love.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants