Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SessionPoolOptions.setFailIfPoolExhausted in SpannerAccessor #31663

Merged
merged 1 commit into from
Jul 16, 2024

Conversation

manitgupta
Copy link
Contributor

@manitgupta manitgupta commented Jun 21, 2024

Configures SpannerAccessor to set the setFailIfPoolExhausted SessionPoolOption which ensures that the Spanner client throws an error if the session pool is exhausted.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

@manitgupta manitgupta marked this pull request as ready for review June 21, 2024 10:53
Copy link
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @robertwb for label java.
R: @damondouglas for label io.
R: @nielm for label spanner.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@nielm
Copy link
Contributor

nielm commented Jun 21, 2024

Can you explain why you believe this is necessary to add?

The default session pool options in Java are:

  • MinSessions =100
  • MaxSessions = 400
  • WriteSessionsFraction = 0.2

These will apply to each worker indivudually.

@olavloite for comment.

@manitgupta
Copy link
Contributor Author

This is my primary motivation -
https://github.com/googleapis/java-spanner/blob/main/session-and-channel-pool-configuration.md#automatically-clean-inactive-transactions

Currently, there is no way to set setWarnAndCloseIfInactiveTransactions(). SpannerAccessor is used in Dataflow templates like DatastreamToSpanner, which customers use as a black box, therefore, having this enabled by default makes sense.

@nielm
Copy link
Contributor

nielm commented Jun 21, 2024

In theory there should never be an inactive transaction left open by SpannerIO.
If there is, then it is a bug in SpannerIO.

Writes use blind writes by calling DatabaseClient.writeAtLeastOnce, which internally uses the single_use_transaction feature of the Commit API so that a transaction is automatically started and closed by the Spanner server.

Reads use a single read transaction, which is shared across all workers. By their nature these are long-running transactions.

SpannerIO is intended to isolate users from needing to worry about transactions, and to just pass their data in and let the connecter deal with it.

If you are experiencing issues with session pool exhaustion, and believe that there may be a bug, then please raise an issue.

@manitgupta
Copy link
Contributor Author

SpannerAccessor is not only consumed by SpannerIO, it is directly used by Flex templates such as DatastreamToSpanner here. Therefore a pipeline may leverage the accessor from beam directly and have a session leak in the application code they write using the accessor.

Copy link
Contributor

Reminder, please take a look at this pr: @robertwb @damondouglas @nielm

@manitgupta
Copy link
Contributor Author

Bump! @nielm @olavloite

Copy link
Contributor

github-actions bot commented Jul 4, 2024

Assigning new set of reviewers because Pr has gone too long without review. If you would like to opt out of this review, comment assign to next reviewer:

R: @kennknowles for label java.
R: @ahmedabu98 for label io.
R: @nielm for label spanner.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

@manitgupta manitgupta force-pushed the session_pool_options branch from dc74994 to efee92b Compare July 5, 2024 08:47
@manitgupta manitgupta changed the title Add SessionPoolOptions to SpannerConfig Add SessionPoolOptions to SpannerAccessor Jul 5, 2024
@nielm
Copy link
Contributor

nielm commented Jul 5, 2024

Result of out-of-band discussion between @manitgupta and @olavloite - Change the PR such that SpannerAccessor always sets

SessionPoolOptions.setFailIfPoolExhausted()

to avoid expanding the SpannerIO public API surface while also catching potential session leak bugs.

@manitgupta manitgupta changed the title Add SessionPoolOptions to SpannerAccessor Add SessionPoolOptions.setFailIfPoolExhausted to SpannerAccessor Jul 5, 2024
@manitgupta manitgupta changed the title Add SessionPoolOptions.setFailIfPoolExhausted to SpannerAccessor Add SessionPoolOptions.setFailIfPoolExhausted in SpannerAccessor Jul 5, 2024
@nielm
Copy link
Contributor

nielm commented Jul 5, 2024

LGTM

Precommit fails for unrelated reason in
org.apache.beam.sdk.io.gcp.spanner.changestreams.it.SpannerChangeStreamOrderedWithinKeyIT > restOrderedWithinKey

@kennknowles kennknowles requested a review from Abacn July 8, 2024 20:27
Copy link
Contributor

Reminder, please take a look at this pr: @kennknowles @ahmedabu98 @nielm

@Abacn Abacn merged commit 53409cc into apache:master Jul 16, 2024
18 checks passed
@nielm
Copy link
Contributor

nielm commented Oct 8, 2024

reverted in #32694

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants