Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: don't ErrorIfMissing on pushed intent during QueryIntent, just increase write timestamp #95225

Closed
nvanbenschoten opened this issue Jan 13, 2023 · 0 comments · Fixed by #98183
Assignees
Labels
A-kv-transactions Relating to MVCC and the transactional model. A-read-committed Related to the introduction of Read Committed C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) P-0 Issues/test failures with a fix SLA of 2 weeks T-kv KV Team

Comments

@nvanbenschoten
Copy link
Member

nvanbenschoten commented Jan 13, 2023

QueryIntent requests contain a flag called ErrorIfMissing that causes them to return an error and short-circuit their BatchRequest when the intent they are searching for is found to be missing. "Missing" is defined as either the intent not being present or it being at a higher timestamp than expected.

We use this flag in two cases:

  1. when a transaction attempts to read its own pipelined writes, before committing
  2. when a transaction commits (parallel commit or otherwise)

This means that when a transaction queries its own intent and finds that it is present but that it has been pushed, we return an error. However, we have an "optimization" that returns a TransactionRetryError: retry txn (RETRY_SERIALIZABLE - intent pushed) error instead of an IntentMissingError:

if ownTxn && curIntentPushed {
// If the transaction's own intent was pushed, go ahead and return a
// TransactionRetryError immediately with an updated transaction proto.
// This is an optimization that can help the txn use refresh spans more
// effectively.
return result.Result{}, roachpb.NewTransactionRetryError(roachpb.RETRY_SERIALIZABLE, "intent pushed")
}

This is overly restrictive. In both cases, we should be ok with bumping the querying transaction's WriteTimestamp, but otherwise not returning an error immediately. For case 1, read-your-writes is still preserved if an intent is pushed. For case 2, parallel commits already considers a successful batch that has had its WriteTimestamp pushed to be an unsuccessful parallel commit that does not qualify for the implicit commit state (needTxnRetryAfterStaging).

This feels like an important change to avoid starvation when txn priorities allow one class of txns to repeatedly push the intent of another class of txn. It will become even more important if/when we allow readers with equal priority to a writer to push its timestamp. The change should be made in conjunction with #95227.

Jira issue: CRDB-23675

Epic CRDB-26539

@nvanbenschoten nvanbenschoten added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-kv-transactions Relating to MVCC and the transactional model. T-kv KV Team labels Jan 13, 2023
@nvanbenschoten nvanbenschoten self-assigned this Jan 23, 2023
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Mar 7, 2023

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
…imestamp

Fixes cockroachdb#95225.

This commit updates QueryIntent's behavior when `ErrorIfMissing` to only return
an error if no matching intent is found. Notably, the request no longer returns
an error if a matching intent is found, but has been pushed.

Previously, this case would result in a `RETRY_SERIALIZABLE` ("intent pushed")
error, causing the querying transaction to immediately refresh and retry. At the
time, we thought that this behavior was "an optimization" because it allowed a
transaction to notice that it had been pushed and may abort earlier, instead of
waiting until commit time. In practice, this does not seem to be a meaningful
improvement. Meanwhile, it makes any solution to cockroachdb#95227 more difficult because
it allows a stream of high-priority reading transaction to starve a low-priority
writing transaction before that writing transaction even reaches its commit
phase. We'd like to solve these starvation cases through some form of
commit-time coordination (either blocking pushers temporarily or refreshing into
the future). This change helps unlock those solutions.

Release note: None

Release justification: None, do not merge this until after the release-23.1
branch has been cut.
@nvanbenschoten nvanbenschoten added the A-read-committed Related to the introduction of Read Committed label Mar 30, 2023
@exalate-issue-sync exalate-issue-sync bot added A-read-committed Related to the introduction of Read Committed and removed A-read-committed Related to the introduction of Read Committed labels Apr 5, 2023
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Apr 18, 2023
…imestamp

Fixes cockroachdb#95225.

This commit updates QueryIntent's behavior when `ErrorIfMissing` to only return
an error if no matching intent is found. Notably, the request no longer returns
an error if a matching intent is found, but has been pushed.

Previously, this case would result in a `RETRY_SERIALIZABLE` ("intent pushed")
error, causing the querying transaction to immediately refresh and retry. At the
time, we thought that this behavior was "an optimization" because it allowed a
transaction to notice that it had been pushed and may abort earlier, instead of
waiting until commit time. In practice, this does not seem to be a meaningful
improvement. Meanwhile, it makes any solution to cockroachdb#95227 more difficult because
it allows a stream of high-priority reading transaction to starve a low-priority
writing transaction before that writing transaction even reaches its commit
phase. We'd like to solve these starvation cases through some form of
commit-time coordination (either blocking pushers temporarily or refreshing into
the future). This change helps unlock those solutions.

Release note: None

Release justification: None, do not merge this until after the release-23.1
branch has been cut.
craig bot pushed a commit that referenced this issue Apr 20, 2023

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
98183: kv: don't error on pushed intent during QueryIntent, increase write timestamp r=arulajmani a=nvanbenschoten

Fixes #95225.

This commit updates QueryIntent's behavior when `ErrorIfMissing` to only return an error if no matching intent is found. Notably, the request no longer returns an error if a matching intent is found, but has been pushed.

Previously, this case would result in a `RETRY_SERIALIZABLE` ("intent pushed") error, causing the querying transaction to immediately refresh and retry. At the time, we thought that this behavior was "an optimization" because it allowed a transaction to notice that it had been pushed and may abort earlier, instead of waiting until commit time. In practice, this does not seem to be a meaningful improvement. Meanwhile, it makes any solution to #95227 more difficult because it allows a stream of high-priority reading transaction to starve a low-priority writing transaction before that writing transaction even reaches its commit phase. We'd like to solve these starvation cases through some form of commit-time coordination (either blocking pushers temporarily or refreshing into the future). This change helps unlock those solutions.

Release note: None

Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig craig bot closed this as completed in 77640e5 Apr 20, 2023
@exalate-issue-sync exalate-issue-sync bot added the P-0 Issues/test failures with a fix SLA of 2 weeks label Apr 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-transactions Relating to MVCC and the transactional model. A-read-committed Related to the introduction of Read Committed C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) P-0 Issues/test failures with a fix SLA of 2 weeks T-kv KV Team
Projects
None yet
1 participant