Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage,kv: make transaction deadline exceeded errors retriable #35284

Merged
merged 2 commits into from
Mar 18, 2019

Conversation

andreimatei
Copy link
Contributor

@andreimatei andreimatei commented Feb 28, 2019

Before this patch, they were opaque TransactionStatusErrors.
The belief is that we should only be seeing such errors when a
transaction is pushed by minutes. Shockingly, this seems to hapen enough
in our tests, for example as described here: #18684 (comment)

This patch marks the error as retriable, since it technically is.

This patch also changes the semantics of the
EndTransactionRequest.Deadline field to make it exclusive so that it
matches the nature of SQL leases. No migration needed.

Touches #18684

Release note (sql change): "transaction deadline exceeded" errors are
now returned to the client with a retriable code.

@andreimatei
Copy link
Contributor Author

cc @vivekmenezes

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, 20 of 20 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)


pkg/roachpb/errors.proto, line 215 at r2 (raw file):

  // An asynchronous write was observed to have failed.
  RETRY_ASYNC_WRITE_FAILURE = 5;
	// The transaction exceeded its deadline.

tab instead of space


pkg/storage/batcheval/cmd_end_transaction.go, line 278 at r2 (raw file):

		}

		if IsEndTransactionExceedingDeadline(reply.Txn.Timestamp, *args) {

Shouldn't we delete this?

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/roachpb/errors.proto, line 215 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

tab instead of space

done


pkg/storage/batcheval/cmd_end_transaction.go, line 278 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Shouldn't we delete this?

yes, it was unreachable, thanks.

@jordanlewis
Copy link
Member

Heroes!

Copy link
Contributor

@vivekmenezes vivekmenezes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @nvanbenschoten)


pkg/kv/txn_coord_sender.go, line 617 at r3 (raw file):

	deadline := ba.Requests[0].GetEndTransaction().Deadline
	txn := tc.mu.txn
	if deadline != nil && deadline.Less(txn.Timestamp) {

This should be !txn.Timestamp.Less(deadline). The deadline is not included as a valid timestamp.


pkg/kv/txn_interceptor_committer.go, line 96 at r3 (raw file):

	// Check if the (read-only) txn was pushed above its deadline.
	deadline := et.Deadline
	if deadline != nil && deadline.Less(br.Txn.Timestamp) {

same


pkg/kv/txn_interceptor_committer.go, line 106 at r3 (raw file):

		txnCpy := txn.Clone()
		return nil, roachpb.NewErrorWithTxn(
			roachpb.NewTransactionRetryError(roachpb.RETRY_COMMIT_DEADLINE_EXCEEDED, extraMsg), &txnCpy)

probably makes sense to abstract this out as a function. It's repeated here.

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @nvanbenschoten, and @vivekmenezes)


pkg/kv/txn_coord_sender.go, line 617 at r3 (raw file):

Previously, vivekmenezes wrote…

This should be !txn.Timestamp.Less(deadline). The deadline is not included as a valid timestamp.

well that's not how EndTransactionRequest is documented and not what the server enforces:

return args.Deadline != nil && args.Deadline.Less(t)

But indeed the leasing code documents to be returning exclusive deadlines.

// commit-timestamp < expiration-time. Care must be taken to not modify

So... which one should it be? I'm happy to change the server-side enforcement too.

Copy link
Contributor

@vivekmenezes vivekmenezes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @nvanbenschoten)


pkg/kv/txn_coord_sender.go, line 617 at r3 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

well that's not how EndTransactionRequest is documented and not what the server enforces:

return args.Deadline != nil && args.Deadline.Less(t)

But indeed the leasing code documents to be returning exclusive deadlines.

// commit-timestamp < expiration-time. Care must be taken to not modify

So... which one should it be? I'm happy to change the server-side enforcement too.

The deadline is not an acceptable timestamp. Let's fix it to accommodate that concept. Thanks!

@vivekmenezes
Copy link
Contributor

@andreimatei wanna update this PR and get it into a mergeable state. Would love to see it get into 19.1

@andreimatei
Copy link
Contributor Author

Don't know if I can do that today or tomorrow.
Feel free to take it over if you have the time.

Release note: None
Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @vivekmenezes)


pkg/kv/txn_coord_sender.go, line 617 at r3 (raw file):

Previously, vivekmenezes wrote…

The deadline is not an acceptable timestamp. Let's fix it to accommodate that concept. Thanks!

done


pkg/kv/txn_interceptor_committer.go, line 96 at r3 (raw file):

Previously, vivekmenezes wrote…

same

Done.


pkg/kv/txn_interceptor_committer.go, line 106 at r3 (raw file):

Previously, vivekmenezes wrote…

probably makes sense to abstract this out as a function. It's repeated here.

Done.

Copy link
Contributor

@vivekmenezes vivekmenezes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @nvanbenschoten, and @vivekmenezes)


pkg/kv/txn_coord_sender.go, line 705 at r5 (raw file):

		// txnCpy := txn.Clone()
		// pErr := roachpb.NewErrorWithTxn(
		//   roachpb.NewTransactionRetryError(roachpb.RETRY_COMMIT_DEADLINE_EXCEEDED, extraMsg), &txnCpy)

Nit: remove comments here and below


pkg/storage/batcheval/cmd_end_transaction.go, line 399 at r5 (raw file):

			retry, reason = true, roachpb.RETRY_COMMIT_DEADLINE_EXCEEDED
		}
	}

I suppose you could use the common function here too.

Before this patch, they were opaque TransactionStatusErrors.
The belief is that we should only be seeing such errors when a
transaction is pushed by minutes. Shockingly, this seems to hapen enough
in our tests, for example as described here: cockroachdb#18684 (comment)

This patch marks the error as retriable, since it technically is.

This patch also changes the semantics of the
EndTransactionRequest.Deadline field to make it exclusive so that it
matches the nature of SQL leases. No migration needed.

Touches cockroachdb#18684

Release note (sql change): "transaction deadline exceeded" errors are
now returned to the client with a retriable code.
Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @vivekmenezes)


pkg/kv/txn_coord_sender.go, line 705 at r5 (raw file):

Previously, vivekmenezes wrote…

Nit: remove comments here and below

done. not a nit, thanks


pkg/storage/batcheval/cmd_end_transaction.go, line 399 at r5 (raw file):

Previously, vivekmenezes wrote…

I suppose you could use the common function here too.

not easily. Unfortunately on the server side most code (like here) deals with errors, and the transaction is attached to the error later.

craig bot pushed a commit that referenced this pull request Mar 18, 2019
35284: storage,kv: make transaction deadline exceeded errors retriable r=andreimatei a=andreimatei

Before this patch, they were opaque TransactionStatusErrors.
The belief is that we should only be seeing such errors when a
transaction is pushed by minutes. Shockingly, this seems to hapen enough
in our tests, for example as described here: #18684 (comment)

This patch marks the error as retriable, since it technically is.

This patch also changes the semantics of the
EndTransactionRequest.Deadline field to make it exclusive so that it
matches the nature of SQL leases. No migration needed.

Touches #18684

Release note (sql change): "transaction deadline exceeded" errors are
now returned to the client with a retriable code.

35793: storage: fix TestRangeInfo flake and re-enable follower reads by default r=ajwerner a=ajwerner

This PR addresses a test flake introduced by enabling follower reads in
conjunction with #35130 which makes follower reads more generally possible
in the face of lease transfer.

Fixes #35758.

Release note: None

35865: roachtest: Skip flaky jepsen nemesis r=tbg a=bdarnell

See #35599

Release note: None

Co-authored-by: Andrei Matei <[email protected]>
Co-authored-by: Andrew Werner <[email protected]>
Co-authored-by: Ben Darnell <[email protected]>
@craig
Copy link
Contributor

craig bot commented Mar 18, 2019

Build succeeded

nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request Sep 6, 2019
As of cockroachdb#35284, this was already checked by IsEndTransactionTriggeringRetryError.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request Sep 6, 2019
As of cockroachdb#35284, this was already checked by IsEndTransactionTriggeringRetryError.

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants