Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for transactions not allowed to finish during PlannedReparentShard #8089

Merged
merged 3 commits into from
May 11, 2021

Conversation

harshit-gangal
Copy link
Member

@harshit-gangal harshit-gangal commented May 10, 2021

Description

In a recent fix, an issue was introduced.

Before sending queries to a tablet, #7879 changed the behaviour to check if the tablet is ready to answer, by checking it's ServingStatus and that the tablet type hasn't changed.

If PlannedReparentShard is going on, this check should not be done for transactions in flight.
The vttablet waits for inflight transactions to get commit/rollback i.e. we want queries to existing transaction to be sent down to get the transaction completed, even if the tablet is currently saying it is NotServing.

The test that exposed this issue was already in the code base: go/test/endtoend/tabletgateway/buffer/buffer_test.go became flaky after #7879 was merged.

So, to fix the issue, the pre-check is removed from Gateway when getting the tablet connection for existing active shard_sessions with vttablet.

This also imply that the reserved connection that used to reset based on this pre-check logic will have to hit the vttablet first and then only will reset the shard session on receiving the expected error making it two round trips.

Related Issue(s)

Bug introduced in #7879

Checklist

  • Tests were added or are not required
  • Documentation was added or is not required

Copy link
Member

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you link the previous issue and PR? Is this a straight revert of the previous fix?

@harshit-gangal
Copy link
Member Author

harshit-gangal commented May 10, 2021

The fix for the old issue is still inplace. This PR removes the additional check done to retrieve queryservice using tablet alias.

@systay
Copy link
Collaborator

systay commented May 11, 2021

This should be backported to both 9.0 and 10.0, since the PR that introduced this problem was backported to both: #8041 & #7935

@harshit-gangal
Copy link
Member Author

Yes, we need to do both the release after merge and backported.

@systay systay merged commit 285c729 into vitessio:master May 11, 2021
@systay systay deleted the qs-fix branch May 11, 2021 13:59
systay pushed a commit to planetscale/vitess that referenced this pull request May 11, 2021
Backport of vitessio#8089
This is a combination of 3 commits.

* remove precheck of tablet serving and target
* remove the additional logic and return error if queryservice not found to serve query
* fix test as per new change

Signed-off-by: Harshit Gangal <[email protected]>

Signed-off-by: Andres Taylor <[email protected]>
systay pushed a commit to planetscale/vitess that referenced this pull request May 11, 2021
Backport of vitessio#8089
This is a combination of 3 commits.

* remove precheck of tablet serving and target
* remove the additional logic and return error if queryservice not found to serve query
* fix test as per new change

Signed-off-by: Harshit Gangal <[email protected]>
Signed-off-by: Andres Taylor <[email protected]>
Comment on lines +568 to +572
// ChangeTabletType changes the tablet type.
func (sbc *SandboxConn) ChangeTabletType(typ topodatapb.TabletType) {
sbc.tablet.Type = typ
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this being used?

@systay systay changed the title Queryservice fix Fix for transactions not allowed to finish during PlannedReparentShard May 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants