-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
snapshot(ticdc): fix ddl puller and ddl manager stuck caused by dead lock #11886
snapshot(ticdc): fix ddl puller and ddl manager stuck caused by dead lock #11886
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: 3AceShowHand, CharlesCheung96 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
/test cdc-integration-pulsar-test |
/retest |
/cherry-pick release-8.5 |
@fubinzh: new pull request created to branch In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
@fubinzh: new pull request created to branch In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
/cherry-pick release-8.1 |
@fubinzh: new pull request created to branch In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
Signed-off-by: ti-chi-bot <[email protected]>
In response to a cherrypick label: new pull request created to branch |
Signed-off-by: ti-chi-bot <[email protected]>
In response to a cherrypick label: new pull request created to branch |
@wlwilliamx: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What problem does this PR solve?
Issue Number: close #11884
What is changed and how it works?
Summary
This PR fixes a deadlock issue in the
Snapshot
implementation:Deadlock in Recursive Read Locking: Although Go’s
sync.RWMutex
allows recursive read locks, they can result in deadlocks if a write lock is requested during the recursive read lock execution. This blocks the outer read lock from releasing, preventing the write lock from proceeding.This PR refactors lock usage patterns to avoid recursive read locking.
Root Causes of the Deadlocks
Recursive Read Lock Issue
Recursive calls involving
RWMutex.RLock()
can result in deadlocks when a write lock is requested during the recursive read lock execution. This behavior arises because Go’ssync.RWMutex
prioritizes write locks over read locks.Here is an example that illustrates the problem:
If a write lock is requested while
NestedOperation
is executing, the following chain of events occurs:RLock
inNestedOperation
.NestedOperation
cannot complete until itsRLock
is granted.RLock
in Operation cannot release untilNestedOperation
completes.RLock
and the recursiveRLock
are mutually dependent.Check List
Tests
Questions
Will it cause performance regression or break compatibility?
No.
Do you need to update user documentation, design documentation or monitoring documentation?
No.
Release note