-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Change recursive_mutex to mutex in DatabaseRotatingImp #5276
base: develop
Are you sure you want to change the base?
Conversation
- Follow-up to #4989, which stated "Ideally, the code should be rewritten so it doesn't hold the mutex during the callback and the mutex should be changed back to a regular mutex."
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #5276 +/- ##
=======================================
Coverage 78.1% 78.1%
=======================================
Files 790 790
Lines 67607 67613 +6
Branches 8164 8163 -1
=======================================
+ Hits 52828 52836 +8
+ Misses 14779 14777 -2
|
* Use a second mutex to protect the backends from modification * Remove a bunch of warning comments
// backendMutex_ is only needed when the *Backend_ members are modified. | ||
// Reads are protected by the general mutex_. | ||
std::mutex backendMutex_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this sounds like a typical single-write and one-or-more-read scenario, is it possible to use a single shared_mutex
here instead of these two mutexes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible, but there are risks. The biggest one is that I'd have to take a shared_lock
at the start of rotateWithLock
, and upgrade it to a unique_lock
after the callback. If there is somehow ever a second caller to that function, or even a different caller that upgrades the lock, there is a potential deadlock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bthomee @vvysokikh1 Ok, it took waaaaaaay longer than it should have because I kept trying clever things that didn't work or turned out unsupported, but I rewrote the locking, and changed to a shared mutex, and I think I've got a pretty foolproof solution here. And a unit test to exercise it.
But don't take my word for it. The point of code reviews is to spot the stuff I didn't consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your solution is not completely solving the issue. It's still technically possible to deadlock (calling rotateWithLock from inside of the callback, this will cause a deadlock on your new mutex).
If it's good enough for now, please leave some comments to rotateWithLock()
to warn any user of calling rotateWithLock()
directly or indirectly from callback.
* upstream/develop: Updates Conan dependencies (5256)
913df26
to
9f564bc
Compare
- Rewrite the locking in DatabaseRotatingImp::rotateWithLock to use a shared_lock, and write a unit test to show (as much as possible) that it won't deadlock.
13fb47c
to
d912b50
Compare
* upstream/develop: fix: Do not allow creating Permissioned Domains if credentials are not enabled (5275) fix: issues in `simulate` RPC (5265)
High Level Overview of Change
Follow-up to #4989, which stated "Ideally, the code should be rewritten so it doesn't hold the mutex during the callback and the mutex should be changed back to a regular mutex."
This rewrites the code so that the lock is not held during the callback. Instead it locks twice, once before, and once after. This is safe due to the structure of the code, but is checked after the second lock. This allows
mutex_
to be changed back to a regular mutex.Context of Change
From #4989:
Type of Change
Test Plan
Testing can be the same as that for #4989, plus ensure that there are no regressions.