-
Notifications
You must be signed in to change notification settings - Fork 792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move confirming_set on to ledger and add a shared_mutex for keeping memory objects in sync with the ledger #4567
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
… and can be expanded to include memory locking.
…r held in memory.
… within the block operation in order to break lock cycle to rolled_back observers.
# Conflicts: # nano/core_test/active_transactions.cpp # nano/core_test/conflicts.cpp # nano/core_test/election.cpp # nano/core_test/election_scheduler.cpp # nano/core_test/ledger.cpp # nano/core_test/network.cpp # nano/core_test/node.cpp # nano/node/node.cpp # nano/node/scheduler/priority.cpp # nano/node/scheduler/priority.hpp # nano/rpc_test/rpc.cpp
# Conflicts: # nano/core_test/backlog.cpp # nano/node/backlog_population.hpp # nano/node/node.cpp
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This adds a shared_mutex to ledger transactions which allows shared read access and exclusive write access to the ledger. This should fix the problem identified here #4540, and we will have a similar need when synchronizing bounded backlog memory objects. I will break this PR up in to discrete chunks starting with #4566.
The existing ledger operations use store transactions as the synchronization method though this uses an MVCC strategy which doesn't play well when also synchronizing memory objects. Adding a shared_mutex to the transaction means ledger writes are done exclusively to ledger reads while reads can co-exist with other reads.
There are several seemingly unrelated changes added but there is a commonality in their purpose: they addressed deadlocks that occur under the new mutually exclusive write transaction behavior. The changes fall under 2 major categories:
Non-obvious recursive acquisition of ledger read transaction.
While the ledger transactions support shared/concurrent read access, they no longer support recursive read transaction acquisition e.g. acquiring a ledger read transaction while an existing one is on the thread stack. The reason for this is that if a write transaction with exclusive lock access is requested elsewhere in the program between two read transactions requests, the second inner read transaction will be deadlocked. This is due to the behavior of shared_mutex where once unique_lock access is requested, it will block new shared_lock acquisition until the unique_lock can be serviced.
Lock order inversion by holding a read transaction across component boundaries.
A common pattern is for the node to have a component with a thread that accepts input guarded by a mutex and the thread does some database operation during its loop. This means as a general rule components have a lock order of: component-mutex then ledger-mutex.
Components also issue notifications or callbacks during their processing loop which sends data to other components. We run into issues when a component needs to pass data to another component: if the database mutex is being held while dispatching this creates a ledger-mutex then component-mutex lock order inversion which TSAN identifies.
To solve these cases the component must not hold a database transaction while making a callback or notifying observers.
It was inconvenient at first to fix up all these cases, however, it seemed to generally address places that might have caused read transactions to stay open for a long time or give unexpected results because of MVCC behavior instead of mutex behavior.