-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix error happening when online compaction kicks in on async commits #6962
Conversation
1a4bcd1
to
9a25b01
Compare
9a25b01
to
1ffa87c
Compare
test/object-store/realm.cpp
Outdated
|
||
#if REALM_HAVE_UV | ||
namespace realm::util { | ||
class UvLoopScheduler final : public util::Scheduler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, do rewrite this test before merging to be scheduler agnostic so that we could verify our abstractions for Scheduler and EventLoop as they are supposed to be used on all platforms with sdks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that can be done. Our UV scheduler only supports one thread (As I understand it). But perhaps we don't need different threads. I will investigate that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have now rewritten the test case. It was not required to create a second thread. Just a second realm.
Fixes realm/realm-dart#1396 |
src/realm/db.cpp
Outdated
@@ -909,7 +909,7 @@ void DB::open(const std::string& path, bool no_create_file, const DBOptions& opt | |||
REALM_ASSERT(path.size()); | |||
|
|||
m_db_path = path; | |||
m_path_hash = StringData(path).hash() & 0xffff; | |||
m_path_hash = (size_t(this) >> 4) & 0xffff; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this line doing? It is not clear...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just hashing the file name was not good enough. If two DBs were opening the same file, they would get the same hash. Here we just take some bytes out of the object pointer. That should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make that into a comment :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, maybe I am wrong, but it seems that this is basically an important part for fixing this issue. And it is really really difficult to make sense of this line just reading it, without any added context.
Can you please add a comment or move this logic for hashing the db object into some helper method or some class.
Also, m_path_hash
is no longer a good name for this, since you are basically hashing the ptr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_path_hash
is only used for the prefix of the logger. It was changed for debugging purposes so that the log output can be differentiated between DB instances. In that way it is not critical to the fix, but it helped find out what was happening. The essence of the fix is separation of GroupWriter and GroupCommitter, so that async commits can just make a commit without doing any of the online compaction.
test/object-store/realm.cpp
Outdated
}; | ||
} // namespace realm::util | ||
|
||
TEST_CASE("Concurrent commits") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment about what this test do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rock'n roll, Baby!
What, How & Why?
The issue here is that online compaction kicks in at the same time as async commits is used from 2 different threads. When the changes are going to be committed to disk, a GroupWriter objects is created. It discovers that online compaction is in progress and creates some additional info in the top level array. It causes a COW on the to array which is why it is being reported released in the SlabAllocator. Those changes are never written to disk, as the commit has already been created and just waiting to be written. When another transaction starts a new write transaction and makes some modifications, the same top level array will be COW again, which will trigger the assertion.
Fixes #6340
Fixes #6399
Fixes #6536
Fixes #6593
Fixes #6922
☑️ ToDos