-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix crash when opening FLX realm after client reset failure #6671
Merged
Merged
Changes from 2 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
cf58a7e
Fix crash when opening FLX realm after client reset failure
danieltabacaru 3bce4a0
Update changelog
danieltabacaru dfc3b65
Don't superceed pending subscriptions in case of a client reset failure
danieltabacaru 31e4a81
Add test
danieltabacaru 161b76b
Merge branch 'master' into dt/fix_client_reset_crash
danieltabacaru 538fdab
Changes after code review
danieltabacaru File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -903,6 +903,8 @@ void SubscriptionStore::supercede_prior_to(TransactionRef tr, int64_t version_id | |
|
||
void SubscriptionStore::supercede_all_except(MutableSubscriptionSet& mut_sub) const | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💯 |
||
{ | ||
// 'mut_sub' can only supersede the other subscription sets if it is in Complete state. | ||
REALM_ASSERT_EX(mut_sub.state() == SubscriptionSet::State::Complete, mut_sub.state()); | ||
auto version_to_keep = mut_sub.version(); | ||
supercede_prior_to(mut_sub.m_tr, version_to_keep); | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to work fine for this corner case. But I'm a little concerned that we are papering over a real underlying problem here. The subscription is actually not complete, it is in an error state. Is there some invariant of the subscription store that has been violated by setting the active subscription to an error?
If I undo these changes and run the test you added, it looks like the actual problem is that https://github.com/realm/realm-core/blob/master/src/realm/sync/subscriptions.cpp#L727-L730 where an assumption has been made that there is an object with primary key 0. This is probably true in the initial state, but not later on. I wonder if it is possible to hit this in normal flow somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, the assumption is that there is always going to be a subscription with pk==0. If that is true, we should not remove it during a superceed state change. What do you think of these changes instead? https://github.com/realm/realm-core/compare/dt/fix_client_reset_crash...js/flx-client-reset?expand=1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the assumption is that there is always going to be a subscription with pk==0 if there is no complete subscription. And that makes sense when opening a fresh realm. But once we have a complete subscription, we supercede and hence remove all the previous ones. I had the same idea to keep around the subscription with the primary key 0, but I think it's wrong. We always supercede subscriptions when one is marked Complete (https://github.com/realm/realm-core/blob/master/src/realm/sync/subscriptions.cpp#L393-L398). If one is marked Error, we should still have the last complete one. And that I think should also be the case here. The new subscription is just a copy of the active/complete one, so it should not be in Error state since there is no issue with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot think of any other scenario.
supercede_prior_to
is called only when a subscription is marked complete, whilesupercede_all_except
is the issue here (called only in this specific case)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, what you say about removing the pk==0 subscription makes sense, ignore my suggested change for that.
I am just wary of setting the subscription to complete when that is not the actual state of things, I feel that this may create other edge case bugs. I've tried to come up with a scenario where that matters, but am having a hard time.
Now I am questioning if it is correct to be superceeding subscriptions in this way at all. It may be fine for discard local, but I think it may be wrong for recovery mode in a scenario where the server breaks the schema but then changes it back such that FLX queries start to work again. In that situation we wouldn't want to superceed subscriptions made offline because those should be recovered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wondered why we don't leave the subscriptions as they are..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Superceeding the subscriptions will lead to compensating writes. We could superceed them for DiscardLocal and keep them for the recovery mode. I would keep them in all modes (in the worst case we'll create subscriptions for tables or objects that don't exist, but that's not an issue for the server afaik)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right, and we should just delete https://github.com/realm/realm-core/blob/master/src/realm/object-store/sync/sync_session.cpp#L580-L589 and nuke the method
supercede_all_except()
. The comment says that the intent is to remove all later versioned (made offline) subscriptions, but that isn't necessary in the case of a client reset at this point in discard mode and it is plain wrong in recovery mode.I think you can test the recovery scenario with something like this:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the test suggestion. I'll give it a try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at https://github.com/realm/realm-core/blob/master/test/object-store/sync/flx_sync.cpp#L754-L800 for inspiration, but the outcome is not always deterministic. So, I added a new test based on the other test I added for this PR to test for this specific case.