-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RCORE-788 Fix prior_size issues when replacing an embedded object in a list #4844
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach here looks good to me. I will add a test in the applier now to make sure that this will work (though I suspect it wont be a problem) and I can work on adding a new E2E test that would have caught this
src/realm/list.cpp
Outdated
@@ -371,6 +371,16 @@ Obj LnkLst::create_and_insert_linked_object(size_t ndx) | |||
Obj LnkLst::create_and_set_linked_object(size_t ndx) | |||
{ | |||
Table& t = *get_target_table(); | |||
if (t.is_embedded() && m_list.get(ndx) != ObjKey{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach looks good to me. The other option would be to just clear out the existing contents of the object here and then just update it with the new object, but not sure if that is possible
// object gets cleared, otherwise you'll only see the Update ObjectValue instruction, which is idempotent, | ||
// and that will lead to corrupted prior size for array operations inside the embedded object during | ||
// changeset application. | ||
if (value.is_type(type_Link, type_TypedLink) && list.get_target_table()->is_embedded()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the type is a list of mixed, then list.get_target_table()
will return null. In that case, the mixed value will be a typed link and we should get the target table from value.get_link().get_table_key()
. Could you add a test for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I now handle type_Link and type_TypedLink separately.
Not familiar with the code, so might be a dumb question, but are dictionaries susceptible to the same problem? I.e. if we have: |
@nirinchev, if you look at the code in this PR, I added a test for dictionaries. Is it insufficient? |
Dictionary instructions don't have the prior_size field and this bug is specifically related to failing to reset that to 0, so this isn't applicable to Dictionaries (or Sets). It's not particularly clear to me why Array instructions have prior_size. There's a whole lot of code which validates it and carefully merges it, but there doesn't appear to be anything which actually uses it. It seems like something which was just accidentally left in the sync instruction log when sync started using a different instruction format from change notifications? I guess it provides some validation that the merge converged correctly? |
@jbreams I can see it now - I got confused by the naming and the general C++ obnoxiousness. The fact that things just work for dicts, coupled with @tgoyne's comment, makes me wonder if I understand the issue. If I read the description correctly, the |
@nirinchev , you are correct. My description sucks. Sorry. It's not the prior_size of the array containing the embedded objects, it's the arrays inside the embedded object. So if you have a nested object like I don't know why this doesn't effect dictionaries. I will look into it and make sure my test case - that so far has worked out of the box - is actually testing the right thing. However, this is a P2 bug for a customer so I don't want to drag this out too far. |
Okay, this makes a lot more sense now, thanks! Agreed, no need to delay that, but it'll definitely be interesting to come back to dictionaries at some point and figure out why they're working since, intuitively, they should suffer from the same issue. I did briefly look at |
@nirinchev, I think the implementation for |
I think I've addressed all comments. We're trying to get this merged so that we can unblock a customer so I'd appreciate a review ASAP. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole thing makes me a bit uneasy since it seems to be pointing at a bug in the protocol or merge algorithm that we're just working around, but that's not a reason not to ship the workaround.
@tgoyne, totally agree. When we introduced the ObjectValue{} payload type to indicate that an object should be created at a path, we made it idempotent. It turns out we need like a ClearObjectValue{} that isn't idempotent to indicate that we want to replace the embedded object rather than update its fields. For regular links, we can set them to null - and this is what the server side Atlas translator does - but for arrays we can't do that. I think there will be follow-up work to this, but it will involve bumping the protocol version and forcing everyone to do a client reset, and I don't want to do that under the gun when there's a reasonable workaround available. |
Agreed with everyone here. The ObjectValue being idempotent has been a bit tricky to work with and there are definitely a few issues with it. I think that it would definitely be interesting to add a new Payload type that clears an object because having to set to null or erase and re-insert is a bit hacky but sadly what has to happen now |
This is similar to RCORE-734, but with lists. If you have a list of embedded objects
a = {b: [{c: [1, 2]}]}
and update them in the sdk like thisa.b[0] = { c: [ 3, 4 ] }
, you'll end up replacing the object ata.b[0]
without clearing the old one, so any arrays in the sub-object won't be empty when they get updated on other devices.In RCORE-734 we fixed this by setting the object to null before doing the replacement. We can't set the array element to null, so instead this transforms the Update into an ArrayErase/ArrayInsert.
☑️ ToDos