-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support added for cascading row removal #561
Conversation
/// | ||
/// Note that if a link is replaced by itself (a link to the same target | ||
/// row), then the link is *not* considered broken, and no rows will be | ||
/// cascade-removed by such an operation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might also want to add a method to "detach" a link. For those cases where you have a strong link to an object, but you want to explicitly remove the link, while keeping the object (kinda as a manual override to circumvent the cascade rules).
This could especially be needed if you need to move sub-objects from one parent object to another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point, I would however prefer to postpone the addition of the 'detach' operation, until after I have presented my full GC idea, as it would render that operation meaningless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, it is already easy to move a subobject from one parent to another. For example, if you want to move A from B to C:
table.set_link(col_ndx, row_ndx_c, row_ndx_a);
table.nullify_link(col_ndx, row_ndx_b);
The point is, that you have to "clone" the link, before removing it, just as you would have to do in a regular reference counting situation.
To a very large extent, the simple cascade removal scheme, that is now implemented, behaves just like regular reference counting. And, to an almost equally large extent, my full GC idea behaves just like a regular tracing garbage collector. The only exception is that objects are removed immediately, rather than eventually.
Here are my suggestions. Mostly they reflect which part of the code I found hard to grasp:
|
Finn, I now renamed vector typedef from |
Overall this looks good. It's a bit different than what we originally had in mind for the binding api, but the more I think about it the more I think this model may work better. The biggest concern I have though is that users will run into errors when changing link values (unsetting before re-setting). It would be very nice if we could delay deletion in the case there is an outstanding object accessor until the transaction is committed. That way a user can pull an object out, unset the link, and then have the choice to set another link with that object before the transaction ends. This model would also allow us to add another method to allow a user to explicitly preserve an object which was unset from a link if so desired. |
Conflicts: release_notes.md
The implementation of cascading row removal in this branch has undergone a major update to fix a severe design flaw in the first version. The new version breaks backlinks as it recurses into the link graph such that it knows the true resulting link counts. It postpones the breaking of forward links until the final row removals, which effectively means that the breaking of forward links can be skipped entirely. For a more detailed description, see the documentation of There are a few other significant changes at this point:
Also, there is a new general pattern for API functions that do replication and also cascade removal. I expect that it will apply well to other replicated functions too. The general idea is that for each operation
|
Again, there is Python-like pseudo code, if you are interested:
|
Testing is incomplete, but please go ahead with the review. |
These new changes sound good, but my previous concerns still stand - ie it is very odd from the user's perspective to have an object get deleted in the case they are holding a valid accessor to that object. For example lets say I want to move an object to a new index in a LinkView where the LinkView holds the only strong link. I would expect to be able to get a reference to an object, remove that object from the LinkView, then add it back to the LinkView without issue. Unfortunately with the current design such an object would get deleted. In this case you can order your operations to prevent this from occurring, but expecting all users to understand these nuances may be too much to ask, and there may be other scenarios where there is no sane way to prevent an object from being deleted without creating 'fake' temporary links to the object. Users will without a doubt run into this issue unless we provide a way to allow users to prevent object from being deleted. Counting outstanding row accessors as strong links to objects would seem to be a natural way to do this, as by holding onto a reference a user is denoting interest in that object, and most likely intends to use it sometime in the future. If we can delay deletion until the accessor is freed this would give the user an opportunity to create a new link if desired, and would avoid this issue altogether. |
I see you point Ari, and I agree that we need a way to 'detach' an object I think that the right way would be to offer specific methods to detach /a On Tue, Oct 28, 2014 at 12:21 PM, Ari Lazier [email protected]
|
An alternative would be to delay any cascading deletes until the end of a transaction. Object could go into a 'pending delete' state would be finalized when the transaction is committed. Until deletion is finalized existing accessors would remain attached and it would be possible to create new strong links to those objects to prevent them from being deleted. I think asking the user to explicitly hold on to an object wont work from an api perspective, as most users wont understand when they need to do this. That being said, its may be possible to hide some of the complexity in the binding if we go that route, although that may have negative impacts on performance. |
The key problem here is the confusion over what happens when you remove an
With the current setup removing an object will have behavior #3. By just /a On Tue, Oct 28, 2014 at 1:07 PM, Ari Lazier [email protected]
|
Ari, Alexander, I'll get back to you later on this. I have already been For the next couple of weeks, I will focus on sync, but then I will get On Tue, Oct 28, 2014 at 11:50 PM, astigsen [email protected] wrote:
|
The convention of having a xxx and a do_xxx matches to some degree similar choices in other portions of the code. But in the long term, I'd prefer a more meaningful prefix instead of "do_". |
Support added for cascading row removal
The documentation of
Descriptor::set_link_type()
explains what this is all about, so please start by reading that.Ari, Thomas, please pay special attention to this paragraph as it may apply to the binding too, i.e., the binding may be vulnerable to rows disappearing while they are modified.
Unit testing is still missing, but please go ahead with your review anyway.
I have added pseudo code below for the non-trivial parts, since I wrote it, and since there is a chance that it can help you understand my intentions.
@finnschiermer @astigsen @rrrlasse
@alazier @tgoyne (especially the API and the documentation)