-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stabilize drop order #1857
Stabilize drop order #1857
Conversation
So, while y'all been discussing what drop order would be the best and whatnot my comment pointing out that one can (and should, if its relied on) implement arbitrary drop ordering in their destructors trivially seems to have been mostly ignored. While doing so would require some unsafe code, it is explicit, and is much better than relying on any implicit order. At the very least explicitly dropping fields in your destructor in the order programmer specifies very clearly demonstrates the fact that ordering is relied on. It is also possible to only involve select fields and let compiler drop the rest. Now it may not be exactly obvious how that happens, so I’ll prototype some code: // Potentially defined in libcore.
union ManuallyDropped<T> { value: T };
impl<T> ManuallyDropped<T> {
fn new(v: T) -> ManuallyDropped<T> { ManuallyDropped { value: v } }
unsafe fn manually_drop(&mut self) { ptr::drop_in_place(&mut self.value) }
}
struct SomethingWithInterfieldDependencies {
like_that: ManuallyDropped<postgres::Transaction<'static>>,
example: ManuallyDropped<Connection<...>>,
rest_of_the: StuffNobodyCaresAbout,
}
impl Drop for SomethingWithInterfieldDependencies {
fn drop(&mut self) {
unsafe {
// oh look, particular drop order is important for this struct,
// I better watch out and not reorder those fields… except it
// doesn’t matter anymore because drop order is clearly and
// explicitly encoded here.
self.like_that.manually_drop();
self.example.manually_drop();
}
// rest_of_the gets dropped by compiler automatically
}
} Need specific drop order within vector/array/slice? struct ReverseDropVec<T>(Vec<ManuallyDropped<T>>);
impl Drop for ReverseDropVec {
fn drop(&mut self) {
unsafe {
for i in self.iter_mut().rev() { i.manually_drop() }
}
}
} So if anything, I’m very mildly against stabilising any order just so this explicitness would begin happening somehow. If something like this was already a widely accepted practice, there would be no backcompat hazard in rust changing drop order whatsoever, everybody would have saved hours of their life that they spent arguing about what’s mostly a non-problem, and life would be generally more ponies and rainbows. |
@nagisa I like your idea very much and I think it would be a great follow-up RFC! I also dislike the current situation, but still think this RFC is necessary. As you said:
Unfortunately, it isn't 😞 |
@nagisa I definitely agree with you that writing a dtor of this style is best practice. But I'd also like to hear an affirmative case for why we should change the drop order -- as a rule of thumb, I think that execution order should be well-defined unless there is a strong reason for it not to be, and drop code seems to me to be not so different from any other code that executes. Put another way, changing drop order seems to me to be primarily a vector by which we can surprise people and cause bugs in their code. (It doesn't, e.g., affect performance all that much, and if it did, people could re-order their fields to accommodate that.) Is there an example of why we might want to change the drop order? |
@nagisa As written, if you replace
You need to call |
I very specifically mentioned it is a prototype. Also known as a rough sketch or “I didn’t compile or run it, but am sure the idea is implementable with roughly this code”. I did specifically intend an |
I really don't have enough experience, so please correct me if it makes no sense :). Should drop order be defined?I like @nikomatsakis words "I think that execution order should be well-defined unless there is a strong reason for it not to be". What happens if the drop order is reversed?This is one of the alternatives mentioned in the rfc. I think that if the drop order is changed, every crate that wants to compile with both a nightly and stable version will have to follow @nagisa's method to manually specify the drop order. That will basically make all crates that rely on a specific drop order use a custom dtor. This will continue to be so until rustc versions with the old drop order become obsolete. So changing the drop order would only makes less crates depend on it. What are the advantages of reversing the drop order? To me it seems the only advantage is to make teaching it a little bit easier. Just like all variables are dropped in reverse order of when they are defined, so are the fields of structs, tuples etc (with the notes about custom dtors and panics during construction, which this rfc explains very nice). But I like a certain simplicity... How great are the consequences of this rfc?The difficult thing with this rfc is of course that there are a few creates relying on something that is currently unspecified. So there is the right to make a breaking change. But because this is something that can not be statically checked for, some code may silently break (if there are not enough test in a crate). Is there any guess about how many crates would be affected by a change? Is it more like 10, of more like 100? Or are we in danger of spending more time thinking about it than testing and fixing a few crates would take? Overall I am slightly in favor of this rfc. Changes just don't really seem worth it. |
Could there be an associated pragma for specifying the drop order? That way the crates in question would just have to do something like the following on the structs that require it and they would be good to go. This way they could be specific AND forwards compatible.
We could even have a Hopefully a crator run (with unit tests) could help tell which (essential) libraries need to be updated, and they could be updated before the changes were released. We could release the pragma early, warning crate authors that in 2 or 3 releases the drop order will be stabilized as FILO. Then we could have our cake and eat it too. |
@vitiral Unfortunately, adding an attribute to specify drop order would cause problems when compiling the old version of the library with a recent compiler. In that case, there would be nothing in the code indicating the preferred drop order and the program could break in subtle ways. Therefore, we need to at least determine that the default behavior should be what we have now. Opt-in drop orders (possibly in the form of attributes) can be added later on. |
|
@briansmith at that point the introduced complexity is not worth it anymore IMO (that is why I wrote the RFC) |
One of the reasons destruction order is fixed in C++ as the reverse of construction is that it is used in multiple contexts. When construction fails during member construction, the members are destroyed in the same order as in the destructor. When copying a vector fails, successfully copied elements are destroyed in order. Arguably, depending on a fixed order is a design error, because you have made a dependency that you should have expressed. In effect, the system must then assume each subobject depends on all that were constructed before it, not just on ones you identified. But there is not always a way to express it, and expressing it for everything would add intolerable clutter. Anyway we are used to code at the bottom of a block depending on state from above it. Demanding annotations of those dependencies risks open insurrection. Actually the compiler analyzes blocks and identifies dependencies itself, but what it can't analyze, it had to assume lexical-order dependency So a universal drop order would not be unfamiliar; it would make members act more like locals. |
@ncm I can't tell if your comment is supporting a defined drop order or not =) |
I still feel like no one has proposed a concrete advantage to arbitrary drop orders, right? I think it's mainly "maybe one will happen in the future"? |
If "dropping in in-memory order" aka "order after field reordering" can be considered arbitrary, then "no pessimization by default" is the concrete advantage. Structures can be quite large especially if they are auto-generated and/or contain other nested structs. Who knows what benefits "optimization-by-default" can bring occasionally, it's better to give it chance. Regarding "you can manually reorder if performance bottleneck is found", the whole point of these layout optimizations is that they accumulate, they affect each and every struct by default. You may not know anything about caches or prefetch and you'll still benefit from them. |
I am in strongly in favor of some amount of nailing down initialization order. That order need not match the physical layout-order of members in a struct (presuming the compiler is allowed to re-arrange those, as petrochenkov seems to suggest). C++ inherited from C that physical layout must match member lexical declaration order; it equated that order to initialization order, but not for any especially good reason. Any specified order of both initialization and destruction should be easily visible to the programmer, matching some prominent lexical order. In C++ the lexical order of supplying arguments to initializers does not affect order of initialization, but every compiler warns if it doesn't match the other three. The lexical order of initializers in the source code should match the temporal order in which they are run. It doesn't seem necessary for that order to match layout order, or to match member declaration order, or for those two to match. But, if there are two places where the same members are initialized, those lexical orders should be enforced to match. That might leave the order of default-initialized members unspecified, but it seems to suffice that if you care about the order, you must mention the members whose order matters; any not mentioned could be allowed to be initialized in any order, or even in parallel with one another and with mentioned members. If the compiler can determine that two mentioned initializations don't interact, it can rearrange or interleave them just like it can rearrange statements in a block, by the as-if rule. Elements in array-like objects probably should be initialized in index order. Again, of course, if the compiler can prove that a program can't tell what the order was, the compiler is free to re-order or do them in parallel. |
The reason why different explicit initializations have to match one another's sequence is that Drop can't know which of possible sequences was used. If Drop cannot know anything about the initialization sequences, then it might need to rely on the lexical order of declaration of the members; in that case, explicit initializations would need to match that order, although layout could still be independent. |
Also: My reasons for wanting some initialization order determinacy have nothing to do with performance. They are about correctness, and implicit dependency. That anyone could reliably predict the performance effect of any given change to initialization order strikes me as absurd. The only way to know if it has any effect is to measure, but there's no way to guess how portable the result is even between two apparently identical machines, or under different loads on the same machine. |
@ncm First, the initialization order is already defined and, unlike in C++, it's left-to-right order of fields in struct initializer expression, regardless of field order in struct definition. So, this RFC is only about drop order. |
Thank you, petrochenkov. I should have read more carefully above before posting. |
@ncm Before initialization is complete, the structure doesn't exist yet, so the rules are (or should be?) identical to calls (which don't happen until after their arguments are fully evaluated). |
There's a way to do this with some indirection and global mutable context.
In this setup 1) initialization order is correct by definition and not tied to field order in struct definition 2) drop order in case of panic is correct because |
I want to make clear up front: My preference is for a semantics based on the source text; not in-memory order after potentially arbitrary reordering. But I'm going to start off attempting an analogy with order-of-evaluation in another context. (Perhaps this will be torn apart as a strawman, but I think the analogy is valid.) @petrochenkov wrote:
I have seen similar arguments put forward to support unspecified evaluation order for the expressions used in function call arguments. That particular lack of specification has always been of interest to me because it is one property that is shared, oddly enough, by both the C and Scheme standards (at least up through r6rs; not sure about r7rs). Furthermore: I have even seen concrete evidence presented of a win due to unspecified evaluation order of call arguments. In particular, a post from 1995 to the comp.compilers newsgroup. (Disclaimer: I used to work on Larceny and Will Clinger was my Ph.D advisor.) http://compilers.iecc.com/comparch/article/95-08-080
(Quoting to ensure the relevant paragraph remains even if link goes dead.) A report of a 7.5% code size reduction is nothing to sneeze at.
So, okay, that's my attempt to point out that unspecified evaluation order for certain language forms can indeed yield performance wins (at least in certain contexts). Nonetheless: My subjective opinion is that it is far more important to Rust's own goals that a human reader can predict what a piece of code will do (especially if that code never uses an So for something like drop order, I would prefer that we specify some reasonable evaluation order that is based on the source text. If the compiler is rearranging structural representation and that causes a change to the drop evaluation order, then I think that is in conflict with the goal of human predictability.
|
Wouldn't that assume there exists an optimal order for all architectures and use-cases? |
Architecture-dependence for optimal performance is problem at all levels of programming, not just something like drop order. For example, see discussion in this paper (pdf); they point out in section 4 of the paper that you have to select entirely different algorithms depending on whether you are targeting a Pentium IV or a Pentium III.
Anyway: No, I am not presuming that there exists a single ordering that will be optimal for all architectures and use cases. I am assuming that any cost in execution time here will pale in comparison to the benefit of having predictable semantics. (I am also assuming that a developer who really wants to attempt to extract optimal performance will be able to either 1. provide manual drop implementations or 2. multiple structure definitions, under different cfg switches, that provide the optimal ordering for each target architecture.) |
I previously cited my advisor's post to comp.compilers earlier in this thread; however, looking over the comp.compiler's postings now, I do think that David Chase's many messages there are echoing the kind of attitude that I have here today. Consider e.g. this:
That one is especially funny given it starts out saying "in this decade", and the message is at this point over twenty years old. |
After having read this entire thread, here's my viewpoint, and it's largely the same as what I expressed 3 years ago. I understand why a lot of people think vectors should drop in reverse order. It's really handy if you're using the vector like a stack. But I still think that it doesn't actually make intuitive sense for vectors, arrays, tuples, or structs to drop in reverse order, because they aren't stacks. Local variables are a stack, not just technically but also conceptually. Variables declared first have a longer lifetime, maybe just technically if declared in the same scope, but very obviously if declared in a parent scope. But structs, tuples, arrays, and vectors have no intrinsic stack nature to them. Structs and tuples are simply values, and arrays and vectors are ordered collections of values, where most processing on these ordered collections happens in-order. Having values get dropped in reverse-order would be pretty surprising to me, especially if you consider nagisa's comment where they point out that, if Drops happen in reverse order, then inserting a single And finally, I think there's a lot of value in standardizing on the current implemented behavior today, because that means we won't break any code, and has been pointed out repeatedly, if we were to flip the order, then there's not really a great way to fix existing code (that relies on Drop order) such that it's still compatible with older compilers. So given all that, I'm strongly in favor of this RFC. Except there's one tweak that needs to be made: The RFC currently says "slices and vectors" match drop order of structs. Except slices don't have a drop order, and the RFC as written doesn't mention arrays. I assume that section was meant to say that "arrays and vectors" should match the drop order of structs. |
Also, regarding the whole notion of intra-struct references, if we do extend the language to let you express those, then I think it's perfectly reasonable to also adjust the Drop order to take those dependencies into account, rather than forcing the writer to reorder their fields. In addition, I think there's value in adding support later on for attributes like |
I feel I should also point out that while I agree in principle that leaving drop order undefined would let us do nice things down the road, in practice it's a bad idea because, even without defining it, people will (and already have) write code that depends on the current behavior. Yes, that code is technically "broken", but that doesn't mean we can ignore the issue, because it's perfectly reasonable for people to write code like this without even realizing that they're depending on unspecified behavior. |
If we agree that the current code is not big enough to warrant setting this for all future Rust code, we could just randomize the drop order of a struct during compile time. That way it won't be "perfectly reasonable" for people to write code like that. |
Is there some example (real-world) code that relies on drop order? |
@tbu- Randomizing the drop order would also introduce random code failures though:
I agree that code that relies on drop order should be explicit about it, but I think randomizing drop order will only create sour experiences for those that do rely on it (given that they may not know or remember that they are relying on it in a specific place). If we could statically verify that the user relies on drop order and give them an error, then I would be fine with leaving it undefined, but that doesn't seem plausible. |
If you're already running the code to check, then you apparantly already know about drop order.
Well, this is better than implicitly relying on the drop order if it is unspecified, right? |
I'm not saying you run your software to test for drop order specifically; I'm just saying you won't ever know the software is wrong if you don't run the specific code under the right conditions, and because of the randomized drop order, those conditions will be slightly different every time you build.
Of course, if nothing is done to stop people from implicitly relying on it, then it may as well become defined, since changing it will then cause too much breakage. But I'm not sure randomizing the drop order is the right answer, since it feels like something that could introduce very subtle errors. Best case is that the random ordering causes a panic immediately, but since |
Just while everyone is thinking about it : Is there enough need for dropping in reverse order to be worth giving a standard way to do it?
I'd expect the answer to be no. And merely adding |
@tbu- Real-world code that relies on current drop order has already been linked upthread. And I completely agree with @KasMA1990, randomizing drop order is a bad idea, because it means you can test your code locally and everything will work, but then it will randomly fail for users, simply because when you tested it the compiler picked a compatible drop order and when your users used it the compiler picked a different drop order. |
The final comment period is now complete. |
Huzzah! At long last, this RFC has been merged. Thanks, @aochagavia, and the many others who participated in this discussion. |
Given that this is just a matter of policy rather than implementation, was this RFC effectively serving the role that stabilization tracking issues on the |
@glaebhoerl Oops, I neglected to link the tracking issue. I'm not totally sure what work is needed, but at the very least there are some documents that should be changed to reflect that we're now specifying a drop order. |
@aturon - WRT,
At the very least, shouldn't there be unit tests for the compiler to ensure regressions don't occur with respect to drop order? If it is going to be guaranteed to follow th order of today (forward), shouldn't there be tests to confirm that this doesn't inadvertently change in some future version of the compiler. |
@gbutler69 Indeed! I'll add that to the tracking issue. |
Rendered