Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static drop semantics #210

Closed
wants to merge 39 commits into from
Closed

Conversation

pnkfelix
Copy link
Member

Switch to static drop semantics to remove the drop-flag and memory zeroing.

rendered view

Summary

Three step plan:

  1. Revise language semantics for drop so that all branches move or drop the same pieces of state ("drop obligations"). To satisfy this constraint, the compiler has freedom to move the drop code for some state to earlier points in the control flow ("early drops").

  2. Add lints to inform the programmer of situations when this new drop-semantics could cause side-effects of RAII-style code (e.g. releasing locks, flushing buffers) to occur sooner than expected.

    Types that have side-effectful drop implement a marker trait, NoisyDrop, that drives a warn-by-default lint; another marker trait, QuietDrop, allows types to opt opt. An allow-by-default lint provides a way for programmers to request notification of all auto-inserted early-drops.

  3. Remove the dynamic tracking of whether a value has been dropped or not; in particular, (a) remove implicit addition of a drop-flag by Drop impl, and (b) remove implicit zeroing of the memory that occurs when values are dropped.

Added a hasty section
"match expressions and enum variants that copy (or do not bind)"
but while I was writing it I realized that it does not make much
sense as written and does not actually reflect my current strategy,
so I am going to remove it now.
As a drive-by, spell out what's happening explicitly,
mostly so that the sets involved appear textually
near each other.
…drop-rfc

Conflicts:
	active/0000-remove-drop-flag-and-zeroing.md
* Added examples for `break` and `return`

* Renamed marker traits and lints to use "early"/"loud" terminology.

* Removed unneeded text that should no longer be necessary now that
  "loud" is the default.
@CloudiDust
Copy link
Contributor

@rkjnsn, I think we can solve the problem by asking one question: why only look at a single code path?

One reason may be, because an object may be dropped along one path but not another.

This is a problem with dynamic drops, and can be solved with static drops.

Another reason may be, because it is simpler.

But consider, all codepaths may be executed at runtime (or they are dead code), so in practice, a programmer should be aware of them anyway.

With the current semantics, he/she is not forced to get the big picture, but he/she should.

With the new semantics, he/she is practically forced to do so, which, I believe is actually a good thing in the long run.

@CloudiDust
Copy link
Contributor

This is like, in C/C++, and to a lesser degree, GC'd languages, a programmer should be aware of object lifetimes, but are not forced to do so.

Rust begs to differ.

@CloudiDust
Copy link
Contributor

@kballard, from the move sementics point of view, eager drops can be seen as a measure to satisfy the guarantee: "objects get moved out of scope as soon as they are no longer used in scope", so either the programmer explicitly does so, or the compiler helps by inserting implicit eager drops.

Balanced eager move semantics can be interesting.

And it is simple to control lifetimes with eager drop semantics, by explicitly adding a drop call at the point you sees fit.

No matter what semantics we are working with, "no move" is always the only way to ensure an object is alive.

Actually I think the solution to unwanted implicit drops is a set of attribute #[forbid(implicit_eager_drop)], #[forbid(implicit_balancing_drop)], #[forbid(implicit_scope_drop)].

There is no need to use traits like NoisyDrop or QuietDrop then, because, if you do care about the lifetime of an object, you will not implicitly give up control, and once you give up control, you cannot be sure what the consumer would do to it anyway.

So tagging a local variable would suffice.

@CloudiDust
Copy link
Contributor

C++'s move semantics is bolted on, just like so many other things, so it is constrained by C++'s forced implicit scoped drop semantics.

But Rust is a clean slate built upon ownership and move semantics.

So we should not be following C++ here. Our perspective should be centered around moves.

There is no early drop, only implicit balancing drop.

@pnkfelix
Copy link
Member Author

pnkfelix commented Sep 1, 2014

@CloudiDust before I go through the exercise of reviewing the many comments here (and potentially the posts in discuss.rust-lang.org), I'd just start off with a quick Yes-or-No question:

The question: Is your "implicit balancing drop" merely a terminological distinction, in the sense that in the end, the semantics of "implicit balancing drop" ends up being the same as what is proposed in the "static drop semantics" RFC, apart from what names one chooses for various lints and/or whether one includes the NoisyDrop trait at all?

That is, I want to know up front if I should be expecting to see some deep difference in the underlying semantics, or if a lot of this is just about 1. perspective and 2. terminology?

(And really, all I want is a Yes/No answer. Or "I don't know" would be acceptable too. No need for an essay on this one. ;) )

@rkjnsn
Copy link
Contributor

rkjnsn commented Sep 2, 2014

@kballard, I don't believe that's fair, as I don't think I've made any dubious or false claims, here. (At least, I have tried very hard not to.) I do not claim that incorrect behavior caused by early drop will be common or even likely, only that it is possible. From my personal experience, it feels similar to things in other languages that have bitten me because they are rare, and thus I don't think about them until after I have spent a good chunk of time trying to figure out why I'm seeing some unexpected behavior. C++ has a lot of rules that fall into this category, and one of the things that attracts me to Rust is that it doesn't.

Also, I stated in a previous comment that I'd be okay with fully eager drops, as they would be consistent, and the programmer would know that whenever they needed an object to last past it's last use, they'd have to annotate that explicitly. Plus, eager drops could provide optimization benefits, as you point out.

My objection to the RFC as written is that the result is that variables are almost always dropped at the end of their scope unless explicitly moved, except in one specific corner case, where the compiler silently adds an early drop. This is what makes it surprising. Furthermore, since early drops only occur in this corner case, you miss out on the advantages of fully eager drops.

In response to the second part of your post, I disagree that eager vs. scope-based drops should be determined by the variable's type. Whether or not I care about a given object getting dropped early is very dependent on the context. For example, I usually won't care about when a file is close as long as I'm done writing to it, but there are situations where I might. This even applies to memory-only objects: I wouldn't want to take the time to free a large tree in the middle of a real-time operation. Because of this, I would prefer to have consistent behavior for all objects. If we were to go with fully eager drop, we could provide a trait or attribute for types of objects about whose lifetimes the programmer will always care (such as a primitive mutex), and add a lint that warns/errors if the programmer isn't explicit about the lifetime of such an object.

TL;DR
I'm okay with either fully eager drops or at-end-of-scope-unless-moved drops (with out-of-band flags where needed), as long as the rule is always true. Each has advantages and disadvantages. What I really want to avoid are rare corner cases where the behavior is different in a way that can bite you. Further, I think the lifetime of an object should be determined by the user of the object, not by the implementer of the type.

@lilyball
Copy link
Contributor

lilyball commented Sep 2, 2014

Having the type define scope-based lifetime does not preclude marking individual variables as having a scope-based lifetime as well.

@rkjnsn
Copy link
Contributor

rkjnsn commented Sep 2, 2014

True, but we both seem to agree that having a type that should always have a scope-based lifetime is pretty rare. (As you point out, not even Mutex qualifies.) I feel like this would be the same kind of inconsistent corner case against which I was arguing, above, just in the other direction. Also, having a type able to determine how long it lives in certain situations just feels odd, to me.

@CloudiDust
Copy link
Contributor

@pnkfelix Sorry for the late reply, busy with my job last week.

I think this is "just about 1. perspective and 2. terminology".

I would like our terminology to encourage people to think outside the box of C++ here.

@CloudiDust
Copy link
Contributor

@rkjnsn, I think early drops/implicit balancing drops are also predictable in their own way.

And we already have to pay attention to object movements anyway. Once an object is moved out of scope, no matter in linear code or in a branch, we cannot know when it is dropped in general. If we do care about when an object is dropped, I think we should explicitly pin the object to the scope/enable a warning that fires when the object gets moved out of scope in any manner. We should do this even now, when we have dynamic drops.

And I agree that we should not tie the pinning/warning semantics to the library types, but let the library users decide.

@rkjnsn
Copy link
Contributor

rkjnsn commented Sep 11, 2014

@CloudiDust, I'm not trying to say implicit balancing drops aren't deterministic, only that figuring out when they happen takes a lot more effort, information, and care than with any of the other three options that have been discussed (unbalanced-moves-is-an-error, out-of-band dynamic drops, and eager drops).

Also, if we want to say that one should explicitly pin an object whose lifetime they care about independently from when it is last used, is there any reason not to go with eager drops? To me, the advantage of the out-of-band dynamic drops and unbalanced-moves-is-an-error is that you know that an object is dropped at the end of scope unless the code path explicitly moves it, before then. If there is going to be any situation where this is not the case (requiring pinning to catch unexpected drops), why not go all the way to eager drops to get the advantages that provides?

@rkjnsn
Copy link
Contributor

rkjnsn commented Sep 11, 2014

I was thinking about pinning, and I think there's a relatively simple way to do it without adding any additional syntax or attributes: add an explicit drop at the end of scope. This will count as a use in the case of eager drops, and makes the lifetime of the object explicit. Furthermore if one accidentally moves the object away before then (even conditionally) without replacing it, it will be an compiler error. In the rare case where one wants to move an object in one case and keep it to the end of scope in another, one would have to use an Option.

@CloudiDust
Copy link
Contributor

@rkjnsn I believe eager drops are harder to reason about than static drops. With eager drops, we have to look out for all mentions of the value that we are interested in, with static drops, we only need to look out for moves, and we know that static drops only happen at block boundaries.

Dynamic dropping has a disadvantage compared to the other two: there is no way to statically determine whether a value is dropped at the end of the scope if unbalanced moves are involved. While both static and eager dropping work statically.

Also, just because a value is not dropped, doesn't mean it can be used. An unbalancedly moved value is unusable after the branching operation, no matter which drop semantics is used.

So the semantics can be seen as follows:

Eager dropping: If a value is not used afterwards, drop it;
Static dropping: if a value can not be used afterwards, drop it;
Dynamic dropping: even if a value can not be used any more, delay the drop till the end of the scope if it is not explicitly dropped on this code path.

The third is the most familiar one, but I'd say the first and the second makes more sense than the third. After all, the dynamic drop at the end of the scope is still an implicit one, why is this implicit drop better than the implicit drops under eager/static dropping semantics?

And yes, eager drops make the simple pinning solution possible.

EDIT: explicit dropping is a valid solution no matter which drop semantics we use.

@CloudiDust
Copy link
Contributor

@rkjnsn there are two problems with the simple solution:

  1. when unexpected moves occur, the compile error messages do not reflect the intentions;
  2. only guaranteeing that unexpected moves of the entire value do not occur, is insufficient. Outbound partial moves should be forbidden as well.

But in practice those may not be serious problems, and I have a RFC in the works that can help dealing with the second problem. (Forbidding partial moves from immutable objects.) I'll add this use case to the RFC.

That'll be good enough. If we ever want more general and more fine grained value movement control, I also have one proposal in the works which supersedes the scoped keyword proposal in the discuss forum. scoped, or a #[lifetime(scope)] attribute, is too tightly coupled to a specific use case.

@rkjnsn
Copy link
Contributor

rkjnsn commented Sep 14, 2014

@CloudiDust

I believe eager drops are harder to reason about than static drops. With eager drops, we have to look out for all mentions of the value that we are interested in, with static drops, we only need to look out for moves, and we know that static drops only happen at block boundaries.

I see what you're saying, and it is true that with eager drops it's harder to exactly when an object is dropped just by looking at the code. However, I disagree that this makes it harder to reason about the code. The vast majority of the time in Rust, you don't care exactly how long an object lives as long it as lives at least as long as its last use. With eager drops, you are leaving the lifetimes of such objects up to the compiler so you don't have to worry about it, which I believe would actually decrease the cognitive burden. Also, when reading code, if you see explicit control of an objects lifetime (e.g., through the use of drop), you know that the lifetime of that object is important for some reason.

After all, the dynamic drop at the end of the scope is still an implicit one, why is this implicit drop better than the implicit drops under eager/static dropping semantics?

I agree that the motivation for end-of-scope drops is much weaker for Rust than it is for C++. In C++ in is absolutely essential in order to allow certain objects to refer to others. To construct type B with a reference to an object of type A, you must be sure that object b is destroyed before object a. C++ does this by tying lifetime to scope and ensuring that objects are destroyed in the reverse order of their construction. Rusts type system is much stronger, and allows the fact that object a must outlive object b to be specified much more directly. The more I think about it, the more I am of the opinion that the compiler should generally be free to choose the best time to drop objects within the bounds of lifetime dependencies, and the programmer should specify the lifetime explicitly in the rare occasion that they care.

When unexpected moves occur, the compile error messages do not reflect the intentions.

The error will look something like:

test.rs:13:10: 13:19 error: use of moved value: `my_struct`
test.rs:13     drop(my_struct);
                    ^~~~~~~~~
test.rs:11:14: 11:23 note: `my_struct` moved here because it has type `MyStruct`, which is non-copyable (perhaps you meant to use clone()?)
test.rs:11         drop(my_struct);
                        ^~~~~~~~~

While this might not perfectly match the intent, I think it makes it pretty clear what has gone wrong and how to fix it.

Only guaranteeing that unexpected moves of the entire value do not occur, is insufficient. Outbound partial moves should be forbidden as well.

The explicit drop solution I mentioned does ensure that no partial moves have occurred. drop is a normal function that takes its argument by value, causing the value to be moved into the function and then destroyed when the function ends. The compiler will not let you move or otherwise use a whole value that has been partially moved from, so you'll get an compiler error (error: use of partially moved value). See this example in the play pen.

EDIT: Fix playpen link

@CloudiDust
Copy link
Contributor

@rkjnsn Thanks for pointing out my mistake. But your playpen example doesn't seem complete. I played around a bit with sample codes I wrote myself and confirmed that I was wrong.

So, explicit drops alone are enough, and if we don't need to care, we should not care at all. :)

EDIT: wording and mentioning that the playpen code is not complete.

@CloudiDust CloudiDust mentioned this pull request Sep 15, 2014
@rkjnsn
Copy link
Contributor

rkjnsn commented Sep 15, 2014

I put together an RFC for eager drop semantics: #239

@nikomatsakis
Copy link
Contributor

I've been thinking about this a lot and I think I've come around to preferring the dynamic drop semantics. The arguments I find most persuasive are:

  • Closer to an idealized version of what C++ does -- all else being equal, being similar to C++ makes sense, as it's what many people will expect.
  • Static drop represents a middle ground between the "eager drop" RFC Allow Eager Drops #239 and dynamic drop. All else being equal, being at an extreme is often better than a middle ground.
  • Dynamic drop can be converted into static drop with a lint and manually inserted drop calls. Converting static drop to dynamic drop requires Option or changes to the API itself, and is hence harder.

The fact that dynamic drop is kind of backwards compatible and hence less of a 1.0 blocker doesn't hurt either. ;)

@pnkfelix
Copy link
Member Author

withdrawing in favor of #320.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.