Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Eager Drops #239

Closed
wants to merge 1 commit into from
Closed

Conversation

rkjnsn
Copy link
Contributor

@rkjnsn rkjnsn commented Sep 15, 2014

Allow the compiler to drop objects that are no longer used before the
end of their scope. Rust's type system ensures that objects aren't
dropped too soon. The programmer can explicitly specify an object's
lifetime if necessary. Removes the need for drop flags.

Rendered View

Allow the compiler to drop objects that are no longer used before the
end of their scope. Rust's type system ensures that objects aren't
dropped too soon.  The programmer can explicitly specify an object's
lifetime if necessary.  Removes the need for drop flags.
@CloudiDust
Copy link
Contributor

+1, and as I said in the discuss forum and the comments in #210, scoped (my superseded proposal) is unnecessary.

And if we decide against eager drops, I think we should go with static drops but without NoisyDrop/QuietDrop and the lints.

@arcto
Copy link

arcto commented Sep 15, 2014

👍 Thanks, that was a terrific write-up!

I think that Rust's lifetime specifications again prove their worth. They can guarantee correctness where C++ has to rely on the less refactoring-robust block scope trick. While at the same time providing correctness this also enables stack optimisations and reduced resource contention by not blocking resources for longer than what is necessary.

@rkjnsn rkjnsn mentioned this pull request Sep 15, 2014
@mahkoh
Copy link
Contributor

mahkoh commented Sep 16, 2014

I think people are used to "variables going out of scope" at the end of a block. And objects being dropped at the end of the block is the RAII version of this.

With this RFC people have to carefully check that they don't accidentally drop anything. This cannot happen with the Mutex in the stdlib, but what if someone implements a Mutex that works differently? Even today there is code that would break:

let tmp_dir = std::io::TempDir::new("");
let tmp_file = tmp_dir.file().join("my_file");
// tmp_dir will be deleted here
// work with tmp_file here

On the other hand I certainly agree that

let mut x = 1i;
let y = &mut x;
// y is dropped here
let z = &mut x;

would be useful.

@rkjnsn
Copy link
Contributor Author

rkjnsn commented Sep 16, 2014

@mahkoh, this proposal could certainly result in cases where an object is freed earlier than intended, especially as people get use to the new rules. Even with the current system, however, an object can accidentally be dropped too early if it is moved unintentionally. Luckily, Rust's type system can prevent this type of error as long as lifetime dependencies are properly specified between objects. Eager drops encourage the programmer to make better use of the type system instead of relying on specific drop rules. This can help catch errors that may be introduced later, e.g., by refactoring.

One way to handle the temporary directory problem would be to have a TempFile type with a contravariant lifetime parameter referring to a TempDir. This would ensure that all TempFiles are closed and cleaned up before their respective directory is removed, preventing any accidental moves/mutations/drops to the TempDir while there are still files using it.

A less ideal approach would be to explicitly drop the TempDir when done with it. However, I believe this is still superior to the current situation because it is now explicit that the object is being "used" even though no other objects refer to it. Further, it will catch any accidental drops or moves (but not mutations) that may be introduced before this point in the future.

Both approaches maintain RAII guarantee that the TempDir will be cleaned up no matter how the scope is exited, be it by reaching the end of the block, a break, a return, or a task failure.

@Ericson2314
Copy link
Contributor

This is great. More power to the optimizer! EDIT: Also shouldn't that lifetime restriction for TempFile be added either way? Is there a bug for it?

@arthurprs
Copy link

Maybe, just maybe, Rust is pushing safety too far and losing ergonomics and practicality along the way.

@Thiez
Copy link

Thiez commented Sep 18, 2014

I keep seeing these proposals about early/eager dropping, but has anyone investigated how much of a difference this would actually make in practice? I imagine LLVM will already reuse stackspace for all types that do not implement Drop, and in cases where it can prove that dropping has no side-effects. How much stackspace would be saved by early dropping?

@Manishearth
Copy link
Member

If this is going to be the case, I think there should be a trait FullDrop which prevents this optimization. Some abstractions rely on extra clean-up functionalities at drop time which might be intended to only happen at the end of the block.

@jdm, would this affect Servo's rooting strategy? We un-root on Drops, IIRC.

@thestinger
Copy link

This wouldn't interact well with unsafe code. Rust can't track the lifespan of raw pointers. This would also break low-level RAII locking and similar patterns. Micro-optimizations shouldn't come at the expense of an unusable language.

By allowing types to be dropped as soon as they are no longer used, we would
further encourage programmers to properly utilize the type system to specify
lifetime dependencies instead of implicitly relying on the current drop
semantics. This would result in safer code, since the compiler would be able to
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give a concrete example of this? The ony way one object could have a lifetime dependency on another object would be if it either owns that object, in which case there can be no problem, or it has a reference, in which case the current borrow checker should catch any problems. Obviously I am misunderstanding something here, and I think an example would help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I was thinking of something like this, where an object is borrowed by another without actually needing a reference, internally. The ergonomics could probably be improved.

@rkjnsn
Copy link
Contributor Author

rkjnsn commented Sep 18, 2014

@Manishearth, I don't like the idea of the lifetime of a variable being determined by a trait of its type, as it seems prone to error. What about a trait such as ExplicitDrop that results in a warning or error if you use the type without explicitly specifying its lifetime? This would also apply to @thestinger's concern about low-level RAII locking. However, I do think such types should be very rare, since Rust has the power to express dependencies through the type system, and doing so helps prevent other errors, as well.

@thestinger
Copy link

There's still the issue of raw pointers not extending the life of the allocation. Do you need to use an explicit drop call to forcefully extend the lifetime?

@rkjnsn
Copy link
Contributor Author

rkjnsn commented Sep 18, 2014

@thestinger, in the case of creating a raw pointer to an owned object within a block and then using that pointer, you would have to call drop to manually extend the lifetime. Taking a raw pointer to an object that could be implicitly dropped within the same block seems like it should be detectable by a lint.

Is it an issue in any other cases? In the case of holding a raw pointer in an object, you already want to have an appropriate lifetime parameter so the object can be used by safe code. In the case of taking a pointer to an object passed by reference, the object is guaranteed to be alive for the duration of the function call.

@ghost
Copy link

ghost commented Sep 18, 2014

Can we please not make writing unsafe code any more painful?

@rkjnsn
Copy link
Contributor Author

rkjnsn commented Sep 18, 2014

@Jurily, I don't think most unsafe code would be affected. The only thing I can think of is that something like

unsafe {
    let x = SomeStruct::new();
    let y: *mut SomeStruct = &x;
    // Do stuff with y
}

would have to become

unsafe {
    let x = SomeStruct::new();
    let y: *mut SomeStruct = &x;
    // Do stuff with y
    drop(x);
}

Is this common?

@shadowmint
Copy link

I categorically oppose this RFC. Adding drop() to the language feels like adding free().

You may as well just use C.

Additionally, I'm deeply skeptical about adding these rules:

The compiler may drop an object an any time as long as the following conditions are met:

    There are no future uses of the object
    The object is not currently borrowed
    There is no other existing object which the object in question must outlive

How could you implement this in a way that is zero-cost?

You're basically proposing to replace the scope-based dropping with a garbage collector that sits watching references in a function and drops them when there are no references; that's fundamentally adverse to rusts memory model.

@thestinger
Copy link

The true last use of a lock object isn't known by the compiler. It only knows about the last access to the variable. It can't know how long it needs to hold onto the lock in order to prevent race conditions. Several locks may need to be locked in sequence and held at the same time in order to prevent race conditions. Eliminating data races in safe code is only solving part of the problem. RAII provides a contract based on scopes, and without that it is far easier to screw up stuff like this.

The compiler doesn't know how long it needs to hold onto things that aren't simple resources, and it doesn't know how long it needs to hold onto resources in the context of unsafe code using raw file descriptors, raw pointers or other low-level code.

Low-level code in C++ is more more clear and concise due to the available sugar, which means it is already more difficult to write correct low-level abstractions in Rust. Fundamentally breaking the RAII contract would be a large step backwards for low-level code when what needs to happen is closing the existing gap. There are plenty of other high-level languages, and Rust only stands out because of the low-level capabilities. Building safe abstractions via clear and concise unsafe blocks needs to be a strength for the language to succeed.

Removing the drop flag and using a local boolean to handle conditional moves would preserve correct scope-based RAII semantics. It would only have overhead in cases where a conditional move is actually used, and LLVM would optimize it out in many cases where it is not semantically required. It would always be possible to write if cond { foo(x) } else { drop(x) } to drop earlier than the end of the scope as a micro-optimization but it's hardly ever going to make a difference. Simplicity and clear semantics is more important. Rust has far bigger performance problems like bounds checks and less ideal algorithms due to aliasing restrictions. Conditional move flags on the stack wouldn't register at all on a list of the top 100 performance issues in the language.

@rkjnsn
Copy link
Contributor Author

rkjnsn commented Sep 19, 2014

@shadowmint, you seem to be misunderstanding the proposal. drop exists today, and its functionality wouldn't change under this proposal. Further, I am not proposing any kind of dynamic garbage collection. The proposal simply gives the compiler more flexibility with respect to emitting static drops in the generated code. This actually allows the elimination of the current dynamic drop logic that uses flags to determine whether an object needs to be dropped.

@shadowmint
Copy link

@rkjnsn Fair enough, but it certainly sounds like if you write unsafe code you'll have to switch to manually managing the drops to get consistent behavior to me.

This may make sense from a compiler point of view, but from a programmer point of view, having non deterministic memory release (even if it is resolved at compile time, there's no way of knowing what it actually is other than running the code afaik?) as the default seems highly dubious to me.

Perhaps if scoped blocks 'a : { ... } existed so you could lifetime an object explicitly to a scope this would be plausible, but I don't believe that's currently the case.

What if you're writing low level (specifically ffi) code and you encounter a segmentation fault as a result of an early drop?

How do you debug that under this proposal?

@arcto
Copy link

arcto commented Sep 19, 2014

Actually, in the case where you take a raw pointer, y, to an object, x, with a lifetime, as in:

unsafe {
    let x = SomeStruct::new();
    let y: *mut SomeStruct = &x;
    // Do stuff with y
}

The compiler could, by static analysis, make sure that the lifetime of x matches that of the local pointer. No big deal, really. It's just a lost opportunity for the optimiser.

Using Rust's lifetime mechanism, as proposed by this RFC, is in fact a strength gained by the type system. I think we should use it now that we have it.

@shadowmint
Copy link

@arcto How would that work over the ffi boundary?

 let y:*const c_void = &foo;
 extern_call_async(y);
 ...  <--- What prevents the early drop of foo here?
 let r = extern_call_recover_y();

@arcto
Copy link

arcto commented Sep 19, 2014

@shadowmint Right, never mind that. You'd have to resort to manual resource management when you use raw pointers.

@rkjnsn
Copy link
Contributor Author

rkjnsn commented Sep 19, 2014

@thestinger, I agree that moving dynamic drops from being part of the object to being on the stack would be a more conservative choice relative to the current implementation, keeping the same semantics with less space and time overhead (I argued as much in the discussion of #210). However, there was a lot of resistance to any kind of dynamic operation (notably from @pcwalton, among others). If the goal is to get rid of dynamic drop logic altogether, I think eager drops are more consistent and less prone to surprises than the solution put forth in that RFC.

Also, I consider the primary advantage of RAII to be that resources are cleaned up regardless of how the scope is exited, which is still maintained with this proposal.

The compiler doesn't know how long it needs to hold onto things that aren't simple resources

One of the big advantages of Rust is the ability of the type system to check and enforce lifetime requirements, which is much more robust than the C++ approach of relying on the programmer getting the order right. If there are currently lifetime dependencies that are hard to express, I think that should be addressed regardless of what drop semantics end up getting used.

@rkjnsn
Copy link
Contributor Author

rkjnsn commented Sep 19, 2014

@shadowmint

From a programmer point of view, having non deterministic memory release (even if it is resolved at compile time, there's no way of knowing what it actually is other than running the code afaik?) as the default seems high dubious to me.

Most of the time in rust, you don't really care exactly how long an object lives as long as it fulfills all of its obligations. Needing to know exactly when an object is dropped is the exception, so I think it's good for this case to be explicitly expressed. This makes it much less likely for something to get accidentally broken by a later change or refactor.

@shadowmint
Copy link

@rkjnsn I'm happy to acknowledge that most of the time in purely rust code you don't really care exactly how long object lives for as long as it fulfills the lifetime bounds.

...but I care very much how long things live when I'm writing ffi code, or asm.

Basically my objections boil down to two points, very simply:

  1. How exactly will the compiler determine there are 'no future uses of the object' when the object has been temporarily passed to an ffi call?

  2. If I make a mistake and the compiler does drop an object I'm still using in ffi code, how do I backtrack to exactly where it was dropped and specify a lifetime on the object to prevent that from happening?

@yuriks
Copy link

yuriks commented Sep 19, 2014

-1, for the same reasons stated in the discussion thread for #210

@Ericson2314
Copy link
Contributor

Raw pointers are just fundamentally unsafe. Keeping them alive until the the end of the block is just a heuristic that doesn't buy one safety in any formal way.

@shadowmint
Copy link

@Ericson2314 It would certainly be nice to pretend no one writes unsafe rust code or uses unsafe pointers, but that simply isn't the case. This proposal will make writing correct (ie. segfault free) unsafe code significantly more annoying.

I really don't see the performance benefits of this (or whatever the guiding motivation for it is?) as being compelling in exchange.

@pythonesque
Copy link
Contributor

Please let's not do this. The fact that the worst of it only affects unsafe code is not an argument in its favor. Safe concurrent code is by nature going to be pretty easy to follow, since it can outsource all the tricky parts. Unsafe concurrent code (a.k.a. the tricky parts) is going to be subtle, and subtle concurrent code is really hard to reason about even when everything is explicit (due to the necessity for read barriers, critical sections, etc.). Adding one more thing upon which to trip up is not the way to go.

@rkjnsn
Copy link
Contributor Author

rkjnsn commented Sep 19, 2014

@shadowmint

How exactly will the compiler determine there are 'no future uses of the object' when the object has been temporarily passed to an ffi call?

In the case of an object used through raw pointers and FFI, the compiler would have to rely on the programmer to specify the appropriate lifetime. Considering object lifetimes is already important when working with raw pointers. When taking a raw pointer, one must always think "how do I know this object lasts long enough." This proposal changes the answer from "because the language specifies that the object lasts until the end of scope" to "because I explicitly drop the object at this point."

A lint can help with this by triggering a warning or error if you take a raw pointer to a locally-owned object that doesn't have an explicit drop.

If I make a mistake and the compiler does drop an object I'm still using in ffi code, how do I backtrack to exactly where it was dropped and specify a lifetime on the object to prevent that from happening?

You shouldn't need to know exactly where the compiler drops it to know where it ought to be dropped. Just add an appropriate drop at the right spot.

I'll again point out that this only applies to code that looks like

let x = SomeStruct::new();
let y: *mut SomeStruct = &x;
// Do stuff with y

Other uses of of raw pointers, such as storing them in structs, will be no different from today with respect to getting lifetimes right.

How common is code like the above? My gut tells me it would be a minority of unsafe code, but I don't currently have data to back that up.

@rkjnsn
Copy link
Contributor Author

rkjnsn commented Sep 19, 2014

I see a lot of comments saying we should stick with always dropping objects at the end of their scope, so I want to reiterate part of the motivation for this proposal.

Because Rust has move semantics, it's always possible for an object to be moved or dropped before the end of its scope. This means that when the end of the block is reached (or the block is left through other means), a given object may or may not need to be dropped. Rust currently deals with this by implicitly including an extra field (the drop flag) in every object of a type implementing Drop and zeroing out moved-from memory. This is bad for both space and performance, and isn't really acceptable for a systems language.

One possible solution for this is for the compiler to instead track when an object is conditionally moved and insert a flag on the stack to track whether the object still needs to be freed when its scope ends. This allows the current semantics to be maintained while fixing the worst problems of the current situation. When compared with the current solution, the overhead would be pretty negligible, since flags and checks are only inserted in cases where there is actually an implicit move. Nevertheless, several people (including at least one member of the core team) were opposed to any kind of dynamic behavior, considering it to be too magical, potentially surprising, and behavior unbecoming for a systems language.

The solution put forth in RFC #210 was to say that objects are usually dropped at the end of scope, but may be dropped earlier if necessary to avoid the need for dynamic drop tracking. This strikes me as a dangerous middle ground, as it allows one to rely on semantics that aren't true in all cases.

I wrote up this RFC because I believe the solution it presents is superior to that provided by #210. I do not claim it is necessarily better than stack-based drop flags, as that seems to be more of a philosophical debate. However, if Rust goes with a static solution (which seems likely), some form of early drops will be required. I believe the approach here is more consistent, easier to explain and understand, and easier to reason about than the #210 approach. I also believe it will lead to code less likely to silently break in the face of changes and refactors by encouraging programmers to make use of Rust's strong type system and its borrowing and ownership semantics. Any reduced memory usage or additional optimization possibilities would just be a bonus.

In conclusion, please read this RFC knowing that maintaining the status quo isn't really an option, and that this RFC is presented as what I believe is a better approach to static drop semantics, if static drop is chosen over dynamic drop.

@thestinger
Copy link

@Ericson2314: It's not a heuristic, it's the basic contract of RAII / scoping. It doesn't just apply to raw pointers, raw file descriptors and other low-level code. It heavily impacts types representing transactions like lock scopes or database transactions. The need to overlap more than one transaction scope is very common. Rust is intended to be a systems language, and the ergonomics of low-level code are important in that niche.

C++ uses scope-based resource management, and C++ programmers are the primary audience that Rust is trying to convert. Implementing a very unimportant micro-optimization (avoiding a flag on the stack when a conditional drop is used) at the expense of low-level ergonomics is going to result in more memory corruption bugs, including more vulnerabilities.

@shadowmint
Copy link

In conclusion, please read this RFC knowing that maintaining the status quo isn't really an option

Obviously its a trade off between performance and expressiveness.

These RFCs are trying to turn lifetimes which are currently lints that do not control drop behaviour into runtime contracts.

Stack based drop flags will maintain the status quo; theres no compelling reason to do this instead. You cant know the runtime drop requirements at compile time with out manual memory management with unsafe code (ie. explicitly binding lifetimes to objects and dummy calls to drop() to ensure objects live for the required scope)

Manually managing memory isn't something you should have to do in rust code.

@CloudiDust
Copy link
Contributor

@thestinger @shadowmint I see.

Indeed invoking undesired behaviour in unsafe code is bad.

This also makes #210 (static drop semantics) undesirable when unsafe code are involved, right?

Because for some reason the unsafe code need an unbalancedly moved value to live till the end of its scope in the code path where it is not explicitly moved, even if it cannot be used in safe code. Is there a use case? (I am just curious.)

@shadowmint
Copy link

@CloudiDust Tangibly, the following sort of unsafe code using callbacks would now segfault as I understand it:

extern {
  fn set_callback(cb:|c_int|, state:*const c_void);
  fn check(a:c_int, b:c_int);
}

fn main() {
  let r = |x:c_int, data:*const c_void| {
    let foo:&mut Foo = transmute(data);
    foo.add(x as int);
    println!("{} was the output", x as int);
  };
  let data = Foo::new();
  unsafe { set_callback(r, &data as *const Foo as *const c_void); }
  for i in range(0, 10) {
    unsafe {
      check(10, i);
    }
  }
  // Now we must manually drop(data); and drop(r) here, othewise check will segfault.      
  // because data will already be dropped. 
}

Libuv is an example of library that uses this pattern (bind callbacks and state, poll periodically and pass state back into the callback). Leveldb does it too.

I don't speak for anyone else, but my issue with eager drops is they make it easier to write unsafe code that is in fact subtly broken

(also, who says one set of 'best drop order' will be the same as another on different systems / architectures? I imagine you could easily have code that runs on one system and segfaults on another because of drop location optimization choices).

@rkjnsn
Copy link
Contributor Author

rkjnsn commented Sep 20, 2014

@shadowmint, let me clarify: maintaining the current situation of storing drop flags in structs and zeroing on moves is not an option. I think stack-based drop flags are a perfectly acceptable choice, and would more or less keep drop semantics the same while eliminating most of the overhead from the current situation. Some unsafe code that directly manipulates drop flags for performance would have to be changed, but that's probably it.

I wrote this RFC because several people were opposed to dynamic flags of any kind, and I think this approach for enabling static drops is much better than the one proposed in #210.

@Ericson2314
Copy link
Contributor

@thestinger I don't mean to say disagree that RAII is well-defined and due to it's lexical nature easy to reason about. My point is that it is a heuristic regarding safety with unsafe data because it doesn't actually enforce correctness, but just makes it easier for the programmer to not make mistakes.

@simias
Copy link

simias commented Oct 7, 2014

Wouldn't it be possible to use early drop in the vast majority of safe code and fallback to lexical scoping when there's unsafe code in the current scope?

@aidancully
Copy link

Much of the negative feedback around this RFC seems to be about the additional difficulty it creates in writing unsafe code: if the compiler can clean up a variable before it leaves scope, how can we use an unsafe pointer to that variable? Would this problem be solved by adding "moveonly" types to the language? That is, a way of telling the compiler that a variable of a given type cannot be implicitly dropped, but must be explicitly free'd? This deserves it's own RFC to address a lot of things I'll handwave past here, but consider:

unsafe {
    let x = MoveOnly(MyStruct);
    let mut y: *MyStruct = &*x;
    //will not compile, needs the following line uncommented:
    //drop(x)
}

(note as well that this sort of facility is requested by issue #523.)

EDIT: perhaps this sort of thing is what's intended to be supported by &move pointers? as in:

unsafe {
    let x: &move = &MyStruct;
    let y: *mut MyStruct = &*x;
    //will not compile, needs the following line uncommented:
    //drop(x)
}

Either way, I'd like to add my voice to strongly support this proposal, or any other proposal in this domain that enforces a statically-determined drop semantics. Dynamic drop of any form makes code transformations much harder to reason about, in such a way that it seems likely to cripple tooling for automatic refactorings. If rust 1.0 is incompatible with an extract method refactoring, I'm not sure how extract method could be made to work in a backwards-compatible way for rust 2.0.

@shadowmint
Copy link

or any other proposal in this domain that enforces a statically-determined drop semantics.

While I agree in principal that entirely static drops would be a major win for the language, eager drops are, for all the reasons above, not the answer that gets there.

@aidancully
Copy link

@shadowmint I'm inclined to agree that eager drops are not a complete solution for reasons you and others describe. But I think they are part of the solution, with the issues you raise dealt with via some other mechanism (I think something based on linear types, though not exactly linear types since they don't interact well with unwinding).

What I'm really worried about is that the status quo (dynamic drops) might be technically incompatible with a highly important part of modern development practice (automated refactorings), and that if this issue isn't addressed by version 1.0, with rust's promise of backwards compatibility at version 1.0, it may become impossible to address in the future. For all the concern I see about towards converting developers to rust, it seems like this issue could turn many of the language's most powerful potential allies away.

@aturon
Copy link
Member

aturon commented Mar 5, 2015

ping @pnkfelix @nikomatsakis

@pnkfelix
Copy link
Member

pnkfelix commented Mar 5, 2015

I think we should close this ticket. We have pretty much committed to supporting the current dynamic drop semantics, or at least an approximation thereof.

In particular the core team has accepted RFC #320.

@nikomatsakis
Copy link
Contributor

Agreed. Thanks to the author for the suggestion, in any case!

withoutboats pushed a commit to withoutboats/rfcs that referenced this pull request Jan 15, 2017
Similar to merge but requires the items/errors to be the same and then only
yields one at a time.

Closes rust-lang#239
wycats pushed a commit to wycats/rust-rfcs that referenced this pull request Mar 5, 2019
@ByteEater-pl
Copy link

Would it be feasible to let programmers opt into this behaviour by using some syntax to create blocks (possibly around their entire programme) in which eager drops are permitted?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.