Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unsafe Option::unwrap_unchecked() #1095

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 73 additions & 0 deletions text/0000-option-unwrap-unchecked.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
- Feature Name: option_unwrap_unchecked
- Start Date: 2015-4-28
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary

This RFC adds a method to the `Option` enum which allows it to be
unsafely unwrapped without checking for `None`.

# Motivation

In cases where a developer can determine that an `Option` is `Some(v)`
but the compiler cannot, this method allows a developer to unsafely
opt-in to an optimization which will avoid a conditional check. In my
own uses I find cases where I know that an `Option` is `Some(v)`. I
present a case from the `libcollections linked_list.rs`:

fn next(&mut self) -> Option<&'a A> {
if self.nelem == 0 {
return None;
}
self.head.as_ref().map(|head| {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this is a curious one. It seems like the first if isn't necessary at all and is there either by accident or it's an optimisation itself.

I believe the community's consensus is that in principle unwrap() is not a good idiom, and in all likelihood neither would unwrap_unchecked(). If there's an implicit contract about an Option value always being a Some or None based on some other state (or, in general, an enum value always being of a certain variant based on some other state), then the state machine is perhaps not represented properly (the two semi-related enumerated datums should in those cases be consolidated into one).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first if is necessary because the iterator is double ended. Without checking it, next will only return None at the tail. The argument against unwrap() is about the fact that it panics (as far as I can tell). This is not a concern for this proposal.

I'd be curious to see your recommendation as to how we modify the linked_list and its iterator to consolidate the semi-related datums. Otherwise, I see unwrap_unchecked or an alternative as the right way to do it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument against unwrap() is about the fact that it panics (as far as I can tell). This is not a concern for this proposal.

A unwrap() would lead to a panic, unwrap_unchecked() would lead to memory override or other corruption in case of a bug. Memory corruption sounds worse than panic to me. Therefore i would think, that to use unwrap_unchecked() is a even worse idiom than pure unwrap().

self.nelem -= 1;
self.head = &head.next;
&head.value
})
}

The check for `nelem == 0` already ensures that `self.head.as_ref()`
cannot be `None` but the `map` method is unlikely to optimize out the
check. Here, an unsafe method which could unwrap the `Option` would
allow the compiler to optimize. I expect that this can be used in many
more cases throughout both the standard library and other code. The
outcome is likely a minor performance improvement, but one that should
be possible for developers to make.

# Detailed design

I choose to name this method `unwrap_unchecked()` (commence
bike-shedding). It is implemented quite simply by using the
`unreachable` intrinsic to indicate that the `Option` cannot be `None`
and then calling `unwrap()`. I expect that the compiler can them
optimize out the conditional. A prototype implement can be found
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you verified that the compiler does in fact optimize it away?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is dependent on a lot of different cases. If I build a library as follows:

#![feature(core)]

use std::intrinsics;

pub fn my_unwrap(o: Option<i32>) -> i32 {
    match o {
        Some(v) => v,
        None => panic!("panic")
    }
}

pub unsafe fn my_unwrap_unsafe(o: Option<i32>) -> i32 {
    match o {
        Some(v) => v,
        None => intrinsics::unreachable()
    }
}


$ rustc -O --crate-type=lib unwrap.rs
$ objdump -d libunwrap.rlib
0000000000000000 <_ZN16my_unwrap_unsafe20h8f4545557bc4e296MaaE>:
   0:   48 c1 ef 20             shr    $0x20,%rdi
   4:   89 f8                   mov    %edi,%eax
   6:   c3                      retq   

You can see no conditionals in the unwrap_unsafe asm. Further testing on the actual standard library implementation would be needed to confirm, but at least one method of implementing it seems to work as I expect.

[here](https://github.com/rust-lang/rust/pull/24905).

# Drawbacks

The biggest drawback appears to be that it adds yet another function
to the `Option` enum and it may not be worth the additional cognitive
load it puts on developers to understand when and why they should use
this method.

# Alternatives

Rather than a method on `Option` this can be implemented as a
free-standing method (perhaps in an external library) or instead
developers could just use the `unreachable` intrinsic without a method:

match x {
Some(x) => {
...
},
None => unsafe { intrinsics::unreachable() };
}

I feel that this is less desirable as it causes right shift in the
code and has poor ergonomics. Forcing developers down this path will
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ergonomics are only really important for common use cases, by definition. It seems like it's not a very compelling argument given that I believe this is nonetheless a rather uncommon one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ergonomics are about efficiency (by definition). A common use case or not, all else being equal I would prefer something to be more efficient to type than not. I also presented an example from the standard library where it seems that the developer either forgot this optimization was possible or avoided it due to the ugliness involved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know one such place, the chars iterator. I've checked it again and I didn't make any performance gains by trying to do this kind of unchecked unwrap.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we agree that the situation where the programmer has extra information to prove that the option is not None is relatively uncommon, but suppose the programmer has this information. Then the case where avoiding the check will have a performance benefit will be even more uncommon. (If this is called only once, why bother?) But suppose there is a performance benefit here. Then the case where the programmer cares about performance will be even more uncommon. (Just that Rust is a systems language does not mean all code is performance-critical.)

If you are in this really specific situation where you have extra information, there is a performance benefit, you have measured it, and it is critical, then it wouldn’t be so bad to manually do the match, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree @ruud-v-a. And there are more enums than just Option that you want to do this with.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruud-v-a Consider this article, it advocates knowing where the performance-critical parts of programs are and not requiring measuring to allow optimizing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruud-v-a I disagree with the assertion that the case where avoiding the check will have a binary performance benefit or not and I also disagree that the amount a programmer cares is similarly a binary value.

Our goal is to make efficient code easy to write. I think in a vacuum, having this method makes it more likely that developers will do this optimization when possible. The drawback is that it's yet another method of Option (of which there are many). So we must ask, does the advantage gained outweigh the disadvantage? I don't see it being simplified any further.

@bluss Yes, there are more cases to apply this. I felt that the RFC should cover a small case where I have a concrete example to have this debate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cases where it could make no difference is when llvm is able to optimize 2 checks into 1 (such as you already check is_some() earlier in the function).

cause them to avoid the simple optimization to keep the code cleaner.

# Unresolved questions

None