-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove unsound TrustedRandomAccess implementations #85874
Conversation
r? @kennytm (rust-highfive has picked a reviewer for you, use r? to override) |
Maybe such a PR should also add some form of test that reproduces a setting similar to the unsoundness in the PR and asserts that things don’t behave incorrectly? If this seems appropriate / necessary, please say so. This PR will most likely have negative performance impact since it removes certain optimizations. Thus it might be interesting to do a benchmark to get a better feeling of how much performance is lost. Since it’s fixing a soundness issue, losing some performance shouldn’t be too bad, especially considering that the relevant optimizations (involving There is the future possibility to re-add part of the removed |
Added some update to the documentation of |
/// * `std::iter::Iterator::__iterator_get_unchecked()` | ||
/// * `std::iter::TrustedRandomAccess::size()` | ||
/// 3. After `self.__iterator_get_unchecked(idx)` has been called, then `self.next_back()` will | ||
/// only be called at most `self.size() - idx - 1` times. If `Self: Clone` and `self` is cloned, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this assume that idx
is incremented in forwards direction? I.e. next_back
may not do the right thing if you're calling __iterator_get_unchecked(self.size() - 1)
Note that the trait is called TrusedRandomAccess, so theoretically some future new iterator method could take some shortcut by directly indexing into an iterator in a non-linear fashion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, first of all note that I didn’t write the original text, I just clarified it w.r.t. Clone
, but second: if __iterator_get_unchecked(self.size() - 1)
, then the requirement will be that
self.next_back()
will only be called at mostself.size() - idx - 1
times
so…
idx == size-1
size - idx - 1 == size - (size - 1) - 1 == 0
There you go, now you’re officially no longer allowed to call next_back
at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess for all of this to make just a little bit more sense, there should also be the requirement (currently undocumented) that calling next_back
decreases the size of the iterator exactly by 1.
/// | ||
/// 1. `0 <= idx` and `idx < self.size()`. | ||
/// 2. If `self: !Clone`, then `get_unchecked` is never called with the same | ||
/// 2. If `Self: !Clone`, then `self.__iterator_get_unchecked(idx)` is never called with the same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have never quite understood the way this is phrased. I mean of course an index will be accessed multiple times if an iterator gets cloned, but that's happens across the different clones.
What would be the reason of having an index accessed multiple times without actually cloning an iterator?
Expected merge conflict with my own PR #85888. I won't rebase for now as to not interfere with review, the rebase will be trivial though because the other PR was just a small fix for a typo on the documentation of |
☔ The latest upstream changes (presumably #85984) made this pull request unmergeable. Please resolve the merge conflicts. |
nominating for (emergency) backport to 1.53-beta |
Can (/should?) we beta-approve just the commit c5390e3adf2402368ddbd089983ef11397125627 that removes the unsound impls, without concerning ourselves with the documentation changes for the beta...? |
@pnkfelix The documentation is completely hidden anyway, it's compiler internal / only visible of you manually run The main open question regarding this PR is whether the implementations that this PR removes are unsound or the usage of If a fix is wanted for 1.53, applying this PR is probably easier than any alternative because this PR already exists. Leaving out the documentation commit is probably easier because there's less potential for merge conflict and the internal documentation changes are irrelevant for a release version of the compiler. The effects of this PR beyond fixing the unsound case are only affecting performance and any alternative approach that fixes the unsoundness, too, and reintroduces the removed impls in some way, can still be chosen by future PRs without any problems. So merging this PR into nightly doesn't come with any form of commitment / drawbacks long-term. |
ping from triage; |
328c35f
to
99f95e5
Compare
done @rustbot label S-waiting-on-review, -S-waiting-on-author |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
☔ The latest upstream changes (presumably #86595) made this pull request unmergeable. Please resolve the merge conflicts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did my best to review the full backlog of comments in all the related issues and then review this PR and while I can't say I fully understand how these specializations work yet the explanations you gave do seem to make sense. Particularly the bit about adding the supertrait that only gets used in locations where we don't coerce to sub types which seems like a valid fix to the perf regression without breaking the Zip unsoundness fixes.
I'd love to sit down sometime with either @the8472 or @steffahn to more fully understand how this code all fits together but for now I'm satisfied with this fix. I have one question I'd like to answer before approving which hopefully will clarify much of the bits I don't currently understand then we should be able to get this merged.
{ | ||
// Safety: The TrustedRandomAccess contract requires that callers only pass an index | ||
unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item { | ||
// Safety: The TrustedRandomAccess contract requires that callers only pass an index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disclaimer: I'm still unfamiliar with __iterator_get_unchecked
and the related specializations so I'm going to be asking some probably silly questions:
Is this comment still relevant?
Also, why doesn't this need a where Self: TrustedRandomAccessNoCoerce
bound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment still relevant?
It's unsafe code so it should have a SAFETY
comment. And it's still correct since this method should only be called by specializing impls relying on TrustedRandomAccess
.
Also, why doesn't this need a where Self: TrustedRandomAccessNoCoerce bound?
TrustedRandomAccess
is implemented unconditionally for this iterator so the bound would be always true.
But maybe it would still make sense for consistency or in case the trait impls are removed at some point. But I don't think it's needed for safety here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aah, I see. I'm assuming that means that all the other __iterator_get_unchecked
impls that leave of that bound also have it similarly implied? If that's the case I'd like to see those all have the bound added explicitly though that doesn't need to happen in this PR, but them being left off in some cases and not others seems somewhat confusing and could add to the maintenance burden.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But maybe it would still make sense for consistency
Actually consistency is the reason I removed the bound. Initially I had left the Self: TrustedRandomAccess
bounds in place and relied on type errors to point out all the places that needed changing (to Self: TrustedRandomAccessNoCoerce
). Afterwards, grepping for and going through all the remaining Self: TrustedRandomAccess
occurrences (just to be sure) I came acoss this case (and another one, too IIRC) that didn’t produce an error message: Surprised me as well, I eventually figured out that the bound was entirely useless anyways. Looking at other iterators in std
, it seems to have be commonly done without any Self: TrustedRandomAccess
bound before when those weren’t necessary. On all kinds of iterators, e.g. all the ones on slices (e.g. also the ones from .chunks(…)
and similar). IIRC, with the changes of this PR all the superfluous bounds are consistently gone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's the case I'd like to see those all have the bound added explicitly though that doesn't need to happen in this PR, but them being left off in some cases and not others seems somewhat confusing and could add to the maintenance burden.
So as far as I know, after this PR only those implementations that need the bound will have it. Don’t quote me on that, I’d need to double check if this is indeed true. This very PR demonstrated that leaving the bounds off can actually help maintainance in some cases, like it helped me being able to use error messages to find all the places that needed to change, like I described above. The redundant Self: TrustedRandomAccess
binding didn’t generate any error messages though, so their existence made the refactor a bit harder (since I needed to grep
for things in order to fix those redundant bindings (for now, by removing them instead of changing them to Self: TrustedRandomAccesNoCoerce
)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here’s a “grep” example. Looks like there are about 15 Iterators without the bound on the method (on the branch of this PR, so before this PR there were 13 Iterators without the bound)
~/forks/rust/library fix_unsound_zip_optimization rg 'fn __iterator_' -A3
core/src/iter/range.rs
677: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item
678- where
679- Self: TrustedRandomAccessNoCoerce,
680- {
core/src/str/iter.rs
299: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> u8 {
300- // SAFETY: the caller must uphold the safety contract
301- // for `Iterator::__iterator_get_unchecked`.
302- unsafe { self.0.__iterator_get_unchecked(idx) }
core/src/iter/adapters/fuse.rs
119: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item
120- where
121- Self: TrustedRandomAccessNoCoerce,
122- {
core/src/iter/adapters/enumerate.rs
117: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> <Self as Iterator>::Item
118- where
119- Self: TrustedRandomAccessNoCoerce,
120- {
core/src/iter/adapters/map.rs
128: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> B
129- where
130- Self: TrustedRandomAccessNoCoerce,
131- {
core/src/iter/adapters/cloned.rs
64: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> T
65- where
66- Self: TrustedRandomAccessNoCoerce,
67- {
core/src/slice/iter.rs
1268: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
1269- // SAFETY: since the caller guarantees that `i` is in bounds,
1270- // which means that `i` cannot overflow an `isize`, and the
1271- // slice created by `from_raw_parts` is a subslice of `self.v`
--
1423: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
1424- let start = idx * self.chunk_size;
1425- let end = match start.checked_add(self.chunk_size) {
1426- None => self.v.len(),
--
1588: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
1589- let start = idx * self.chunk_size;
1590- let end = match start.checked_add(self.chunk_size) {
1591- None => self.v.len(),
--
1759: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
1760- let start = idx * self.chunk_size;
1761- // SAFETY: mostly identical to `Chunks::__iterator_get_unchecked`.
1762- unsafe { from_raw_parts(self.v.as_ptr().add(start), self.chunk_size) }
--
1911: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
1912- let start = idx * self.chunk_size;
1913- // SAFETY: see comments for `ChunksMut::__iterator_get_unchecked`.
1914- unsafe { from_raw_parts_mut(self.v.as_mut_ptr().add(start), self.chunk_size) }
--
2172: unsafe fn __iterator_get_unchecked(&mut self, i: usize) -> &'a [T; N] {
2173- // SAFETY: The safety guarantees of `__iterator_get_unchecked` are
2174- // transferred to the caller.
2175- unsafe { self.iter.__iterator_get_unchecked(i) }
--
2289: unsafe fn __iterator_get_unchecked(&mut self, i: usize) -> &'a mut [T; N] {
2290- // SAFETY: The safety guarantees of `__iterator_get_unchecked` are transferred to
2291- // the caller.
2292- unsafe { self.iter.__iterator_get_unchecked(i) }
--
2435: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
2436- let end = self.v.len() - idx * self.chunk_size;
2437- let start = match end.checked_sub(self.chunk_size) {
2438- None => 0,
--
2597: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
2598- let end = self.v.len() - idx * self.chunk_size;
2599- let start = match end.checked_sub(self.chunk_size) {
2600- None => 0,
--
2761: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
2762- let end = self.v.len() - idx * self.chunk_size;
2763- let start = end - self.chunk_size;
2764- // SAFETY:
--
2919: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
2920- let end = self.v.len() - idx * self.chunk_size;
2921- let start = end - self.chunk_size;
2922- // SAFETY: see comments for `RChunksMut::__iterator_get_unchecked`.
core/src/iter/adapters/copied.rs
80: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> T
81- where
82- Self: TrustedRandomAccessNoCoerce,
83- {
core/src/iter/adapters/zip.rs
92: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item
93- where
94- Self: TrustedRandomAccessNoCoerce,
95- {
core/src/iter/traits/iterator.rs
3492: unsafe fn __iterator_get_unchecked(&mut self, _idx: usize) -> Self::Item
3493- where
3494- Self: TrustedRandomAccessNoCoerce,
3495- {
core/src/slice/iter/macros.rs
321: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
322- // SAFETY: the caller must guarantee that `i` is in bounds of
323- // the underlying slice, so `i` cannot overflow an `isize`, and
324- // the returned references is guaranteed to refer to an element
alloc/src/collections/vec_deque/iter.rs
107: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
108- // Safety: The TrustedRandomAccess contract requires that callers only pass an index
109- // that is in bounds.
110- unsafe {
alloc/src/collections/vec_deque/iter_mut.rs
93: unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
94- // Safety: The TrustedRandomAccess contract requires that callers only pass an index
95- // that is in bounds.
96- unsafe {
alloc/src/vec/into_iter.rs
169: unsafe fn __iterator_get_unchecked(&mut self, i: usize) -> Self::Item
170- where
171- Self: TrustedRandomAccessNoCoerce,
172- {
I'm ready to merge this once the conflicts have been resolved. @bors delegate+ |
✌️ @steffahn can now approve this pull request |
I’ll handle the merge conflict tomorrow. |
Removes the implementations that depend on the user-definable trait `Copy`.
Include new details regarding coercions to a subtype. These conditions also explain why the previously removed implementations for {array, vec, vec_deque}::IntoIter<T> were unsound, because they introduced an extra `T: Clone` for the TrustedRandomAccess impl, even though their parameter T is covariant.
…arantees about subtype coercions Update all the TrustedRandomAccess impls to also implement the new supertrait
18a6d45
to
6d9c0a1
Compare
📌 Commit 6d9c0a1 has been approved by |
☀️ Test successful - checks-actions |
…=dtolnay Fix intra doc link in hidden doc of Iterator::__iterator_get_unchecked Recently, I edited the import list of the `core::iter::traits::iterator` module (in rust-lang#85874). This results in a broken intra doc link in a hidden documentation with the effect that `RUSTDOCFLAGS='--document-private-items --document-hidden-items' x doc library/std` fails. (This can be worked around by adding `-Arustdoc::broken-intra-doc-links`; still, it’s a broken link so let’s fix it.) `@rustbot` label C-cleanup, T-libs
…=dtolnay Fix intra doc link in hidden doc of Iterator::__iterator_get_unchecked Recently, I edited the import list of the `core::iter::traits::iterator` module (in rust-lang#85874). This results in a broken intra doc link in a hidden documentation with the effect that `RUSTDOCFLAGS='--document-private-items --document-hidden-items' x doc library/std` fails. (This can be worked around by adding `-Arustdoc::broken-intra-doc-links`; still, it’s a broken link so let’s fix it.) ``@rustbot`` label C-cleanup, T-libs
Removes the implementations that depend on the user-definable trait
Copy
.Fixes #85873 in the most straightforward way.
Edit: This PR now contains additional trait infrastructure to avoid performance regressions around in-place collect, see the discussion in this thread starting from the codegen test failure at #85874 (comment).
With this PR,
TrustedRandomAccess
gains additional documentation that specifically allows for and specifies the safety conditions around subtype coercions – those coercions can happen in safe Rust code with theZip
API’s usage ofTrustedRandomAccess
. This PR introduces a new supertrait ofTrustedRandomAccess
(currently namedTrustedRandomAccessNoCoerce
) that doesn’t allow such coercions, which means it can be still be useful for optimizing cases such as in-place collect where no iterator is handed out to a user (who could do coercions) after aget_unchecked
call; the benefit of the supertrait is that it doesn’t come with the additional safety conditions around supertraits either, so it can be implemented for more types thanTrustedRandomAccess
.The
TrustedRandomAccess
implementations forvec::IntoIter
,vec_deque::IntoIter
, andarray::IntoIter
are removed as they don’t conform with the newly documented safety conditions, this way unsoundness is removed. But this PR in turn (re-)adds aTrustedRandomAccessNoCoerce
implementation forvec::IntoIter
to avoid performance regressions from stable in a case of in-place collecting ofVec
s [the above-mentioned codegen test failure]. Re-introducing the (currently nightly+beta-only) impls forVecDeque
’s and[T; N]
’s iterators is technically possible, but goes beyond the scope of this PR (i.e. it can happen in a future PR).