-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify invariants and fix constructors and is_standard_layout #543
Clarify invariants and fix constructors and is_standard_layout #543
Conversation
1b5d312
to
f3c4441
Compare
let dim_size = dim | ||
.size_checked() | ||
.ok_or_else(|| d.error("overflow calculating array size"))?; | ||
if len != dim_size { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! This reminds me, it's more than time to drop rustc-serialize I think :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was thinking the same thing. :)
src/dimension/dimension_trait.rs
Outdated
/// length `size`, and `strides` are created with `self.default_strides()` | ||
/// or `self.fortran_strides()`, then the invariants are met to construct | ||
/// an owned `Array` from the `Vec`, `dim`, and `strides`. (See the docs of | ||
/// `can_index_slice_not_custom` for a slightly more general case.) | ||
fn size_checked(&self) -> Option<usize> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really see why this is needed, yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this seems good.
Isn't there an option to let this method just work with the overflow check and then check the isize::MAX requirement in constructors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, whoops. I didn't realize that size_checked
was a public method.
Initially, I didn't do this because I thought size_checked
was private, and I noticed that all places but one where .size_checked()
was called needed to perform the isize::MAX
check too. (The only exception is this line in indices_iter_f
.)
I'll change it back to the original behavior and add a separate checking function for internal use.
I agree with your thoughts about the tradeoffs. We don't need to prioritize zero size elements much at all, and I'll look if we can do this without making docs and code so much more complicated. |
_ => true, | ||
} | ||
}); | ||
debug_assert!(dimension::can_index_slice(&v, &dim, &strides).is_ok()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this now correct? There were some negative stride issues we couldn't check properly before (we'd have a false positive, which is fine, it was a "missing feature", but we can't debug assert on that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment below. If we want to allow negative strides, we need to provide some way for the user to specify the ptr
since it's always different from vec.as_mut_ptr()
when there are negative strides for axes with length > 1.
/// | ||
/// 2. The product of non-zero axis lengths must not exceed `isize::MAX`. | ||
/// | ||
/// 3. For axes with length > 1, the stride must be nonnegative. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about why we require this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be unchecked? We don't implement the proper debug check for negative strides, but if the strides are correct, they should be fine to be used here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If any axes with length > 1 have negative strides, moving along those axes would result in offsetting the array's pointer backwards outside the Vec
. (from_shape_vec_unchecked
is implemented in terms of from_vec_dim_stride_unchecked
, which creates the array's pointer from v.as_mut_ptr()
.) We could allow negative strides if we allow the user to specify the array's ptr
(or, equivalently, the offset of ptr
relative to v.as_mut_ptr()
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh great point, I just never considered that about these constructors. Thanks for the explanation.
/// A pointer into the buffer held by data, may point anywhere | ||
/// in its range. | ||
/// A non-null and aligned pointer into the buffer held by `data`; may | ||
/// point anywhere in its range. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good
Very nice work on this PR 😄 |
// bounds or one byte past the end of a single allocation with element | ||
// type `A`. The only exceptions are if the array is empty or the element | ||
// type is zero-sized. In these cases, `ptr` may be dangling, but it must | ||
// still be safe to [`.offset()`] the pointer along the axes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I guess we must accept a dangling pointer, as in the one we get from an empty Vec
// methods/functions on the array while it violates the constraints. | ||
// | ||
// Users of the `ndarray` crate cannot rely on these constraints because they | ||
// may change in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whole block is super
By the way, one thing that surprised me was that before this PR, |
447f654
to
6adb1e4
Compare
@jturner314 ok about can_index_slice, so the argument for allowing more general strides for zero length axes would be that they can come up in normal operation somehow? Otherwise it seems good to ban them |
Here's an example demonstrating that arbitrary strides can occur for zero-length axes in normal operation: extern crate ndarray;
use ndarray::{array, s, Axis};
fn main() {
let mut a = array![[1, 2, 3], [4, 5, 6]];
a.invert_axis(Axis(0));
println!("a =\n{:?}", a);
let s = a.slice(s![0..0, ..]);
println!("s =\n{:?}", s);
} It prints:
The stride of axis 0 is We could still ban them in constructors, but I don't see a good reason to if they can occur in normal operation. |
@jturner314 sounds good |
8df9cfb
to
8850395
Compare
I think I've resolved all of the comments, and it finally passes CI. :)
Making the assumption that slices contain at most |
Monumental effort, it's great! I'll try my hand at adding a comment more |
8850395
to
9435c8f
Compare
ready to merge when you think so |
9435c8f
to
4bcbce5
Compare
4bcbce5
to
da99962
Compare
da99962
to
8727b8a
Compare
I fixed some minor issues (comments/docs) and merged the PR. 🎉 |
Various method and iterator implementations in
ndarray
implicitly made some assumptions that were not enforced. For example,In a number of places, methods/iterators violated the safety constraints on
.offset()
. For example,AxisIterCore
offsets the pointer along the axis of iteration even if one of the other axes is zero-length. As a result, before this PR, this program would have undefined behavior:The stride of axis 1 is 1, so
AxisIter
callsptr.offset(0 * 1)
,ptr.offset(1 * 1)
, andptr.offset(2 * 1)
. The first and second offsets are fine because they're "in bounds or one byte past the end of the same allocated object", but the last offset is undefined behavior.This PR states the invariant that offsets along all axes must be safe, even if there are zero-length axes, and it checks this property in array constructors.
In some places, implementations made the assumption that the axis lengths would always be less than or equal to
isize::MAX
, such as in thestride_offset
function in thedimension
module. However, it is possible to create aVec
/slice with length greater thanisize::MAX
of zero-size elements, and the implementation ofsize_checked
allowed creation of arrays with axis lengths greater thanisize::MAX
(e.g.ArrayView2::<i32>::from_shape((isize::MAX as usize + 1, 0), &[]).unwrap()
. I have trouble finding a place where this would cause undefined behavior, but these cases could cause unexpected panics in debug mode due to overflow, and the fact that I've seen code making this invalid assumption makes me uncomfortable that there may be a soundness issue I haven't found.This PR changes the constructors to enforce that the product of non-zero axis lengths must not exceed
isize::MAX
.This PR explicitly states the invariants that
ArrayBase
must uphold and methods can depend on. It also adds the necessary checks to enforce these invariants when arrays are created. The invariants are carefully designed so that slices/subviews/reshapes/etc. preserve them, so checks are needed only in the constructors and not in method/iterator implementations.This PR does make some changes to the behavior of constructors, the default strides for empty arrays, and
is_standard_layout
in some edge cases. However, I would still consider it a backwards-compatible change because it's enforcing assumptions that were made in previous versions that should have been checked. In other words, it's a bug fix.A few notes about possible alternatives:
AxisIter
and.subview_inplace()
. These checks would be easy-to-forget, which is problematic because missing a check could result in undefined behavior. I much prefer performing the necessary checks in constructors so that all the other method/iterator implementations can be simple.isize
. This would be slightly more flexible because, for example, it be possible possible to create a 1-D array ofusize::MAX
zero-size elements. However, this additional flexibility doesn't have practical applications and would require us to explicitly check for overflow in a lot more places.ArrayPtr
/ArrayPtrMut
in Add raw array pointer types #496 more flexible. However, method implementations already make this assumption (e.g..as_slice()
), allowing null pointers isn't significantly more useful than allowing dangling pointers, and forbidding null pointers will allow us to switchArrayBase.ptr
to useNonNull
in the future (which should remove the need for "reborrow" methods (Add .reborrow() methods to array views #412)).