Inconsistency in `into_shape` with identity #766

elslooo · 2019-12-17T23:32:20Z

In addition to the limits of into_shape() described in #390, I think the following is a bug with the current implementation.

I'm trying to reshape a tensor into its own shape (which should obviously always succeed).

let array = Array:from_shape_vec((224, 224, 3), vec![...])?;
let array = array.permuted_axes([2, 0, 1]);
let shape = array.shape().to_owned(); // [3, 224, 224]

let array = array.into_shape(shape)?; // This should not fail.
// Err(ShapeError/IncompatibleLayout: incompatible memory layout)

This use case is purely hypothetical (why would anyone want to reshape into the current shape?), but I still think it might be a bug.

The next step would be to support reshaping from (A...) to (1, A...), which I assume would also be possible once this bug is fixed.

let array = Array:from_shape_vec((224, 224, 3), vec![...])?;
let array = array.permuted_axes([2, 0, 1]);

let array = array.into_shape([1, 3, 224, 224])?; // This should not fail.
// Err(ShapeError/IncompatibleLayout: incompatible memory layout)

The text was updated successfully, but these errors were encountered:

jturner314 · 2019-12-18T03:06:52Z

Good catch! The first paragraph of the docs is incorrect. The correct condition is specified in the second paragraph ("Errors if the input array is not c- or f-contiguous."). In the example you've provided, the array is contiguous but not in standard or Fortran layout; that's why .into_shape() returns an error. I've created #767 to fix the docs.

If you actually want to do what your example illustrates (re-interpreting an array with contiguous but not standard or Fortran layout) using the current version of ndarray, the best way to do this is to convert the array into a slice with .as_slice_memory_order(), and then turn it into a view with the correct shape/layout using ::from_shape(). If you want to reshape an owned array instead of getting a view, you could use .into_raw_vec() instead of .as_slice_memory_order(), but that's tricky because it's possible (after e.g. a slicing operation) that an owned array may only contain a portion of the Vec's elements. Please feel free to ask for help if that's what you actually want to do using the current version of ndarray.

The .into_shape() method really needs to be overhauled (see the discussion in #390). I have a proposal here that I think is good; it just needs to be implemented.

elslooo · 2019-12-18T23:29:43Z

Got it! If you don't mind me asking, what makes reshaping an existing array nontrivial? To my knowledge the reshape is essentially "virtual"; no memory is actually moved, right? For example, I've seen implementations that only check if the product of the shape (i.e. n*m) remains the same and don't do anything other than that.

jturner314 · 2019-12-19T01:38:43Z

the reshape is essentially "virtual"; no memory is actually moved, right? For example, I've seen implementations that only check if the product of the shape (i.e. n*m) remains the same and don't do anything other than that.

This approach works well if the implementation forces a particular memory layout of the arrays. For example, MATLAB arrays are always column-major, so reshape can just check that the number of elements match, and the result will always be predictable.

For implementations like NumPy and ndarray, however, which allow arbitrary strided memory layout, this approach doesn't work as well. In many cases, it can be difficult to predict the memory layout of an array by just reading the source code, so if reshaping is just reinterpreting the data in-place, the output can be difficult to predict.

The current implementation of .into_shape() is flawed for that reason. It currently works by reinterpreting the existing data as an array with new shape/strides (no data moved/copied), with the following behavior:

If the input is a row-major array, the output array will also be row-major.
If the input array is a column-major array, the output array will also be column-major.

Consider this example which just flattens various arrays into 1-D arrays:

use ndarray::prelude::*;

fn main() {
    let a = Array2::from_shape_vec((2, 3), vec![0, 1, 2, 3, 4, 5]).unwrap();
    let b = Array2::from_shape_vec((2, 3).f(), vec![0, 3, 1, 4, 2, 5]).unwrap();
    let c = (&a + &b) / 2;
    let d = (&b + &a) / 2;

    assert_eq!(a, b);
    assert_eq!(a, c);
    assert_eq!(a, d);

    println!("a_flat = {}", a.into_shape(6).unwrap());
    println!("b_flat = {}", b.into_shape(6).unwrap());
    println!("c_flat = {}", c.into_shape(6).unwrap());
    println!("d_flat = {}", d.into_shape(6).unwrap());
}

It has the following output:

a_flat = [0, 1, 2, 3, 4, 5]
b_flat = [0, 3, 1, 4, 2, 5]
c_flat = [0, 1, 2, 3, 4, 5]
d_flat = [0, 3, 1, 4, 2, 5]

Observe that all the arrays are equal (see the assert_eq! lines), but the results of reshaping them depend on their memory layout. The problem is that in a large program, it can be difficult to determine the memory layout of an array by reading the source code. Even in this short example, it's not obvious what the layout of c or d will be by just reading the source code. In fact, ndarray doesn't currently guarantee any particular memory layout for the result of arithmetic operations.

I would hate for a project to use .into_shape() and have it always work, but then some time later have a developer change &a + &b to &b + &a (or some other similar, minor change) and have that break the code in a surprising way.

So, IMO it's very important to change .into_shape() to behave in an easily-predictable way. My proposal is basically to do the same thing as numpy.reshape -- take an order parameter that describes how the reshaping operation should be performed; see the docs of numpy.reshape for details. Note that if the input array is not contiguous in the specified order, the data will need to be copied/rearranged to obtain the desired result.

elslooo · 2019-12-19T15:37:08Z

Got it, thanks!

jturner314 mentioned this issue Dec 18, 2019

Fix description of required layout for into_shape #767

Merged

elslooo closed this as completed Dec 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistency in `into_shape` with identity #766

Inconsistency in `into_shape` with identity #766

elslooo commented Dec 17, 2019 •

edited

Loading

jturner314 commented Dec 18, 2019

elslooo commented Dec 18, 2019

jturner314 commented Dec 19, 2019 •

edited

Loading

elslooo commented Dec 19, 2019

Inconsistency in into_shape with identity #766

Inconsistency in into_shape with identity #766

Comments

elslooo commented Dec 17, 2019 • edited Loading

jturner314 commented Dec 18, 2019

elslooo commented Dec 18, 2019

jturner314 commented Dec 19, 2019 • edited Loading

elslooo commented Dec 19, 2019

Inconsistency in `into_shape` with identity #766

Inconsistency in `into_shape` with identity #766

elslooo commented Dec 17, 2019 •

edited

Loading

jturner314 commented Dec 19, 2019 •

edited

Loading