Reshaping array in numpy scrambles values #263

dmrd · 2016-04-20T08:49:45Z

In short: Why does reshaping an array on the numpy side touch the underlying memory ([1,2,3,4] -> [1,3,2,4]). I thought it should just require changing the metadata on stride/shape. Is the reshuffling necessary?

I am using PyCall to call into Keras for my project to write a Go bot in Julia. I am running into problems converting arrays for use with Keras.

On the Julia side, the training example array is of size (19, 19, 6, N), corresponding to N examples of 6 layers of 19x19 features. On the Python side, Keras expects the array to be (N, 6, 19, 19).

I have successfully trained models using the following function (in code here) to process arrays before passing them into Keras (train simplified. Actual code here).

@pyimport numpy as np
function to_python(arr::AbstractArray)
    np.reshape(arr[:], reverse(size(arr)))
end

function train(X, Y)
    X = to_python(X)
    Y = to_python(Y)
    model.fit(X,Y)
end

I strongly believe this to be correct because it produces a reasonably strong go bot and achieves good validation accuracy. Unfortunately, it takes an absurd amount of memory and maxes out RAM for large X because it actually reshuffles the underlying data. Stranger still, the order changes if you flatten the array:

R = reshape(collect(1:10), (2,5))
println(np.reshape(R, reverse(size(R)))[:])
println(np.reshape(R[:], reverse(size(R)))[:])
> [1,5,9,4,8,3,7,2,6,10]
> [1,3,5,7,9,2,4,6,8,10]

--------- ^ important part above ^ ----------

I also experimented with #85 , but it does not seem to help this case. A few things I tried: (each part was fed into model.fit to test). to_python is the only one that actually fits the data. Even though the first two have the correct dimensionality, the data is effectively shuffled and is nonsensical.

----
    X2 = reshape(X, reverse(size(X)))
    Y2 = reshape(Y, reverse(size(Y)))
    Y = PyObject(Y2, false)
    X = PyObject(X2, false)
does not work, but correct size
-----
    X2 = reshape(X, reverse(size(X)))
    Y2 = reshape(Y, reverse(size(Y)))
    Y = PyObject(Y2, true)
    X = PyObject(X2, true)
does not work, but correct size
-----
Y = PyObject(Y, true)
X = PyObject(X, true)
Exception('Input arrays should have the same number of samples as target arrays. Found 9 input samples and 81 target samples.',)
----
Y = PyObject(Y, false)
X = PyObject(X, false)
Exception('Input arrays should have the same number of samples as target arrays. Found 9 input samples and 81 target samples.',)
----
function to_python(arr::AbstractArray)
    np.reshape(arr[:], reverse(size(arr)))
end
-
    X = to_python(X)
    Y = to_python(Y)
works!

The text was updated successfully, but these errors were encountered:

stevengj · 2016-04-20T12:24:02Z

Once #85 is merged that will provide a good way to do this.

dmrd · 2016-04-21T03:29:40Z

Here is a minimal reproduction of the issue

Converting bit arrays in #85 fails, so it falls back to the other method which doesn't take revdims.

Does not modify underlying memory Documented: JuliaPy/PyCall.jl#263

stevengj · 2016-04-23T00:07:42Z

eb8d70a adds BitArray conversions, so you can do PyReverseDims(b) on a b::BitArray to get a NumPy boolean array with reversed dimensions. (It makes a copy of the underlying data, however, because Python doesn't have a standard bitarray type.)

dmrd · 2016-04-24T07:42:32Z

Looks great. Thanks for adding that in!

Closing this out.

dmrd added a commit to dmrd/go.jl that referenced this issue Apr 21, 2016

Julia -> Numpy array conversion fixed

4203b2a

Does not modify underlying memory Documented: JuliaPy/PyCall.jl#263

dmrd closed this as completed Apr 24, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reshaping array in numpy scrambles values #263

Reshaping array in numpy scrambles values #263

dmrd commented Apr 20, 2016

stevengj commented Apr 20, 2016

dmrd commented Apr 21, 2016

stevengj commented Apr 23, 2016

dmrd commented Apr 24, 2016

Reshaping array in numpy scrambles values #263

Reshaping array in numpy scrambles values #263

Comments

dmrd commented Apr 20, 2016

stevengj commented Apr 20, 2016

dmrd commented Apr 21, 2016

stevengj commented Apr 23, 2016

dmrd commented Apr 24, 2016