Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reshaping array in numpy scrambles values #263

Closed
dmrd opened this issue Apr 20, 2016 · 4 comments
Closed

Reshaping array in numpy scrambles values #263

dmrd opened this issue Apr 20, 2016 · 4 comments

Comments

@dmrd
Copy link

dmrd commented Apr 20, 2016

In short: Why does reshaping an array on the numpy side touch the underlying memory ([1,2,3,4] -> [1,3,2,4]). I thought it should just require changing the metadata on stride/shape. Is the reshuffling necessary?

I am using PyCall to call into Keras for my project to write a Go bot in Julia. I am running into problems converting arrays for use with Keras.

On the Julia side, the training example array is of size (19, 19, 6, N), corresponding to N examples of 6 layers of 19x19 features. On the Python side, Keras expects the array to be (N, 6, 19, 19).

I have successfully trained models using the following function (in code here) to process arrays before passing them into Keras (train simplified. Actual code here).

@pyimport numpy as np
function to_python(arr::AbstractArray)
    np.reshape(arr[:], reverse(size(arr)))
end

function train(X, Y)
    X = to_python(X)
    Y = to_python(Y)
    model.fit(X,Y)
end

I strongly believe this to be correct because it produces a reasonably strong go bot and achieves good validation accuracy. Unfortunately, it takes an absurd amount of memory and maxes out RAM for large X because it actually reshuffles the underlying data. Stranger still, the order changes if you flatten the array:

R = reshape(collect(1:10), (2,5))
println(np.reshape(R, reverse(size(R)))[:])
println(np.reshape(R[:], reverse(size(R)))[:])
> [1,5,9,4,8,3,7,2,6,10]
> [1,3,5,7,9,2,4,6,8,10]

--------- ^ important part above ^ ----------

I also experimented with #85 , but it does not seem to help this case. A few things I tried: (each part was fed into model.fit to test). to_python is the only one that actually fits the data. Even though the first two have the correct dimensionality, the data is effectively shuffled and is nonsensical.

----
    X2 = reshape(X, reverse(size(X)))
    Y2 = reshape(Y, reverse(size(Y)))
    Y = PyObject(Y2, false)
    X = PyObject(X2, false)
does not work, but correct size
-----
    X2 = reshape(X, reverse(size(X)))
    Y2 = reshape(Y, reverse(size(Y)))
    Y = PyObject(Y2, true)
    X = PyObject(X2, true)
does not work, but correct size
-----
Y = PyObject(Y, true)
X = PyObject(X, true)
Exception('Input arrays should have the same number of samples as target arrays. Found 9 input samples and 81 target samples.',)
----
Y = PyObject(Y, false)
X = PyObject(X, false)
Exception('Input arrays should have the same number of samples as target arrays. Found 9 input samples and 81 target samples.',)
----
function to_python(arr::AbstractArray)
    np.reshape(arr[:], reverse(size(arr)))
end
-
    X = to_python(X)
    Y = to_python(Y)
works!
@stevengj
Copy link
Member

Once #85 is merged that will provide a good way to do this.

@dmrd
Copy link
Author

dmrd commented Apr 21, 2016

Here is a minimal reproduction of the issue

Converting bit arrays in #85 fails, so it falls back to the other method which doesn't take revdims.

dmrd added a commit to dmrd/go.jl that referenced this issue Apr 21, 2016
Does not modify underlying memory
Documented: JuliaPy/PyCall.jl#263
@stevengj
Copy link
Member

eb8d70a adds BitArray conversions, so you can do PyReverseDims(b) on a b::BitArray to get a NumPy boolean array with reversed dimensions. (It makes a copy of the underlying data, however, because Python doesn't have a standard bitarray type.)

@dmrd
Copy link
Author

dmrd commented Apr 24, 2016

Looks great. Thanks for adding that in!

Closing this out.

@dmrd dmrd closed this as completed Apr 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants