Safe, lazy unwrapping #65

mcabbott · 2019-08-22T14:14:01Z

It would be nice to have an unwrapping function which is guaranteed to leave the dimensions in the specified order. This would be like parent(permutedims(A, (:i, :j)) but I think it should:

return always a view of the data, not a copy
only return a wrapper Transpose or PermutedDimsArray if this is necessary
eventually understand multiple wildcards

My proposal was to call this unname(A, (:i, :j)). Then 2 can't strictly be type-stable but perhaps some wizardry can be done to make it work well.

The text was updated successfully, but these errors were encountered:

oxinabox · 2019-08-22T14:39:08Z

I wonder if a better way to do this might be entirely seperate to NamedDims.jl
Have a package with 1 function: simplify_wrapper_arrays
and it knows about all the Base/LinearAlgebra
wrapper array types,
and it applies rules to simplify them.

For example making this transform to remove a rundant permutation that is mixed in with views.

julia> x
3×4 Array{Int64,2}:
 11  12  13  14
 21  22  23  24
 31  32  33  34

julia> PermutedDimsArray(view(PermutedDimsArray(x, (2,1)), :, 1:2), (2,1))
2×4 PermutedDimsArray(view(PermutedDimsArray(::Array{Int64,2}, (2, 1)), :, 1:2), (2, 1)) with eltype Int64:
 11  12  13  14
 21  22  23  24

julia> PermutedDimsArray(view(PermutedDimsArray(x, (2,1)), :, 1:2), (2,1)) |> typeof
PermutedDimsArray{Int64,2,(2, 1),(2, 1),SubArray{Int64,2,PermutedDimsArray{Int64,2,(2, 1),(2, 1),Array{Int64,2}},Tuple{Base.Slice{Base.OneTo{Int64}},UnitRange{Int64}},false}}

julia> view(x, 1:2, :)
2×4 view(::Array{Int64,2}, 1:2, :) with eltype Int64:
 11  12  13  14
 21  22  23  24

julia> view(x, 1:2, :) |> typeof
SubArray{Int64,2,Array{Int64,2},Tuple{UnitRange{Int64},Base.Slice{Base.OneTo{Int64}}},false}

Then one would largely just do:
unname(simplify_wrapper_arrays(unname(A)))
or something to that effect.

Further more if simplifying arrays it is written to be extensible NamedDIms.jl
could plug into it, and have rules that say when simplifying wrapper types that have a NamedDimsArray as one of the layers,
try and push that layer to be futher out.
Which is generally preferable as it means the named indexing works.

nickrobinson251 · 2019-08-22T15:03:13Z

yes AFAICT this is orthogonal to NamedDims (but a cool idea, and one that NamedDims should play nice with, if/when it exists)

mcabbott · 2019-08-22T15:16:17Z

views are pretty smart about overlapping, but PermutedDimsArray (and also reshape) not so much. The package Strided.jl handles such things pretty well, although not in quite the style you suggest:

julia> using Strided

julia> y = @strided permutedims(view(permutedims(x, (2,1)), :, 1:2), (2,1))
2×4 StridedView{Int64,2,Array{Int64,1},typeof(identity)}:
 11  12  13  14
 21  22  23  24

julia> y[1,1] = 99
99

julia> x
3×4 Array{Int64,2}:
 99  12  13  14
 21  22  23  24
 31  32  33  34

mcabbott · 2019-08-22T15:26:13Z

One more NamedDims-like issue is whether there ought to be a way of normalising so that the parent is an Array, not a lazy wrapper.

julia> NamedDimsArray(ones(3,1)', (:i, :j)) 
1×3 NamedDimsArray{(:i, :j),Float64,2,LinearAlgebra.Adjoint{Float64,Array{Float64,2}}}:
 1.0  1.0  1.0

Should there be a function name_the_parent(A, names) which unwraps on creation, to give 3×1 NamedDimsArray{(:j, :i),...? So that code like sum(nda, dims=:j) always gets the simplest operation.

Or should we regard things like this as bugs to be solved upstream:

rr = rand(100,100,100);
@btime sum($rr, dims=(1,3)) # 279.192 μs (6 allocations: 1.06 KiB)
pp = PermutedDimsArray(rr, (3,2,1));
@btime sum($pp, dims=(1,3)) # 2.078 ms (6 allocations: 1.06 KiB)

Edit: JuliaLang/julia#33029 is the issue; and here is a go at the normalising function suggested.

mcabbott · 2019-08-28T11:16:49Z

Maybe we got off track here. While better stacking of views would be nice, my point here is something different. I'd like this to work:

m = NamedDimsArray(rand(2,3), (:i, :j))
tm = rand()<0.5 ? m : copy(transpose(m))
unname(tm, (:i, :j))     # un-named array, always 2 x 3, sometimes ::Transpose

That is, given something with the right names, but in unknown order, I want to produce an AbstractArray with indices guaranteed to be in the order I specify, to pass to some calculation which cares about order not names.

More broadly I guess I'd like ways to deal with indices only by name. This package allows some things in this direction (like tm[i=2]) but it could go further, to write things not only clearly but independent of storage order.

nickrobinson251 · 2019-08-28T12:45:52Z

I want to produce an AbstractArray with indices guaranteed to be in the order I specify

I certainly see the appeal of this!

...to pass to some calculation which cares about order not names.

But i wonder what functions you have in mind here? Do you have a specifc example? I'm wondering if these functions could just be extended to handle names?

nickrobinson251 · 2019-08-28T12:49:45Z

For posterity, here's the current situation (where unname doesn't take a second argument):

julia> m = NamedDimsArray(rand(2, 3), (:i, :j))
2×3 NamedDimsArray{(:i, :j),Float64,2,Array{Float64,2}}:
 0.596994  0.0550656  0.204249
 0.211499  0.734237   0.364742

julia> tm = transpose(m)
3×2 NamedDimsArray{(:j, :i),Float64,2,LinearAlgebra.Transpose{Float64,Array{Float64,2}}}:
 0.596994   0.211499
 0.0550656  0.734237
 0.204249   0.364742

julia> unname(tm)
3×2 LinearAlgebra.Transpose{Float64,Array{Float64,2}}:
 0.596994   0.211499
 0.0550656  0.734237
 0.204249   0.364742

julia> tm = copy(transpose(m))
3×2 NamedDimsArray{(:j, :i),Float64,2,Array{Float64,2}}:
 0.596994   0.211499
 0.0550656  0.734237
 0.204249   0.364742

julia> unname(tm)
3×2 Array{Float64,2}:
 0.596994   0.211499
 0.0550656  0.734237
 0.204249   0.364742

mcabbott · 2019-08-28T13:26:58Z

...to pass to some calculation which cares about order not names.

But i wonder what functions you have in mind here? Do you have a specifc example? I'm wondering if these functions could just be extended to handle names?

For instance A * B contracts neighbouring indices. It does handle names (passes them forward & errors on a clash), but it doesn't decide what to do based on the names.

Which is correct, for the rule that code not mentioning names is supposed to treat named and un-named arrays alike (apart from passing along, and checking for clashes). You could imagine other behaviours for *, but that's another issue really.

If I need to call someone else's Package.fun which contains x ./ sum(x, dims=1) somewhere, then writing fun(unname(tm, (:i, :j))) is a safe way to get the right answer. If Package is sufficiently generic, perhaps I can call fun(permutelabels(tm, (:i, :j))) with some function permutelabels which returns a NamedDimsArray guaranteed to be in the order mentioned, and hope to get back something with names.

nickrobinson251 · 2019-08-28T13:42:35Z

If Package is sufficiently generic, perhaps I can call fun(permutelabels(tm, (:i, :j))) with some function permutelabels which returns a NamedDimsArray guaranteed to be in the order mentioned, and hope to get back something with names.

is this not permutedims (optionally combined with unname if you don't want a NamedDimsArray back)?

julia> nda = NamedDimsArray{(:w, :x, :y, :z)}(ones(10, 20, 30, 40));

julia> pnda = permutedims(nda, (:x, :y, :z, :w));

julia> typeof(pnda)
NamedDimsArray{(:x, :y, :z, :w),Float64,4,Array{Float64,4}}

julia> size(pnda)
(20, 30, 40, 10)

julia> typeof(unname(pnda))
Array{Float64,4}

mcabbott · 2019-08-28T13:51:10Z

It's close, but permutedims always copies -- so it fails my point 1 above. It's no good for Package.fun!, and in general may add avoidable memory usage.

nickrobinson251 · 2019-08-28T14:08:45Z

with apologies for going round in a circle here... how it this unname related?

In this case don't we just want

julia> nda = NamedDimsArray{(:w, :x, :y, :z)}(ones(10, 20, 30, 40));

julia> pnda = PermutedDimsArray(nda, (2, 3, 4, 1));  # should add a method for Symbol names

julia> typeof(pnda)
PermutedDimsArray{Float64,4,(2, 3, 4, 1),(4, 1, 2, 3),NamedDimsArray{(:w, :x, :y, :z),Float64,4,Array{Float64,4}}}

julia> size(pnda)
(20, 30, 40, 10)

mcabbott · 2019-08-28T14:15:07Z

This is my point 2 -- unfortunately quite a few operations on PermutedDimsArray fall back to slower variants than Transpose or Array.

It could be spelled something like parent(permutenames(tm, (:i, :j))), I agree. But if you have that, then you may as well have a method for unname as well.

Edit: Sometimes PermutedDimsArray is so slow that a copy is worth it. I don't think there's a great built-in way of saying "copy this only if it's not an array". Perhaps that's another question to consider here, lazy=false or something.

oxinabox · 2019-09-01T20:38:02Z

Was this closed on purpose?

mcabbott mentioned this issue Aug 22, 2019

PermutedDimsArray etc. #64

Closed

nickrobinson251 mentioned this issue Aug 28, 2019

Indexing by name when multiple dimensions have same name #52

Open

mcabbott closed this as completed Sep 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Safe, lazy unwrapping #65

Safe, lazy unwrapping #65

mcabbott commented Aug 22, 2019

oxinabox commented Aug 22, 2019

nickrobinson251 commented Aug 22, 2019 •

edited

Loading

mcabbott commented Aug 22, 2019

mcabbott commented Aug 22, 2019 •

edited

Loading

mcabbott commented Aug 28, 2019

nickrobinson251 commented Aug 28, 2019

nickrobinson251 commented Aug 28, 2019

mcabbott commented Aug 28, 2019

nickrobinson251 commented Aug 28, 2019

mcabbott commented Aug 28, 2019 •

edited

Loading

nickrobinson251 commented Aug 28, 2019

mcabbott commented Aug 28, 2019 •

edited

Loading

oxinabox commented Sep 1, 2019

Safe, lazy unwrapping #65

Safe, lazy unwrapping #65

Comments

mcabbott commented Aug 22, 2019

oxinabox commented Aug 22, 2019

nickrobinson251 commented Aug 22, 2019 • edited Loading

mcabbott commented Aug 22, 2019

mcabbott commented Aug 22, 2019 • edited Loading

mcabbott commented Aug 28, 2019

nickrobinson251 commented Aug 28, 2019

nickrobinson251 commented Aug 28, 2019

mcabbott commented Aug 28, 2019

nickrobinson251 commented Aug 28, 2019

mcabbott commented Aug 28, 2019 • edited Loading

nickrobinson251 commented Aug 28, 2019

mcabbott commented Aug 28, 2019 • edited Loading

oxinabox commented Sep 1, 2019

nickrobinson251 commented Aug 22, 2019 •

edited

Loading

mcabbott commented Aug 22, 2019 •

edited

Loading

mcabbott commented Aug 28, 2019 •

edited

Loading

mcabbott commented Aug 28, 2019 •

edited

Loading