make BatchedAdjOrTrans return correct BroadcastStyle #424

chengchingwen · 2022-06-26T00:03:24Z

This make batch_transpose or batch_adjoint over gpu array broadcastable.

without this patch:

julia> using NNlib, CUDA

julia> x = cu(randn(3,4,1));

julia> CUDA.allowscalar(false)

julia> batched_transpose(x) + batched_transpose(x)  
ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.
Stacktrace:                                                
[...]

with this patch:

julia> using NNlib, CUDA

julia> x = cu(randn(3,4,1));

julia> CUDA.allowscalar(false)    

julia> batched_transpose(x) + batched_transpose(x)
4×3×1 CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}:
[:, :, 1] =
 0.804003  -1.25596    1.6399
 2.09514    0.652395   2.89468
 0.938654   1.72338   -4.85532
 0.247084   0.30947   -2.44189

make BatchedAdjOrTrans return correct BroadcastStyle

da1f09c

ToucheSir approved these changes Jun 26, 2022

View reviewed changes

chengchingwen merged commit c9faa64 into FluxML:master Jun 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make BatchedAdjOrTrans return correct BroadcastStyle #424

make BatchedAdjOrTrans return correct BroadcastStyle #424

chengchingwen commented Jun 26, 2022

make BatchedAdjOrTrans return correct BroadcastStyle #424

make BatchedAdjOrTrans return correct BroadcastStyle #424

Conversation

chengchingwen commented Jun 26, 2022