-
-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
possible huge performance regression in batched_mul
#282
Comments
Can you provide an MWE? |
here's an MWE showing a 1000x performance regression for
|
cc @mcabbott |
Oh that's not good. I can't try this out now, but the obvious guess is that it's using the generic fallback (which makes slices) rather than a CUBLAS call. If you run Apart from that... does it return a |
More compact demonstration: using NNlib
ENV["JULIA_DEBUG"] = NNlib
x, y = randn(64,3,5), randn(64,7,5); # fine
xT = batched_transpose(x); xT2 = copy(xT);
batched_mul(xT,y) ≈ batched_mul(xT2,y)
x, y = randn(64,1,5), randn(64,7,5); # uses fallback
xT = batched_transpose(x); xT2 = copy(xT);
batched_mul(xT,y) ≈ batched_mul(xT2,y) # still correct Test all cases, from #268: quite a few use fallback. using NNlib, Test
@testset "awkward strides, $T" for T in [Float64] #, ComplexF64]
@testset "$tA(rand$((sA...,2))) ⊠ $tB(rand$((sB...,2)))" for tA in [identity, batched_adjoint, batched_transpose], sA in [(1,3), (3,1), (3,3)], tB in [identity, batched_adjoint, batched_transpose], sB in [(1,3), (3,1), (3,3)]
A = tA(rand(T, sA..., 2))
B = tB(rand(T, sB..., 2))
size(A,2) == size(B,1) || continue
C = cat(A[:,:,1] * B[:,:,1], A[:,:,2] * B[:,:,2]; dims=3)
@test A ⊠ B ≈ C
end
end; |
thanks for looking into this! would it be worth adding automatic performance regression testing to NNlib's CI like exists for julia base? |
It might be, although tricky. There are many cases to test, and I'm not sure this would be a huge slowdown on the CPU, where the fast path isn't so different from the fallback. Now #299 at least tests whether |
i'm still looking into the cause and quantifying the magnitude of the effect, but i think it is caused by #271. has anyone else observed this?
The text was updated successfully, but these errors were encountered: