Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfixes when using \ operator with non square matrices #1584

Merged
merged 3 commits into from
Sep 13, 2022

Conversation

GVigne
Copy link
Contributor

@GVigne GVigne commented Sep 5, 2022

Hi,
I was experiencing issues with the \ operator and QR decomposition in CUDA. I checked the issues and saw #138, but it is quite old and outdated. Currently, here is the bug I am facing

julia> A = CUDA.rand(4);
julia> B = CUDA.rand(4);
julia> M = qr(A);
julia> M \ B
ERROR: DimensionMismatch("trsm!")
Stacktrace:
 [1] trsm!
   @ ~/.julia/packages/CUDA/DfvRa/lib/cublas/wrappers.jl:1375 [inlined]
 [2] ldiv!
   @ ~/.julia/packages/CUDA/DfvRa/lib/cublas/linalg.jl:359 [inlined]
 [3] \(A::UpperTriangular{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, B::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
   @ LinearAlgebra /opt/julia-1.7.3/share/julia/stdlib/v1.7/LinearAlgebra/src/triangular.jl:1661
 [4] ldiv!(_qr::CUDA.CUSOLVER.CuQR{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, b::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
   @ CUDA.CUSOLVER ~/.julia/packages/CUDA/DfvRa/lib/cusolver/linalg.jl:213
 [5] \(F::CUDA.CUSOLVER.CuQR{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, B::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
   @ LinearAlgebra /opt/julia-1.7.3/share/julia/stdlib/v1.7/LinearAlgebra/src/factorization.jl:107
 [6] top-level scope
   @ REPL[14]:1
 [7] top-level scope
   @ ~/.julia/packages/CUDA/DfvRa/src/initialization.jl:52

This also fails if A is a rectangular matrix, or if it is a square matrix and B is a rectangular matrix.
I took a look at how it's done in LinearAlgebra and it seems that there are two specialized versions of ldiv!, one for vectors and one for matrices. These two ldiv! functions select only the relevant part of the input elements by using a view, and this is how it avoids dimension mismatch.

I tried to reproduce this in CUDA by breaking down ldiv! into two functions. With this, I can run the code above and it gives the correct result.

julia> A = CUDA.rand(4);
julia> B = CUDA.rand(4);
julia> M = qr(A);
julia> M \ B
4-element CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}:
  0.64181817
  0.905216
 -3.1400292
  0.83969826
julia> qr(Array(A)) \ Array(B)
4-element Vector{Float32}:
  0.64181787
  0.90521634
 -3.1400278
  0.83969784

Fixes #1584

@maleadt
Copy link
Member

maleadt commented Sep 8, 2022

Thanks! I'm not too familiar with this implementation, so I'll trust you on it. Can you add a test?

The error reported here is the same as what now happens in #1584, so I guess this fixes that issue?

@maleadt maleadt added cuda array Stuff about CuArray. needs tests Tests are requested. bugfix This gets something working again. labels Sep 8, 2022
@GVigne
Copy link
Contributor Author

GVigne commented Sep 12, 2022

I updated the PR by adding a few tests. This indeed fixes the issue I was talking about in my PR in DFTK: we use the \ operator for our Anderson solver, so we need that to work in CUDA.

@maleadt
Copy link
Member

maleadt commented Sep 12, 2022

CI failures are relevant.

@GVigne
Copy link
Contributor Author

GVigne commented Sep 13, 2022

I think I found out what was happening.
When using julia 1.7 and computing the QR decompostion of a CuArray, we get a CuQR object. However, when using julia 1.8 and computing a QR decomposition, we get a QR object. I don't really know why this is the case, but this is why the tests would fail on julia 1.8. I updated the PR, it should work for both versions.

@maleadt
Copy link
Member

maleadt commented Sep 13, 2022

That's because JuliaLang/julia#42594 is only in 1.8

@maleadt maleadt merged commit fae0a0b into JuliaGPU:master Sep 13, 2022
@GVigne GVigne deleted the qr_ldiv! branch September 13, 2022 15:31
simonbyrne pushed a commit to simonbyrne/CUDA.jl that referenced this pull request Nov 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix This gets something working again. cuda array Stuff about CuArray. needs tests Tests are requested.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants