-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support sparse matrix and sparse linear solvers #2906
Comments
Will you consider supporting sparse matrix solver like Suitesparse? |
Will you consider an inclusion of some engineering level packages like PETSc and Trilinos. These packages not only support different types of matrix formats but also a lot of preconditioners. |
@mzy2240 Thanks for your suggestion. We just wrap Eigen for now. SuiteSparse is in our plan. If you have interest, |
Our plan is to design a general sparse linear system solver. But now, we intend to implement a basic and useable version based on Eigen. More packages like SuitSparse, Pardiso, CuSolver et al. are in our plan. And we welcome you to join us during the process. |
Maybe KokkosKernel could help your work. |
Do you consider supporting conversion from and to scipy.sparse matrix? Scipy is probably the most used library when dealing with sparse matrix in python. |
Hi, there! I am now working on implementing the basic operation operations such as +, -, *, @, etc. on GPU. However, I found that the Is it OK to use cuSparse Routine |
Issue: #2906 ### Brief Summary Using ndarray `read_int` function to attain triplets is not efficient. The triplets could be copied to host memory using `memcpy_device_to_host`. In addition, the triplets stored in the map container are already ordered. In function `build_csr_from_coo`, the coo triplets are sorted again. To improve performance, we could use the unordered map to store triplets.
Issue: #2906 Co-authored-by: Olinaaaloompa <[email protected]> Co-authored-by: Yi Xu <[email protected]>
Issue: taichi-dev#2906 Co-authored-by: Olinaaaloompa <[email protected]> Co-authored-by: Yi Xu <[email protected]>
Issue: #2906 ### Brief Summary The `read_int` function of ndarray consumes more than 100M gpu memory. It's better to use `memcpy_device_to_host` function to obtain `num_triplets_`.
Issue: #2906 ### Brief Summary When solving a sparse linear system without reordering the system matrix, `analyze_pattern` and `factorize` would consume a lot of GPU memory. For example, the `res` in the `stable fluid` example can not be larger than 500. After reordering the sparse system matrix, we can set `res=512` now. Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Issue: #2906 ### Brief Summary To be consistent with API on CPU backend, this pr provides LU sparse solver on CUDA backend. CuSolver just provides a CPU version API of LU sparse solver which is used in this PR. The cuSolverRF provides a GPU version LU solve, but it only supports `double` datatype. Thus, it's not used in this PR. Besides, the `print_triplets` is refactored to resolve the ndarray `read` constraints (the `read` and `write` data should be the same datatype). Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Issue: taichi-dev#2906 taichi-dev#6082 + Support element access for sparse matrix on GPU. + Add tests for GPU sparse matrix operators. Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Yi Xu <[email protected]>
Issue: taichi-dev#2906 ### Brief Summary The previous API of spmv is `A.spmv(x, y)` which means `y=A@x`. Now the spmv function is removed. We can directly use `y = A @ x` to compute the results on Cuda backend.
Issue: taichi-dev#2906 ### Brief Summary Replace the array storing triplets in sparse matrix builder with ndarray. It unifies the sparse matrix builder creation on the CPU and GPU. Co-authored-by: Yi Xu <[email protected]>
…atrix (taichi-dev#6605) Issue: taichi-dev#2906 ### Brief Summary When building GPU sparse matrix, cuSparse API requires three separated arrays: row index ptr, col index ptr, and values ptr. However, the sparse matrix builder only uses one ndarray to store all triplets, the memory layout is like: [row, col, value, row, col, value, ...]. In this pr, I retrieve all data from ndarray and merge all triplets in the same position of the sparse matrix. Then, all triplets are stored in three separate arrays. At last, these three arrays are used to build sparse matrix using cuSparse API.
…chi-dev#6651) Issue: taichi-dev#2906 Previously, ndarray cann't be used as a vector on CPU backend in spmv and sparse linear solve operations. In this PR, the ndarray is [mapped](https://eigen.tuxfamily.org/dox/classEigen_1_1Map.html) as Eigen's vector, and then it's used to do math computation. In addition, the implicit mass spring example is modified to use ndarrays. It enables us to run this example on both the CPU and GPU backend.
Issue: taichi-dev#2906 ### Brief Summary Using ndarray `read_int` function to attain triplets is not efficient. The triplets could be copied to host memory using `memcpy_device_to_host`. In addition, the triplets stored in the map container are already ordered. In function `build_csr_from_coo`, the coo triplets are sorted again. To improve performance, we could use the unordered map to store triplets.
Issue: taichi-dev#2906 Co-authored-by: Olinaaaloompa <[email protected]> Co-authored-by: Yi Xu <[email protected]>
Issue: taichi-dev#2906 ### Brief Summary The `read_int` function of ndarray consumes more than 100M gpu memory. It's better to use `memcpy_device_to_host` function to obtain `num_triplets_`.
Issue: taichi-dev#2906 ### Brief Summary When solving a sparse linear system without reordering the system matrix, `analyze_pattern` and `factorize` would consume a lot of GPU memory. For example, the `res` in the `stable fluid` example can not be larger than 500. After reordering the sparse system matrix, we can set `res=512` now. Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Issue: taichi-dev#2906 ### Brief Summary To be consistent with API on CPU backend, this pr provides LU sparse solver on CUDA backend. CuSolver just provides a CPU version API of LU sparse solver which is used in this PR. The cuSolverRF provides a GPU version LU solve, but it only supports `double` datatype. Thus, it's not used in this PR. Besides, the `print_triplets` is refactored to resolve the ndarray `read` constraints (the `read` and `write` data should be the same datatype). Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Concisely describe the proposed feature
Many problems in computer graphics (especially physical simulation) need to solve sparse linear systems. While some of these problems can be addressed using matrix-free solvers (e.g., conjugate gradients), many need explicitly constructing a sparse matrix, in standard formats such as CSR and COO.
Currently, Taichi does not have first-class support for these matrices. Note that the sparse computation system in Taichi is designed for spatial sparsity instead of matrix sparsity.
Describe the solution you'd like (if any)
+
,-
,*
,@
, transpose, etc.LLT
,LDLT
,LU
, etc.+
,-
,*
,@
, transpose, etc.Unify CPU and GPU API
solve_cu
andsolve_rf
tosolve
functionUpdate:
The text was updated successfully, but these errors were encountered: