Skip to content

Commit

Permalink
[lang] Use less gpu memory when building sparse matrix (#6781)
Browse files Browse the repository at this point in the history
Issue: #2906 

### Brief Summary
The `read_int` function of ndarray consumes more than 100M gpu memory.
It's better to use `memcpy_device_to_host` function to obtain
`num_triplets_`.
  • Loading branch information
FantasyVR authored Dec 2, 2022
1 parent d491a10 commit dd70ce8
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion taichi/program/sparse_matrix.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,8 @@ std::unique_ptr<SparseMatrix> SparseMatrixBuilder::build_cuda() {
built_ = true;
auto sm = make_cu_sparse_matrix(rows_, cols_, dtype_);
#ifdef TI_WITH_CUDA
num_triplets_ = ndarray_data_base_ptr_->read_int(std::vector<int>{0});
CUDADriver::get_instance().memcpy_device_to_host(
&num_triplets_, (void *)get_ndarray_data_ptr(), sizeof(int));
auto len = 3 * num_triplets_ + 1;
std::vector<float32> trips(len);
CUDADriver::get_instance().memcpy_device_to_host(
Expand Down

0 comments on commit dd70ce8

Please sign in to comment.