-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pullback for sparse-array vector product very inefficient #803
Comments
Alternatively, there could be a more specialised Note that all things sparse are very crude! We rushed to include the semantics someone said they wanted in the 1.0 release. It would probably have been better to leave them as errors, until someone who cared could take on the task. |
A dispatch-based solution at the |
Yea I don't know! Do any such types currently work? I'd be pretty surprised if |
I did a quick test, dA = @thunk(project_A(LowRankMatrix(Ȳ, B))) (If LowRankMatrices is an acceptable dependency for ChainRulesCore, it's very lightweight, though).
It might be interesting in general, also for non-sparse matrices, though I guess that would be a pretty big change in current behavior. But could dispatch the implementation of the pullback based on |
Our current rrule for sparse matrix vector products is very inefficient, and causes out-of-memory with large sparse CPU or GPU arrays. Our current
rrule(*, sparse(A), x)
is implemented like thisSo we first compute a non-sparse
Ȳ * B'
(may easily exceed memory if A was very large but very sparse) and then project back to a sparse tangent.The best way to fix this (at least if
Ȳ' and 'B'
are vectors) might be adding a specific "vector-outer-product" array type for read-only vector * adjoint-vector products (might be useful in general) that computesgetindex
on the fly. Or maybe we already have that somewhere?The text was updated successfully, but these errors were encountered: