-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT: Refactor differences between cblas_matrixproduct and PyArray_MatrixProduct2 #11432
Conversation
36ee2d9
to
0003933
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice cleanup! Apart from a total nitpick - the unneeded change in numpy/core/src/multiarray/cblasfuncs.h - this looks all OK to me.
Good to squash before merging, though, that way the commit message is more guaranteed to be what you want. |
0003933
to
8486c8c
Compare
8486c8c
to
760daff
Compare
Squashed, rebased, removed the gratuitous change, and realized that adding a include directive to one file meant I should remove it from the another |
* | ||
* If `out` is non-NULL, memory overlap is checked with ap1 and ap2, and an | ||
* updateifcopy temporary array may be returned. If `result` is non-NULL, the | ||
* output array to be returned (`out` if non-NULL and the newly allocated array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must admit I do not understand what is the point of possibly passing on result
to store the output in when the output is also returned. Since this PR is just about moving stuff so it can be used in multiple places, I think it is better to leave it as is here, but if you have the energy for a clean-up follow-up PR, that might be an idea...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
result
is the output ndarray to be returned - either out
or a freshly allocated ndarray (let's call it retval
). retval
is the ndarray to be iterated over. If writeback semantics are active then retval
is not out
rather it is a new ndarray for iterating, and retval->base == out
. Then toward the end of the function the writeback semantics are resolved, retval
is discared, and the function returns return
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation! If in a future PR you get to touch this file again, maybe add it to the comments...
I'll merge now. I guess you can copy your table on top for the next clean-up!! |
The two functions do very similar things,
cblas_matrixproduct
is used in bothPyArray_MatrixProduct2
(which is the actual implementation ofdot
) and inarray_matmul
ifndim
<=2 and the dtype is fitting. Otherwisearray_matmul
goes througheinsum
,PyArray_MatrixProduct2
has its own continuation. I worked through the logic to compare the two and found points of code duplication:PyArray_MatrixProduct2
cblas_matrixproduct
PyArray_NewCopy
if non-aligned or non-contiguousPyArray_IterAllButAxis
if
blocks for each caseout->dtype->vdot
if
blocks for each of the level 2 BLAS cases, use out->dtype->vdot for vector/scalar (which will use level 1 BLAS dot functions)This PR makes the two functions share the output array logic and uses the same semantics for the level 1 BLAS dot functions (scalar-vector)