-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable copy
to be used with MatrixRef
#1021
Conversation
Just for the record, I will benchmark these changes: master...msimberg:DLA-Future:matrix-ref-algorithm-input-shortcut-trivial-ref. Hopefully by early next week we'll know more about if we can possibly only instantiate algorithms with |
6e866b2
to
0fff1a9
Compare
53b00a7
to
7d46120
Compare
namespace ex = pika::execution::experimental; | ||
|
||
if constexpr (Source == Destination) { | ||
if (&src == &dst) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this needs to be resolved in this PR, but since this compares MatrixRef
addresses and not Matrix
addresses this could in the worst case lead to deadlocks if the MatrixRef
s refer to the same Matrix
(read
and readwrite
into the same task). I think one step in the right direction would be to have operator==
on MatrixRef
(which compares the addresses of the contained Matrix
and sub distributions), but I'm not sure if it covers all possible issues.
I'd be fine with dealing with this outside of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that Submatrix pipeline should be considered as well, so it is better to work on it separately.
However it should be taken care as the tridiagonal solver might be impacted by it.
(device <-> host mirror copies in the case of multicore are skipped as they are the same matrix, however after the cleanup they will be two equal ref but different objects)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @msimberg for opening the issue!
I'll not resolve this comment hoping it will be easier in the future to spot it in case of need of reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please start to use snake_case
for functions.
cscs-ci run |
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #1021 +/- ##
==========================================
- Coverage 94.10% 94.09% -0.02%
==========================================
Files 146 146
Lines 8843 8856 +13
Branches 1121 1125 +4
==========================================
+ Hits 8322 8333 +11
- Misses 326 327 +1
- Partials 195 196 +1 ☔ View full report in Codecov by Sentry. |
template <class ElementGetter> | ||
auto subValues(ElementGetter&& fullValues, const GlobalElementIndex& offset) { | ||
return [fullValues, offset = sizeFromOrigin(offset)](const GlobalElementIndex& ij) { | ||
return fullValues(ij + offset); | ||
}; | ||
} | ||
|
||
template <class OutsideElementGetter, class InsideElementGetter> | ||
auto mixValues(const GlobalElementIndex& offset, const GlobalElementSize& sub_size, | ||
InsideElementGetter&& insideValues, OutsideElementGetter&& outsideValues) { | ||
return [outsideValues, insideValues, offset, sub_size](const GlobalElementIndex& ij) { | ||
if (ij.isInSub(offset, sub_size)) | ||
return insideValues(ij - common::sizeFromOrigin(offset)); | ||
else | ||
return outsideValues(ij); | ||
}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These helpers might be useful elsewhere, and I will probably move them in some "common" place so that they can be used in other tests too. (Probably this will happen in #969)
In particular SubMatrix test for MC has been added (GPU yet to be adpated)
b1a41bf
to
651f2f6
Compare
cscs-ci run rebased on new master (for getting matrix base snake_case) and addressed snake_case in implementation too. after this it is ready to be merged 👍🏻 |
Basic implementation (mainly copy-paste) for a copy method involving
MatrixRef
s.Currently
Matrix
andMatrixRef
are still two independent objects, and this implies that we would have to duplicate all the code for theMatrix
(more or less, e.g.MatrixRef
might have an offset which creates some new corner cases) the same forMatrixRef
.This represents a problem
Matrix
vsMatrixRef
calls), trivial template solution might costs a lot in terms of compile times and binaries sizes.Matrix
tests and replicate them forMatrixRef
?!Nothing new, but since it does not sound useful to spend time replicating things like that, in this PR I started with just the basic implementation of the
copy
and as soon as we take a decision on how to proceed, I will try to apply here too. 😉Waiting for your ideas and suggestions!
EDIT:
After a discussion with others, we opted for using
copy
as a "testbed" also for the sub-matrix/pipeline mechanisms, as a "real" use-case. This means that test forcopy
will be more "extensive" wrt other algorithms (e.g. gemm) and we will have:For the rest of the algorithms (so
gemm
included), we will just test the with MatrixRef both full and sub-matrices cases (where applicable).