-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix several bugs in the team_split_2D operation #920
Fix several bugs in the team_split_2D operation #920
Conversation
The 2D team split operation was neglected a bit during development of shmem_team_split_strided, so it's missing some important properties: * PE numbers must be w.r.t. the team indexing * updates to the PE strides must be w.r.t. the team's stride * The returned team object(s) should be the one(s) containing my PE Signed-off-by: David M. Ozog <[email protected]>
start += xrange; | ||
|
||
if (my_xteam != SHMEMX_TEAM_INVALID) | ||
*xaxis_team = my_xteam; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we want to add an assert here to ensure that xaxis_team
is not assigned to more than once? Also, you can move the declaration of my_xteam
into the loop body above and my_yteam
into the loop body below. It looks like neither should be referenced outside of the loop body.
int ret = 0; | ||
|
||
shmem_internal_team_t *my_xteam, *my_yteam; | ||
|
||
for (int i = 0; i < num_xteams; i++) { | ||
int xsize = (i == num_xteams - 1 && parent_size % xrange) ? parent_size % xrange : xrange; | ||
|
||
ret = shmem_internal_team_split_strided(parent_team, start, parent_stride, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the parent team's stride the correct thing to use here? I suspect we need some more unit tests to cover the various cases in here, for example where the parent team stride is more than 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - I think it's correct. I believe the "3D" split test (spec-examples/shmem_team_split_2D.c
) covers this case during the 2nd split: In the 1st split, the parent stride (team_world) is 1, but in the 2nd split, the parent team has a stride of 3 in the x-splits, and a stride of 12 in the y-splits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the stride argument supposed to be relative to the PE numbers of the parent team? And isn't the parent team's stride relative to the global PE number space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any unit tests (in the branch of this PR, ignoring the awful spec test) that call shmem_team_split_2d
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, Yes, and... Yes?
This stride is relative to the parent team, which is relative to the global PE space... maybe we should discuss in person, since I don't see an alternative stride that would make sense here.
Right, there are no unit tests with split_2d
after moving the spec example, but I'm working on a 2D heat program, which might actually be a better fit in apps
? Unfortunately, I don't believe a 2D heat example would exercise a non-unit stride in the parent, IIUC, but I believe the 3D split does.
Signed-off-by: David M. Ozog <[email protected]>
Signed-off-by: James Dinan <[email protected]>
Signed-off-by: James Dinan <[email protected]>
Signed-off-by: James Dinan <[email protected]>
Signed-off-by: James Dinan <[email protected]>
Add 2d split unit test
…-shmem into pr/teams_split_2d-fixes
* Set xrange to be less than or equal to the parent_team size * Return nonzero in strided split when parent is invalid * Return nonzero in 2D split when parent is invalid * Correct shmemx_team_reuse_teams example accordingly Signed-off-by: David M. Ozog <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢 🇮🇹 👍
The 2D team split operation was neglected a bit during the development of
shmem_team_split_strided
, so it's missing some important properties:I believe these changes more-or-less fix the issues we are seeing in the 2D example in #918.