Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix several bugs in the team_split_2D operation #920

Merged

Conversation

davidozog
Copy link
Member

@davidozog davidozog commented Dec 16, 2019

The 2D team split operation was neglected a bit during the development of shmem_team_split_strided, so it's missing some important properties:

  • PE numbers must be w.r.t. the team indexing
  • updates to the PE strides must be w.r.t. the team's stride
  • The returned team object(s) should be the one(s) containing my PE

I believe these changes more-or-less fix the issues we are seeing in the 2D example in #918.

The 2D team split operation was neglected a bit during development of
shmem_team_split_strided, so it's missing some important properties:

* PE numbers must be w.r.t. the team indexing
* updates to the PE strides must be w.r.t. the team's stride
* The returned team object(s) should be the one(s) containing my PE

Signed-off-by: David M. Ozog <[email protected]>
start += xrange;

if (my_xteam != SHMEMX_TEAM_INVALID)
*xaxis_team = my_xteam;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we want to add an assert here to ensure that xaxis_team is not assigned to more than once? Also, you can move the declaration of my_xteam into the loop body above and my_yteam into the loop body below. It looks like neither should be referenced outside of the loop body.

int ret = 0;

shmem_internal_team_t *my_xteam, *my_yteam;

for (int i = 0; i < num_xteams; i++) {
int xsize = (i == num_xteams - 1 && parent_size % xrange) ? parent_size % xrange : xrange;

ret = shmem_internal_team_split_strided(parent_team, start, parent_stride,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the parent team's stride the correct thing to use here? I suspect we need some more unit tests to cover the various cases in here, for example where the parent team stride is more than 1.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I think it's correct. I believe the "3D" split test (spec-examples/shmem_team_split_2D.c) covers this case during the 2nd split: In the 1st split, the parent stride (team_world) is 1, but in the 2nd split, the parent team has a stride of 3 in the x-splits, and a stride of 12 in the y-splits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the stride argument supposed to be relative to the PE numbers of the parent team? And isn't the parent team's stride relative to the global PE number space?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any unit tests (in the branch of this PR, ignoring the awful spec test) that call shmem_team_split_2d?

Copy link
Member Author

@davidozog davidozog Dec 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Yes, and... Yes?

This stride is relative to the parent team, which is relative to the global PE space... maybe we should discuss in person, since I don't see an alternative stride that would make sense here.

Right, there are no unit tests with split_2d after moving the spec example, but I'm working on a 2D heat program, which might actually be a better fit in apps? Unfortunately, I don't believe a 2D heat example would exercise a non-unit stride in the parent, IIUC, but I believe the 3D split does.

davidozog and others added 8 commits December 18, 2019 16:04
Signed-off-by: James Dinan <[email protected]>
Signed-off-by: James Dinan <[email protected]>
* Set xrange to be less than or equal to the parent_team size
* Return nonzero in strided split when parent is invalid
* Return nonzero in 2D split when parent is invalid
* Correct shmemx_team_reuse_teams example accordingly

Signed-off-by: David M. Ozog <[email protected]>
Copy link
Member

@jdinan jdinan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢 🇮🇹 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants