Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement tensor redistribution for binary operations #167

Open
2 of 4 tasks
Markus-Goetz opened this issue Mar 12, 2019 · 12 comments
Open
2 of 4 tasks

Implement tensor redistribution for binary operations #167

Markus-Goetz opened this issue Mar 12, 2019 · 12 comments
Labels
MPI Anything related to MPI communication redistribution Related to distributed tensors

Comments

@Markus-Goetz
Copy link
Member

Markus-Goetz commented Mar 12, 2019

After PR #165 is merged, implement binary operators for two tensor with set, but different split axis (e.g. split=0 and split=1).

Think of a rule to decide which tensor should be resplit. Proposal (might require modification):

  1. attempt to redistribute tensors with fewer elements
  2. should both have the same amount of elements, redistribute tensor that are split along a minor axis, i.e. split >= 1.
  3. should both have the same number of elements and both have a minor axis, redistribute the one with larger minor axis

EDITED July 31st 2023 by @ClaudiaComito
Here's the current status of binary operations heat.core._operations.__binary_op():

  • ensure that operand types are promoted correctly
  • ensure that binary ops are possible between operands split along the same axis but with misaligned lshape_maps (e.g. as a result of indexing)
  • ensure that operands are on the same MPI communicator
  • ensure that binary ops are possible between operands split along different axes

Reviewed within #1109

@Cdebus
Copy link
Contributor

Cdebus commented Apr 10, 2019

@Markus-Goetz I will try out these rules. However, what happens in 2) if none of the tensors is split among a minor axis?

@Markus-Goetz
Copy link
Member Author

In this case you may simply add them locally as already implemented and working

@Cdebus
Copy link
Contributor

Cdebus commented Apr 10, 2019

No, I mean if they both have a split > 1, but along different axes... or is that unrealistic?

@Markus-Goetz
Copy link
Member Author

That would be rule 3.) if both have a minor axis, than the one with the larger minor axis will be redistributed

@Markus-Goetz
Copy link
Member Author

Couple more things that came to my mind that have to be considered:

  • __binary_op should ensure that the comms of the two operands match. We cannot perform the operation if the data is distributed across different comms. Exception: the split of one of the operands is None. Local data can always be used

  • Ensure that the operands types are properly promoted. See also types.promote_types.

@Cdebus
Copy link
Contributor

Cdebus commented Jul 25, 2019

Why was that closed? Issue is not fixed, PR #268 bumped due to issues with MPI communication and resplit

@Cdebus Cdebus reopened this Jul 25, 2019
@coquelin77
Copy link
Member

My apologies. i think I saw that the PR was closed (at the time it was a month after the bump, and i think that you had closed the PR). I am not sure but I think that I had thought that the PR was merged and then it was a stay issue which was not closed at the time of merging. Sorry @Cdebus

@Cdebus
Copy link
Contributor

Cdebus commented Jul 25, 2019

No worries ^^
I was just surprised, because fixing the _binary_op was my original intention but didn't work due to resplit due to allgather, and that's how I ended up with that mess

@ClaudiaComito
Copy link
Contributor

Status? @coquelin77 @Markus-Goetz can we break this into smaller issues?

@ClaudiaComito ClaudiaComito added redistribution Related to distributed tensors MPI Anything related to MPI communication heat-dev week labels Apr 1, 2022
@ClaudiaComito ClaudiaComito added this to the Repo Clean-Up milestone Jul 31, 2023
@ClaudiaComito
Copy link
Contributor

ClaudiaComito commented Jul 31, 2023

Alright, here's the current status of binary operations heat.core._operations.__binary_op():

  • ensure that operand types are promoted correctly
  • ensure that binary ops are possible between operands split along the same axis but with misaligned lshape_maps (e.g. as a result of indexing)
  • ensure that operands are on the same MPI communicator
  • ensure that binary ops are possible between operands split along different axes

Reviewed within #1109

@ClaudiaComito ClaudiaComito self-assigned this Aug 21, 2023
@github-actions
Copy link
Contributor

@mrfh92
Copy link
Collaborator

mrfh92 commented Aug 21, 2023

Stil active and ongoing work.

(Reviewed within #1109 )

@mrfh92 mrfh92 removed this from the Repo Clean-Up milestone Aug 21, 2023
@ClaudiaComito ClaudiaComito removed their assignment Oct 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MPI Anything related to MPI communication redistribution Related to distributed tensors
Projects
None yet
Development

No branches or pull requests

5 participants