Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/669 bincount #670

Merged
merged 6 commits into from
Sep 22, 2020
Merged

Feature/669 bincount #670

merged 6 commits into from
Sep 22, 2020

Conversation

mtar
Copy link
Collaborator

@mtar mtar commented Sep 11, 2020

Description

Introduces the bincount function similar to the respective numpy and pytorch functions.

Issue/s resolved: #669

Changes proposed:

  • new feature bincount

Type of change

  • New Feature

Due Diligence

  • All split configurations tested
  • Multiple dtypes tested in relevant functions
  • Documentation updated (if needed)
  • Updated changelog.md under the title "Pending Additions"

Does this change modify the behaviour of other functions? If so, which?

no

@codecov
Copy link

codecov bot commented Sep 14, 2020

Codecov Report

Merging #670 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #670   +/-   ##
=======================================
  Coverage   97.37%   97.38%           
=======================================
  Files          87       87           
  Lines       16711    16758   +47     
=======================================
+ Hits        16273    16320   +47     
  Misses        438      438           
Impacted Files Coverage Δ
heat/core/statistics.py 97.02% <100.00%> (+0.10%) ⬆️
heat/core/tests/test_statistics.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 223fe27...1a80df5. Read the comment docs.

@mtar
Copy link
Collaborator Author

mtar commented Sep 14, 2020

rerun tests

5 similar comments
@mtar
Copy link
Collaborator Author

mtar commented Sep 14, 2020

rerun tests

@mtar
Copy link
Collaborator Author

mtar commented Sep 14, 2020

rerun tests

@mtar
Copy link
Collaborator Author

mtar commented Sep 14, 2020

rerun tests

@mtar
Copy link
Collaborator Author

mtar commented Sep 14, 2020

rerun tests

@mtar
Copy link
Collaborator Author

mtar commented Sep 14, 2020

rerun tests

@mtar mtar force-pushed the Feature/669-bincount branch from 68f081f to 6807be1 Compare September 14, 2020 12:24
@mtar mtar marked this pull request as ready for review September 14, 2020 12:36
@mtar
Copy link
Collaborator Author

mtar commented Sep 14, 2020

rerun tests

1 similar comment
@mtar
Copy link
Collaborator Author

mtar commented Sep 14, 2020

rerun tests

ClaudiaComito
ClaudiaComito previously approved these changes Sep 17, 2020
Copy link
Contributor

@ClaudiaComito ClaudiaComito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good to me, thanks for implementing it!

Comment on lines 420 to 423
counts = torch.bincount(x._DNDarray__array, weights, minlength)

size = counts.size()[0]
maxlength = x.comm.allreduce(size, op=MPI.MAX)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I would have just calculated maxlength = x.max()+1. You're bypassing all the __reduce_op() checks so you're probably more efficient :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was curious. I get roughly 0.0001s ~ 0.0002s faster times in a very simple test.

Raises
------
ValueError
If split axes of `x` and `weights` are not the same.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically this can't be because both only have one axis.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It happens in the one case with None and 0, but pytorch would be complaining anyway because of different input tensor lengths.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I understood what you were referring to, I was just suggesting to reword this line because None is not a split axis. Suggestions:

  • "if the split attribute is not the same"?
  • "if x is distributed and weights isn't, or vice versa"?

@mtar
Copy link
Collaborator Author

mtar commented Sep 22, 2020

rerun tests

Copy link
Contributor

@ClaudiaComito ClaudiaComito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks!

@mtar mtar merged commit 7cee8f0 into master Sep 22, 2020
@mtar mtar deleted the Feature/669-bincount branch September 22, 2020 09:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

implement bincount
3 participants