Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computing metrics per-class for imbalanced data #204

Merged
merged 54 commits into from
Jun 14, 2021

Conversation

AnselmC
Copy link
Contributor

@AnselmC AnselmC commented Apr 27, 2021

Before submitting

  • ✅ Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • ✅ Did you read the contributor guideline, Pull Request section?
  • ✅ Did you make sure to update the docs?
  • ✅ Did you write any new necessary tests?

What does this PR do?

Draft PR as discussed in #174.

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@pep8speaks
Copy link

pep8speaks commented Apr 27, 2021

Hello @AnselmC! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-06-14 12:17:10 UTC

@AnselmC AnselmC marked this pull request as draft April 27, 2021 11:07
@Borda
Copy link
Member

Borda commented Apr 28, 2021

@AnselmC how is it going here? ready for review?

@AnselmC
Copy link
Contributor Author

AnselmC commented Apr 29, 2021

@Borda Hi, so I believe it's at least ready for an initial discussion.
One point I was struggling with/unsure about is how to handle multi-dimensional multi-class data (and their different accumulations). As I understand (please correct me if I'm mistaken), this means that a sample may belong to multiple classes encoded by a multi-dimensional binary vector. Hence, when mdmc_average="global", these multi-dim vectors need to be treated as a label - this seems analogous to how I've dealt with integer labels.
In the samplewise case, however, each sample is treated as a separate "batch" and then the metrics are computed per "batch" and averaged - I'm not sure how to correctly address this but I'm inclined to keep functionality as is for this case.

Would appreciate some input on this or pointers to more documentation for these scenarios.

@codecov
Copy link

codecov bot commented May 4, 2021

Codecov Report

Merging #204 (5d7a937) into master (ad5e360) will increase coverage by 0.03%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #204      +/-   ##
==========================================
+ Coverage   96.78%   96.82%   +0.03%     
==========================================
  Files          94       94              
  Lines        3084     3115      +31     
==========================================
+ Hits         2985     3016      +31     
  Misses         99       99              
Flag Coverage Δ
Linux 78.70% <70.17%> (-0.31%) ⬇️
Windows 78.70% <70.17%> (-0.31%) ⬇️
cpu 78.70% <70.17%> (-18.02%) ⬇️
gpu 96.78% <100.00%> (+0.03%) ⬆️
macOS 78.70% <70.17%> (-18.02%) ⬇️
pytest 96.82% <100.00%> (+0.03%) ⬆️
python3.6 ?
python3.8 ?
python3.9 ?
torch1.3.1 ?
torch1.4.0 ?
torch1.8.1 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
torchmetrics/classification/accuracy.py 96.15% <ø> (ø)
torchmetrics/classification/f_beta.py 100.00% <100.00%> (ø)
torchmetrics/classification/precision_recall.py 100.00% <100.00%> (ø)
torchmetrics/classification/specificity.py 100.00% <100.00%> (ø)
torchmetrics/classification/stat_scores.py 100.00% <100.00%> (ø)
torchmetrics/functional/classification/accuracy.py 94.36% <100.00%> (+0.42%) ⬆️
torchmetrics/functional/classification/f_beta.py 100.00% <100.00%> (ø)
...rics/functional/classification/precision_recall.py 100.00% <100.00%> (ø)
...chmetrics/functional/classification/specificity.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ad5e360...5d7a937. Read the comment docs.

@Borda Borda added enhancement New feature or request Important milestonish labels May 11, 2021
@Borda Borda added this to the v0.4 milestone May 11, 2021
@Borda Borda marked this pull request as ready for review May 11, 2021 07:40
Copy link
Member

@SkafteNicki SkafteNicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm

torchmetrics/functional/classification/f_beta.py Outdated Show resolved Hide resolved
torchmetrics/functional/classification/f_beta.py Outdated Show resolved Hide resolved
@mergify mergify bot removed the has conflicts label Jun 8, 2021
@AnselmC
Copy link
Contributor Author

AnselmC commented Jun 13, 2021

@Borda sorry, I was out on vacation for a week. I fixed the valid issues from deep source but not sure how to deal with torch.tensor is not callable (see here)).
Anything else you need from me on this atm?

@Borda
Copy link
Member

Borda commented Jun 13, 2021

@SkafteNicki @maximsch2 mind review?

Copy link
Member

@SkafteNicki SkafteNicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job, LGTM!
Please add entry to changelog on which metrics are affected by this enchancement :]

@AnselmC
Copy link
Contributor Author

AnselmC commented Jun 14, 2021

@SkafteNicki done!

@Borda Borda enabled auto-merge (squash) June 14, 2021 12:15
@Borda Borda merged commit 517a611 into Lightning-AI:master Jun 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Important milestonish ready
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants