-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
F1, Accuray, Precision and Recall all output the same value consistently in a binary classification setting. #746
Comments
Hi! thanks for your contribution!, great first issue! |
seem to be duplicate to #543 if you still find this in need, feel free to reopen 🐰 |
Sorry if I'm reopening the issue, but I think this is at the very least an issue with the documentation. The way the From what I know regardless of |
Hi, I'm having the same issue on:
|
Hi, I'm having the same issue on: |
I have encountered this issue and here is a Colab notebook to replicate the issue and the solution. I agree with @FeryET, the setup is confusing and it would be great if there is a warning or a better example to showcase the difference. |
Also, can we add a Like import sklearn.metrics as metrics
import numpy as np
a = np.array([1, 1, 0, 0, 0, 1])
b = np.array([0, 1, 1, 1, 0, 1])
metrics.recall_score(a, b, average='binary') # 0.6666666666666666 |
I am also having this issue as well - is there a simple way to fix this? |
Hi, I encountered a similar issue when using the Precision metric in MetricCollection. However, the output was always zero rather than consistent with other metrics. |
I encountered the same issue. As @lucienwang1009 said, initializing
Some detailed observations:
The bug is not easy to be observed, and I take hours to check other places like data pre-processing, training and testing scripts, package versions, and so on. |
Issue will be fixed by classification refactor: see this issue #1001 and this PR #1195 for all changes Small recap: This issue describe that metric Using the new from torchmetrics.functional import binary_accuracy, binary_precision, binary_recall, binary_f1_score
preds = tensor([0.4225, 0.5042, 0.1142, 0.4134, 0.0978, 0.1402, 0.9422, 0.4846, 0.1639, 0.6613])
target = tensor([1, 1, 1, 1, 1, 1, 1, 0, 1, 1])
binary_accuracy(preds, target) # tensor(0.4000)
binary_recall(preds, target) # tensor(0.3333)
binary_precision(preds, target) # tensor(1.)
binary_f1_score(preds, target) # tensor(0.5000) which also corresponds to what |
🐛 Bug
I am trying to report F1, Accuracy, Precision and Recall for a binary classification task. I have collected these metrics in a
MetricCollection
module, and run them for mytrain
,val
andtest
stages. Upon inspecting the results, I can see that all of these metrics are showing the exact same value.To Reproduce
Create a random binary classification task and add these metrics together in a metric collection.
Code sample
I have uploaded a very minimal example in this notebook. As you can see the values reported by
torchmetrics
doesn't align withclassification_report
.Expected behavior
F1, Precision, Recall and Accuracy should usually differ. It should be very unlikely to see all of them match exactly.
Environment
Additional context
I have also asked this question in a discussion form yesterday thinking it was a problem on my part, but after looking up the sitaution, I think this might be a bug.
#743
The text was updated successfully, but these errors were encountered: