Classification metrics overhaul: input formatting standardization (1/n) #4837

tadejsv · 2020-11-24T20:55:57Z

This PR is a spin-off from #4835. It should be merged before any other spin offs, as it provides a base for all of them

What does this PR do?

General (fundamental) changes

I have created a new _input_format_classification function (in metrics/classification/utils). The job of this function is to a) validate, and b) transform the inputs into a common format. This common format is a binary label indicator array: either (N, C), or (N, C, X) (only for multi-dimensional multi-class inputs).

I believe that having such a "central" function is crucial, as it gets rid of code duplication (which was present in PL metrics before), and enables metric developers to focus on developing the metrics themselves, and not on standardizing and validating inputs.

The validation performed on the inputs basically makes sure that they fall into one of the possible input type cases, that the values are consistent with both the type of the inputs and the additional parameters set (e.g. that there is no label higher than num_classes in target). The docstrings (and the new "Input types" section in the documentation) give all the details about how the standardization and validation are performed.

Here I'll list the parameters of this function (many of which are also present on some metrics), and why I decided to use them:

threshold: The probability threshold for binarizing binary and multi-label inputs.
num_classes: number of classes. Used to either decide the C dimension of inputs, or, if this is already implicitly given, to ensure consistency between inputs and number of classes the user specified when creating the metric (thus ignoring either having to chech this manually in update for each metric, or raising error when updating the state, which may not be very clear
to the user).
top_k: for (multi-dimensional) multi-class, if predictions are given as probabilities, selects the top k highest probabilities per sample. It's a generalization of the usual procedure, with k=1. This will be used by the Accuracy metric in subsequent PRs.
is_multiclass: used for transforming binary or multi-label input to 2-class multi-class and 2-class multi-dimensional multi-class, respectively. And vice versa.

Why? This is similar to multilabel argument that was (is?) present on some metrics. I believe this is a better name for it, as it also deals with transforming to/from binary. But why is it needed? There are cases where it is not clear what the inputs are: for example, say that both preds and target are of the form [0,1,0,1]. This actually appears to be multi-class (could be the case that is simply happened in this batch that there were only 0s and 1s), so an explicit instruction is needed to tell the metrics that this is in fact binary. On the other hand, sometimes we would like to treat binary inputs as two class inputs - this is the case used in confusion matrix.

I also experiemented with using num_classes to determine this. Besides this being a very confusing approach, requiring several paragraphs to explain clearly, it also does not resolve all ambiguities (is setting num_classes=1 with 2 class probability predictions a request to treat the data as binary, or an inconsitency of inputs that should raise an error?). So I think is_multiclass is the best approach here.

Documentation

Instead of metrics being organized into "Class Metrics" and "Functional Metrics", they are now organized by topics (Classification, Regression, ...), and within topics split into class and functional, if necessary. This allows to add special topic-related sections - in this case I have added a section on what type of inputs are used for classification metric - a section that metrics can link to, in order to not repeat the same thing 100 times, and to keep docstrings short and to the point.

A second half of the Input types section with examples from StatScores metric will be added in the metric's PR.

teddykoker · 2020-11-24T21:13:36Z

Thanks for splitting off the PR! Reviewing now

teddykoker

Overall looks like good changes, just a few small things to fix.

docs/source/metrics.rst

pytorch_lightning/metrics/classification/utils.py

tests/metrics/classification/test_inputs.py

pytorch_lightning/metrics/classification/utils.py

codecov · 2020-11-24T22:54:36Z

Codecov Report

Merging #4837 (4a71a56) into master (02152c1) will increase coverage by 0%.
The diff coverage is 99%.

@@           Coverage Diff           @@
##           master   #4837    +/-   ##
=======================================
  Coverage      93%     93%            
=======================================
  Files         129     130     +1     
  Lines        9397    9527   +130     
=======================================
+ Hits         8713    8843   +130     
  Misses        684     684

SkafteNicki

overall very good :)

pytorch_lightning/metrics/classification/utils.py

docs/source/metrics.rst

Co-authored-by: Nicki Skafte <[email protected]>

tadejsv · 2020-12-04T19:11:26Z

Is there anything else that needs to be done before this PR can be merged?
I think I've addressed all issues that arised so far, so we can bring the review process to an end - as I'm very anxious to finally get to the good stuff ;)

@Borda @SkafteNicki @teddykoker

SkafteNicki · 2020-12-05T13:37:07Z

@tadejsv, thanks for the further description of the is_multiclass parameter. I am fine with merging this PR now, so please go ahead and solve the conflicts and I will approve it.

Borda · 2020-12-06T10:22:38Z

@tadejsv mind resolve conflicts :] probably after #4549
@SkafteNicki @justusschock @teddykoker @ananyahjha93 mind do the final review?

Borda

LGTM, just would be nice to have tests also for all the helper functions raising some kind of exception...

pytorch_lightning/metrics/classification/utils.py

…orch-lightning into cls_metrics_input_formatting

tadejsv · 2020-12-06T11:29:26Z

Alright, merge conflicts resolved, ready for final review. @SkafteNicki please double check that docs are ok (git diff not useful there).

SkafteNicki

LGTM, docs looks fine :]

awaelchli

High-level review, looks good, nice docs

docs/source/metrics.rst

tadejsv added 2 commits November 24, 2020 20:48

Add stuff

6959ea0

Change metrics documentation layout

0679015

tadejsv requested review from ananyahjha93, awaelchli, Borda, justusschock, nateraw, SeanNaren, tchaton, teddykoker and williamFalcon as code owners November 24, 2020 20:55

tadejsv mentioned this pull request Nov 24, 2020

Classification metrics overhaul: accuracy metrics (2/n) #4838

Merged

teddykoker mentioned this pull request Nov 24, 2020

[WIP] Classification metrics overhaul 1/3 [wip] #4835

Closed

tadejsv mentioned this pull request Nov 24, 2020

Classification metrics overhaul: stat scores (3/n) #4839

Merged

teddykoker suggested changes Nov 24, 2020

View reviewed changes

Change testing utils

55fdaaf

rohitgr7 reviewed Nov 24, 2020

View reviewed changes

pytorch_lightning/metrics/classification/utils.py Outdated Show resolved Hide resolved

pytorch_lightning/metrics/classification/utils.py Outdated Show resolved Hide resolved

tadejsv added 3 commits November 24, 2020 22:50

Replace len(*.shape) with *.ndim

5cbf56a

More descriptive error message for input formatting

9c33d0b

Replace movedim with permute

6562205

tadejsv mentioned this pull request Nov 24, 2020

Classification metrics overhaul: precision & recall (4/n) #4842

Merged

SkafteNicki reviewed Nov 25, 2020

View reviewed changes

pytorch_lightning/metrics/classification/utils.py Outdated Show resolved Hide resolved

pytorch_lightning/metrics/classification/utils.py Outdated Show resolved Hide resolved

docs/source/metrics.rst Outdated Show resolved Hide resolved

SkafteNicki added docs Documentation related feature Is an improvement or enhancement Metrics labels Nov 25, 2020

SkafteNicki added this to the 1.1 milestone Nov 25, 2020

SkafteNicki mentioned this pull request Nov 25, 2020

metrics.Accuracy is not calculated correctly when the first argument is of type float16 #4840

Closed

tadejsv and others added 3 commits November 30, 2020 23:27

Apply suggestions from code review

8e7a85a

Co-authored-by: Nicki Skafte <[email protected]>

Check that probabilities in preds sum to 1 (for MC)

829155e

Fix coverage

768879d

s-rog linked an issue Dec 1, 2020 that may be closed by this pull request

Formatting issues in metric docs #4908

Closed

teddykoker and others added 2 commits December 2, 2020 16:22

Merge branch 'master' into cls_metrics_input_formatting

15ef14d

Merge branch 'master' into cls_metrics_input_formatting

1568970

Borda requested review from SkafteNicki, rohitgr7 and Borda December 6, 2020 10:23

Borda approved these changes Dec 6, 2020

View reviewed changes

pytorch_lightning/metrics/classification/utils.py Outdated Show resolved Hide resolved

pytorch_lightning/metrics/classification/utils.py Outdated Show resolved Hide resolved

Merge with master and resolve conflicts

a9fa730

tadejsv requested a review from edenlightning as a code owner December 6, 2020 11:14

Borda and others added 3 commits December 6, 2020 12:26

Merge branch 'master' into cls_metrics_input_formatting

44ad276

Minor changes

96d40c8

Merge branch 'cls_metrics_input_formatting' of github.com:tadejsv/pyt…

cca430a

…orch-lightning into cls_metrics_input_formatting

Fix edge case and simplify testing

f3c47f9

rohitgr7 mentioned this pull request Dec 6, 2020

Information Retrieval (IR) metrics implementation (MAP, MRR, P@K, R@K, HR@K) [wip] #4991

Closed

11 tasks

SkafteNicki approved these changes Dec 7, 2020

View reviewed changes

SkafteNicki added the ready PRs ready to be merged label Dec 7, 2020

Merge branch 'master' into cls_metrics_input_formatting

ecb5472

awaelchli approved these changes Dec 7, 2020

View reviewed changes

docs/source/metrics.rst Show resolved Hide resolved

teddykoker approved these changes Dec 7, 2020

View reviewed changes

Merge branch 'master' into cls_metrics_input_formatting

4a71a56

Borda merged commit fedc0d1 into Lightning-AI:master Dec 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Classification metrics overhaul: input formatting standardization (1/n) #4837

Classification metrics overhaul: input formatting standardization (1/n) #4837

tadejsv commented Nov 24, 2020

teddykoker commented Nov 24, 2020

teddykoker left a comment

codecov bot commented Nov 24, 2020 •

edited

Loading

SkafteNicki left a comment

tadejsv commented Dec 4, 2020

SkafteNicki commented Dec 5, 2020

Borda commented Dec 6, 2020

Borda left a comment

tadejsv commented Dec 6, 2020

SkafteNicki left a comment

awaelchli left a comment

Classification metrics overhaul: input formatting standardization (1/n) #4837

Classification metrics overhaul: input formatting standardization (1/n) #4837

Conversation

tadejsv commented Nov 24, 2020

What does this PR do?

General (fundamental) changes

Documentation

teddykoker commented Nov 24, 2020

teddykoker left a comment

Choose a reason for hiding this comment

codecov bot commented Nov 24, 2020 • edited Loading

Codecov Report

SkafteNicki left a comment

Choose a reason for hiding this comment

tadejsv commented Dec 4, 2020

SkafteNicki commented Dec 5, 2020

Borda commented Dec 6, 2020

Borda left a comment

Choose a reason for hiding this comment

tadejsv commented Dec 6, 2020

SkafteNicki left a comment

Choose a reason for hiding this comment

awaelchli left a comment

Choose a reason for hiding this comment

codecov bot commented Nov 24, 2020 •

edited

Loading