Deprecate HFCrossEntropy and Perplexity #1857

dakinggg · 2023-01-05T22:35:37Z

What does this PR do?

This PR adds DeprecationWarnings to HFCrossEntropy and Perplexity, as the separation between these and LanguageCrossEntropy is confusing. To mitigate removing these, this PR also adds support for Mapping input to LanguageCrossEntropy.update and adds LanguagePerplexity(LanguageCrossEntropy).

More context:
There is a slight difference between LanguageCrossEntropy and HFCrossEntropy due to how the loss is reduced. This creates confusion in the examples repo, which uses LanguageCrossEntropy and Perplexity. There is a possible small cost to this change, because HFCrossEntropy uses output['loss'] (if available) from HF rather than recomputing the loss. LanguageCrossEntropy will always recompute the loss so that the reduction is consistent and LanguagePerplexity always matches LanguageCrossEntropy. The examples repo was always returning the logits from forward already, so this slight cost was already present in the examples repo.

What issue(s) does this change relate to?

Closes CO-1616

Before submitting

Have you read the contributor guidelines?
Was this change discussed/approved in a GitHub issue first? It is much more likely to be merged if so.
Did you update any related docs and document your change?
Did you update any related tests and add any new tests related to your change? (see testing)
Did you run the tests locally to make sure they pass?
Did you run pre-commit on your change? (see the pre-commit section of prerequisites)

vchiley · 2023-01-06T16:15:35Z

LanguageCrossEntropy init requires vocab_size; its only used here like this:

        assert isinstance(output, Tensor)
        output = output.view(-1, self.vocab_size)
        target = target.view(-1)
        losses = self.loss_fn(output, target)

can we instead do something like:

        assert isinstance(output, Tensor)
        target = target.view(-1)
        output = output.view(target.size(0), -1)
        losses = self.loss_fn(output, target)

this will remove the need for needing to pass vocab_size.

I'm also not sure if there is a fundamental diff between these and our generic CE metric.

dakinggg · 2023-01-06T19:13:28Z

It looks like the difference is that CrossEntropy supports one hot targets, does not do the reshaping for you, and (with this PR) does not support Mapping input. On your comment about eliminating the need for vocab_size, yeah that seems good.

review-notebook-app · 2023-01-06T20:18:30Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

mvpatel2000

Please don't merge until review from NLP person as well. LGTM for eng side

dakinggg · 2023-01-18T22:11:08Z

@abhi-mosaic @vchiley one of you mind taking a look to approve from the NLP side?

abhi-mosaic

Looks great! After this gets merged and released I will change our imports in examples/[llm, bert], likely on 0.13 release

mvpatel2000 · 2023-02-01T18:07:21Z

@dakinggg lets hold until after 12.1

dakinggg · 2023-02-01T18:11:55Z

@mvpatel2000 yeah, that was my plan

dakinggg added 4 commits January 5, 2023 13:17

add languageperplexity and deprecate hfperplexity

70ee141

add deprecation warning to hfcrossentropy

d367e16

add dict support to languagecrossentropy

e61bd82

switch gpt2 factory to use languageperplexity

6d562eb

dakinggg requested a review from a team as a code owner January 5, 2023 22:35

dakinggg added 3 commits January 5, 2023 14:46

remove debugging code

5310844

add to all

bf66a33

remove unnecessary reshaping in test

d7c5bcc

dakinggg requested review from abhi-mosaic, vchiley and mvpatel2000 January 5, 2023 22:57

dakinggg added 2 commits January 5, 2023 18:45

Merge branch 'dev' into xent

bb39f2d

Merge branch 'dev' into xent

f73ac51

dakinggg added 3 commits January 6, 2023 12:18

remove need for vocab size input

ac4558d

add deprecation warning for vocab size arg

c140e36

remove vocab_size arg everywhere

c0acbee

dakinggg added 2 commits January 8, 2023 15:06

Merge branch 'dev' into xent

3dd151f

Merge branch 'dev' into xent

a37d9e9

mvpatel2000 approved these changes Jan 9, 2023

View reviewed changes

dakinggg and others added 7 commits January 9, 2023 10:56

Merge branch 'dev' into xent

f1c6c14

merge

90f3058

Merge branch 'dev' into xent

4fd3bf6

Merge branch 'dev' into xent

572876f

Merge branch 'dev' into xent

80e1382

Merge branch 'dev' into xent

fd58d1a

Merge branch 'dev' into xent

645e64a

dakinggg added 8 commits January 19, 2023 15:44

Merge branch 'dev' into xent

04db2bb

Merge branch 'dev' into xent

e88e890

Merge branch 'dev' into xent

dcd0b92

Merge branch 'dev' into xent

9f51077

Merge branch 'dev' into xent

74a5833

Merge branch 'dev' into xent

06dcdec

Merge branch 'dev' into xent

9892ca7

Merge branch 'dev' into xent

25401ad

dakinggg mentioned this pull request Jan 27, 2023

HF FSDP wrap BLOOM and OPT as well mosaicml/examples#83

Merged

dakinggg added 2 commits January 27, 2023 14:28

Merge branch 'dev' into xent

3e86dd0

Merge branch 'dev' into xent

f3a0fbb

abhi-mosaic approved these changes Jan 31, 2023

View reviewed changes

merge

e77bb64

dakinggg enabled auto-merge (squash) February 6, 2023 18:48

dakinggg merged commit 48d40f9 into mosaicml:dev Feb 6, 2023

dakinggg deleted the xent branch September 9, 2023 22:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deprecate HFCrossEntropy and Perplexity #1857

Deprecate HFCrossEntropy and Perplexity #1857

dakinggg commented Jan 5, 2023 •

edited by jira bot

Loading

vchiley commented Jan 6, 2023

dakinggg commented Jan 6, 2023

review-notebook-app bot commented Jan 6, 2023

mvpatel2000 left a comment

dakinggg commented Jan 18, 2023

abhi-mosaic left a comment

mvpatel2000 commented Feb 1, 2023

dakinggg commented Feb 1, 2023

Deprecate HFCrossEntropy and Perplexity #1857

Deprecate HFCrossEntropy and Perplexity #1857

Conversation

dakinggg commented Jan 5, 2023 • edited by jira bot Loading

What does this PR do?

What issue(s) does this change relate to?

Before submitting

vchiley commented Jan 6, 2023

dakinggg commented Jan 6, 2023

review-notebook-app bot commented Jan 6, 2023

mvpatel2000 left a comment

Choose a reason for hiding this comment

dakinggg commented Jan 18, 2023

abhi-mosaic left a comment

Choose a reason for hiding this comment

mvpatel2000 commented Feb 1, 2023

dakinggg commented Feb 1, 2023

dakinggg commented Jan 5, 2023 •

edited by jira bot

Loading