Add a gather_for_metrics capability #540

muellerzr · 2022-07-20T15:15:55Z

Introduce a `gather_for_metrics` function

What does this add?

This PR adds a new function to Accelerator called gather_for_metrics, which assists with calculating the right metric in distributed setups

Who is it for?

Users of accelerate that want to ensure that their reported metrics are fully accurate

Why is it needed?

To assist with making sure all the batches have the right batch size, Accelerate will pad the length on the last batch to be duplicates of the last sample. These need to be dropped when calculating the final metrics on the last batch, and currently it looks something like:

if accelerator.use_distributed:
    # Then see if we're on the last batch of our eval dataloader
    if step == len(eval_dataloader) - 1:
        # Last batch needs to be truncated on distributed systems as it contains additional samples
        predictions = predictions[: len(eval_dataloader.dataset) - samples_seen]
        references = references[: len(eval_dataloader.dataset) - samples_seen]
    else:
        # Otherwise we add the number of samples seen
        samples_seen += references.shape[0]

This PR adds a new utility called accelerate.gather_for_metrics which will handle this check for us, entirely thanks to the GradientState capability.

Note: this method currently doesn't work for TPU's, as it needs to be eval_dataloader._loader.dataset. This PR fixes this as well

What parts of the API does this impact?

User-facing:

A new Accelerator.gather_for_metrics function was added

Internal structure:

Preprocessed dataloaders now have a new total_dataset_length attribute
GradientState now keeps track of the number of samples seen

Basic Usage Example(s):

When calculating metrics, users can now do the following to properly calculate their metrics:

input, target = next(iter(dataloader))
with torch.no_grad():
    logits = ddp_model(ddp_input)
    logits, target = accelerator.gather_for_metrics((logits, ddp_target), dataloader)
    accuracy_multi = accuracy(logits.argmax(dim=-1), target)

When would I use it, and when wouldn't I?

Since this works on distributed and non-distributed systems, always if the evaluation dataset has been prepared by Accelerator. Users should just add this to any script that calculates metrics.

TODO:

Update the other examples to use this new API. multiprocess_metrics will serve as a lower-level example

HuggingFaceDocBuilderDev · 2022-07-20T15:24:44Z

The documentation is not available anymore as the PR was closed or merged.

sgugger

Very clever! It's really great, just make sure to update the docs and the examples :-)

sgugger

Nice doc!

docs/source/quicktour.mdx

plamb-viso · 2022-07-29T15:34:22Z

is this functionality in 0.11.0?

muellerzr · 2022-07-29T15:36:36Z

It is not, we suggest not using it for now and doing the check manually as shown in the metric example script as some bugs were discovered: #575

plamb-viso · 2022-07-29T15:50:31Z

Cool, thank you, excited for this change when it happens

muellerzr added 2 commits July 20, 2022 10:42

Add test and full implementation

78e472f

Finish tests

a9ea2ed

muellerzr added the enhancement New feature or request label Jul 20, 2022

muellerzr requested a review from sgugger July 20, 2022 15:15

muellerzr added 2 commits July 20, 2022 11:18

Docstring

ac08399

Set samples seen

aef9d96

sgugger approved these changes Jul 20, 2022

View reviewed changes

muellerzr added 5 commits July 20, 2022 11:38

Make dataset length truthy for check

3e81d4b

Use basic try/catch

204499a

Adjust quicktour docs

4602028

Update all examples

4d1caa7

Typo fix

91e41b1

muellerzr requested a review from sgugger July 20, 2022 16:23

muellerzr added 2 commits July 20, 2022 12:30

Include eval_dataloader

510fbc6

Rename for a more actionable description

56f131c

muellerzr changed the title ~~Add a gather_metrics capability~~ Add a gather_for_metrics capability Jul 20, 2022

Clean

fda6b0b

sgugger approved these changes Jul 21, 2022

View reviewed changes

docs/source/quicktour.mdx Show resolved Hide resolved

muellerzr added 2 commits July 21, 2022 07:25

Update quicktour

ded5cab

Fix doc

de6ac00

muellerzr merged commit 164943c into main Jul 21, 2022

muellerzr deleted the dset_len branch July 21, 2022 11:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a gather_for_metrics capability #540

Add a gather_for_metrics capability #540

muellerzr commented Jul 20, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Jul 20, 2022 •

edited

Loading

sgugger left a comment

sgugger left a comment

plamb-viso commented Jul 29, 2022

muellerzr commented Jul 29, 2022

plamb-viso commented Jul 29, 2022

Add a gather_for_metrics capability #540

Add a gather_for_metrics capability #540

Conversation

muellerzr commented Jul 20, 2022 • edited Loading

Introduce a gather_for_metrics function

What does this add?

Who is it for?

Why is it needed?

What parts of the API does this impact?

User-facing:

Internal structure:

Basic Usage Example(s):

When would I use it, and when wouldn't I?

HuggingFaceDocBuilderDev commented Jul 20, 2022 • edited Loading

sgugger left a comment

Choose a reason for hiding this comment

sgugger left a comment

Choose a reason for hiding this comment

plamb-viso commented Jul 29, 2022

muellerzr commented Jul 29, 2022

plamb-viso commented Jul 29, 2022

muellerzr commented Jul 20, 2022 •

edited

Loading

Introduce a `gather_for_metrics` function

HuggingFaceDocBuilderDev commented Jul 20, 2022 •

edited

Loading