Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically find metric attribute when logging #8656

Closed
samgelman opened this issue Jul 30, 2021 · 7 comments · Fixed by #8675
Closed

Automatically find metric attribute when logging #8656

samgelman opened this issue Jul 30, 2021 · 7 comments · Fixed by #8675
Assignees
Labels
feature Is an improvement or enhancement help wanted Open to be worked on waiting on author Waiting on user action, correction, or update

Comments

@samgelman
Copy link

🚀 Feature

Motivation

When logging metrics stored in a ModuleList, it's somewhat clunky because you have to manually specify the metric_attribute in the call to self.log().

def __init__(self, ...):
     self.num_tasks = 50
     self.metrics = nn.ModuleList([torchmetrics.PearsonCorrcoef() for _ in range(num_tasks)])

def test_step(self, data_batch, batch_idx):
     inputs, labels = data_batch
     outputs = self(inputs)

     for task_num in range(self.num_tasks):
          outputs_task = outputs[:, task_num]
          labels_task = labels[:, task_num]
          self.metrics[task_num](outputs_task, labels_task)

          self.log("test/pearson_{}".format(task_num), self.metrics[task_num], metric_attribute="metrics.{}".format(task_num))

Pitch

It seems like Lightning should be able to find the metric_attribute automatically, given the Metric object. This would make logging metrics stored in a ModuleList a bit cleaner.

Alternatives

Additional context


If you enjoy Lightning, check out our other projects! ⚡

  • Metrics: Machine learning metrics for distributed, scalable PyTorch applications.

  • Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning

  • Bolts: Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

  • Lightning Transformers: Flexible interface for high performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

@samgelman samgelman added feature Is an improvement or enhancement help wanted Open to be worked on labels Jul 30, 2021
@tchaton
Copy link
Contributor

tchaton commented Jul 31, 2021

Hey @samgelman,

Good catch. I will do that next week unless you want to. Should be fairly simple

Best,
T.C

@tchaton tchaton self-assigned this Aug 2, 2021
@tchaton
Copy link
Contributor

tchaton commented Aug 2, 2021

Dear @samgelman,

Seems to work fine. Am I missing something ?

def test_metric_collections(tmpdir):

    class TestModel(BoringModel):

        def __init__(self):
            super().__init__()
            self.metrics_list = ModuleList([DummyMetric() for _ in range(2)])
            self.metrics_dict = ModuleDict({"a": DummyMetric(), "b": DummyMetric()})
            self.metrics_collection_dict = MetricCollection({"a": DummyMetric(), "b": DummyMetric()})

        def training_step(self, batch, batch_idx):
            loss = super().training_step(batch, batch_idx)
            self.metrics_list[0](batch_idx)
            self.metrics_list[1](batch_idx)

            self.metrics_dict["a"](batch_idx)
            self.metrics_dict["b"](batch_idx)

            self.metrics_collection_dict["a"](batch_idx)
            self.metrics_collection_dict["b"](batch_idx)

            self.log("a", self.metrics_list[0])
            self.log("b", self.metrics_list[1])

            self.log("c", self.metrics_dict["a"])
            self.log("d", self.metrics_dict["b"])

            self.log("e", self.metrics_collection_dict["a"])
            self.log("f", self.metrics_collection_dict["b"])

            return loss

        def on_train_epoch_end(self) -> None:
            results = self.trainer.fit_loop.epoch_loop._results
            assert results['training_step.a'].meta.metric_attribute == 'metrics_list.0'
            assert results['training_step.b'].meta.metric_attribute == 'metrics_list.1'

            assert results['training_step.c'].meta.metric_attribute == 'metrics_dict.a'
            assert results['training_step.d'].meta.metric_attribute == 'metrics_dict.b'

            assert results['training_step.e'].meta.metric_attribute == 'metrics_collection_dict.a'
            assert results['training_step.f'].meta.metric_attribute == 'metrics_collection_dict.b'

    model = TestModel()

    trainer = Trainer(
        default_root_dir=tmpdir,
        max_epochs=2,
        limit_train_batches=2,
        limit_val_batches=2,
    )
    trainer.fit(model)

@carmocca carmocca added the waiting on author Waiting on user action, correction, or update label Aug 2, 2021
@samgelman
Copy link
Author

Hmm, thanks for looking into it. In my set up, I was getting:

pytorch_lightning.utilities.exceptions.MisconfigurationException: Could not find the LightningModule attribute for the torchmetrics.Metric logged. You can fix this by calling self.log(test/pearson_fa_elec, ..., metric_attribute=name) where name is one of ['test_pearson_collection.0', 'test_pearson_collection.1']

I'll try to recreate it with a simple example.

@tchaton
Copy link
Contributor

tchaton commented Aug 2, 2021

Dear @samgelman,

I added a test. Feel free to add a test which currently breaks.

Best,
T.C

@samgelman
Copy link
Author

Hey @tchaton,

It seems this fails:

def test_metric_collections(tmpdir):
    class TestModel(BoringModel):
        def __init__(self):
            super().__init__()
            self.num_tasks = 50
            self.metrics_list = ModuleList([PearsonCorrcoef() for _ in range(self.num_tasks)])

        def training_step(self, batch, batch_idx):
            loss = super().training_step(batch, batch_idx)

            for i in range(self.num_tasks):
                self.metrics_list[i](torch.tensor([1., 2., 3.]), torch.tensor([2., 1., 3.]))
                self.log("a_{}".format(i), self.metrics_list[i])

            return loss

        def on_train_epoch_end(self) -> None:
            results = self.trainer.fit_loop.epoch_loop._results

            for i in range(self.num_tasks):
                assert results["training_step.a_{}".format(i)].meta.metric_attribute == "metrics_list.{}".format(i)

Two things to note.

  1. I have num_tasks=50. It passes with num_tasks=2 but fails with num_tasks=50.
  2. I am using PearsonCorrcoef() instead of DummyMetric(). It passes with DummyMetric() but fails with PearsonCorrcoef().

Thanks,
Sam

@tchaton tchaton reopened this Aug 5, 2021
@SkafteNicki
Copy link
Member

After looking a bit at this I found the following:

import torchmetrics
from torchmetrics import Accuracy, PearsonCorrcoef
from torch.nn import ModuleList

m1 = ModuleList([Accuracy() for _ in range(5)])
m2 = ModuleList([PearsonCorrcoef() for _ in range(5)])

list(m1.named_modules())
#[('',
#  ModuleList(
#    (0): Accuracy()
#    (1): Accuracy()
#    (2): Accuracy()
#    (3): Accuracy()
#    (4): Accuracy()
#  )),
# ('0', Accuracy()),
# ('1', Accuracy()),
# ('2', Accuracy()),
# ('3', Accuracy()),
# ('4', Accuracy())]

list(m2.named_modules())
# [('',
#  ModuleList(
#    (0): PearsonCorrcoef()
#    (1): PearsonCorrcoef()
#    (2): PearsonCorrcoef()
#    (3): PearsonCorrcoef()
#    (4): PearsonCorrcoef()
#  )),
# ('0', PearsonCorrcoef())]

it seems like metrics were the metric state is the list type (PearsonCorrcoef) only one copy is recognised by pytorch as a sub-module whereas it works as expected for metrics were the metric state is a tensor (Accuracy).

@SkafteNicki
Copy link
Member

This was a issue do to a bug in the hashing of metrics, which have been fixed in Lightning-AI/torchmetrics#478.
Please upgrade to master of torchmetrics: pip install git+https://github.com/PytorchLightning/metrics.git@master and the issue should disappear.
Closing issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Is an improvement or enhancement help wanted Open to be worked on waiting on author Waiting on user action, correction, or update
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants