Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore step param in Neptune logger's log_metric method #5510

Merged
merged 12 commits into from
Jan 25, 2021
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- Increased TPU check timeout from 20s to 100s ([#5598](https://github.com/PyTorchLightning/pytorch-lightning/pull/5598))


- Ignored `step` param in Neptune logger's log_metric method ([#5510](https://github.com/PyTorchLightning/pytorch-lightning/pull/5510))


### Deprecated


Expand Down
8 changes: 5 additions & 3 deletions pytorch_lightning/loggers/neptune.py
Original file line number Diff line number Diff line change
Expand Up @@ -252,13 +252,15 @@ def log_metrics(

Args:
metrics: Dictionary with metric names as keys and measured quantities as values
step: Step number at which the metrics should be recorded, must be strictly increasing
step: Step number at which the metrics should be recorded, currently ignored
"""
assert rank_zero_only.rank == 0, 'experiment tried to log from global_rank != 0'

metrics = self._add_prefix(metrics)
for key, val in metrics.items():
self.log_metric(key, val, step=step)
# `step` is ignored because Neptune expects strictly increasing step values which
# Lighting does not always guarantee.
self.log_metric(key, val)
Comment on lines +261 to +263
Copy link
Contributor

@awaelchli awaelchli Jan 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It normally is, unless user adds new logs manually then the neptune's internal step may increase while lightning's step does not.

it's probably the same as in WandB logger, see #5194
if more flexibility is needed in the future, we could consider something similar as proposed in #5351

Note: the global step in Lightning should not increase during val or test. It is not a bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this comment - could you just have a look at this related issue (and the specific linked comment) and confirm that this behaviour of PTL is expected? #5130 (comment)

Copy link
Contributor

@awaelchli awaelchli Jan 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In reply to the comment in the link:
Multiple calls to fit will not reset the global step. However, if the user creates a new trainer after fit and then calls fit again on the new trainer, this one will start from global step 0 (but then this would probably also create a new neptune experiment)

Using global step for logging is one possibility. I understand that some loggers may want to resume and append new data points with a monotonic step. I think that's perfectly fine. If the usability of Neptune benefits from tracking an internal step separate from Trainers global step, then that should be fine in my opinion. However, the user should be careful, because if they do multiple log calls per actual training step, they may run into issues comparing experiments across runs because the step axes don't match up.


@rank_zero_only
def finalize(self, status: str) -> None:
Expand Down Expand Up @@ -318,7 +320,7 @@ def log_text(self, log_name: str, text: str, step: Optional[int] = None) -> None
text: The value of the log (data-point).
step: Step number at which the metrics should be recorded, must be strictly increasing
"""
self.log_metric(log_name, text, step=step)
self.experiment.log_text(log_name, text, step=step)

@rank_zero_only
def log_image(self,
Expand Down
2 changes: 1 addition & 1 deletion tests/loggers/test_all.py
Original file line number Diff line number Diff line change
Expand Up @@ -353,7 +353,7 @@ def test_logger_with_prefix_all(tmpdir, monkeypatch):
with mock.patch('pytorch_lightning.loggers.neptune.neptune'):
logger = _instantiate_logger(NeptuneLogger, save_idr=tmpdir, prefix=prefix)
logger.log_metrics({"test": 1.0}, step=0)
logger.experiment.log_metric.assert_called_once_with("tmp-test", x=0, y=1.0)
logger.experiment.log_metric.assert_called_once_with("tmp-test", 1.0)

# TensorBoard
with mock.patch('pytorch_lightning.loggers.tensorboard.SummaryWriter'):
Expand Down
4 changes: 2 additions & 2 deletions tests/loggers/test_neptune.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,8 +76,8 @@ def test_neptune_additional_methods(neptune):
created_experiment.log_metric.reset_mock()

logger.log_text('test', 'text')
created_experiment.log_metric.assert_called_once_with('test', 'text')
created_experiment.log_metric.reset_mock()
created_experiment.log_text.assert_called_once_with('test', 'text', step=None)
created_experiment.log_text.reset_mock()

logger.log_image('test', 'image file')
created_experiment.log_image.assert_called_once_with('test', 'image file')
Expand Down