-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EarlyStopping not working / wrong keys in log #3338
Comments
Hi! thanks for your contribution!, great first issue! |
Hi @undertherain, currently the def validation_step(self, batch, batch_idx):
x, y = batch
y_hat = self(x)
loss = F.l1_loss(y_hat, y)
result = pl.EvalResult(checkpoint_on=loss, early_stop_on=loss) # changes here
result.log("val_loss", loss, sync_dist=True)
return result Yes,
API docs: https://pytorch-lightning.readthedocs.io/en/latest/results.html#evalresult-api |
Aha, I saw |
I guess can close this, if it is considered "not a bug, but a feature" :) |
Glad to hear that we also have some discussions going on #3286 |
closing it ok? since you found a workaround and ignored monitor is a known issue and discussion happens in other issue as pointed out by @ydcjeff |
🐛 Bug
I’m trying to implement EarlyStopping when validation loss stops decreasing. I add callback as follows:
This does not work - it is returning False at from the
_validate_condition_metric
functionWhen I checked what’s in the log dictionary, the values looked like
{'val_early_stop_on': None, 'val_checkpoint_on': tensor(0.5601, device='cuda:0')}
- which is slightly confusing. Where does “val_checkpoint_on” come from and why it is not called “val_loss”?It feels like it might be slightly connected to the
result = pl.EvalResult(checkpoint_on=loss)
line.I was reading documentation, but frankly speaking I found
checkpoint_on (Union[Tensor, bool, None]) – Metric to checkpoint on.
to be slightly not intuitive. What does it mean for the metric to be checkpoints on? And does it really connect to keys in log being renamed in a strange way?Code sample
https://github.com/matsuokalab/cosmoflow/blob/ac75fe317f8daf3444c96b837bb109064aa81dab/main.py
Expected behavior
Expecting EarlyStopping to work, log to have val_loss key
Environment
The text was updated successfully, but these errors were encountered: