Warn different size inputs in loss functions #1522

collodi · 2021-03-01T16:16:47Z

Say I have a model output a scalar, and the batch size is 5.
m(x) gives me an array of size (1, 5), while y is of size (5, ).
If I pass them to Flux.Losses.mae, the value is different than what I expect due to broadcasting.

In PyTorch, I get a warning message.

dense.py:43: UserWarning: Using a target size (torch.Size([5])) that is different to the input size (torch.Size([5, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.

It would be nice to have a warning message in Flux so I know I'm doing something wrong.

The text was updated successfully, but these errors were encountered:

ToucheSir · 2021-06-20T02:12:24Z

It's a good idea, but we'd have to define what range of sizes and dimensionalities are valid for each loss function. I can't think of any scenario where ndims(ŷ) != ndims(y), but we've been bitten in the past by being too restrictive around dims.

DhairyaLGandhi · 2021-06-20T04:26:31Z

Flux models (conv, polling, batchnorm, etc) already can take variable sizes inputs. The loss functions shouldn't alter that. The dimension mismatch that would occur more often is probably most sensible and maybe not worth it to inject these messages in Flux for more specific cases.

Re this mwe: Outer products can be handled by broadcasting, but keeping things general would be a bigger design win esp for non array uses.

ToucheSir · 2021-06-20T04:31:12Z

@collodi is referring to the case where a dimension mismatch error doesn't occur and Julia silently promotes the rank of one argument to match the other. There are certainly valid use-cases for that, but all of the loss examples we have in the docs use inputs with a uniform number of dimensions. The question is whether we can get away with defining loss functions like mae(ŷ::AbstractArray{N}, y::AbstractArray{N}; ...) where {N} = .... I'm sure there are exceptions I'm missing, but thus far none have come to mind.

DhairyaLGandhi · 2021-06-20T04:44:48Z

I addressed that in the re :)

ToucheSir · 2021-06-20T04:50:16Z

What non-array uses to we foresee though? I would argue scalar-scalar loss calculations are wholly out of scope, so maybe array-scalar? Do we have any strategy other than documentation to prevent the silently incorrect behaviour demonstrated up top?

collodi · 2021-06-30T15:39:31Z

Would it be a bad idea to simply warn the user? Instead of restricting the loss function inputs? That way, people who know what they're doing can simply ignore the message.

darsnack · 2021-06-30T19:57:03Z

The linked PR should address what you want

darsnack linked a pull request Jun 30, 2021 that will close this issue

Add warnings for mismatched sizes in losses #1636

Merged

mcabbott mentioned this issue Jul 2, 2021

Add warnings for mismatched sizes in losses #1636

Merged

bors bot closed this as completed in 8162b8a Jul 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warn different size inputs in loss functions #1522

Warn different size inputs in loss functions #1522

collodi commented Mar 1, 2021

ToucheSir commented Jun 20, 2021

DhairyaLGandhi commented Jun 20, 2021

ToucheSir commented Jun 20, 2021

DhairyaLGandhi commented Jun 20, 2021

ToucheSir commented Jun 20, 2021

collodi commented Jun 30, 2021

darsnack commented Jun 30, 2021

Warn different size inputs in loss functions #1522

Warn different size inputs in loss functions #1522

Comments

collodi commented Mar 1, 2021

ToucheSir commented Jun 20, 2021

DhairyaLGandhi commented Jun 20, 2021

ToucheSir commented Jun 20, 2021

DhairyaLGandhi commented Jun 20, 2021

ToucheSir commented Jun 20, 2021

collodi commented Jun 30, 2021

darsnack commented Jun 30, 2021