Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check shape #859

Closed
MordehayM opened this issue Feb 24, 2022 · 9 comments · Fixed by #864
Closed

check shape #859

MordehayM opened this issue Feb 24, 2022 · 9 comments · Fixed by #864
Labels
question Further information is requested

Comments

@MordehayM
Copy link

https://github.com/PyTorchLightning/metrics/blob/21ba6502418f537a7ca3618be0b19f617f83a062/torchmetrics/functional/audio/pit.py#L148
Hi,
I don't understand why the target shape and pred shape must be equal?
This question arises from the fact that the loss can be the categorical cross-entropy with multiple outputs (for each speaker, for instance) and then this constraint does not apply(categorical cross-entropy in Pytorch)

@github-actions
Copy link

Hi! thanks for your contribution!, great first issue!

@SkafteNicki
Copy link
Member

cc: @quancs

@SkafteNicki SkafteNicki added the question Further information is requested label Feb 25, 2022
@quancs
Copy link
Member

quancs commented Feb 25, 2022

Hi, you are right this constraint is not designed correctly for all possible use cases... Could you give some example input and output for your use case

@quancs
Copy link
Member

quancs commented Feb 27, 2022

I don't understand why the target shape and pred shape must be equal?

  1. the batch dim of the pred and target should be the same by nature
  2. the speaker dim should be the same required by PIT
  3. For speech separation, the metrics like SDR, SI-SDR, PESQ require the pred and target to have the same shape at the time dim.

I guess it is (3) could not be applied to other audio sub-domains. Is that right? @MordehayM

@MordehayM
Copy link
Author

Yes, exactly.
When the loss is categorical cross entropy, the target and pred do not have the same shape, the target's shape is [B, num_speaker, d1..dk] while the pred's shape is [B, num_speaker, C, d1..dk] where C is the number of categories.

@quancs
Copy link
Member

quancs commented Feb 27, 2022

Yes, exactly. When the loss is categorical cross entropy, the target and pred do not have the same shape, the target's shape is [B, num_speaker, d1..dk] while the pred's shape is [B, num_speaker, C, d1..dk] where C is the number of categories.

@MordehayM So, it works for your case if we just check the first two dimensions, batch and speaker?

@MordehayM
Copy link
Author

Sorry, but I didn't check that. I chose another implementation of PIT.
In my opinion it should work.

@quancs
Copy link
Member

quancs commented Feb 27, 2022

In my opinion it should work.

Thanks for your opinions. I will fix this problem.

Sorry, but I didn't check that. I chose another implementation of PIT.

just a little bit curious, could you tell me which one? If we find it's faster, we could implement it in TorchMetrics ^^

@MordehayM
Copy link
Author

This one:
https://github.com/asteroid-team/pytorch-pit/blob/master/torch_pit/pit_wrapper.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants