-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix a bug in k2.ragged.normalize_scores. #563
fix a bug in k2.ragged.normalize_scores. #563
Conversation
If k2.ragged.RaggedFloat is constructed from a shape and a value, its scores should be set to the given value to make autograd work.
There already exists Line 887 in 75f9378
so I am removing the one introduced a few days ago. |
looks OK to me.. |
k2/python/k2/ragged/tensor.py
Outdated
elif isinstance(ragged, _k2.RaggedShape): | ||
assert values is not None | ||
ragged = _k2.RaggedFloat(ragged, values) | ||
|
||
assert isinstance(ragged, _k2.RaggedFloat) | ||
|
||
self.ragged = ragged | ||
self._scores = ragged.values() | ||
if values is not None: | ||
self._scores = values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, why does RaggedFloat have a _scores member?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... it is supposed to be a generic interface, not tied into class Fsa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, why does RaggedFloat have a _scores member?
While doing normalize_scores
, if we use _k2.RaggedFloat
directly it is not possible to register the
operation into the autograd graph since there is no torch.Tensor
.
k2.ragged.RaggedFloat
is a wrapper around _k2.RaggedFloat
and the field _scores
is to make
autograd work.
k2/k2/python/k2/ragged/autograd.py
Lines 15 to 16 in 75f9378
def forward(ctx, src: RaggedFloat, out: List[RaggedFloat], | |
unused_scores: torch.Tensor) -> torch.Tensor: |
For the autograd to work, we need to pass a torch.Tensor
to it. One alternative is to let the user pass a tensor
while calling normalize_scores
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... it is supposed to be a generic interface, not tied into class Fsa.
The class RaggedFloat
is not tied to class Fsa
; it is a wrapper around _k2.RaggedFloat
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the name _values would be better than _scores; we don't want to imply the values have any specific meaning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And please make sure the documentation of class RaggedFloat is as self-contained as possible and explains anything that needs to be explained about autograd... (it may already be; just a reminder).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am adding more documentation for it.
Ready for review. |
k2/python/k2/fsa.py
Outdated
def set_scores_stochastic_(self) -> None: | ||
'''Set `scores` to random numbers. | ||
def set_scores_stochastic_(self, scores) -> None: | ||
'''Normalize the given `scores` and assign it to self. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd write this as:
Normalize the given scores
and assign the result to self.scores.
Args:
scores: Tensor of scores of dtype torch.float32, and shape equal to self.scores.shape
(one axis). Will be normalized so the sum, after exponentiating, of the scores
leaving each state that has at least one arc leaving it is 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, changed.
|
||
ans.scores = self.scores.detach() | ||
return ans | ||
# Note we use `to` here since `scores` and `self.scores` may not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it odd that scores and self.scores might not be on the same device, surely we should ensure that they always are?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It happens in our current ASG training:
- model.P_scores is on CUDA
- P is on CPU since
k2.intersect
works only on CPU
So in this case model.P_scores
and P.scores
are on different devices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
surely we should ensure that they always are?
I assume that log-sum per sub list runs faster on CUDA; otherwise, we can require that scores.device == self.device
.
Merging. If there are any objections, please leave your comments here and I will resolve them. |
If k2.ragged.RaggedFloat is constructed from a shape and a value,
its scores should be set to the given value to make autograd work.