Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix a bug in k2.ragged.normalize_scores. #563

Merged
merged 8 commits into from
Jan 4, 2021

Conversation

csukuangfj
Copy link
Collaborator

If k2.ragged.RaggedFloat is constructed from a shape and a value,
its scores should be set to the given value to make autograd work.

If k2.ragged.RaggedFloat is constructed from a shape and a value,
its scores should be set to the given value to make autograd work.
@csukuangfj
Copy link
Collaborator Author

There already exists k2.Fsa.detach()

def detach(self) -> 'Fsa':

so I am removing the one introduced a few days ago.

@danpovey
Copy link
Collaborator

danpovey commented Jan 1, 2021

looks OK to me..

elif isinstance(ragged, _k2.RaggedShape):
assert values is not None
ragged = _k2.RaggedFloat(ragged, values)

assert isinstance(ragged, _k2.RaggedFloat)

self.ragged = ragged
self._scores = ragged.values()
if values is not None:
self._scores = values
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, why does RaggedFloat have a _scores member?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... it is supposed to be a generic interface, not tied into class Fsa.

Copy link
Collaborator Author

@csukuangfj csukuangfj Jan 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, why does RaggedFloat have a _scores member?

While doing normalize_scores, if we use _k2.RaggedFloat directly it is not possible to register the
operation into the autograd graph since there is no torch.Tensor.

k2.ragged.RaggedFloat is a wrapper around _k2.RaggedFloat and the field _scores is to make
autograd work.

def forward(ctx, src: RaggedFloat, out: List[RaggedFloat],
unused_scores: torch.Tensor) -> torch.Tensor:

For the autograd to work, we need to pass a torch.Tensor to it. One alternative is to let the user pass a tensor
while calling normalize_scores.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... it is supposed to be a generic interface, not tied into class Fsa.

The class RaggedFloat is not tied to class Fsa; it is a wrapper around _k2.RaggedFloat.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the name _values would be better than _scores; we don't want to imply the values have any specific meaning.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will change it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And please make sure the documentation of class RaggedFloat is as self-contained as possible and explains anything that needs to be explained about autograd... (it may already be; just a reminder).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am adding more documentation for it.

@csukuangfj
Copy link
Collaborator Author

Ready for review.

def set_scores_stochastic_(self) -> None:
'''Set `scores` to random numbers.
def set_scores_stochastic_(self, scores) -> None:
'''Normalize the given `scores` and assign it to self.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd write this as:
Normalize the given scores and assign the result to self.scores.
Args:
scores: Tensor of scores of dtype torch.float32, and shape equal to self.scores.shape
(one axis). Will be normalized so the sum, after exponentiating, of the scores
leaving each state that has at least one arc leaving it is 1.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, changed.


ans.scores = self.scores.detach()
return ans
# Note we use `to` here since `scores` and `self.scores` may not
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it odd that scores and self.scores might not be on the same device, surely we should ensure that they always are?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It happens in our current ASG training:

  • model.P_scores is on CUDA
  • P is on CPU since k2.intersect works only on CPU

So in this case model.P_scores and P.scores are on different devices.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surely we should ensure that they always are?

I assume that log-sum per sub list runs faster on CUDA; otherwise, we can require that scores.device == self.device.

@csukuangfj
Copy link
Collaborator Author

Merging. If there are any objections, please leave your comments here and I will resolve them.

@csukuangfj csukuangfj merged commit 4c629e1 into k2-fsa:master Jan 4, 2021
@csukuangfj csukuangfj deleted the fangjun-fix-normalize-scores branch January 12, 2021 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants