-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
merge Dan's PR of determinize #41
Conversation
k2/csrc/determinize.cc
Outdated
int32_t next_state_id; | ||
bool is_new_state = state_map->GetOutputState(state); | ||
arcs_out->push_back({this->state_id, next_state_id, label}); | ||
arc_weights_out->push_back(arc_weight); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this
point to? and what's the next_state_id
, seems we need to change the signature of GetOutputState
to return a pair of <bool, next_state_id>
.
And what's the label
k2/csrc/determinize.cc
Outdated
if (det_state->forward_backward_prob >= prune_cutoff) { | ||
bool is_new_state = state_map->GetOutputState(det_state); | ||
arcs_out->push_back({this->state_id, next_state_id, label}); | ||
arc_weights_out->push_back(arc_weight); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, next_state_id
, label
and arc_weight
k2/csrc/determinize.cc
Outdated
arc_derivs_out->clear(); | ||
|
||
bool ans = map.GetOutputState(start_state.get()); | ||
CHECK(ans && ans->state_id == 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, seems we should change the signature of GetOutputState
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think so (add wfsa_in.fsa)
int32_t seq_len = d.seq_len; | ||
const auto &arcs = fsa.arcs; | ||
for (int32_t i = 0; i < seq_len; ++i) { | ||
int32_t symbol = arcs[elem->arc_id].label; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as MaxTracebackState
and LogSumTracebackState
has differnt structure, we may need to define differnt functions for them, For LogSum
, how can we compute the hash pair as there are a vector of prev_elements vector<LogSumTracebackLink> prev_elements
, iterate over all elements in it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No you can just pick an arbitrary one, the symbol-sequence will be the same no matter which you pick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I just picked the first one.
} | ||
} | ||
|
||
int32_t state_id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danpovey , I added a property of state_id
as it seems it is used in many places (but do definition)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK but it needs to be documented. E.g.
// State-id in the output FSA. Only defined after DetStateMap::GetOutputState()
// is called. A DetState that is in the queue should have this defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
int32_t seq_len = d.seq_len; | ||
const auto &arcs = fsa.arcs; | ||
for (int32_t i = 0; i < seq_len; ++i) { | ||
int32_t symbol = arcs[elem->prev_elements[0].arc_index].label; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just pick the first element to get the hash
OK, fixed all compiler errors and style issues, ready to merge (so there will be no style issues for following PR from other guys). Definitely there will be bugs (and memory leaks as I noticed there are usages of |
db2ed3b
to
cfd3c06
Compare
Great. Would you mind making sure the code runs? E.g. making some test cases? |
I think we can merge it first to make other PRs pass the style check. I'll make another PR to debug with test cases this weekend, is that OK? |
Sure, thanks! |
Note: this PR is totally from @danpovey https://github.com/danpovey/k2/pull/37, I'm just fixing some compiler errors.
Nearly fixed with 3 group errors left. Please see comments below.