Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[speechx]add wfst decoder #2886

Merged
merged 1 commit into from
Feb 7, 2023
Merged

Conversation

SmileGoat
Copy link
Contributor

PR types

New features

PR changes

Others

Describe

add wfst decoder

@mergify mergify bot added the Deployment label Feb 7, 2023
@SmileGoat SmileGoat added this to the r1.4.0 milestone Feb 7, 2023
@@ -45,7 +45,7 @@ class CTCPrefixBeamSearch : public DecoderBase {

void FinalizeSearch();

const std::shared_ptr<fst::SymbolTable> VocabTable() const {
const std::shared_ptr<fst::SymbolTable> WordSymbolTable() const override {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unit table 应该是模型的输出,不一定是 word

@@ -15,7 +15,7 @@
#include "decoder/ctc_tlg_decoder.h"
namespace ppspeech {

TLGDecoder::TLGDecoder(TLGDecoderOptions opts) {
TLGDecoder::TLGDecoder(TLGDecoderOptions opts) : opts_(opts) {
fst_.reset(fst::Fst<fst::StdArc>::Read(opts.fst_path));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opts -> opts_

LOG(INFO) << "chunk size (frame): " << chunk_size;
LOG(INFO) << "chunk stride (frame): " << chunk_stride;
LOG(INFO) << "receptive field (frame): " << receptive_field_length;
new ppspeech::Decodable(nnet_producer, FLAGS_acoustic_scale));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make_shared

speechx/speechx/asr/decoder/ctc_tlg_decoder.h Show resolved Hide resolved
speechx/speechx/asr/nnet/nnet_producer.cc Show resolved Hide resolved
decoder_.reset(new CTCPrefixBeamSearch(
resource.vocab_path, resource.decoder_opts.ctc_prefix_search_opts));
if (resource.decoder_opts.tlg_decoder_opts.fst_path == "") {
decoder_.reset(new CTCPrefixBeamSearch(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make_shared


unit_table_ = decoder_->VocabTable();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unit_table 指的 model 的vocab,
symbol_table 应该指的是 fst 的 olabel 对应的table吧?

英文的时候这个不能混。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个不是额,fst 的 input or output lable 都可以对应symbol table; 主要是vocab不是很合适;unit_table 建模单元可以看做事symbol table的一部分;中文的字 与 词 有点混淆,但是这不是问题。

@@ -214,7 +218,7 @@ void U2Recognizer::UpdateResult(bool finish) {

void U2Recognizer::AttentionRescoring() {
decoder_->FinalizeSearch();
UpdateResult(true);
UpdateResult(false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

二遍后应该更新下识别结果。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不更新time line而已;后续整理上去,timeline 还没有矫正;

@@ -154,10 +165,9 @@ class U2Recognizer {

std::shared_ptr<NnetProducer> nnet_producer_;
std::shared_ptr<Decodable> decodable_;
std::unique_ptr<CTCPrefixBeamSearch> decoder_;
std::unique_ptr<DecoderBase> decoder_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

都替换成 shared_ptr 吧。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个非共享指针,会在refactor时都变成 unique_ptr;


// e2e unit symbol table
std::shared_ptr<fst::SymbolTable> unit_table_ = nullptr;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要保留。

@SmileGoat SmileGoat merged commit 21183d4 into PaddlePaddle:speechx Feb 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants