Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Wav2Vec2BertProcessorWithLM #30671

Closed
FredHaa opened this issue May 6, 2024 · 3 comments
Closed

Add Wav2Vec2BertProcessorWithLM #30671

FredHaa opened this issue May 6, 2024 · 3 comments
Labels
Audio Feature request Request for a new feature

Comments

@FredHaa
Copy link

FredHaa commented May 6, 2024

Feature request

Wav2Vec2-Bert was open sourced and integrated with Transformers in the end of last year. However, it is missing an easy integration with pyctcdecode similar to Wav2Vec2ProcessorWithLM. This should be quite trivial to implement, since Wav2Vec2Processor is very similar to Wav2Vec2BertProcessor, the only difference being that they use different feature extractors.

Motivation

Having a Wav2Vec2BertProcessorWithLM class would make it possible to use Wav2Vec2-Bert with a kenlm model in a Transformers ASR pipeline.

Your contribution

I can submit a PR.

@FredHaa FredHaa changed the title Wav2Vec2BertProcessorWithLM Add Wav2Vec2BertProcessorWithLM May 6, 2024
@LysandreJik
Copy link
Member

cc @sanchit-gandhi @ylacombe

@amyeroberts amyeroberts added Feature request Request for a new feature Audio labels May 7, 2024
@ylacombe
Copy link
Contributor

Hey @FredHaa, #28706 should fix this, I'm reopening it! Note that you would have to use Wav2Vec2ProcessorWithLM and not Wav2Vec2BertProcessorWithLM!

@ylacombe
Copy link
Contributor

#28706 has been merged, I'm closing the issue for now, feel free to ask questions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Audio Feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

4 participants