Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple improvements: language detection per segment, VAD min duration on/off, unique speakers, pyproject.toml and more. #900

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

cvl01
Copy link

@cvl01 cvl01 commented Oct 17, 2024

WhisperX repo with multiple improvements combined:

  • Silero VAD added from Silero VAD support  #888
  • Diarization improvements from assign_word_speakers fix #590
  • Unique speakers added to result (inspiration from Update diarize.py #126)
  • Option to detect language per segment, very useful for longer audio with frequent language switches.
  • Changed setup.py to pyproject.toml
  • Added VAD min duration on and off parameters to PyAnnote. The current implementation splits even on sub-second pauses which is rather ineffective sometimes.
  • Pyannote.audio bumped to 3.3.2

Feel free to check out my repo and suggest improvements.

Copy link

@3manifold 3manifold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt what the purpose of gathering multiple pull requests under a single pull request is (in addition, your commits are not dependent on commits regarding other pull requests) 😕. Please change this pull request & submit any changes relating only to your contribution.

@cvl01
Copy link
Author

cvl01 commented Oct 24, 2024

I agree with you in the sense that normally you'd open a pull request for one feature at once.
Since this repo is unsupported I have created my own fork with all the changes that are useful for my personal whisperx usage.
I did not create this with the intention of a pull request, more as a fully working, up to date whisperx package to be used in various projects.
The reason I added the pull request here is for others to see the changes, and check out my repository if they want to see how it's implemented.
If you insist, I can close the pull request. But since this repo is unmaintained and no new changes are merged in for a long time now, I don't think the extra effort of splitting into multiple PR's with one change/feature per PR is worth the effort.

@federicotorrielli
Copy link

Hi @cvl01, can you create PRs for my project? I plan to support WhisperX with the help of the community.

https://github.com/federicotorrielli/BetterWhisperX

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants