Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can the accuracy of the timestamp be improved? #255

Open
czkoko opened this issue Dec 10, 2022 · 3 comments
Open

Can the accuracy of the timestamp be improved? #255

czkoko opened this issue Dec 10, 2022 · 3 comments
Labels
question Further information is requested

Comments

@czkoko
Copy link

czkoko commented Dec 10, 2022

The timestamp of whisper is not very accurate.
The following is the comparison between Microsoft Cognitive Services Speech and whisper.

1                                    
00:00:00,120 --> 00:00:01,379 (Microsoft)    
[00:00:00.000 --> 00:00:02.000] (whisper)
2
00:00:02,120 --> 00:00:06,320 (Microsoft)  
[00:00:02.000 --> 00:00:07.500] (whisper)
@misutoneko
Copy link

misutoneko commented Dec 11, 2022

Yes, this would be much appreciated, I'm not sure how much can be done without retraining the model(s) though.
I suppose you are using the large model?
I've found the smaller models to be less accurate.

Btw for the original whisper there's the stable-ts fork, maybe that can provide some inspiration. See here:
openai/whisper#435

@ggerganov ggerganov added the question Further information is requested label Dec 11, 2022
@ggerganov
Copy link
Owner

The timestamp precision is a limitation of the model. You would need some sort of pre/post-processing to improve the timestamps. But at the moment it is not clear what is the best approach.

@pneyrinck
Copy link

Apparently, this work has been done to improve time stamps. https://github.com/jianfch/stable-ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants