Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Be more tolerant of bad VTT #2358

Closed
bloomtom opened this issue Jan 25, 2020 · 3 comments · Fixed by #2394
Closed

Be more tolerant of bad VTT #2358

bloomtom opened this issue Jan 25, 2020 · 3 comments · Fixed by #2394
Labels
component: captions/subtitles The issue involves captions or subtitles component: WebVTT The issue involves WebVTT subtitles specifically flag: seeking PR We are actively seeking PRs for this; we do not currently expect the core team will resolve this priority: P4 Nice to have / wishful thinking status: archived Archived and locked; will not be updated type: enhancement New feature or request
Milestone

Comments

@bloomtom
Copy link
Contributor

Have you read the FAQ and checked for duplicate open issues?
Yes

Is your feature request related to a problem? Please describe.
Small defects in a vtt file can cause the entire thing to be dropped. Specifically, linefeeds in cue text and multiple linefeeds between a cue timestamp and its text. Trying to parse such a file in the demo player results in an error this.data_ is undefined.

I have my ever-useful two second clip demo prepared with such a vtt file. It's the Italian language.
Demo Link

The content of that vtt file is

WEBVTT

00:00.000 --> 00:00.500
<b>Alla tua destra puoi vedere...

|...ma pensa...</b>

00:00.500 --> 00:01.000

Visible

00:01.000 --> 00:02.000
Visible

Describe the solution you'd like
Technically a vtt file structured in this way is not compliant with the standard, yet there is much of it around due to ffmpeg. If you have extraneous linefeeds in an srt or ass file, and you ask ffmpeg to turn it into a vtt, it'll make a file that looks like the above. It would be nice if Shaka could render such files.

Describe alternatives you've considered
I have a patch in for review on the ffmpeg mailing list, but even if future files are compliant, that doesn't fix the issue for vtt files which have already been created.

Additional context
This is a feature request because I'm unsure if strict parsing was intended by the original implementer. Although other players display such malformed vtt files, they are technically not obliged to do so.

@joeyparrish
Copy link
Member

I'm open to this, assuming it doesn't over-complicate the parser. We would need some additional tests to cover the new functionality, but it could be as easy as adjusting some regexes in the VTT parser.

Are you interested in building a PR for this?

@joeyparrish joeyparrish added component: captions/subtitles The issue involves captions or subtitles type: enhancement New feature or request flag: seeking PR We are actively seeking PRs for this; we do not currently expect the core team will resolve this labels Jan 29, 2020
@shaka-bot shaka-bot added this to the Backlog milestone Jan 29, 2020
@bloomtom
Copy link
Contributor Author

Yes. I'll take a look at handling this one as well. Thanks for the input.

@theodab theodab added component: WebVTT The issue involves WebVTT subtitles specifically priority: P3 Useful but not urgent priority: P4 Nice to have / wishful thinking and removed priority: P3 Useful but not urgent labels Oct 2, 2021
@didof
Copy link

didof commented Nov 17, 2021

Is this PR going to be accepted?

joeyparrish pushed a commit that referenced this issue Feb 9, 2023
Bad linebreaks will now cause cues to be skipped (with a warning),
rather than throwing an error.

Closes #2358
Co-authored-by: Álvaro Velad Galván <[email protected]>
@avelad avelad modified the milestones: Backlog, v4.4 Feb 9, 2023
@github-actions github-actions bot added the status: archived Archived and locked; will not be updated label Apr 10, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 10, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
component: captions/subtitles The issue involves captions or subtitles component: WebVTT The issue involves WebVTT subtitles specifically flag: seeking PR We are actively seeking PRs for this; we do not currently expect the core team will resolve this priority: P4 Nice to have / wishful thinking status: archived Archived and locked; will not be updated type: enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants