-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use of edit lists and timed text tracks #40
Comments
I've had a think and a discussion about this and would add the following points:
Overall, the simplest option for typical use (e.g. for a use case like CMAF) is probably to state that edit lists cannot be used with TTML, so that the composition timeline and the presentation timeline are effectively identical. I agree with what I think is the intent behind your proposal @cconcolato , which is to define more clearly what it means to apply an edit list to a TTML sample, and what the consequences might be, without changing the current sense that they can be applied. Then it is down to profiles and applications to define whether edit lists are permitted in a particular context. |
@cconcolato You give two possibilities about TTML timing i) TTML times correlate to Composition Times, and ii) TTML times correlate to Presentation Times. I think your recommentation is OK, but needs to explicitly explain that it's option ii) that is the correct interpretation. Previously (before all these discussions around this amendment) I'd expected it was option i), but with the drawback that most clients were unlikely to support edit lists at all with TTML. |
I think we should start an amendment to 14496-30 with this item. |
If this is taken forwards then I recommend (request) that test content be produced (or at least extremely detailed instructions written for it) that distinguishes between 1) a correct implementation of what is specified, 2) implementations not supporting edit lists on TTML at all, 3) the most obvious way or ways in which an implementation might support edit lists with TTML but incompatibly with what is specified. A description of what would be seen in each case would also be really helpful. Obviously some test content that behaves visibly differently in each of these may need to be very artificial. |
I want to explore the use of edit lists of TTML in MP4 (and WebVTT in MP4).
Reminder on edit lists
As defined in 14496-12, a track has:
The edit list of a track can provide a very complex transformation of the composition timeline. For example, it can indicate that for a presentation time interval, no samples are played (empty edit); or that for a presentation time interval, a sample is paused; or that for a presentation time interval, a composition time interval is played at a given speed ... Edit lists can even select parts of a sample to be in the presentation timeline, e.g. a sample that starts at composition time 5 and ends at composition time 7 is only played from its 6s to 6.5s.
Simple (typical) use cases of edit lists are:
Note that when one track has such edit list, the other tracks don't need to have an edit list. For example, if a file has 3 tracks (audio, video, subs), if the audio track has priming but the video track has no b-frame, the video track and the subs track don't need an edit list.
Advanced use cases for edit lists are editing operations in non-linear authoring tools (cut, insert, reorder, ...).
CMAF restricts the use of edit lists to the typical cases above.
Timed text and edit lists
As an example, if a timed text file is authored against a video file and later on that video file is modified to add a bumper (i.e. an ident, an intro, a title card, ...), effectively shifting the dialogs in the audio and the video forward in time, it could be interesting to adjust the timing of the timed text track. This may be done by inserting an empty edit with an edit list.
WebVTT
For WebVTT tracks, given that the WebVTT cue time is derived from the sample time, shifting the sample time with an edit list actually affects the cue time as expected. The following cue:
gets stored in a sample with CTS = 11s, duration 2s, with the payload:
if the sample CTS is changed to 15s (adding a 4s empty edit), that effectively does as if the initial cue had been:
Note that additional care (i.e. use of the
ctim
box) needs to be taken if the cue payload contained a WebVTT Cue Timestamp.TTML
For a TTML track, times in the TTML document as relative to the start of the track, but a question is: is it the start of the composition timeline or the start of the presentation timeline?
Let's assume that the following document:
is stored in a sample (CTS = 11s, duration = 2s).
i. if TTML document times are interpreted as delta from the start of the composition timeline, the behavior would be the same as for the WebVTT case. When applying the same edit list, and playing presentation time 15s, the player would know that it is actually playing composition time 11s, and which would match the time values in the document.
ii. if TTML document times are interpreted as delta from the start of the presentation timeline, when applying the same edit list, at presentation time 15s, when the TTML parser is fed the same document and seeked at time 15s, there is no active element. Nothing plays. To make it work, when adding the edit list, one has to adjust the TTML document in the sample to be:
Currrent spec text
ISO/IEC 14496-30 2nd edition, Section 4.2 says:
This means, as usual, that the presentation of a timed text track behaves like a video or audio track and is driven by the presentation time, from which a composition time and a sample number is derived.
It then says:
Section 5.3 (TTML) says:
So clearly edit lists are meant to apply to TTML, but nothing warns about the issue described above.
Note that the text from the second edition has other flaws/ambiguities and is rephrased in the amendment 1 to the second edition.
Recommendation
My recommendation would then be to update the TTML section and add something along the lines of:
The text was updated successfully, but these errors were encountered: