Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shaka 2.0 and stream interruption #555

Closed
agelun opened this issue Oct 17, 2016 · 18 comments
Closed

Shaka 2.0 and stream interruption #555

agelun opened this issue Oct 17, 2016 · 18 comments
Assignees
Labels
status: archived Archived and locked; will not be updated type: enhancement New feature or request
Milestone

Comments

@agelun
Copy link

agelun commented Oct 17, 2016

Latest master (6d0f081)

According to http://dashif.org/wp-content/uploads/2015/04/DASH-IF-IOP-v3.0.pdf "4.8.3" and "4.8.5" (page 67) we have two options to indicate temporary stream unavailability:

  1. "Short Periods" – "This expresses that for this Period no media is present at least for the time as expressed by the @duration attribute". As I understood, they mean periods without any AdaptationSets or Representations inside.
  2. Using "dummy content", and according to "4.8.5" sending 404 to indicate content unavailability for specified interval.

But in first case Shaka reports about invalid MPD (no AdaptationSet/Representation) or stucks before such "dummy" period (in case of stream info absence).

In second case (404 response) Shaka makes repeated attempts to get a chunk and doesn't continue playback until we move scrollbar (or until chunk suddenly appears).

What do you recommend to use for temporary stream unavailability indication?

Are you going to handle 404 responses and "dummy periods" according to DASH-IF-IOP?

Thank you.

@agelun
Copy link
Author

agelun commented Oct 17, 2016

And the simpliest way is to support discontinuity between periods (when period1.start + period1.duration < period2.start). I've already made a patch for this and it would be great if you'll be able to review it.

@joeyparrish
Copy link
Member

We don't currently support discontinuity at all, but I'd love to see your patch. Perhaps I have been over-complicating it.

As for short periods and dummy content, we don't have any such content available. Can you provide a test stream so that we can take a look?

@agelun
Copy link
Author

agelun commented Oct 17, 2016

Thanks, Joey. you can find my commit here:
https://github.com/agelun/shaka-player/commit/d0b3fe58bf58f8ca72f904b37e199db104513402

I didn't make a pull request because it breaks a lot of test cases and it's not ready for merge.

@joeyparrish joeyparrish added the type: question A question from the community label Oct 17, 2016
@joeyparrish joeyparrish self-assigned this Oct 17, 2016
@joeyparrish
Copy link
Member

@baconz, this seems related to a conversation we were having about gaps.

@agelun, it appears that you detect when the playhead is in a gap, and move it to the other side. This is something we have considered before and decided not to do.

I took a look at the sections of the IOP you referenced. (The same content appears in the same sections in the most up-to-date version of IOP, v3.3.)

My reading of the IOP is that there are three mechanisms:

  1. Dummy content. This means that the server generates or uses pre-made media to fill the gap in the input. The output DASH stream has no gaps. Dummy content requires no player support, so this is a definite option for you. (Note that in my interpretation, this is not the same as serving a 404.)
  2. "Short Periods". These are periods with no media at all. We do not support these currently.
  3. 404s. The Period element points to media which is not there on the server. When the segment is requested, we get a 404 and we know that there is no such segment. We do not support this either.

It appears that using 404s is the same as short periods in concept. Both indicate that segments are not available and will not be available. The main difference is in when the player finds out. A short period tells the player at manifest parse time that there's no content. A 404 when fetching a segment tells the player later, when it starts trying to buffer the relevant section of the presentation.

I dislike using 404s for this because there are other ways to encounter them, such as bad clock sync or a misbehaving server. If the player is expected to happily move past 404s, it may make debugging certain server problems more difficult.

Short periods are more clear, but still don't fit into our current architecture. We don't expect to be able to play through gaps, and these empty periods just signal "here's a gap".

I'm not sure if/when we would be able to support gaps. Because of the way MediaSource works, I believe it is difficult for JS-based players to deal with gaps elegantly. I'm averse to extra complexity in the player.

@baconz
Copy link
Contributor

baconz commented Oct 17, 2016

@joeyparrish there's one more option that we discussed, which would be to leave out the empty period, but apply an adjustment to the live head, pulling it back by a duration equivalent to the gap length. This method is not yet supported in the spec, but I think that barring a true server-side solution (ie filling the gaps with a "slate"), it might be the simplest fix.

I'm happy to submit a ticket to the IOP if others think that this would be helpful.

@sandersaares
Copy link
Contributor

sandersaares commented Oct 18, 2016

I would like to point out some additional complexity, especially with live streams. Empty periods and dummy content might be technically feasible for on-demand content but for live content, the backend often might not even have the ability to detect missing pieces, let alone do something about them. Many packagers simply deliver segments over FTP to a CDN origin and if one segment gets lost then too bad - it will just skip the segment and carry on.

This should not happen but real world network conditions can be nonideal, so I have seen it quite a lot in practice - there's a gap of one or more segments, after which the stream continues just fine. Maybe not very often but perhaps just once a day. If players do not skip past 404s, this means that live streams will suddenly stall at random moments in time, for all (or a large subset of) users simultaneously. This makes users rather unhappy.

@agelun
Copy link
Author

agelun commented Oct 19, 2016

@joeyparrish can you explain why did you decide not to "jump over" the gap? Are there any tech reasons or just code style/simplicity/etc.?

I thought about #462 as well and both cases can be solved if you create something like "event bus" for stream events, I think. There could be pushed MPD events from EventStream tag and discontinuity events (in case of periods interruption, empty periods and 404 responses – all from different sources) and application will be able to choose when/how to react on such events (for example, jump over, stuck, show error message etc. for discontinuity events). It can be used to draw informative timeline as well.

I need to understand how can we progress with this issue, should we support our own branch to cover such "real life" requirements or it can be merged into google's code. Or, maybe, there are some more serious reasons I don't know about and it cannot work well at all?

@joeyparrish
Copy link
Member

We decided not to jump over the gap for many reasons, but I think they are becoming increasingly irrelevant. Let's figure out the best solution and work toward getting it merged into Shaka Player.

In my view, there are many reasons for gaps:

  1. Broken content. The fact is that we get many support requests that turn out to be bugs in the encoder or bad input to the encoder. See DASH stream playing on DASH IF but not in JW Player #562. Broken content and non-IOP-conforming content are things we don't intend to support. I think failing on broken content is appropriate and helps to highlight and fix bugs in encoders. Jumping gaps here may hide these content issues, but that shouldn't be a deal-breaker.
  2. Browser bugs. As of this writing, Chrome is still mixing up DTS and PTS, which can sometimes cause gaps to appear at the start or end of a Period even when there are none in the content. Jumping gaps in this case might be a good thing, in that it would work around browser bugs when possible.
  3. True discontinuity. This is permitted by IOP and a reality of live streaming for encoders which don't generate dummy content. Jumping gaps in this case is a clear win, as it allows us to play live streams we couldn't otherwise play.

I don't see any good way to differentiate between these three cases, so probably it's best to try to handle all gaps.

One thing that worries me is the use of 404s to signal discontinuities. There are other times a 404 may occur:

  1. Again, broken content. If I mis-encode VOD content, segments might be missing or their URLs might be wrong in the manifest. Ideally, we would fail on such content so that the app developer sees an error and can start investigating.
  2. Live content when clock sync is broken or misconfigured. If the app config is wrong, the player might request past the proper live edge and get a 404 from the server. In this case, treating a 404 as a discontinuity would result in the player requesting the next segment after that, and so on. The rest of the stream would appear as one big gap and would generate no errors. Ideally, we would fail with an error in this case, too.
  3. True discontinuity. These could appear at the live edge or elsewhere, depending on whether a time shift buffer is used. Ideally, we should handle this gracefully somehow.

I'm unsure how to differentiate the third case from the other two. Perhaps we could have a heuristic for that, such as isLive && numConsecutive404s < errorThreshold. If there are too many in a row, we would decide we were in case 2 and fail. If it was VOD, we would decide we were in case 1 and fail.

We've also discussed two different strategies for handling gaps:

  1. Seek over gaps automatically. I'm concerned about the complexity of this, but it could work. Playhead seems the most reasonable place to detect and jump gaps. It could be similar to how we keep the playhead inside the seek range.
  2. Dispatch events to allow the application to decide what to do. If the app isn't expecting gaps, it could fail. This offers greater flexibility, but requires more effort on the part of the app developer. Extra work is required for any app expecting gaps, to listen for that and jump over them. Extra work is also required for any app not expecting gaps, to listen for them and fail.

Thoughts? Specifically on:

  • decision not to differentiate causes of gaps in JavaScript
  • heuristic to differentiate between good 404s and bad ones
  • pros/cons of strategies for jumping gaps

@joeyparrish joeyparrish added type: enhancement New feature or request and removed type: question A question from the community labels Nov 9, 2016
@joeyparrish joeyparrish added this to the vLater milestone Nov 9, 2016
@agelun
Copy link
Author

agelun commented Nov 9, 2016

Hello, Joey. Thanks for reply.

Step by step:

  1. Broken content. Yes, It's a encoders and/or IP delivery (in case of multicast delivery) problem. But end user don't want to know it )) So it's better to "hide" such problem sometime instead of shout "Hey, we/our streaming partner have dumb network admin, he cant configure QoS" )
  2. 404 with VoD content. I can agree with you in this part. Usually content provider is able to check content consistency before publishing. Especially in case of "SegmentBase" profile.
  3. 404 with Live. Maybe it'll be better to check server time from HTTP response header (?)
  4. Handling strategies. I think only second one will be "production friendly" because:
  • Automatic seek over gap can be surprising for end-user. Usually app developer wants to notify somehow about problem, switch to "stub" advertisement or disable such seek at all (for example when you'll be switched to another film. Imagine, your kid watched cartoons and suddenly was switched to some horror blockbuster )) )
  • Generated event can be used for monitoring purposes as well (to notify monitoring system about such discontinuity/gap)
  • Discontinuity events can be unified with other stream events (e.x. EventStream/Event from MPD). So it'll be more generic and more elegant.

And I don't understand why you think it'll be more complex for app developer. Event can be ignored and playback will stop (like now).

@sandersaares
Copy link
Contributor

This topic is currently also under discussion/development in the dash.js player. Perhaps some collaboration might be beneficial. I link here the dash.js discussion on the topic.

Dash-Industry-Forum/dash.js#651

@dobrusev
Copy link

dobrusev commented Feb 6, 2017

I have an issue with DASH + Widevine Live stream that player make requests for the segment with segment number -1 and the response from the server of course is 404.

@joeyparrish Can the reason for that be temporary stream unavailability as well?

@TheModMaker
Copy link
Contributor

@dobrusev That seems unlikely. If the player is requesting a bad segment number, that seems like a bug. Please file a new issue, preferably with a link to the manifest.

@aztlan2k
Copy link

aztlan2k commented Mar 3, 2017

@joeyparrish @TheModMaker Seems this was pushed out to vLater (not sure what that means) ... any chance this is still being actively considered/pursued? Last update looks to be sometime late last year?

if we're generating a live manifest and we experience a stream loss, it would be nice to be able to include an empty <Period duration="PTxS"> element (as the spec indicates is acceptable) to account for that gap and know that shaka is going to properly jump that gap.

@joeyparrish
Copy link
Member

joeyparrish commented Mar 3, 2017

We plan to implement what we are calling "gap jumping": automatically jumping the play head over gaps in the timeline. vLater meant that it is not scheduled for a specific release yet. (This milestone has since been renamed to "Backlog".) We have decided to wait until after v2.1.0 is released, to avoid any risk to that release.

We will release a design doc in the near future, and link to it here.

Thank you for your patience with this issue. Tolerating gaps in the timeline is a big change in behavior for Shaka Player, and we are being very careful.

@aztlan2k
Copy link

aztlan2k commented Mar 3, 2017

@joeyparrish Thanks for the update! Good to know. We'll keep our eye out that design doc and progress on this in the future. For our purposes, we plan on creating a period with placeholder content to bridge the gap. Hopefully that'll get us past this for now.

@joeyparrish
Copy link
Member

There have been so many related issues that it is difficult to track. Since github's issue tracker doesn't have a field for related issues, I have manually compiled a list for posterity:

#180 #377 #472 #654 #656 #661 #668 #670 #672 #731 #732

@joeyparrish joeyparrish modified the milestones: Backlog, v2.1.0 Mar 31, 2017
shaka-bot pushed a commit that referenced this issue Apr 3, 2017
This adds the new configuration options that will be used in gap
jumping.  This also changes Playhead to accept the config over the
rebuffing goal directly.

Issue #555

Change-Id: I467690ad1f417e69634087e04e0b44c98e1c9b81
shaka-bot pushed a commit that referenced this issue Apr 3, 2017
Issue #555

Change-Id: Ifb8525ad8924f2581f1aa72bc2278b8bc9d857f3
@TheModMaker
Copy link
Contributor

We just pushed a design doc for our implementation of gap jumping. I have also started working on it and hope to have something in within the next week or two.

The basic idea is for encoders to introduce gaps in the manifest (e.g. missing segments, or Periods with both a start and duration). Then we will jump any of the resulting gaps. If the gap is larger than a configurable threshold, we will raise an event.

shaka-bot pushed a commit that referenced this issue Apr 5, 2017
Now the DASH parser will remove gaps that appear between Periods.
It removes the gaps by adjusting the duration of the previous Period
to fill the gap.  Because it affects the duration, the last segment of
the Period will also be adjusted. This only affects the segment index;
if this adjustment results in gaps in the media, they will not be
jumped (yet).

Issue #555

Change-Id: Idd2cd7ad960855be01565615c8917f7191b29503
shaka-bot pushed a commit that referenced this issue Apr 17, 2017
The bulk of the logic for gap jumping is handled in Playhead.  It
tracks the current buffered ranges and jumps over any gaps that appear.
It listens for a special browser event ('waiting') for when the video
element runs out of playable frames.

This change also removes the logic for jumping gaps at the beginning
of the timeline.  This is handled by the more general gap jumping
logic and works cross-browser.

Finally, this updates the buffering logic to only count the amount of
content buffered (i.e. ignoring the gaps).  This fixes some bugs where
gaps in the content can result in StreamingEngine buffering forever
since it thinks only a little is buffered.

This includes full implementation of the logic, but this doesn't close
the issue since there aren't any integration tests yet.  Those will
be added next.

Issue #555

Change-Id: Id99eb9fe469e8cf2c7464a3d70c3733791e806e0
@TheModMaker
Copy link
Contributor

We just pushed the main implementation of gap jumping. Now any gaps that appear in the media or the manifest will be jumped automatically. You can see it live on the nightly page.

"Small" gaps are jumped automatically. These are gaps smaller than config.streaming.smallGapLimit, default of 0.5 seconds. This can be increased using player.configure(). "Large" gaps result in an event on Player (type largegap). These can also be jumped automatically by setting config.streaming.jumpLargeGaps to true.

@shaka-project shaka-project locked and limited conversation to collaborators Mar 22, 2018
@shaka-bot shaka-bot added the status: archived Archived and locked; will not be updated label Apr 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status: archived Archived and locked; will not be updated type: enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

8 participants