Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Playback not stable/reliable enough: How to pinpoint the problem? #717

Closed
torerikal opened this issue Aug 25, 2015 · 11 comments
Closed

Playback not stable/reliable enough: How to pinpoint the problem? #717

torerikal opened this issue Aug 25, 2015 · 11 comments

Comments

@torerikal
Copy link

We are experiencing several instability and reliance problems with Dash.js. Too often, and in most cases quite random, the playback just freezes/stalls, without ever resuming.

It looks like the sources of this symptom can be quite different. The most apparent and reproducible issue occurs with live streams, for example this one:

http://hls-live.akamai.tv2.no/wzlive/_definst_/smil:Wza01_dashtest.smil/manifest.mpd

Repeated playback attempts of this stream in the nightly built version, http://vm2.dashif.org/dash.js/samples/dash-if-reference-player/index.html, shows that the stream stalls without recovering. Not every time, but in more than 1 of 10 playback attempts.

It is possible to get the stall to occur within the first playback minute, with this frequency, but we also have examples of streams playing half an hour before it happens.

We also experience the issue #716 in some playback attempts, but when that error doesn't occur, the non-resuming stall symptoms still appear more often than every 10th attempt.

The error occurs in Chrome 44.0.2403.155 (64 bit) with good network conditions.

It might be something wrong with the stream characteristics, but how can we pinpoint this? Observing log output, and inspecting state by adding breakpoints, has revealed the following for the mentioned live stream:

  • There are no errors occuring on the video element (like MEDIA_ERR_DECODE etc.)
  • At some point, Dash.js just gives up on adding new segments to be fetched.
  • Despite this, the manifest is updated, and contains new t="" offsets for the first segments in the SegmentTimeline (which should also give a new last segment as the live edge).
  • The browser console log contains no other indications that something wrong has happened, except that the "Buffered range" entries disappear when no more segments are scheduled for download.
  • All fragment and MPD requests are performed without errors (i.e. no 404s or similar).

We can't point at anything wrong with the stream manifest. Neither are we able to point at code parts where the "abandonment" of adding new segments is happening. When inspecting the executing code with breakpoints, it looks like the condition could originate from anywhere in rules or controllers.

Any qualified suggestions on what's going wrong, or how to troubleshoot this, would be helpful.

It should be mentioned that the 1.4.0 release version consistently gives up on the live stream example after 15 seconds. This looks like another issue.

We have also observed some other examples of on demand streams never resuming after stall state being set. In this case, the buffer level runs low due to connectivity issues, but when it gets above the threshold again, the playback is not resumed anyhow. Unfortunately, this error is not that frequent or easy to reproduce.

For stall conditions that doesn't resume, even if all requests have completed successfully, and there are no bandwidth or connectivity issues, Dash.js could benefit from having a more "active" approach to identifying it, and at least reporting the reason for why it doesn't resume, if not being able to resume playback.

@KozhinM
Copy link
Contributor

KozhinM commented Aug 26, 2015

We also experience the issue #716 in some playback attempts, but when that error doesn't occur, the non-resuming stall symptoms still appear more often than every 10th attempt.

I implemented a fix for this one (see #716).

@KozhinM
Copy link
Contributor

KozhinM commented Aug 26, 2015

@torerikal, the following change should fix one of the causes of the issue:
https://github.com/MSOpenTech/dash.js/commit/556dbc9a871f6fe20a4275662d018f5808aae26f
Seems like there are other causes, because I still see playback stops sometimes, but now it happens not so frequently.

@bwidtmann
Copy link
Contributor

Hi @torerikal

We already noticed that 1.4 release has stability problems (stalling/freezing). For me (I can only speak about on demand streaming) there are 2 different cases: Did the stall/freeze happen after seeking or without seeking?

Case 1 (after seeking):
The root of this problem is the silently pruning of source buffer by the browser itself. Therefore you need either the fix we provided (see PR #633 ) or the alternate fix provided by BBC (see PR #713). Please try one of these and give us feedback to the corresponding discussions.

Case 2 (without seeking):
The root of this problem is sometimes the complexity of multiple pending and rejected requests. Please try PR #670 (which is already merged in current dev branch).

The combination of both fixes provides us stable on demand streaming on our productive system

@torerikal
Copy link
Author

@KozhinM I have added your mentioned change (i.e. the last one, along with the #716 fix), and I agree that it looks like the stream stalls less frequently. But also I observe that there are still issues causing the same symptom. Thanks for your help.

@KozhinM
Copy link
Contributor

KozhinM commented Aug 28, 2015

@torerikal, thanks for testing the fixes. I have already merged them into the dev branch. Continue investigating the rest causes.

@KozhinM
Copy link
Contributor

KozhinM commented Aug 31, 2015

@torerikal, the test mpd that you provided seems to be down. Do you have any other mpds to test? I have got a couple, but they have a single representation for audio/video and the playback appears to be stable. I suspect that the issues can be related to the quality switch process, so if you have a live content with multiple representations it would be very helpful.

@torerikal
Copy link
Author

Sorry, we closed down the live stream by the end of last week. It is restarted now, so it'd be great if you could have a look, @KozhinM .

@bwidtmann, The most apparent condition is what you describe in case 1, occurring after seeking. We have applied your fix, but need to isolate and re-run the test cases with and without it, before we are able to give an answer with confidence. Unfortunately we are a bit tight on our schedule, so it might take some more time.

@bwidtmann
Copy link
Contributor

@torerikal try setting this config params if you want to achieve more stable seeking behavior:
MediaPlayer.dependencies.BufferController.BUFFER_TO_KEEP = 10; MediaPlayer.dependencies.BufferController.BUFFER_PRUNING_INTERVAL = 10;

@KozhinM
Copy link
Contributor

KozhinM commented Sep 1, 2015

Update: sometimes the stall occurs because DashHandler.isMediaFinished returns true while it should return false. I observe that in this case start time of the segment in isMediaFinished > period end time. Segment time is correct since we get it from the mpd. Period end time is calculated as periodEnd = checkTime = fetchTime + minimumUpdatePeriod = (mpdLoadedTime - availabilityStartTime + clientServerTimeShift) + minimumUpdatePeriod. Since this stream has SegmentTimeline the value of clientServerTimeShift is calculated as a difference between start time of the last segment in the mpd (actual live edge) and the expected value of the live edge: Math.min(checkTime, now)) - segmentDuration;. I could not find a bug in these calculations yet, but if I set clientServerTimeShift = 0; in TimelineConverter.onLiveEdgeSearchCompleted then it helps to avoid isMediaFinished to be true. Continue investigating.

@torerikal
Copy link
Author

We have done further tests of the on demand seeking issue. With PR633 the problem is less apparent, and the BufferController config params made the issue almost un-reproducible. However, we did manage to still get the stall symptom after a lot of attempts.

Our observations show that sometimes Dash.js gives up on fetching segments, and sometimes it never resumes playing when the buffer level becomes good enough again. In the latter case, we can resume playback manually by modifying the playback rate directly on the video DOM element.

Since this issue was reported, we have switched to another solution for DASH playback in the browser. This makes it less relevant to do any more troubleshooting of the issues described here.

However, I hope these discussions and testing reports has brought some value to the Dash.js project also for other users of the library, and to the quality of the project.

@dsparacio
Copy link
Contributor

@torerikal I encourage you to check out Dash.js 2.0 as we fixed many of the stability issues. I am going to close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants