Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix management of dash segment template index references for multiperiod live DASH streams #3419

Closed

Conversation

caridley
Copy link
Contributor

Description

Fixes #3354

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to
    not work as expected)
  • This change requires a documentation update

Checklist:

  • I have signed the Google CLA https://cla.developers.google.com
  • My code follows the style guidelines of this project
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • I have verified my change on multiple browsers on different platforms
  • I have run ./build/all.py and the build passes
  • I have run ./build/test.py and all tests pass

Testing on other platforms is in still in progress, but I would like to get some early feedback on the proposed changes.

Could use some guidance on building unit tests for this. I don't currently see any tests the timer based extension and evictions of references for live segment template indexes.

Note one automated test is failing on Safari, but I don't think it has anything to do with my changes
Safari 14.1 (Mac OS 10.15.7) CastUtils serialize/deserialize TimeRanges deserialize into equivalent objects FAILED
Error: Timeout - Async function did not complete within 120000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 7525)

Expected 0 to be greater than 0.

test/cast/cast_utils_unit.js:257:9 <- test/cast/cast_utils_unit.js:367:48

@google-cla
Copy link

google-cla bot commented May 19, 2021

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@caridley
Copy link
Contributor Author

I don't know why the CLA check is failing. @joeyparrish manually checked for my e-mail in our contributors group https://groups.google.com/d/forum/charter-contributors

Would a difference in capitalization of e-mail address cause a problem? My e-mail address is [email protected] in the google group [email protected] in my github account?

@caridley
Copy link
Contributor Author

@googlebot I signed it!

@google-cla
Copy link

google-cla bot commented May 20, 2021

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

// need to evict old references.
const willNeedToEvictReferences = getPeriodEnd() != Infinity &&
(presentationTimeline.getSegmentAvailabilityStart() > periodStart) &&
(presentationTimeline.getSegmentAvailabilityEnd() > getPeriodEnd());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not certain that this is the best way to compute that we will need to evict references.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you are right that there are two cases:

  1. We need to add segments. This happens when the period goes beyond the availability window. Your logic for this already looks good to me.
  2. We need to evict segments. This happens when the period is available, but the availability duration is finite. Try this, with the computation done in stages for clarity:
const periodAvailable = availabilityStartTime <= getPeriodEnd();
const availabilityFinite = presentationTimeline.getSegmentAvailabilityDuration() != Infinity;
const willNeedToEvictReferences = periodAvailable && availabilityFinite;

The getter getSegmentAvailabilityDuration() will need to be added to PresentationTimeline. There is an internal member for it already called segmentAvailabilityDuration_.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joeyparrish what do you think about simplifying this to

const willNeedToEvictReferences = presentationTimeline.isLive();

All live streams and only live streams have a segmentAvailabilityStart time that moves forward and therefore have segment references that that need to be evicted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is broader, but some false positives should be okay. A live stream can have an infinite availability duration (technically), in which case nothing would ever be evicted. But it may not matter much if we ignore that case here.

@@ -426,7 +453,7 @@ shaka.dash.SegmentTemplate = class {
references.push(reference);
nextPosition++;
}
if (presentationTimeline.getSegmentAvailabilityEnd() >= periodEnd &&
if (availabilityStartTime > getPeriodEnd() &&
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is needed to make the timer continue firing until the entire period is prior to the segment availability window so that all references will be evicted. Prior to this the timer stopped firing prematurely causing a large number of these error messages

"Assertion failed: Should not see evicted segments after available segments"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see! That makes sense. Good catch, and thank you for the fix!

Can you please put your PR comment into a code comment, so that some well-meaning person doesn't some day revert your change to "optimize" something? Something like this would work for me:

// The timer must continue firing until the entire period is
// unavailable, so that all references will be evicted.

@caridley
Copy link
Contributor Author

@googlebot I signed it!

@google-cla
Copy link

google-cla bot commented May 20, 2021

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@caridley
Copy link
Contributor Author

@googlebot I signed it!

@google-cla
Copy link

google-cla bot commented May 20, 2021

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@caridley
Copy link
Contributor Author

@googlebot I signed it!

@google-cla
Copy link

google-cla bot commented May 20, 2021

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@caridley
Copy link
Contributor Author

@googlebot I signed it!

@google-cla
Copy link

google-cla bot commented May 21, 2021

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@caridley
Copy link
Contributor Author

@googlebot I signed it!

@google-cla
Copy link

google-cla bot commented May 21, 2021

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@joeyparrish
Copy link
Member

@caridley, please double-check the email address in your commits to make sure it's a charter.com address. If so, I may just need to override the CLA bot on this one.

@caridley
Copy link
Contributor Author

@joeyparrish double checked e-mail in commits

shaka-player % git log
commit 0624f23 (HEAD -> dashSegmentTemplatePeriodEnd, origin/dashSegmentTemplatePeriodEnd)
Merge: ccdb8a1 ebb4e18
Author: Aidan Ridley [email protected]
Date: Wed May 19 14:36:06 2021 -0600

Merge branch 'master' into dashSegmentTemplatePeriodEnd

commit ebb4e18
Author: Aidan Ridley [email protected]
Date: Wed May 19 14:35:25 2021 -0600

Add Charter Communication Inc to AUTHORS and Aidan Ridley <[email protected]> to CONTRIBUTORS

commit ccdb8a1
Author: Aidan Ridley [email protected]
Date: Wed May 19 14:00:05 2021 -0600

Prevent segment template indexes from continuing to grow after a finite period end time becomes available in the manifest.
Change time comparison that determines if we index extending timer continues to fire, so that it will evict references when availability start time passes the period end time.
Schedule the segment template index timer when it is needed to evict reference event thought it is not needed for adding referen
In streaming engine getSegmentReferenceNeeded_() clear the mediaState.segmentIterator when unable to locate a segment with getIteratorForTime() so that we can recover on the on the next update

commit b426fe8 (origin/master, origin/HEAD, master)
Author: Álvaro Velad Galván [email protected]
Date: Mon May 17 23:24:55 2021 +0200

...

@joeyparrish
Copy link
Member

I see now that earlier in this thread, you mention that I had verified that you are in the charter-contributors group. But that's not true. I apologize for any miscommunication. I don't have access to that group or its membership, so I have no idea if you are in it or not. Someone in Charter should have access to add you, but I don't know who that is. It appears that I can file a ticket within Google to find out who the group admin is at Charter. Please let me know if I need to do this.

In the meantime, I'll go ahead with review of the PR. I've file a ticket internally to ask if there is a way the CLA bot could provide more useful information to understand why a check fails in cases like yours.

@caridley
Copy link
Contributor Author

I know who the charter-contributors group admin is and have verified that I am in that group as [email protected]

Copy link
Member

@joeyparrish joeyparrish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Thank you for your hard work!

lib/dash/segment_template.js Outdated Show resolved Hide resolved
@@ -426,7 +453,7 @@ shaka.dash.SegmentTemplate = class {
references.push(reference);
nextPosition++;
}
if (presentationTimeline.getSegmentAvailabilityEnd() >= periodEnd &&
if (availabilityStartTime > getPeriodEnd() &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see! That makes sense. Good catch, and thank you for the fix!

Can you please put your PR comment into a code comment, so that some well-meaning person doesn't some day revert your change to "optimize" something? Something like this would work for me:

// The timer must continue firing until the entire period is
// unavailable, so that all references will be evicted.

// need to evict old references.
const willNeedToEvictReferences = getPeriodEnd() != Infinity &&
(presentationTimeline.getSegmentAvailabilityStart() > periodStart) &&
(presentationTimeline.getSegmentAvailabilityEnd() > getPeriodEnd());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you are right that there are two cases:

  1. We need to add segments. This happens when the period goes beyond the availability window. Your logic for this already looks good to me.
  2. We need to evict segments. This happens when the period is available, but the availability duration is finite. Try this, with the computation done in stages for clarity:
const periodAvailable = availabilityStartTime <= getPeriodEnd();
const availabilityFinite = presentationTimeline.getSegmentAvailabilityDuration() != Infinity;
const willNeedToEvictReferences = periodAvailable && availabilityFinite;

The getter getSegmentAvailabilityDuration() will need to be added to PresentationTimeline. There is an internal member for it already called segmentAvailabilityDuration_.

lib/media/streaming_engine.js Outdated Show resolved Hide resolved
@joeyparrish
Copy link
Member

I know who the charter-contributors group admin is and have verified that I am in that group as [email protected]

Acknowledged. I've added this info to the ticket I filed internally. Hopefully we'll resolve this soon.

@TheModMaker
Copy link
Contributor

Also, I wonder if this complex work is needed for the bug. From your description, it seems the problem is that we remain updating segment indexes forever. Could you just stop the timer once we get a new Period? We only need to update the segment indexes in the last Period; so you could just have the manifest update stop index updates for non-last Periods and call fit one last time to fit to the updated duration?

@joeyparrish
Copy link
Member

Also, I wonder if this complex work is needed for the bug. From your description, it seems the problem is that we remain updating segment indexes forever. Could you just stop the timer once we get a new Period? We only need to update the segment indexes in the last Period; so you could just have the manifest update stop index updates for non-last Periods and call fit one last time to fit to the updated duration?

I think I agree with the conclusion that we should continue evicting until the period is empty. Otherwise, we end up with some old periods that are empty and some that aren't (based on timing of the timer callbacks WRT the moving availability window). This results in assertion failures, and I suspect, the hanging issues some have been seeing.

The other issue is the updating-forever issue, which is wasteful, but otherwise doesn't seem to break anything AFAICT. This is the issue solved by getting updated period durations throughout the session.

@caridley, have I accurately represented all of that from your point of view?

@caridley
Copy link
Contributor Author

The other issue is the updating-forever issue, which is wasteful, but otherwise doesn't seem to break anything AFAICT.

Forever appending to the segment index for a period that does not initially have a finite duration is in fact the biggest problem. It makes the streaming engine attempt to fetch segments that do not exist at the end of an advertising period. Then for reasons I don't entirely understand, when the stalled player position falls our of the availability window, the player starts fetching and playing segments from the main stream instead of the inserted ad periods. Those segment happen to exist even through they are not used by the manifest during an ad break. Video is being played and you think everything is ok, but the wrong content is being played.

I am guessing that when player tries to recover by seeking forward into the availability window the segment index for the first period of main content has extended itself all the way to the availability end time.

@caridley
Copy link
Contributor Author

We are still trying to resolve our CLA issue, but we do still intend to complete this PR

@caridley
Copy link
Contributor Author

caridley commented Jun 1, 2021

@googlebot I signed it!

@google-cla
Copy link

google-cla bot commented Jun 1, 2021

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@caridley
Copy link
Contributor Author

caridley commented Jun 3, 2021

@googlebot I signed it!

@joeyparrish
Copy link
Member

Related fix for SegmentTimeline content: 3222370

caridley added 2 commits June 14, 2021 07:52
…hen unable to locate a segment at the requested time, per request from Joey
@caridley
Copy link
Contributor Author

@caridley, may we merge this PR now for a much-needed bugfix release, and follow-up with additional work once you find the cause of the assertion failure? May I offer to help debug in some way?

I have not been able to reproduce that assertion failure this morning, and I am OK with merging this PR.

In fact, I would really like to merge this PR as two distinct commits: one to address the DashParser issues around segment reference generation, and a second to resolve the issues with StreamingEngine and SegmentIterator. I think if we are bisecting or cherry-picking in the future, having these as distinct commits will be useful.

Do you need me to create separate PRs for the two issues? or is this something that is done on your end?

@joeyparrish
Copy link
Member

Do you need me to create separate PRs for the two issues? or is this something that is done on your end?

No, I will just rebase and merge the PR manually as two commits. Thanks!

joeyparrish pushed a commit that referenced this pull request Jun 16, 2021
In DASH with SegmentTemplate and a fixed duration, indexes would
continue to grow so long as they had not ended.  However, this was
based on a callback which captured an end time variable that could
become out-of-date in a multi-period live stream.

This prevents indexes from continuing to grow indefinitely by always
using the most up-to-date end time for the period.

This also fixes eviction of segments in the same scenario, which
fixes SegmentIterator access and related assertions later.

Issue #3354 (partial fix)

b/186457868

Change-Id: I2b34ee52dd12b59e1c1237258050b50e3189bee3
joeyparrish pushed a commit that referenced this pull request Jun 16, 2021
In multi-period live DASH streams, the SegmentIterator would sometimes
get reset to a value pointing to the beginning of the segment
references for the period, causing StreamingEngine to re-fetch
segments it already had.

This fixes SegmentIterator's getIteratorForTime() to return null when
a reference can't be found, and fixes StreamingEngine to handle null
returns by waiting and checking again.

Closes #3354

b/186457868

Change-Id: I2b34ee52dd12b59e1c1237258050b50e3189bee3
@joeyparrish
Copy link
Member

I don't know what I should have done differently to make GitHub see the PR as merged. It's possible that my two-commit breakdown made that impossible. Sorry! But it is merged, and you are listed as the author of both commits.

Thank you so much for your contributions and your patience. I'm going to start the process of cherry-picking, testing, and releasing v3.1.1, v3.0.12, and v2.5.23 now.

@joeyparrish joeyparrish reopened this Jun 16, 2021
joeyparrish added a commit that referenced this pull request Jun 16, 2021
Based on discussion of PR #3419

Backported to v3.0.x

Change-Id: I02bfc615a169b54b94cdc624ace77d9ffefd624e
joeyparrish added a commit that referenced this pull request Jun 16, 2021
In DASH with SegmentTimeline, we had previously assumed that the
timeline would only ever have items added to its end.  In practice,
some content adds segments to the beginning of the timeline in an
update.

When this happened, assumptions higher up the stack were broken, and
StreamingEngine began streaming from the wrong position, re-fetching
segments that were already buffered.

It is relatively simple to filter the incoming segment references to
ensure that we only ever grow our list from the end.  This fixes
assumptions in other components and resolves the re-fetching issue for
SegmentTimeline-based content.

PR #3419 will further address issues with SegmentTemplate+duration.

See also issue #3354
b/186457868

Change-Id: Ibc03a697dd6a2a6f5e0f539cb7cf94bd9a63b495
joeyparrish pushed a commit that referenced this pull request Jun 16, 2021
In DASH with SegmentTemplate and a fixed duration, indexes would
continue to grow so long as they had not ended.  However, this was
based on a callback which captured an end time variable that could
become out-of-date in a multi-period live stream.

This prevents indexes from continuing to grow indefinitely by always
using the most up-to-date end time for the period.

This also fixes eviction of segments in the same scenario, which
fixes SegmentIterator access and related assertions later.

Issue #3354 (partial fix)

b/186457868

Change-Id: I2b34ee52dd12b59e1c1237258050b50e3189bee3
joeyparrish pushed a commit that referenced this pull request Jun 16, 2021
In multi-period live DASH streams, the SegmentIterator would sometimes
get reset to a value pointing to the beginning of the segment
references for the period, causing StreamingEngine to re-fetch
segments it already had.

This fixes SegmentIterator's getIteratorForTime() to return null when
a reference can't be found, and fixes StreamingEngine to handle null
returns by waiting and checking again.

Closes #3354

b/186457868

Backported to v3.0.x

Change-Id: I2b34ee52dd12b59e1c1237258050b50e3189bee3
joeyparrish added a commit that referenced this pull request Jun 16, 2021
Based on discussion of PR #3419

Change-Id: I02bfc615a169b54b94cdc624ace77d9ffefd624e
joeyparrish added a commit that referenced this pull request Jun 16, 2021
In DASH with SegmentTimeline, we had previously assumed that the
timeline would only ever have items added to its end.  In practice,
some content adds segments to the beginning of the timeline in an
update.

When this happened, assumptions higher up the stack were broken, and
StreamingEngine began streaming from the wrong position, re-fetching
segments that were already buffered.

It is relatively simple to filter the incoming segment references to
ensure that we only ever grow our list from the end.  This fixes
assumptions in other components and resolves the re-fetching issue for
SegmentTimeline-based content.

PR #3419 will further address issues with SegmentTemplate+duration.

See also issue #3354
b/186457868

Change-Id: Ibc03a697dd6a2a6f5e0f539cb7cf94bd9a63b495
joeyparrish pushed a commit that referenced this pull request Jun 16, 2021
In DASH with SegmentTemplate and a fixed duration, indexes would
continue to grow so long as they had not ended.  However, this was
based on a callback which captured an end time variable that could
become out-of-date in a multi-period live stream.

This prevents indexes from continuing to grow indefinitely by always
using the most up-to-date end time for the period.

This also fixes eviction of segments in the same scenario, which
fixes SegmentIterator access and related assertions later.

Issue #3354 (partial fix)

b/186457868

Change-Id: I2b34ee52dd12b59e1c1237258050b50e3189bee3
joeyparrish pushed a commit that referenced this pull request Jun 16, 2021
In multi-period live DASH streams, the SegmentIterator would sometimes
get reset to a value pointing to the beginning of the segment
references for the period, causing StreamingEngine to re-fetch
segments it already had.

This fixes SegmentIterator's getIteratorForTime() to return null when
a reference can't be found, and fixes StreamingEngine to handle null
returns by waiting and checking again.

Closes #3354

b/186457868

Change-Id: I2b34ee52dd12b59e1c1237258050b50e3189bee3
@caridley
Copy link
Contributor Author

caridley commented Jun 17, 2021

I don't know what I should have done differently to make GitHub see the PR as merged. It's possible that my two-commit breakdown made that impossible. Sorry! But it is merged, and you are listed as the author of both commits.

Thank you so much for your contributions and your patience. I'm going to start the process of cherry-picking, testing, and releasing v3.1.1, v3.0.12, and v2.5.23 now.

@joeyparrish we have not been seeing these problems with the 2.5.x versions

@joeyparrish
Copy link
Member

@joeyparrish we have not been seeing these problems with the 2.5.x versions

This particular change didn't apply to v2.5, but I was waiting to release from all three active branches.

@mlevkov
Copy link

mlevkov commented Jul 20, 2021

@joeyparrish and @caridley Do this fix intended to support "static" or only live? I am seeing this issue with the multi-period static type.

@caridley
Copy link
Contributor Author

@joeyparrish and @caridley Do this fix intended to support "static" or only live? I am seeing this issue with the multi-period static type.

I think this particular problem was specific to live multi-period streams. We never observed this problem with our static VOD multi-period streams.

@joeyparrish
Copy link
Member

That's correct. The change only affected live, multi-period DASH.

@mlevkov
Copy link

mlevkov commented Jul 21, 2021

@caridley I have this case right now and would be open to sharing whatever (within reason) you need to confirm. @joeyparrish Thank you for letting me know.

@TheModMaker
Copy link
Contributor

@mlevkov If you have a problem, please file a new issue, filling out the bug template with your manifest content.

@github-actions github-actions bot added the status: archived Archived and locked; will not be updated label Jul 25, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status: archived Archived and locked; will not be updated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Live DASH multiperiod SSAI not working in shaka 3.x.x because segment template indexes extend forever
4 participants