Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Improve FREEZING mechanisms and add Representation avoidance mechanism #1523

Open
wants to merge 10 commits into
base: dev
Choose a base branch
from

Conversation

peaBerberian
Copy link
Collaborator

@peaBerberian peaBerberian commented Sep 2, 2024

Situation

We have lately seen on some LG and Philips TVs what we call "infinite FREEZING" occurences: the playback position (the HTMLMediaElement's currentTime property) was not advancing and most of the time video media was not playing either (though in some occurences audio was) despite having largely enough media data in media buffers. Sadly our usual trick to restart playback, seeking close to the current position, didn't seem to have an effect.

Most of those freezing cases were happening as playback switched from a video quality to another (sometimes both had to be specific, sometimes only the destination one had an impact).

It is probably better to detect which situations causes which problems on which devices to work-around it either in the RxPlayer (when for example we KNOW that a device has issues with a given codec in general) or inside the application (when we only reproduced it on a specific application and there's many unknowns left). Yet we found out after encountering issues again and again, especially on some smart TVs, that it may not be realistic to catch in advance (before production) all potential breakages that could exist.

So instead, we thought about the possibility to define an heuristic detecting that we're probably encountering this type of issue, to then work-around it automatically.

Rough idea of the solution

So, the idea is to detect when a FREEZING case occurs right when switching from a given video/audio quality to another. When that's the case:

  1. We first try to "un-freeze" playback, by e.g. performing a small seek. This is what we already do today in FREEZING cases.

  2. If that doesn't work (we're still freezing), I added here a second mechanism: we "avoid" the corresponding new quality played, so it isn't played anymore in most cases.

So basically once we see that switching to another quality lead to FREEZING, we avoid playing that quality for the rest of playback.

Unlike other scenarios where we prevent playback of qualities (in the RxPlayer called Representation), here the "avoidance mechanism" is only considered if there's other, non-avoided, qualities we can play on the current track: meaning that if all available qualities are "avoided", we will play them again. This is to protect against errors in cases where all qualities would be avoided: because our heuristic is not 100% sure science, we would here prefer there to keep playback potentially happening (the alternative would have been stopping playback on an error).

As of now, the "Representation avoidance" has no impact on the API: corresponding Representation are still returned as if they are playable through our getVideoTrack API for example, and the user can still choose to call lockVideoRepresentation on avoided Representation without knowing. As of now, the "main thread" part of the code isn't even aware of such avoidance happening, for it we're just curiously never seem to be playing some Representation. This also means that our DEBUG_ELEMENT feature won't even notify for now if there's avoidance happening.

This may change in the future, but for now, this is because I did not want to complexify this experiment too much.

PS: I like this "avoidance" concept, as it seems to me to be portable for much more future cases where we would prefer not playing qualities because they seem to present issues (e.g. if MEDIA_ERR_DECODE or BUFFER_APPEND_ERROR errors seem to only happen with some Representation).

Bonus: also reloading in other cases

There are several other posibilities for infinite FREEZING: Period changes gone wrong, random decoding issues, etc.

To also provide a better solution for those cases, I choose to "reload" (in the RxPlayer, "reloading" means removing and re-creating all the media buffers, leading to a temporary black screen) if the initial "un-freezing" attempt (the seek-based one) didn't have any effect.

Unlike when the freezing seems to be linked to a quality change though, there's no Representation avoidance happening in that case.

Implementation

FreezeResolver

I renamed the very-specific DecipherabilityFreezeDetector class - which only handled freezes linked to DRM - into a more general FreezeResolver class, which will now perform all un-freezing attempts (even the seekinf one, which previously happened in the RebufferingController in main thread whereas it's now in core - thus optionally running in our Worker).

The FreezeResolver's onObservation method has to be called even when playback goes well, at each produced media observation, because it needs to construct its history of played segments (see next chapter).

Quality switch detection

Knowing whether we switched from a given media quality to another is not straightforward and we can never really know for sure as there's edge cases where it might be device-specific (e.g. segment replacements, segment pushing still pending etc.).

However, we have a rough idea by inspecting the HTMLMediaElement's currentTime property (which is roughly the current media position being played), and the content of our SegmentInventory class, which stores metadata about all media segments available in the buffer.

From there, I maintain a short-term history in the FreezeResolver of what seemed to be the current video AND audio quality played. If a FREEZING, which we do not seem to be able to fix by seeking, seems to coincide with a quality switch, we propose to avoid the Representation.

For now this also lead to a RELOADING operation, though it may not even be required in some cases (e.g. replacing segments and seeking in place might do the trick). Yet reloading should have more chance of fixing it (even though it leads to a temporary black screen, where a still video frame would have been less disruptive for users).

Avoiding Representation

Then, an AdaptationStream is able to filter out avoided Representation (unless it has no choice), when asking the AdaptiveRepresentationSelector (managing our Adaptive BitRate logic) which Representation should be played (though I'm still not sure whether the AdaptiveRepresentationSelector should actually be the one doing that?).

Where we go from there

This heuristic seems somewhat risky. as we're essentially blacklisting qualities from ever be playing again on the current content.

So what I thought was to put the avoidance mechanism behind an experimental experimental.enableRepresentationAvoidance loadVideo option, that applications would have to enable (making it also possible to disable at any time easily through some config).

However the reload mechanism if un-freezing fails seems OK to me for enabling it by default.

If we do notice clear improvements, we might think of enabling the avoidance mechanism by default, removing the
experimental.enableRepresentationAvoidance option.

Thoughts?

@peaBerberian peaBerberian added proposal This Pull Request or Issue is only a proposal for a change with the expectation of a debate on it Priority: 2 (Medium) This issue or PR has a medium priority. labels Sep 3, 2024
@peaBerberian peaBerberian force-pushed the misc/lg-detect branch 3 times, most recently from 173db61 to 6287c96 Compare September 5, 2024 09:38
@peaBerberian peaBerberian force-pushed the misc/lg-detect branch 4 times, most recently from a28d152 to 9455db5 Compare October 25, 2024 09:51
Copy link
Collaborator

@Florent-Bouisset Florent-Bouisset left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand "deprecated" representation, it feels like it's mean that it is no longer supported because the representation uses aged codecs or DRM encrypting.
I could see in the future an usage of marking as deprecated H.264 codecs and prefer H.265 one; or even mark a content encrypted with CENC as deprecated to prefer CBCS encrypted one.

What do you think about "Unstable" or "Avoided" ?

@peaBerberian
Copy link
Collaborator Author

it feels like it's mean that it is no longer supported because the representation uses aged codecs or DRM encrypting

I don't see deprecation as a lack of support but more of a discouragement on its usage personally, but I see your point.

I'll check with Avoidance/Avoided/Avoid if it makes sense in the various usages.
We would thus have something like a experimental.enableRepresentationAvoidance option

@peaBerberian peaBerberian added this to the 4.3.0 milestone Nov 6, 2024
@peaBerberian peaBerberian force-pushed the misc/lg-detect branch 3 times, most recently from f00a8aa to 7243ab1 Compare November 6, 2024 21:33
@peaBerberian
Copy link
Collaborator Author

Done (this late, because during very uneventful on-call duties ;p)

I relied on the "Avoidance" name for now, as "unstable" could mean that the issue is with the media itself - yet we don't know that, all we know is that the device seems to have issues decoding it.

@peaBerberian peaBerberian changed the title [Proposal] Improve FREEZING mechanisms and add Representation deprecation [Proposal] Improve FREEZING mechanisms and add Representation avoidance mechanism Nov 6, 2024
src/core/main/worker/worker_main.ts Outdated Show resolved Hide resolved
src/core/main/worker/worker_main.ts Outdated Show resolved Hide resolved
@@ -708,6 +712,34 @@ export default class MediaSourceContentInitializer extends ContentInitializer {
? !playbackObserver.getIsPaused()
: autoPlay;
onReloadOrder({ position, autoPlay: autoplay });
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does triggerReload access playbackObserver.getReference().getValue() where it can use the observation that was emitted in the parent scope ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've done that because as a closure, it would then have an implicit dependency on the time at which it should be called: i.e. it should be called ASAP - and probably synchronously, as at the time of the next observation it would rely on stale data.

To avoid adding such complex implicit dependencies, I get the last observation, even if right now it's only called synchronously so that last observation is the same one than in the parent scope.

Basically to me: there's nothing in that function declaration indicating that we should call it synchronously. Thus by relying on the parent scope's observation, we risk creating future bugs (or more probably: future weird behaviors where we would be reloading in the "past").

src/main_thread/init/media_source_content_initializer.ts Outdated Show resolved Hide resolved
src/core/main/common/FreezeResolver.ts Outdated Show resolved Hide resolved
src/core/main/common/FreezeResolver.ts Outdated Show resolved Hide resolved

const hasDecipherabilityFreezePotential =
(rebufferingForTooLong || frozenForTooLong) &&
((bufferGap < 6 && !fullyLoaded) || readyState > 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the condition bufferGap < 6 does this means that it cannot be a decipherable freeze if there is more than 6 sec of media data in the buffer?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right this parenthesis looks reversed it should maybe be (bufferGap >= 6 || fullyLoaded)....

Will look at the previous code because it must be a refactoring mistake.

src/default_config.ts Show resolved Hide resolved
src/core/main/common/FreezeResolver.ts Outdated Show resolved Hide resolved
Comment on lines +349 to +389
for (let i = segmentList.length - 2; i >= 0; i--) {
const segment = segmentList[i];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if segmentList.length := 1 ?
I'm not sure we are checking this correctly

Copy link
Collaborator Author

@peaBerberian peaBerberian Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to find the previous Representation's segment I think here. It makes no sense if there's only one entry in history and in that case currentSegmentEntry is already the earliest reference to the current segment, so I think that loop is OK

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean segmentList[i] will be undefined if i is below 0.
Next line segment.segment will throw an error if segment is undefined.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well condition i >= 0; guarantees that i is positive so there will be no error here

peaBerberian and others added 9 commits December 12, 2024 14:49
…recation

Situation
=========

We have lately seen on some LG and Philips TVs what we call "infinite
`FREEZING`" occurences: the playback position (the `HTMLMediaElement`'s
`currentTime` property) was not advancing and most of the time video
media was not playing either (though in some occurences audio was)
despite having largely enough media data in media buffers. Sadly our
usual trick to restart playback, seeking close to the current position,
didn't seem to have an effect.

Most of those freezing cases were happening as playback switched from a
video quality to another (sometimes both had to be specific, sometimes
only the destination one had an impact).

It is probably better to detect which situations causes which problems on
which devices to work-around it either in the RxPlayer (when for example
we KNOW that a device has issues with a given codec in general) or inside
the application (when we only reproduced it on a specific application and
there's many unknowns left). Yet we found out after encountering issues
again and again, especially on some smart TVs, that it may not be
realistic to catch in advance (before production) all potential breakages
that could exist.

So instead, we thought about the possibility to define an heuristic
detecting that we're probably encountering this type of issue, to then
work-around it automatically.

Rough idea of the solution
==========================

So, the idea is to detect when a `FREEZING` case occurs right when
switching from a given video/audio quality to another. When that's the
case:

1. We first try to "un-freeze" playback, by e.g. performing a small seek.
   This is what we already do today in `FREEZING` cases.

2. If that doesn't work (we're still freezing), I added here a second
   mechanism: we "deprecate" (the name can change) the corresponding new
   quality played, so it isn't played anymore in most cases.

So basically once we see that switching to another quality lead to
`FREEZING`, we avoid playing that quality for the rest of playback.

Unlike other scenarios where we prevent playback of qualities (in the
RxPlayer called `Representation`), here the deprecation is only
considered if there's other, non-deprecated, qualities we can play on the
current track: meaning that if all available qualities are deprecated, we
will play them again. This is to protect against errors in cases where
all qualities would be deprecated: because our heuristic is not 100% sure
science, we would here prefer there to keep playback potentially
happening (the alternative would have been stopping playback on an
error).

As of now, the deprecation has no impact on the API: corresponding
`Representation` are still returned as if they are playable through our
`getVideoTrack` API for example, and the user can still choose to call
`lockVideoRepresentation` on deprecated `Representation` without knowing.
As of now, the "main thread" part of the code isn't even aware of
deprecation happening, for it we're just curiously never seem to be
playing some `Representation`. This also means that our `DEBUG_ELEMENT`
feature won't even notify for now if there's deprecation happening.

This may change in the future, but for now, this is because I did not
want to complexify this experiment too much.

PS: I like this "deprecation" concept, as it seems to me to be portable
for much more future cases where we would prefer not playing qualities
because they seem to present issues (e.g. if `MEDIA_ERR_DECODE` or
`BUFFER_APPEND_ERROR` errors seem to only happen with some
`Representation`).

Bonus: also reloading in other cases
====================================

There are several other posibilities for infinite `FREEZING`: Period
changes gone wrong, random decoding issues, etc.

To also provide a better solution for those cases, I choose to "reload"
(in the RxPlayer, "reloading" means removing and re-creating all the
media buffers, leading to a temporary black screen) if the initial
"un-freezing" attempt (the seek-based one) didn't have any effect.

Unlike when the freezing seems to be linked to a quality change though,
there's no deprecation happening in that case.

Implementation
==============

FreezeResolver
--------------

I renamed the very-specific `DecipherabilityFreezeDetector` class - which
only handled freezes linked to DRM - into a more general `FreezeResolver`
class, which will now perform all un-freezing attempts (even the seekinf
one, which previously happened in the `RebufferingController` in main
thread whereas it's now in `core` - thus optionally running in our
`Worker`).

The `FreezeResolver`'s `onObservation` method has to be called even when
playback goes well, at each produced media observation, because it needs
to construct its history of played segments (see next chapter).

Quality switch detection
------------------------

Knowing whether we switched from a given media quality to another is not
straightforward and we can never really know for sure as there's edge
cases where it might be device-specific (e.g. segment replacements,
segment pushing still pending etc.).

However, we have a rough idea by inspecting the `HTMLMediaElement`'s
`currentTime` property (which is roughly the current media position being
played), and the content of our `SegmentInventory` class, which stores
metadata about all media segments available in the buffer.

From there, I maintain a short-term history in the `FreezeResolver` of
what seemed to be the current video AND audio quality played. If a
`FREEZING`, which we do not seem to be able to fix by seeking, seems to
coincide with a quality switch, we propose to deprecate the
`Representation`.

For now this also lead to a `RELOADING` operation, though it may not even
be required in some cases (e.g. replacing segments and seeking in place
might do the trick). Yet reloading should have more chance of fixing it
(even though it leads to a temporary black screen, where a still video
frame would have been less disruptive for users).

Avoiding deprecated `Representation`
------------------------------------

Then, an `AdaptationStream` is able to filter out deprecated
`Representation` (unless it has no choice), when asking the
`AdaptiveRepresentationSelector` (managing our Adaptive BitRate logic)
which `Representation` should be played (though I'm still not sure
whether the `AdaptiveRepresentationSelector` should actually be the one
doing that?).

Where we go from there
======================

This heuristic seems somewhat risky. as we're essentially blacklisting
qualities from ever be playing again on the current content.

So what I thought was to put the deprecation mechanism behind an
experimental `experimental.enableRepresentationDeprecation` `loadVideo`
option, that applications would have to enable (making it also possible
to disable at any time easily through some config).

However the reload mechanism if un-freezing fails seems OK to me for
enabling it by default.

If we do notice clear improvements, we might think of enabling the
deprecation mechanism by default, removing the
`experimental.enableRepresentationDeprecation` option.

Thoughts?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: 2 (Medium) This issue or PR has a medium priority. proposal This Pull Request or Issue is only a proposal for a change with the expectation of a debate on it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants