Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

404'ing segments retry forever and do not respect maxRetries setting #4548

Closed
shakso opened this issue Oct 5, 2022 · 9 comments · Fixed by #4635 or #4424
Closed

404'ing segments retry forever and do not respect maxRetries setting #4548

shakso opened this issue Oct 5, 2022 · 9 comments · Fixed by #4635 or #4424
Assignees
Labels
priority: P2 Smaller impact or easy workaround status: archived Archived and locked; will not be updated type: bug Something isn't working correctly
Milestone

Comments

@shakso
Copy link

shakso commented Oct 5, 2022

Have you read the FAQ and checked for duplicate open issues?
Yes

What version of Shaka Player are you using?
4.2.1

Can you reproduce the issue with our latest release version?
Yes

Can you reproduce the issue with the latest code from main?
Yes

Are you using the demo app or your own custom app?
Demo

If custom app, can you reproduce the issue using our demo app?
N/A

What browser and OS are you using?
Chrome, MacOS

For embedded devices (smart TVs, etc.), what model and firmware version are you using?
N/A

What are the manifest and license server URIs?

https://s3-eu-west-1.amazonaws.com/test-mpds/media-404/test.mpd

What configuration are you using? What is the output of player.getConfiguration()?

Default demo config (see below for JSON stringify)
{"drm":{"retryParameters":{"maxAttempts":2,"baseDelay":1000,"backoffFactor":2,"fuzzFactor":0.5,"timeout":30000,"stallTimeout":5000,"connectionTimeout":10000},"servers":{},"clearKeys":{},"advanced":{"com.widevine.alpha":{"distinctiveIdentifierRequired":false,"persistentStateRequired":false,"videoRobustness":"","audioRobustness":"","sessionType":"","serverCertificate":{},"serverCertificateUri":"","individualizationServer":""},"com.microsoft.playready":{"distinctiveIdentifierRequired":false,"persistentStateRequired":false,"videoRobustness":"","audioRobustness":"","sessionType":"","serverCertificate":{},"serverCertificateUri":"","individualizationServer":""},"com.apple.fps":{"distinctiveIdentifierRequired":false,"persistentStateRequired":false,"videoRobustness":"","audioRobustness":"","sessionType":"","serverCertificate":{},"serverCertificateUri":"","individualizationServer":""},"com.adobe.primetime":{"distinctiveIdentifierRequired":false,"persistentStateRequired":false,"videoRobustness":"","audioRobustness":"","sessionType":"","serverCertificate":{},"serverCertificateUri":"","individualizationServer":""},"org.w3.clearkey":{"distinctiveIdentifierRequired":false,"persistentStateRequired":false,"videoRobustness":"","audioRobustness":"","sessionType":"","serverCertificate":{},"serverCertificateUri":"","individualizationServer":""}},"delayLicenseRequestUntilPlayed":false,"logLicenseExchange":false,"updateExpirationTime":1,"preferredKeySystems":[],"keySystemsMapping":{}},"manifest":{"retryParameters":{"maxAttempts":2,"baseDelay":1000,"backoffFactor":2,"fuzzFactor":0.5,"timeout":30000,"stallTimeout":5000,"connectionTimeout":10000},"availabilityWindowOverride":null,"disableAudio":false,"disableVideo":false,"disableText":false,"disableThumbnails":false,"defaultPresentationDelay":0,"segmentRelativeVttTiming":false,"dash":{"clockSyncUri":"https://shaka-player-demo.appspot.com/time.txt","ignoreDrmInfo":false,"disableXlinkProcessing":false,"xlinkFailGracefully":false,"ignoreMinBufferTime":false,"autoCorrectDrift":true,"initialSegmentLimit":1000,"ignoreSuggestedPresentationDelay":false,"ignoreEmptyAdaptationSet":false,"ignoreMaxSegmentDuration":false,"keySystemsByURI":{"urn:uuid:1077efec-c0b2-4d02-ace3-3c1e52e2fb4b":"org.w3.clearkey","urn:uuid:e2719d58-a985-b3c9-781a-b030af78d30e":"org.w3.clearkey","urn:uuid:edef8ba9-79d6-4ace-a3c8-27dcd51d21ed":"com.widevine.alpha","urn:uuid:9a04f079-9840-4286-ab92-e65be0885f95":"com.microsoft.playready","urn:uuid:79f0049a-4098-8642-ab92-e65be0885f95":"com.microsoft.playready","urn:uuid:f239e769-efa3-4850-9c16-a903c6932efb":"com.adobe.primetime"}},"hls":{"ignoreTextStreamFailures":false,"ignoreImageStreamFailures":false,"defaultAudioCodec":"mp4a.40.2","defaultVideoCodec":"avc1.42E01E","ignoreManifestProgramDateTime":false,"mediaPlaylistFullMimeType":"video/mp2t; codecs=\"avc1.42E01E, mp4a.40.2\""}},"streaming":{"retryParameters":{"maxAttempts":2,"baseDelay":1000,"backoffFactor":2,"fuzzFactor":0.5,"timeout":30000,"stallTimeout":5000,"connectionTimeout":10000},"rebufferingGoal":2,"bufferingGoal":10,"bufferBehind":30,"ignoreTextStreamFailures":false,"alwaysStreamText":false,"startAtSegmentBoundary":false,"gapDetectionThreshold":0.5,"durationBackoff":1,"forceTransmuxTS":false,"safeSeekOffset":5,"stallEnabled":true,"stallThreshold":1,"stallSkip":0.1,"useNativeHlsOnSafari":true,"inaccurateManifestTolerance":2,"lowLatencyMode":false,"autoLowLatencyMode":false,"forceHTTPS":false,"preferNativeHls":false,"updateIntervalSeconds":1,"dispatchAllEmsgBoxes":false,"observeQualityChanges":false,"maxDisabledTime":30},"offline":{"usePersistentLicense":true,"numberOfParallelDownloads":5},"abr":{"enabled":true,"useNetworkInformation":true,"defaultBandwidthEstimate":1000000,"switchInterval":8,"bandwidthUpgradeTarget":0.85,"bandwidthDowngradeTarget":0.95,"restrictions":{"minWidth":0,"maxWidth":null,"minHeight":0,"maxHeight":null,"minPixels":0,"maxPixels":null,"minFrameRate":0,"maxFrameRate":null,"minBandwidth":0,"maxBandwidth":null},"advanced":{"minTotalBytes":128000,"minBytes":16000,"fastHalfLife":2,"slowHalfLife":5},"restrictToElementSize":false,"ignoreDevicePixelRatio":false},"preferredAudioLanguage":"en-GB","preferredTextLanguage":"en-GB","preferredVariantRole":"","preferredTextRole":"","preferredAudioChannelCount":2,"preferredVideoCodecs":[],"preferredAudioCodecs":[],"preferForcedSubs":false,"preferredDecodingAttributes":[],"restrictions":{"minWidth":0,"maxWidth":null,"minHeight":0,"maxHeight":null,"minPixels":0,"maxPixels":null,"minFrameRate":0,"maxFrameRate":null,"minBandwidth":0,"maxBandwidth":null},"playRangeStart":0,"playRangeEnd":null,"cmcd":{"enabled":false,"sessionId":"","contentId":"","useHeaders":false}}

What did you do?

Passed in a manifest with a 404ing segment.

What did you expect to happen?
A request on the 404ing segment to be sent twice, following the maxAttempts default parameter of 2.

What actually happened?

Requests to retrieve the segment were made ad infinitum.

@shakso shakso added the type: bug Something isn't working correctly label Oct 5, 2022
@github-actions github-actions bot added this to the v4.3 milestone Oct 5, 2022
@joeyparrish
Copy link
Member

That is WAI for a live stream, based on another config that decides what to do in the event of a streaming failure. The default behavior for live is to try again for certain errors, since live streams are ever-changing and certain problems could resolve themselves. The default behavior for VOD is to stop, since the content won't suddenly update to fix the issue mid-presentation.

Apps can control this behavior, though, and make their own decisions. The default configuration is equivalent to this:

  player.configure('streaming.failureCallback', (error) => {
    const retryErrorCodes = [
      shaka.util.Error.Code.BAD_HTTP_STATUS,
      shaka.util.Error.Code.HTTP_ERROR,
      shaka.util.Error.Code.TIMEOUT,
    ];

    if (player.isLive() && retryErrorCodes.includes(error.code)) {
      error.severity = shaka.util.Error.Severity.RECOVERABLE;

      shaka.log.warning('Live streaming error.  Retrying automatically...');
      player.retryStreaming();
    }
  });

The app can examine the streaming error and decide whether or not to try again. To try again, just call player.retryStreaming().

Does this help?

@joeyparrish joeyparrish added the status: working as intended The behavior is intended; this is not a bug label Oct 5, 2022
@joeyparrish joeyparrish removed this from the v4.3 milestone Oct 5, 2022
@joeyparrish joeyparrish added the status: waiting on response Waiting on a response from the reporter(s) of the issue label Oct 5, 2022
@github-actions github-actions bot added this to the v4.3 milestone Oct 5, 2022
@shakso
Copy link
Author

shakso commented Oct 7, 2022

Thanks for the snippet Joey.

However, when using the following code with the 404ing manifest:

const manifestUri = "https://s3-eu-west-1.amazonaws.com/test-mpds/media-404/test.mpd";

function initPlayer() {

  const video = document.getElementById("video");
  const player = new shaka.Player(video);

  player.configure('streaming.failureCallback', (error) => {
    console.log('Streaming failure');
  });

  await player.load(manifestUri);
}

document.addEventListener("DOMContentLoaded", initPlayer);

I am not seeing the error handler console text, just the 404 errors from the http_fetch_plugin.js

@github-actions github-actions bot removed the status: waiting on response Waiting on a response from the reporter(s) of the issue label Oct 7, 2022
@joeyparrish
Copy link
Member

I was unaware that there's a special handler for 404 errors that excludes it from being considered a "streaming failure". I'll go back through the commit history and figure out why we do that.

@joeyparrish
Copy link
Member

joeyparrish commented Oct 31, 2022

The current behavior:

  1. Request segment, get back 404
  2. Try a second time, get back 404
  3. StreamingEngine treats 404s as ignorable, continues

If we "fix" this special case for 404, the behavior would become:

  1. Request segment, get back 404
  2. Try a second time, get back 404
  3. StreamingEngine treats 404s as a streaming error, invokes error callback
  4. Default callback for live stream says "keep going", StreamingEngine continues

If you then configure the error callback to stop on streaming errors, you would get:

  1. Request segment, get back 404
  2. Try a second time, get back 404
  3. StreamingEngine treats 404s as a streaming error, invokes error callback
  4. Custom error callback says "stop", StreamingEngine stops, and the entire playback fails

@shakso, is this what you want? If so, why?

@joeyparrish
Copy link
Member

Or is the real issue the 404s showing up in monitoring? If so, is that client-side or server-side monitoring surfacing the issue?

@shakso
Copy link
Author

shakso commented Oct 31, 2022

Correct @joeyparrish - I believe this is server side monitoring.

The problem is, with Smart speakers, there's no way of knowing that this is happening on inspection of the device and it can happen in perpetuity until the user selects another stream. This will obviously cause a lot of 404s to be reported on the server side.

@joeyparrish
Copy link
Member

Ah, I see. So a playback failure on these live streams is preferable, then?

@shakso
Copy link
Author

shakso commented Oct 31, 2022

I wouldn't want to make that assumption for all use cases here - it would be best if this was a flag (retryParameters -> liveSegmentMaxAttempts?), defaulting to current behaviour (infinite) but configurable otherwise.

@joeyparrish joeyparrish added priority: P2 Smaller impact or easy workaround and removed status: working as intended The behavior is intended; this is not a bug labels Oct 31, 2022
@joeyparrish joeyparrish self-assigned this Oct 31, 2022
@joeyparrish
Copy link
Member

Just discussed with the team, and we have a plan.

The 404 handling is outside of the system of failureCallback, which is preventing applications from influencing the behavior of 404s. failureCallback was always meant to give applications control over error handling behavior.

We'll move the special case for 404 into the default callback for streaming.failureCallback. This will give you the ability to make all decisions at an application level. If you want to treat all failures as fatal, you can do that. If you want to make a more complex decision, the app can examine the error object and create whatever business logic they want.

joeyparrish added a commit to joeyparrish/shaka-player that referenced this issue Oct 31, 2022
In general, streaming.failureCallback is meant to give applications
control over error handling at the level of streaming.  However, there
was a special case for HTTP 404s built into StreamingEngine in a way
that applications could not override.  This was in spite of the fact
that the default failureCallback would already check for and retry on
the error code BAD_HTTP_STATUS.

This removes the special case in StreamingEngine and refactors
failureCallback and retryStreaming to preserve the special delay
imposed in the old 404 handler.  With this, applications can override
failureCallback to have complete control over 404 handling.

Closes shaka-project#4548
joeyparrish added a commit that referenced this issue Nov 1, 2022
In general, streaming.failureCallback is meant to give applications
control over error handling at the level of streaming. However, there
was a special case for HTTP 404s built into StreamingEngine in a way
that applications could not override. This was in spite of the fact that
the default failureCallback would already check for and retry on the
error code BAD_HTTP_STATUS.

This removes the special case in StreamingEngine and refactors
failureCallback and retryStreaming to preserve the special delay imposed
in the old 404 handler. With this, applications can override
failureCallback to have complete control over 404 handling.

Closes #4548
joeyparrish added a commit that referenced this issue Nov 8, 2022
In general, streaming.failureCallback is meant to give applications
control over error handling at the level of streaming. However, there
was a special case for HTTP 404s built into StreamingEngine in a way
that applications could not override. This was in spite of the fact that
the default failureCallback would already check for and retry on the
error code BAD_HTTP_STATUS.

This removes the special case in StreamingEngine and refactors
failureCallback and retryStreaming to preserve the special delay imposed
in the old 404 handler. With this, applications can override
failureCallback to have complete control over 404 handling.

Closes #4548
joeyparrish added a commit that referenced this issue Nov 8, 2022
In general, streaming.failureCallback is meant to give applications
control over error handling at the level of streaming. However, there
was a special case for HTTP 404s built into StreamingEngine in a way
that applications could not override. This was in spite of the fact that
the default failureCallback would already check for and retry on the
error code BAD_HTTP_STATUS.

This removes the special case in StreamingEngine and refactors
failureCallback and retryStreaming to preserve the special delay imposed
in the old 404 handler. With this, applications can override
failureCallback to have complete control over 404 handling.

Closes #4548
joeyparrish added a commit that referenced this issue Nov 8, 2022
In general, streaming.failureCallback is meant to give applications
control over error handling at the level of streaming. However, there
was a special case for HTTP 404s built into StreamingEngine in a way
that applications could not override. This was in spite of the fact that
the default failureCallback would already check for and retry on the
error code BAD_HTTP_STATUS.

This removes the special case in StreamingEngine and refactors
failureCallback and retryStreaming to preserve the special delay imposed
in the old 404 handler. With this, applications can override
failureCallback to have complete control over 404 handling.

Closes #4548
joeyparrish added a commit that referenced this issue Nov 8, 2022
In general, streaming.failureCallback is meant to give applications
control over error handling at the level of streaming. However, there
was a special case for HTTP 404s built into StreamingEngine in a way
that applications could not override. This was in spite of the fact that
the default failureCallback would already check for and retry on the
error code BAD_HTTP_STATUS.

This removes the special case in StreamingEngine and refactors
failureCallback and retryStreaming to preserve the special delay imposed
in the old 404 handler. With this, applications can override
failureCallback to have complete control over 404 handling.

Closes #4548
@github-actions github-actions bot added the status: archived Archived and locked; will not be updated label Dec 31, 2022
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 31, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
priority: P2 Smaller impact or easy workaround status: archived Archived and locked; will not be updated type: bug Something isn't working correctly
Projects
None yet
2 participants