Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Media Source Extensions #6

Open
chrisn opened this issue Jun 22, 2018 · 5 comments
Open

Media Source Extensions #6

chrisn opened this issue Jun 22, 2018 · 5 comments

Comments

@chrisn
Copy link
Member

chrisn commented Jun 22, 2018

Work on updating Media Source Extensions has started in the Web Platform Incubator Community Group (WICG). The repository is here.

The feature addition currently in development is support for codec and container switching. Please see the Intent to Implement and the explainer for details.

This feature adds support for improved cross-codec or cross-bytestream transitions in Chrome HTML5 Media Source Extensions playback using a new changeType method on SourceBuffer that allows the type (bytestream and codec(s)) of media bytes subsequently appended to the SourceBuffer to be changed. We are incubating this idea via the WICG, with goal of eventually working with WebPlatformWG to get the result of WICG incubation as part of the next version of the Media Source Extensions API (MSE).

The ReadMe adds further guidance:

All specification updates that are part of incubation will be done on a per-feature branch off of gh-pages in this repository. Please base your pull-requests accordingly.

We will use the upstream w3c/media-source repository's issue tracker for feature discussion beyond pull-requests.

I encourage Interest Group members to review the codec switching changes, and also to suggest other new features as needed.

I suggest that we also review the Interest Group's own requirements for Media Source Extensions and coordinate submitting these to the MSE issue tracker. This includes previous submissions from CTA WAVE and Cloud Browser Task Force.

@chrisn
Copy link
Member Author

chrisn commented Aug 15, 2018

A TAG review has now been requested for the codec and container switching addition to MSE.

@chrisn
Copy link
Member Author

chrisn commented Oct 19, 2018

During the IG call on 4 Jan 2018 (Minutes, Summary), @squapp from Fraunhofer Fokus presented their 360-degree video playout system, which identified some issues with Media Source Extensions:

  • The need for a low latency buffering capability
  • Reliability of replacing segments using the 'segment' AppendMode in a single SourceBuffer
  • Switching between multiple SourceBuffers attached to a single MediaSource

Now that WICG is incubating new features for MSE, TPAC is an opportunity to review these requirements and propose work on API changes.

@chrisn
Copy link
Member Author

chrisn commented Oct 19, 2018

The IG call on 2 Oct 2018 about WebRTC also raised some ideas for low latency streaming using MSE.

Peter mentioned possible changes to MSE to provide controls over media playback behaviour for low latency streaming. The Interest Group could further analyse these ideas and advance them as part of the current work on new feature incubation for MSE.

  • Per-frame injection: e.g., SourceBuffer.appendFrame(frame)
  • Buffer delay controls: e.g., SourceBuffer.setLiveness(liveness) or SourceBuffer.setDelays(min, max)
  • Controls for interpolation: e.g., HTMLMediaElement.setInterpolationBehavior

@wolenetz
Copy link
Member

wolenetz commented Oct 24, 2018

Please see also FOMS 2018 notes on MSE discussions for vNext feature incubations, many were focused on "low-latency" / "live" / "buffer management" / "GC interop-improvements":

Session 1: http://www.foms-workshop.org/foms2018/pmwiki.php/Site/MSE
Session 2: http://www.foms-workshop.org/foms2018/pmwiki.php/Site/LL-MSE
My high-level summary:
MediaSource Extensions (MSE) (quite a lot covered, main points:)

  • Status update

    • MSE vNext:
      • WICG setup for incubations, tracking issues on main MSE github
      • changeType: first vNext incubation (Chrome, Firefox, Safari), YT using it for seamless AV1<->VP9 adaptations
    • REC MSE updates: upcoming deprecations
      • multiple tracks in a SourceBuffer set with ‘sequence’ append Mode
      • possibly createObjectUrl (though that’s unlikely to happen since breaking change)
    • MediaError.message was still unknown to some
  • Upcoming/in-progress work on MSE vNext:

    • MSE-in-workers, including mechanism for improving MSE setup latency (“sourceopen”)
  • Open ended discussion focused heavily on MSE vNext improvements around:

    • “Live” “Low latency” “Gap-skipping” “buffer/GC mgmt” related features
      • These dovetail with, but they didn’t want tied directly to the media element latencyHint/jitter buffer control API (we're looking to specify something like that soon instead of relying on implementation-specific liveness parsing heuristics)
      • SourceBuffer API for specifying a GC mode (e.g. aggressive GC) can enable "infinite" GOP playbacks, roughly:
        • "default" = current implemented implementation-specific heuristic
        • "aggressive" = allow GC/eviction to occur throughout playback (not just at appendBuffer() or remove() synchronous points) and allow eviction of anything prior to current demuxer/playback head (including current GOP's keyframe). Apps would be responsible for seeking responsibly (for example, being aware that seek-to-current-time in this mode might stall). This enables "infinite GOP" scenarios and associated reduced network bandwidth jank in streaming media containing just 1 keyframe followed by nonkeyframes.
        • “LL-CMAF?” = third kind of MSE GC mode that preemptively evicts while playing (not just at appendBuffer() or remove() times), but doesn't evict anything from the start of the current GOP forwards. Such a mode would support the CMAF-low-latency type players which have frequent keyframes, and help reduce memory pressure from already-played media (and also allow the element's decoders to be suspended in scenarios like background-tab/etc unlike the more "aggressive" mode I mentioned, above). Apple/Safari promoted this mode in particular.
      • Way to specifying a cap on SourceBuffer resources (time or bytes TBD)
      • SourceBuffer API (tbd) for specifying gap fudge factor (e.g. all gaps < 100ms must coalesce and not be a gap reflected via SourceBuffer.buffered) can reduce interop issues.
      • MediaSource (or perhaps HTMLMediaElement) API (tbd) for modifying playback behavior across surviving gaps:
        • Default: v1 MSE behavior: stall at buffered range intersection gaps. Reflect those gaps in media element buffered ranges.
        • Play-through: play silence for missing audio and show most current video frame. Once client media clock reaches next audio (or video) frame, play that audio (or video). Essentially, try to preserve the media timeline (no auto-seek). If app seeks to a position in such a gap, allow it, and play silence/(no?) video frame until clock reaches end of gap. Reflect no gap in media element buffered ranges.
        • Fancy-seek: Hide gaps. If no audio and no video are available for current media time, seek to earliest of next available buffered audio or video and resume playback from there. Open question: give app some event indicating this "fancy-seek" has occurred? Reflect no gap in media element buffered ranges (though they should be apparent in the intersection of activeSourceBuffers' buffered ranges). Seek to such a partial or both A/V gap should be allowed and not stall (and then auto-fancy-seek from there if the target had no A/V?)
        • Don't engage these modes automatically from the media element "low latency" hint API (tbd) because apps might want to not involve a jitter buffer, but still retain v1 MSE buffering behavior.
        • mediaElement.buffered should show what is expected w.r.t. playback stalls. sourceBuffer.buffered should show what is actually buffered (not hiding gaps)
    • Better debugging/information:
      • “What is the timestamp/resolution/codec/bitrate/profile/etc of what is playing right now?” (possibly via MediaPlaybackQuality? or MCAPI?)
      • “What is the earliest PTS of this GOP’s presentation interval?” (For apps to use to guide their own explicit version of something like the "LL-CMAF?" gc mode, above.)
      • “What media was actually removed by my call to SourceBuffer.remove()?”
      • “Can I haz appended media tags and a way to retrieve a timeline showing where those tags are?”
      • “Can I haz promises with MSE?”
      • Chrome: hot-link from devtools to specific media-internals player log

Many of these already have associated MSE github issues. Work will be ramping up on specifying solutions to these.

p.s. I wish I had been able to make it to TPAC this year to meet f2f with folks there, too, regarding MSE vNext feature ideas, proposals, and incubations. Please file issues or follow-up on existing ones in the main MSE github (https://github.com/w3c/media-source/issues) to help us gain traction on reaching ergonomic APIs that improve usability and interoperability of MSE.

@chrisn
Copy link
Member Author

chrisn commented Oct 26, 2018

@wolenetz Many thanks for sharing this! I note that similar ideas were mentioned in a recent IG call on low latency streaming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants