index.bs

<pre class='metadata'>
Title: Media Capabilities
Repository: w3c/media-capabilities
Status: ED
ED: https://w3c.github.io/media-capabilities/
TR: https://www.w3.org/TR/media-capabilities/
Shortname: media-capabilities
Level: None
Group: mediawg
Editor: Jean-Yves Avenard, w3cid 115886, Apple Inc. https://www.apple.com/
Editor: Mark Foltz, w3cid 68454, Google Inc. https://www.google.com/, https://github.com/markafoltz

Former Editor: Will Cassella, w3cid 139598, Google Inc. https://www.google.com/
Former Editor: Mounir Lamouri, w3cid 45389, Google Inc. https://www.google.com/
Former Editor: Chris Cunningham, w3cid 114832, Google Inc. https://www.google.com/
Former Editor: Vi Nguyen, w3cid 116349, Microsoft Corporation https://www.microsoft.com/

Abstract: This specification intends to provide APIs to allow websites to make
Abstract: an optimal decision when picking media content for the user. The APIs
Abstract: will expose information about the decoding and encoding capabilities
Abstract: for a given format but also output capabilities to find the best match
Abstract: based on the device's display.

!Participate: <a href='https://github.com/w3c/media-capabilities'>Git Repository.</a>
!Participate: <a href='https://github.com/w3c/media-capabilities/issues/new'>File an issue.</a>
!Version History: <a href='https://github.com/w3c/media-capabilities/commits'>https://github.com/w3c/media-capabilities/commits</a>
</pre>

<pre class='link-defaults'>
spec:html; type:dfn; for:realm; text:global object
</pre>

<pre class='anchors'>
spec: media-source; urlPrefix: https://www.w3.org/TR/media-source/
    type: interface
        for: MediaSource; text: MediaSource; url: #mediasource
    type: method
        for: MediaSource; text: isTypeSupported(); url: #dom-mediasource-istypesupported

spec: mediastream-recording; urlPrefix: https://www.w3.org/TR/mediastream-recording/#
    type:interface
        text: MediaRecorder; url: mediarecorder

spec: mimesniff; urlPrefix: https://mimesniff.spec.whatwg.org/#
    type: dfn; text: valid mime type; url: valid-mime-type
</pre>

<pre class='biblio'>
{
    "SMPTE-ST-2084": {
        "href": "https://ieeexplore.ieee.org/document/7291452",
        "title": "High Dynamic Range Electro-Optical Transfer Function of Mastering Reference Displays",
        "publisher": "SMPTE",
        "date": "2014",
        "id": "SMPTE-ST-2084"
    },
    "SMPTE-ST-2086": {
        "href": "https://ieeexplore.ieee.org/document/7291707",
        "title": "Mastering Display Color Volume Metadata Supporting High Luminance and Wide Color Gamut Images",
        "publisher": "SMPTE",
        "date": "2014",
        "id": "SMPTE-ST-2086"
    },
    "SMPTE-ST-2094": {
        "href": "https://ieeexplore.ieee.org/document/7513361",
        "title": "Dynamic Metadata for Color Volume Transform Core Components",
        "publisher": "SMPTE",
        "date": "2016",
        "id": "SMPTE-ST-2094"
    },
    "ENCRYPTED-MEDIA-DRAFT": {
        "href": "https://w3c.github.io/encrypted-media",
        "title": "Encrypted Media Extensions",
        "publisher": "W3C",
        "date": "13 December 2019"
    }
}
</pre>

<section class='non-normative'>
  <h2 id='introduction'>Introduction</h2>
  <em>This section is non-normative</em>

  <p>
    This specification defines an API to query the user agent with regards
    to its audio and video decoding and encoding capabilities,
    based on information such as the codecs, profile, resolution, bitrates,
    etc., of the media. The API indicates if the configuration is supported
    and whether the playback is expected to be smooth and/or power efficient.
  </p>
  <p>
    This specification focuses on encoding and decoding capabilities.
    It is expected to be used with other web APIs that provide information about
    the display properties, such as supported color gamut or dynamic range capabilities,
    which enable web applications to pick the right content for the display and to,
    for example, avoid providing HDR content to an SDR display.
  </p>
</section>

<section>
  <h2 id='decoding-encoding-capabilities'>Decoding and Encoding Capabilities</h2>

  <section>
    <h3 id='media-configurations'>Media Configurations</h3>

    <section>
      <h4 id='mediaconfiguration'>MediaConfiguration</h4>

      <xmp class='idl'>
        dictionary MediaConfiguration {
          VideoConfiguration video;
          AudioConfiguration audio;
        };
      </xmp>

      <xmp class='idl'>
        dictionary MediaDecodingConfiguration : MediaConfiguration {
          required MediaDecodingType type;
          MediaCapabilitiesKeySystemConfiguration keySystemConfiguration;
        };
      </xmp>

      <xmp class='idl'>
        dictionary MediaEncodingConfiguration : MediaConfiguration {
          required MediaEncodingType type;
        };
      </xmp>

      <p>
        The input to the decoding capabilities is represented by a
        {{MediaDecodingConfiguration}} dictionary and the input to the encoding
        capabilities by a {{MediaEncodingConfiguration}} dictionary.
      </p>
      <p>
        For a {{MediaConfiguration}} to be a <dfn>valid
        MediaConfiguration</dfn>, all of the following conditions MUST be true:
        <ol>
          <li>
            <code>audio</code> and/or <code>video</code> MUST [=map/exist=].
          </li>
          <li>
            <code>audio</code> MUST be a <a>valid audio configuration</a> if
            it [=map/exists=].
          </li>
          <li>
            <code>video</code> MUST be a <a>valid video configuration</a> if
            it [=map/exists=].
          </li>
        </ol>
      </p>
      <p>
        For a {{MediaDecodingConfiguration}} to be a <dfn>valid
        MediaDecodingConfiguration</dfn>, all of the following conditions MUST
        be true:
        <ol>
          <li>
            It MUST be a <a>valid MediaConfiguration</a>.
          </li>
          <li>
            If <code>keySystemConfiguration</code> [=map/exists=]:
            <ol>
              <li>
                The <code>type</code> MUST be {{media-source}} or {{file}}.
              </li>
              <li>
                If <code>keySystemConfiguration.audio</code> [=map/exists=],
                <code>audio</code> MUST also [=map/exist=].
              </li>
              <li>
                If <code>keySystemConfiguration.video</code> [=map/exists=],
                <code>video</code> MUST also [=map/exist=].
              </li>
            </ol>
          </li>
        </ol>
      </p>
      <p>
        For a {{MediaDecodingConfiguration}} to describe [[!ENCRYPTED-MEDIA]], a
        {{keySystemConfiguration}} MUST [=map/exist=].
      </p>
    </section>

    <section>
      <h4 id='mediadecodingtype'>MediaDecodingType</h4>

      <xmp class='idl'>
        enum MediaDecodingType {
          "file",
          "media-source",
          "webrtc"
        };
      </xmp>

      <p>
        A {{MediaDecodingConfiguration}} has three types:
        <ul>
          <li><dfn for='MediaDecodingType' enum-value>file</dfn> is used to
          represent a configuration that is meant to be used for playback of
          media sources other than {{MediaSource/MediaSource}} as defined in 
          [[media-source]] and {{RTCPeerConnection}} as defined in [[webrtc]]. </li>
          <li><dfn for='MediaDecodingType' enum-value>media-source</dfn> is used
          to represent a configuration that is meant to be used for playback of
          a {{MediaSource/MediaSource}}. </li>
          <li><dfn for='MediaDecodingType' enum-value>webrtc</dfn> is used to
          represent a configuration that is meant to be received using
          {{RTCPeerConnection}}.</li>
        </ul>
      </p>
    </section>

    <section>
      <h4 id='mediaencodingtype'>MediaEncodingType</h4>

      <xmp class='idl'>
        enum MediaEncodingType {
          "record",
          "webrtc"
        };
      </xmp>

      <p>
        A {{MediaEncodingConfiguration}} can have one of two types:
        <ul>
          <li><dfn for='MediaEncodingType' enum-value>record</dfn> is used to
          represent a configuration for recording of media,
          <span class="informative">e.g., using {{MediaRecorder}} as defined in
          [[mediastream-recording]]</span>.</li>
          <li><dfn for='MediaEncodingType' enum-value>webrtc</dfn> is used to
          represent a configuration that is meant to be transmitted using
          {{RTCPeerConnection}} as defined in [[webrtc]]</span>).</li>
        </ul>
      </p>
    </section>

    <section>
      <h4 id='mime-type'>MIME types</h4>

      <p>
        In the context of this specification, a MIME type is also called content
        type. A <dfn>valid media MIME type</dfn> is a string that is a <a>valid
        MIME type</a> per [[mimesniff]].
      </p>

      <p>
        Please note that the definition of MIME subtypes and parameters is
        context dependent. For {{file}}, {{media-source}}, and {{record}}, the
        MIME types are specified as defined for [[#http|HTTP]], whereas for
        {{MediaDecodingType/webrtc}} the MIME types are specified as
        defined for [[#rtp|RTP]].
      </p>

      <p>
        A <dfn>valid audio MIME type</dfn> is a string that is a <a>valid media
        MIME type</a> and for which the <code>type</code> per [[RFC9110]] is
        either <code>audio</code> or <code>application</code>.
      </p>

      <p>
        A <dfn>valid video MIME type</dfn> is a string that is a <a>valid media
        MIME type</a> and for which the <code>type</code> per [[RFC9110]] is
        either <code>video</code> or <code>application</code>.
      </p>

      <section>
        <h5 id='http'>HTTP</h5>

        <p>
          If the MIME type does not imply a codec, the string MUST also have one
          and only one parameter that is named <code>codecs</code> with a value
          describing a single media codec. Otherwise, it MUST contain no
          parameters.
        </p>
      </section>

      <section>
        <h5 id='rtp'>RTP</h5>

        <p>
          The MIME types used with RTP are defined in the specifications of the
          corresponding RTP payload formats [[RFC4855]] [[RFC6838]]. The codec
          name is typically specified as subtype and zero or more parameters
          may be present depending on the codec.
        </p>
      </section>
    </section>

    <section>
      <h4 id='videoconfiguration'>VideoConfiguration</h4>

      <xmp class='idl'>
        dictionary VideoConfiguration {
          required DOMString contentType;
          required unsigned long width;
          required unsigned long height;
          required unsigned long long bitrate;
          required double framerate;
          boolean hasAlphaChannel;
          HdrMetadataType hdrMetadataType;
          ColorGamut colorGamut;
          TransferFunction transferFunction;
          DOMString scalabilityMode;
          boolean spatialScalability;
        };
      </xmp>

      <p>
        The <dfn for='VideoConfiguration' dict-member>contentType</dfn> member
        represents the MIME type of the video track.
      </p>

      <p>
        To check if a {{VideoConfiguration}} <var>configuration</var> is a
        <dfn>valid video configuration</dfn>, the following steps MUST be run:
        <ol>
          <li>
            If <var>configuration</var>'s {{VideoConfiguration/contentType}} is
            not a <a>valid video MIME type</a>, return <code>false</code> and
            abort these steps.
          </li>
          <li>
            If {{VideoConfiguration/framerate}} is not finite or is not greater
            than 0, return <code>false</code> and abort these steps.
          </li>
          <li>
            If an optional member is specified for a {{MediaDecodingType}} or
            {{MediaEncodingType}} to which it's not applicable, return
            <code>false</code> and abort these steps. See applicability rules
            in the member definitions below.
          <li>
            Return <code>true</code>.
          </li>
        </ol>
      </p>

      <p>
        The <dfn for='VideoConfiguration' dict-member>width</dfn> and
        <dfn for='VideoConfiguration' dict-member>height</dfn> members represent
        respectively the visible horizontal and vertical encoded pixels in the
        encoded video frames.
      </p>

      <p>
        The <dfn for='VideoConfiguration' dict-member>bitrate</dfn> member
        represents the average bitrate of the video track given in units of bits
        per second. In the case of a video stream encoded at a constant bit rate
        (CBR) this value should be accurate over a short term window. For the
        case of variable bit rate (VBR) encoding, this value should be usable to
        allocate any necessary buffering and throughput capability to
        provide for the un-interrupted decoding of the video stream over the
        long-term based on the indicated {{VideoConfiguration/contentType}}.
      </p>

      <p>
        The <dfn for='VideoConfiguration' dict-member>framerate</dfn> member
        represents the framerate of the video track. The framerate is the number
        of frames used in one second (frames per second). It is represented as a
        double.
      </p>

      <p>
        The <dfn for='VideoConfiguration' dict-member>hasAlphaChannel</dfn> member
        represents whether the video track contains alpha channel information. If
        true, the encoded video stream can produce per-pixel alpha channel information
        when decoded. If false, the video stream cannot produce per-pixel alpha channel
        information when decoded. If undefined, the UA should determine whether the
        video stream encodes alpha channel information based on the indicated
        {{VideoConfiguration/contentType}}, if possible. Otherwise, the UA should
        presume that the video stream cannot produce alpha channel information.
      </p>

      <p>
        If present, the <dfn for='VideoConfiguration' dict-member>hdrMetadataType</dfn>
        member represents that the video track includes the specified HDR
        metadata type, which the UA needs to be capable of interpreting for tone
        mapping the HDR content to a color volume and luminance of the output
        device. Valid inputs are defined by {{HdrMetadataType}}. hdrMetadataType is
        only applicable to {{MediaDecodingConfiguration}} for types {{media-source}}
        and {{file}}.
      </p>

      <p>
        If present, the <dfn for='VideoConfiguration' dict-member>colorGamut</dfn>
        member represents that the video track is delivered in the specified
        color gamut, which describes a set of colors in which the content is
        intended to be displayed. If the attached output device also supports
        the specified color, the UA needs to be able to cause the output device
        to render the appropriate color, or something close enough. If the
        attached output device does not support the specified color, the UA
        needs to be capable of mapping the specified color to a color supported
        by the output device. Valid inputs are defined by {{ColorGamut}}. colorGamut
        is only applicable to {{MediaDecodingConfiguration}} for types
        {{media-source}} and {{file}}.
      </p>

      <p>
        If present, the <dfn for='VideoConfiguration' dict-member>transferFunction</dfn>
        member represents that the video track requires the specified transfer
        function to be understood by the UA. Transfer function describes the
        electro-optical algorithm supported by the rendering capabilities of a
        user agent, independent of the display, to map the source colors in the
        decoded media into the colors to be displayed. Valid inputs are defined
        by {{TransferFunction}}. transferFunction is only applicable to
        {{MediaDecodingConfiguration}} for types {{media-source}} and {{file}}.
      </p>

      <p>
        If present, the <dfn for='VideoConfiguration' dict-member>scalabilityMode</dfn>
        member represents the scalability mode as defined in [[webrtc-svc]]. If
        absent, the implementer defined default mode for this
        {{VideoConfiguration/contentType}} is assumed (i.e., the mode you get if
        you don't specify one via {{RTCRtpSender/setParameters()}}).
        scalabilityMode is only applicable to {{MediaEncodingConfiguration}} for
        type {{MediaEncodingType/webrtc}}.
      </p>

      <p>
        If present, the <dfn for='VideoConfiguration' dict-member>spatialScalability</dfn>
        member represents the ability to do spatial prediction, that is,
        using frames of a resolution different than the current resolution as
        dependencies. If absent, spatialScalability will default to
        <code>false</code>. spatialScalability is closely coupled to
        {{VideoConfiguration/scalabilityMode}} in the sense that streams encoded
        with modes using spatial scalability (e.g. "L2T1") can only be decoded
        if spatialScalability is supported. spatialScalability is only
        applicable to {{MediaDecodingConfiguration}} for types {{media-source}},
        {{file}}, and {{MediaDecodingType/webrtc}}.
      </p>
    </section>

    <section>
      <h4 id='hdrmetadatatype'>HdrMetadataType</h4>

      <p>
        <xmp class='idl'>
          enum HdrMetadataType {
            "smpteSt2086",
            "smpteSt2094-10",
            "smpteSt2094-40"
          };
        </xmp>

        <p>
          If present, {{HdrMetadataType}} describes the capability to interpret HDR metadata
          of the specified type.
        </p>

        <p>
          The {{VideoConfiguration}} may contain one of the following types:
          <ul>
            <li>
              <dfn for='HdrMetadataType' enum-value>smpteSt2086</dfn>,
              representing the static metadata type defined by
              [[!SMPTE-ST-2086]].
            </li>
            <li>
              <dfn for='HdrMetadataType' enum-value>smpteSt2094-10</dfn>,
              representing the dynamic metadata type defined by
              [[!SMPTE-ST-2094]].
            </li>
            <li>
              <dfn for='HdrMetadataType' enum-value>smpteSt2094-40</dfn>,
              representing the dynamic metadata type defined by
              [[!SMPTE-ST-2094]].
            </li>
          </ul>
        </p>
      </p>
    </section>

    <section>
      <h4 id='colorgamut'>ColorGamut</h4>

      <p>
        <xmp class='idl'>
          enum ColorGamut {
            "srgb",
            "p3",
            "rec2020"
          };
        </xmp>

        <p>
          The {{VideoConfiguration}} may contain one of the following types:
          <ul>
            <li>
              <dfn for='ColorGamut' enum-value>srgb</dfn>, representing the
              [[!sRGB]] color gamut.
            </li>
            <li>
              <dfn for='ColorGamut' enum-value>p3</dfn>, representing the DCI
              P3 Color Space color gamut. This color gamut includes the
              {{ColorGamut/srgb}} gamut.
            </li>
            <li>
              <dfn for='ColorGamut' enum-value>rec2020</dfn>, representing
              the ITU-R Recommendation BT.2020 color gamut. This color gamut
              includes the {{ColorGamut/p3}} gamut.
            </li>
          </ul>
        </p>
      </p>
    </section>

    <section>
      <h4 id='transferfunction'>TransferFunction</h4>

      <p>
        <xmp class='idl'>
          enum TransferFunction {
            "srgb",
            "pq",
            "hlg"
          };
        </xmp>

        <p>
          The {{VideoConfiguration}} may contain one of the following types:
          <ul>
            <li>
              <dfn for='TransferFunction' enum-value>srgb</dfn>, representing
              the transfer function defined by [[!sRGB]].
            </li>
            <li>
              <dfn for='TransferFunction' enum-value>pq</dfn>, representing the
              "Perceptual Quantizer" transfer function defined by
              [[!SMPTE-ST-2084]].
            </li>
            <li>
              <dfn for='TransferFunction' enum-value>hlg</dfn>, representing the
              "Hybrid Log Gamma" transfer function defined by BT.2100.
            </li>
          </ul>
        </p>
      </p>
    </section>

    <section>
      <h4 id='audioconfiguration'>AudioConfiguration</h4>

      <xmp class='idl'>
        dictionary AudioConfiguration {
          required DOMString contentType;
          DOMString channels;
          unsigned long long bitrate;
          unsigned long samplerate;
          boolean spatialRendering;
        };
      </xmp>

      <p>
        The <dfn for='AudioConfiguration' dict-member>contentType</dfn> member
        represents the MIME type of the audio track.
      </p>

      <p>
        To check if a {{AudioConfiguration}} <var>configuration</var> is a
        <dfn>valid audio configuration</dfn>, the following steps MUST be run:
        <ol>
          <li>
            If <var>configuration</var>'s {{AudioConfiguration/contentType}} is
            not a <a>valid audio MIME type</a>, return <code>false</code> and
            abort these steps.
          </li>
          <li>
            Return <code>true</code>.
          </li>
        </ol>
      </p>

      <p>
        The <dfn for='AudioConfiguration' dict-member>channels</dfn> member
        represents the audio channels used by the audio track. channels is only
        applicable to the decoding types {{media-source}}, {{file}}, and
        {{MediaDecodingType/webrtc}} and the encoding type
        {{MediaEncodingType/webrtc}}.
      </p>

      <p class='issue'>
        The {{AudioConfiguration/channels}} needs to be defined as a
        <code>double</code> (2.1, 4.1, 5.1, ...), an <code>unsigned short</code>
        (number of channels) or as an <code>enum</code> value. The current
        definition is a placeholder.
      </p>

      <p>
        The <dfn for='AudioConfiguration' dict-member>bitrate</dfn> member
        represents the average bitrate of the audio track. The bitrate
        is the number of bits used to encode a second of the audio track.
      </p>

      <p>
        The <dfn for='AudioConfiguration' dict-member>samplerate</dfn>
        member represents the sample rate of the audio track. The sample rate
        is the number of samples of audio carried per second. samplerate is only
        applicable to the decoding types {{media-source}}, {{file}}, and
        {{MediaDecodingType/webrtc}} and the encoding type
        {{MediaEncodingType/webrtc}}.
      </p>

      <p class='note'>
        The {{AudioConfiguration/samplerate}} is expressed in <code>Hz</code>
        (ie. number of samples of audio per second). Sometimes the samplerates
        value are expressed in <code>kHz</code> which represents the number of
        thousands of samples of audio per second.<br>
        44100 <code>Hz</code> is equivalent to 44.1 <code>kHz</code>.
      </p>

      <p>
        The <dfn for='AudioConfiguration' dict-member>spatialRendering</dfn>
        member indicates that the audio SHOULD be rendered spatially. The
        details of spatial rendering SHOULD be inferred from the
        {{AudioConfiguration/contentType}}. If it does not [=map/exist=], the UA
        MUST presume spatial rendering is not required. When <code>true</code>,
        the user agent SHOULD only report this configuration as
        {{MediaCapabilitiesInfo/supported}} if it can support spatial
        rendering for the current audio output device without failing back to a
        non-spatial mix of the stream. {{spatialRendering}} is only applicable to
        {{MediaDecodingConfiguration}} for types {{media-source}} and {{file}}.
      </p>
    </section>
  </section>

  <section>
      <h4 id='mediacapabilitieskeysystemconfiguration'>
        MediaCapabilitiesKeySystemConfiguration
      </h4>

      <xmp class='idl'>
        dictionary MediaCapabilitiesKeySystemConfiguration {
          required DOMString keySystem;
          DOMString initDataType = "";
          MediaKeysRequirement distinctiveIdentifier = "optional";
          MediaKeysRequirement persistentState = "optional";
          sequence<DOMString> sessionTypes;
          KeySystemTrackConfiguration audio;
          KeySystemTrackConfiguration video;
        };
      </xmp>

      <p class='note'>
        This dictionary refers to a number of types defined by
        [[ENCRYPTED-MEDIA]] (EME). Sequences of EME types are
        flattened to a single value whenever the intent of the sequence was to
        have {{Navigator/requestMediaKeySystemAccess()}} choose a subset it supports.
        With MediaCapabilities, callers provide the sequence across multiple
        calls, ultimately letting the caller choose which configuration to use.
      </p>

      <p>
        The <dfn for='MediaCapabilitiesKeySystemConfiguration' dict-member>keySystem</dfn>
        member represents a {{MediaKeySystemAccess/keySystem}} name as described in
        [[!ENCRYPTED-MEDIA]].
      </p>
      <p>
        The <dfn for='MediaCapabilitiesKeySystemConfiguration' dict-member>initDataType</dfn>
        member represents a single value from the {{MediaKeySystemConfiguration/initDataTypes}} sequence
        described in [[!ENCRYPTED-MEDIA]].
      </p>
      <p>
        The <dfn for='MediaCapabilitiesKeySystemConfiguration' dict-member>distinctiveIdentifier</dfn>
        member represents a {{MediaKeySystemConfiguration/distinctiveIdentifier}} requirement as
        described in [[!ENCRYPTED-MEDIA]].
      </p>
      <p>
        The <dfn for='MediaCapabilitiesKeySystemConfiguration' dict-member>persistentState</dfn>
        member represents a {{MediaKeySystemConfiguration/persistentState}} requirement as described in
        [[!ENCRYPTED-MEDIA]].
      </p>
      <p>
        The <dfn for='MediaCapabilitiesKeySystemConfiguration' dict-member>sessionTypes</dfn>
        member represents a sequence of required {{MediaKeySystemConfiguration/sessionTypes}} as
        described in [[!ENCRYPTED-MEDIA]].
      </p>
      <p>
        The <dfn for='MediaCapabilitiesKeySystemConfiguration' dict-member>audio</dfn> member
        represents a {{KeySystemTrackConfiguration}} associated with the {{AudioConfiguration}}.
      </p>
      <p>
        The <dfn for='MediaCapabilitiesKeySystemConfiguration' dict-member>video</dfn> member
        represents a {{KeySystemTrackConfiguration}} associated with the {{VideoConfiguration}}.
      </p>
  </section>

  <section>
    <h4 id='keysystemtrackconfiguration'>
      KeySystemTrackConfiguration
    </h4>

    <xmp class='idl'>
      dictionary KeySystemTrackConfiguration {
        DOMString robustness = "";
        DOMString? encryptionScheme = null;
      };
    </xmp>

    <p>
      The <dfn for='KeySystemTrackConfiguration' dict-member>robustness</dfn>
      member represents a {{MediaKeySystemMediaCapability/robustness}} level
      as described in [[!ENCRYPTED-MEDIA]].
    </p>

    <p>
      The <dfn for='KeySystemTrackConfiguration' dict-member>encryptionScheme</dfn>
      member represents an {{MediaKeySystemMediaCapability/encryptionScheme}}
      as described in [[!ENCRYPTED-MEDIA-DRAFT]].
    </p>
  </section>

  <section>
    <h3 id='media-capabilities-info'>Media Capabilities Information</h3>

    <xmp class='idl'>
      dictionary MediaCapabilitiesInfo {
        required boolean supported;
        required boolean smooth;
        required boolean powerEfficient;
      };
    </xmp>

    <xmp class='idl'>
      dictionary MediaCapabilitiesDecodingInfo : MediaCapabilitiesInfo {
        required MediaKeySystemAccess keySystemAccess;
        MediaDecodingConfiguration configuration;
      };
    </xmp>

    <xmp class='idl'>
      dictionary MediaCapabilitiesEncodingInfo : MediaCapabilitiesInfo {
        MediaEncodingConfiguration configuration;
      };
    </xmp>

    <p>
      A {{MediaCapabilitiesInfo}} has associated <dfn dict-member
      for='MediaCapabilitiesInfo'>supported</dfn>, <dfn dict-member
      for='MediaCapabilitiesInfo'>smooth</dfn>, <dfn dict-member
      for='MediaCapabilitiesInfo'>powerEfficient</dfn> fields which are
      booleans.
    </p>

    <p>
      Encoding or decoding is considered <dfn>power efficient</dfn> when the
      power draw is optimal. The definition of optimal power draw for encoding
      or decoding is left to the user agent. However, a common implementation
      strategy is to consider hardware usage as indicative of optimal power
      draw. User agents SHOULD NOT mark hardware encoding or decoding as power
      efficient by default, as non-hardware-accelerated codecs can be just as
      efficient, particularly with low-resolution video. User agents SHOULD
      NOT take the device's power source into consideration when determining
      encoding power efficiency unless the device's power source has side
      effects such as enabling different encoding or decoding modules.
    </p>

    <p>
      A {{MediaCapabilitiesDecodingInfo}} has associated
      <dfn dict-member for=MediaCapabilitiesDecodingInfo>keySystemAccess</dfn>
      which is a {{MediaKeySystemAccess}} or <code>null</code> as
      appropriate.
    </p>

    <p class='note'>
      If the encrypted decoding configuration is supported, the
      resulting {{MediaCapabilitiesInfo}} will include a
      {{MediaKeySystemAccess}}. Authors may use this to create
      {{MediaKeys}} and setup encrypted playback.
    </p>

    <p>
      A {{MediaCapabilitiesDecodingInfo}} has an associated <dfn
      dict-member for='MediaCapabilitiesDecodingInfo'>configuration</dfn> which
      is the decoding configuration properties used to generate the
      {{MediaCapabilitiesDecodingInfo}}.
    </p>

    <p>
      A {{MediaCapabilitiesEncodingInfo}} has an associated <dfn dict-member
      for='MediaCapabilitiesEncodingInfo'>configuration</dfn> which
      is the encoding configuration properties used to generate the
      {{MediaCapabilitiesEncodingInfo}}.
    </p>

    <section>
      <h3 id='info-algorithms'>Algorithms</h3>

      <section>
        <h4 id='create-media-capabilities-encoding-info'>
          <dfn>Create a MediaCapabilitiesEncodingInfo</dfn>
        </h4>
        <p>
          Given a {{MediaEncodingConfiguration}} <var>configuration</var>, this
          algorithm returns a {{MediaCapabilitiesEncodingInfo}}. The following
          steps are run:
          <ol>
            <li>
              Let <var>info</var> be a new {{MediaCapabilitiesEncodingInfo}}
              instance. Unless stated otherwise, reading and writing apply to
              <var>info</var> for the next steps.
            </li>
            <li>
              Set {{MediaCapabilitiesEncodingInfo/configuration}} to be a
              new {{MediaEncodingConfiguration}}. For every property in <var>
              configuration</var> create a new property with the same name and
              value in {{MediaCapabilitiesEncodingInfo/configuration}}.
            </li>
            <li>
              If the user agent is able to encode the media represented by
              <var>configuration</var>, set {{MediaCapabilitiesInfo/supported}}
              to <code>true</code>. Otherwise set it to <code>false</code>.
            </li>
            <li>
              If the user agent is able to encode the media represented by
              <var>configuration</var> at the indicated framerate, set
              {{MediaCapabilitiesInfo/smooth}} to <code>true</code>. Otherwise
              set it to <code>false</code>.
            </li>
            <li>
              If the user agent is able to encode the media represented by
              <var>configuration</var> in a [=power efficient=] manner, set
              {{MediaCapabilitiesInfo/powerEfficient}} to <code>true</code>.
              Otherwise set it to <code>false</code>.
            </li>
            <li>
              Return <var>info</var>.
            </li>
          </ol>
        </p>
      </section>

      <section>
        <h4 id='create-media-capabilities-decoding-info'>
          <dfn>Create a MediaCapabilitiesDecodingInfo</dfn>
        </h4>
        <p>
          Given a {{MediaDecodingConfiguration}} <var>configuration</var>, this
          algorithm returns a {{MediaCapabilitiesDecodingInfo}}. The following
          steps are run:
          <ol>
            <li>
              Let <var>info</var> be a new {{MediaCapabilitiesDecodingInfo}}
              instance. Unless stated otherwise, reading and writing apply to
              <var>info</var> for the next steps.
            </li>
            <li>
              Set {{MediaCapabilitiesDecodingInfo/configuration}} to be a new
              {{MediaDecodingConfiguration}}. For every property in <var>
              configuration</var> create a new property with the same name and
              value in {{MediaCapabilitiesDecodingInfo/configuration}}.
            </li>
            <li>
              If <code>configuration.keySystemConfiguration</code> [=map/exists=]:
              <ol>
                <li>
                  Set {{MediaCapabilitiesDecodingInfo/keySystemAccess}}
                  to the result of running the <a>Check Encrypted Decoding
                  Support</a> algorithm with <var>configuration</var>.
                </li>
                <li>
                  If {{MediaCapabilitiesDecodingInfo/keySystemAccess}}
                  is not <code>null</code> set
                  {{MediaCapabilitiesInfo/supported}} to
                  <code>true</code>. Otherwise set it to <code>false</code>.
                </li>
              </ol>
            </li>
            <li>
              Otherwise, run the following steps:
              <ol>
                <li>
                  Set {{MediaCapabilitiesDecodingInfo/keySystemAccess}}
                  to <code>null</code>.
                </li>
                <li>
                  If the user agent is able to decode the media represented
                  by <var>configuration</var>, set
                  {{MediaCapabilitiesInfo/supported}} to <code>true</code>.
                </li>
                <li>Otherwise, set it to <code>false</code>.</li>
              </ol>
            </li>
            <li>
              If the user agent is able to decode the media represented by
              <var>configuration</var> at the indicated framerate
              without dropping frames, set {{MediaCapabilitiesInfo/smooth}}
              to <code>true</code>. Otherwise set it to <code>false</code>.
            </li>
            <li>
              If the user agent is able to decode the media represented by
              <var>configuration</var> in a [=power efficient=]
              manner, set {{MediaCapabilitiesInfo/powerEfficient}} to
              <code>true</code>. Otherwise set it to <code>false</code>.
            </li>
            <li>
              Return <var>info</var>.
            </li>
          </ol>
        </p>
      </section>

      <section>
        <h4 id='is-encrypted-decode-supported'>
          <dfn>Check Encrypted Decoding Support</dfn>
        </h4>
        <p>
          Given a {{MediaDecodingConfiguration}} <var>config</var> where
          {{keySystemConfiguration}} [=map/exists=], this algorithm returns a
          {{MediaKeySystemAccess}} or <code>null</code> as appropriate. The
          following steps are run:
          <ol>
            <li>
              If the {{keySystem}} member of
              <code>config.keySystemConfiguration</code> is not one of the
              [=Key Systems=] supported by the user agent, return
              <code>null</code>. String comparison is case-sensitive.
            </li>
            <li>
              Let <var>origin</var> be the [=/origin=] of the calling context's
              <a>Document</a>.
            </li>
            <li>
              Let <var>implementation</var> be the implementation of <code>config.keySystemConfiguration.keySystem</code>.
            </li>

            <li>
              Let <var>emeConfiguration</var> be a new
              {{MediaKeySystemConfiguration}}, and initialize it as follows:
            </li>
            <ol>
              <li>
                Set the {{MediaKeySystemConfiguration/initDataTypes}} attribute to a sequence containing
                <code>config.keySystemConfiguration.initDataType</code>.
              </li>
              <li>
                Set the {{MediaKeySystemConfiguration/distinctiveIdentifier}} attribute to
                <code>config.keySystemConfiguration.distinctiveIdentifier</code>.
              </li>
              <li>
                Set the {{MediaKeySystemConfiguration/persistentState}} attribute to
                <code>config.keySystemConfiguration.peristentState</code>.
              </li>
              <li>
                Set the {{MediaKeySystemConfiguration/sessionTypes}} attribute to
                <code>config.keySystemConfiguration.sessionTypes</code>.
              </li>
              <li>
                If {{MediaConfiguration/audio}} [=map/exists=] in <var>config</var>, set the
                {{MediaKeySystemConfiguration/audioCapabilities}} attribute to a sequence containing a
                single {{MediaKeySystemMediaCapability}}, initialized as
                follows:
                <ol>
                  <li>
                    Set the {{MediaKeySystemMediaCapability/contentType}} attribute to
                    <code>config.audio.contentType</code>.
                  </li>
                  <li>
                    If <code>config.keySystemConfiguration.audio</code>
                    [=map/exists=]:
                    <ol>
                      <li>
                        Set the {{MediaKeySystemMediaCapability/robustness}} attribute to <code>
                        config.keySystemConfiguration.audio.robustness</code>.
                      </li>
                      <li>
                        Set the {{MediaKeySystemMediaCapability/encryptionScheme}} attribute to <code>
                        config.keySystemConfiguration.audio.encryptionScheme</code>.
                      </li>
                    </ol>
                  </li>
                </ol>
              </li>
              <li>
                If {{MediaConfiguration/video}} [=map/exists=] in <var>config</var>, set the
                videoCapabilities attribute to a sequence containing a single
                {{MediaKeySystemMediaCapability}}, initialized as follows:
                <ol>
                  <li>
                    Set the {{MediaKeySystemMediaCapability/contentType}} attribute to
                    <code>config.video.contentType</code>.
                  </li>
                  <li>
                    If <code>config.keySystemConfiguration.video</code> [=map/exists=]:
                    <ol>
                      <li>
                        Set the {{MediaKeySystemMediaCapability/robustness}} attribute to <code>
                        config.keySystemConfiguration.video.robustness</code>.
                      </li>
                      <li>
                        Set the {{MediaKeySystemMediaCapability/encryptionScheme}} attribute to <code>
                        config.keySystemConfiguration.video.encryptionScheme</code>.
                      </li>
                    </ol>
                  </li>
                </ol>
              </li>
            </ol>
            <li>
              Let <var>supported configuration</var> be the result of
              executing the [=Get Supported Configuration=]
              algorithm on <var>implementation</var>,
              <var>emeConfiguration</var>, and <var>origin</var>.
            </li>
            <li>
              If <var>supported configuration</var> is
              <code>NotSupported</code>, return <code>null</code> and abort
              these steps.
            </li>
            <li>
              Let <var>access</var> be a new {{MediaKeySystemAccess}}
              object, and initialize it as follows:
              <ol>
                <li>
                  Set the {{MediaKeySystemAccess/keySystem}} attribute to
                  <code>emeConfiguration.keySystem</code>.
                </li>
                <li>
                  Let the <var>configuration</var> value be
                  <var>supported configuration</var>.
                </li>
                <li>
                  Let the <var ignore=''>cdm implementation</var> value be
                  <var>implementation</var>.
                </li>
              </ol>
            </li>
            <li>Return <var>access</var>.</li>
          </ol>
        </p>
      </section>
  </section>

  <section>
    <h3 id='navigators-extensions'>Navigator and WorkerNavigator extension</h3>

    <xmp class='idl'>
      [Exposed=Window]
      partial interface Navigator {
        [SameObject] readonly attribute MediaCapabilities mediaCapabilities;
      };
    </xmp>

    <xmp class='idl'>
      [Exposed=Worker]
      partial interface WorkerNavigator {
        [SameObject] readonly attribute MediaCapabilities mediaCapabilities;
      };
    </xmp>
  </section>

  <section>
    <h3 id='media-capabilities-interface'>Media Capabilities Interface</h3>

    <xmp class='idl'>
      [Exposed=(Window, Worker)]
      interface MediaCapabilities {
        [NewObject] Promise<MediaCapabilitiesDecodingInfo> decodingInfo(MediaDecodingConfiguration configuration);
        [NewObject] Promise<MediaCapabilitiesEncodingInfo> encodingInfo(MediaEncodingConfiguration configuration);
      };
    </xmp>

  <section>
    <h4 id='task-source'>Media Capabilities Task Source</h4>
    <p>
      The [=task source=] for the tasks mentioned in this specification
      is the <dfn>media capabilities task source</dfn>.
    </p>

    <p>
      When an algorithm <dfn lt="queue a Media Capabilities task">queues
      a Media Capabilities task</dfn> <var>T</var>, the user agent
      MUST [=queue a global task=] <var>T</var> on the [=media capabilities
      task source=] using the [=global object=] of the [=the current realm record=].
    </p>
  </section>
  <section>
    <h4 id='decodinginfo-method'>decodingInfo() Method</h4>
    <p>
      The {{decodingInfo()}} method MUST run the following steps:
      <ol>
        <li>
          If <var>configuration</var> is not a <a>valid
          MediaDecodingConfiguration</a>, return a Promise rejected with a
          newly created {{TypeError}}.
        </li>
        <li>
          If <code>configuration.keySystemConfiguration</code> [=map/exists=],
          run the following substeps:
          <ol>
            <li>
              If the [=/global object=] is of type {{WorkerGlobalScope}},
              return a Promise rejected with a newly created {{DOMException}}
              whose name is {{InvalidStateError}}.
            </li>
            <li>
              If the [=/global object's=] <a>relevant settings object</a> is a
              [=non-secure context=], return a Promise rejected with a newly
              created {{DOMException}} whose name is {{SecurityError}}.
            </li>
          </ol>
        </li>
        <li>
          Let <var>p</var> be a new Promise.
        </li>
        <li>
          Run the following steps <a>in parallel</a>:
          <ol>
            <li>
              Run the <a>Create a MediaCapabilitiesDecodingInfo</a> algorithm
              with <var>configuration</var>.
            </li>
            <li>
              <a>Queue a Media Capabilities task</a> to resolve <var>p</var>
              with its result.
            </li>
          </ol>
        </li>
        <li>
          Return <var>p</var>.
        </li>
      </ol>
    </p>

    <p class='note'>
      Note, calling {{decodingInfo()}} with a {{keySystemConfiguration}} present
      may have user-visible effects, including requests for user consent. Such
      calls should only be made when the author intends to create and use a
      {{MediaKeys}} object with the provided configuration.
    </p>
    </section>

    <section>
    <h4 id='encodinginfo-method'>encodingInfo() Method</h4>
    <p>
      The {{encodingInfo()}} method MUST run the following steps:
      <ol>
        <li>
          If <var>configuration</var> is not a <a>valid MediaConfiguration</a>,
          return a Promise rejected with a newly created {{TypeError}}.
        </li>
        <li>
          Let <var>p</var> be a new Promise.
        </li>
        <li>
          Run the following steps <a>in parallel</a>:
            <ol>
              <li>
                Run the <a>Create a MediaCapabilitiesEncodingInfo</a>
                algorithm with <var>configuration</var>.
              </li>
              <li>
               <a>Queue a Media Capabilities task</a> to resolve <var>p</var> with
               its result.
             </li>
           </ol>
        </li>
        <li>
          Return <var>p</var>.
        </li>
      </ol>
    </p>
    </section>
  </section>
</section>

<section class='non-normative'>
  <h2 id='security-privacy-considerations'>
    Security and Privacy Considerations
  </h2>

  <section>
    <p>
      This specification does not introduce any security-sensitive information
      or APIs but it provides easier access to some information that can be
      used to fingerprint users.
    </p>

    <section>
      <h3 id='decoding-encoding-fingerprinting'>
        Decoding/Encoding and Fingerprinting
      </h3>

      <p>
        The information exposed by the decoding/encoding capabilities can
        already be discovered via experimentation with the exception that the
        API will likely provide more accurate and consistent information. This
        information is expected to have a high correlation with other
        information already available to web pages as a given class of
        device is expected to have very similar decoding/encoding capabilities.
        In other words, high end devices from a certain year are expected to
        decode some type of videos while older devices may not. Therefore, it is
        expected that the entropy added with this API isn't going to be
        significant.
      </p>

      <p>
        HDR detection is more nuanced. Adding {{colorGamut}}, {{transferFunction}}, and
        {{hdrMetadataType}} has the potential to add significant entropy. However,
        for UAs whose decoders are implemented in software and therefore whose
        capabilities are fixed across devices, this feature adds no effective
        entropy. Additionally, for many cases, devices tend to fall into large
        categories, within which capabilities are similar thus minimizing
        effective entropy.
      </p>

      <p>
        If an implementation wishes to implement a fingerprint-proof version of
        this specification, it would be recommended to fake a given set of
        capabilities (i.e., decode up to 1080p VP9, etc.) instead of returning
        always yes or always no as the latter approach could considerably
        degrade the user's experience. Another mitigation could be to limit
        these Web APIs to top-level browsing contexts. Yet another is to use a
        privacy budget that throttles and/or blocks calls to the API above a
        threshold.
      </p>
    </section>
  </section>
</section>

<section>
  <h2 id='examples'>Examples</h2>

  <section>
    <h3 id='example1'>Query playback capabilities with {{decodingInfo()}}</h3>
      <p>
        The following example shows how to use {{decodingInfo()}} to query
        media playback capabilities when using Media Source Extensions
        [[media-source]].
      </p>

      <div class="example" highlight="javascript">
        <pre>
          &lt;script>
            const contentType = 'video/mp4;codecs=avc1.640028';

            const configuration = {
              type: 'media-source',
              video: {
                contentType: contentType,
                width: 640,
                height: 360,
                bitrate: 2000,
                framerate: 29.97
              }
            };

            navigator.mediaCapabilities.decodingInfo(configuration)
              .then((result) => {
                console.log('Decoding of ' + contentType + ' is'
                  + (result.supported ? '' : ' NOT') + ' supported,'
                  + (result.smooth ? '' : ' NOT') + ' smooth and'
                  + (result.powerEfficient ? '' : ' NOT') + ' power efficient');
              })
              .catch((err) => {
                console.error(err, ' caused decodingInfo to reject');
              });
          &lt;/script>
        </pre>
      </div>

      <p>
        The following examples show how to use {{decodingInfo()}} to query
        WebRTC receive capabilities [[webrtc]].
      </p>

      <div class="example" highlight="javascript">
        <pre>
          &lt;script>
            const contentType = 'video/VP8';

            const configuration = {
              type: 'webrtc',
              video: {
                contentType: contentType,
                width: 640,
                height: 360,
                bitrate: 2000,
                framerate: 25
              }
            };

            navigator.mediaCapabilities.decodingInfo(configuration)
              .then((result) => {
                console.log('Decoding of ' + contentType + ' is'
                  + (result.supported ? '' : ' NOT') + ' supported,'
                  + (result.smooth ? '' : ' NOT') + ' smooth and'
                  + (result.powerEfficient ? '' : ' NOT') + ' power efficient');
              })
              .catch((err) => {
                console.error(err, ' caused decodingInfo to reject');
              });
          &lt;/script>
        </pre>
      </div>

      <div class="example" highlight="javascript">
        <pre>
          &lt;script>
            const contentType = 'video/H264;level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42e01f';

            const configuration = {
              type: 'webrtc',
              video: {
                contentType: contentType,
                width: 640,
                height: 360,
                bitrate: 2000,
                framerate: 25
              }
            };

            navigator.mediaCapabilities.decodingInfo(configuration)
              .then((result) => {
                console.log('Decoding of ' + contentType + ' is'
                  + (result.supported ? '' : ' NOT') + ' supported,'
                  + (result.smooth ? '' : ' NOT') + ' smooth and'
                  + (result.powerEfficient ? '' : ' NOT') + ' power efficient');
              })
              .catch((err) => {
                console.error(err, ' caused decodingInfo to reject');
              });
          &lt;/script>
        </pre>
      </div>

  </section>

  <section>
    <h3 id='example2'>Query recording capabilities with {{encodingInfo()}}</h3>

      <div class="note">
        The following example shows how to use {{encodingInfo()}} to query
        WebRTC send capabilities [[webrtc]] including the optional field
        {{VideoConfiguration/scalabilityMode}}.
      </div>

      <div class="example" highlight="javascript">
        <pre>
          &lt;script>
            const contentType = 'video/VP9';

            const configuration = {
              type: 'webrtc',
              video: {
                contentType: contentType,
                width: 640,
                height: 480,
                bitrate: 10000,
                framerate: 29.97,
                scalabilityMode: "L3T3_KEY"
              }
            };

            navigator.mediaCapabilities.encodingInfo(configuration)
              .then((result) => {
                console.log(contentType + ' is:'
                  + (result.supported ? '' : ' NOT') + ' supported,'
                  + (result.smooth ? '' : ' NOT') + ' smooth and'
                  + (result.powerEfficient ? '' : ' NOT') + ' power efficient');
              })
              .catch((err) => {
                console.error(err, ' caused encodingInfo to reject');
              });
          &lt;/script>
        </pre>
      </div>

      <div class="note">
        The following example can also be found in
        <a href="https://codepen.io/miguelao/pen/bWNwej/left?editors=0010#0">
        this codepen</a> with minimal modifications.
      </div>

      <div class="example" highlight="javascript">
        <pre>
          &lt;script>
            const contentType = 'video/webm;codecs=vp8';

            const configuration = {
              type: 'record',
              video: {
                contentType: contentType,
                width: 640,
                height: 480,
                bitrate: 10000,
                framerate: 29.97
              }
            };

            navigator.mediaCapabilities.encodingInfo(configuration)
              .then((result) => {
                console.log(contentType + ' is:'
                  + (result.supported ? '' : ' NOT') + ' supported,'
                  + (result.smooth ? '' : ' NOT') + ' smooth and'
                  + (result.powerEfficient ? '' : ' NOT') + ' power efficient');
              })
              .catch((err) => {
                console.error(err, ' caused encodingInfo to reject');
              });
          &lt;/script>
        </pre>
      </div>
  </section>
</section>