Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PlayReady EME "encrypted" PSSH initData payload ignored #6005

Closed
jrivany opened this issue Nov 29, 2023 · 25 comments · Fixed by #6644
Closed

PlayReady EME "encrypted" PSSH initData payload ignored #6005

jrivany opened this issue Nov 29, 2023 · 25 comments · Fixed by #6644

Comments

@jrivany
Copy link

jrivany commented Nov 29, 2023

Is your feature request related to a problem? Please describe.

I'm looking at supporting media encrypted with multiple DRM systems (i.e. playready, widevine), in doing so I've generated MP4 init segments containing 2 PSSH boxes.

Currently the PSSH parsing code that handles the initData payload is written to assume there is only one PSSH box being passed:

https://github.com/video-dev/hls.js/blob/master/src/utils/mp4-tools.ts#L1300

However the CENC initialization data spec states that this could be multiple boxes concatenated:

https://www.w3.org/TR/eme-initdata-cenc/#format

Describe the solution you'd like

I would like if the EME constroller could support media that contains multiple adjacent PSSH boxes for different key systems.

Additional context

No response

@jrivany jrivany added Feature proposal Needs Triage If there is a suspected stream issue, apply this label to triage if it is something we should fix. labels Nov 29, 2023
@robwalch
Copy link
Collaborator

robwalch commented Nov 30, 2023

Hi @jrivany,

Are you providing the correct KEY tags in your HLS playlists?

The parsePssh function you are referencing was added specifically for assets with clear-lead keys like https://storage.googleapis.com/shaka-demo-assets/angel-one-widevine-hls/hls.m3u8 where HLS.js receives an encrypted event from EME before encountering a fragment with a KEY tag in the HLS playlist. If it fails to extract a key ID, that shouldn't prevent the key system from being selected and key session started using playlist keys (assuming parsePssh returns null and not a bad key id).

That being said, I'd be happy to review and test changes that could support additional configurations.

@robwalch robwalch removed the Needs Triage If there is a suspected stream issue, apply this label to triage if it is something we should fix. label Nov 30, 2023
@jrivany
Copy link
Author

jrivany commented Nov 30, 2023

Hey Rob, thanks for the quick reply.

Are you providing the correct KEY tags in your HLS playlists?

In this instance no. I should provide more context: we're currently writing our own packaging layer, and were experimenting with the viability of different options.

In this particular experiment I was casually ignoring 4.4.4.4 in the spec, specifically:

If the Media Playlist file does not contain an EXT-X-KEY tag, then. Media Segments are not encrypted.

In theory there was no technical reason we couldn't rely solely on the PSSH box in the content itself, which currently works fine when there's only a single PSSH box.

This is definitely a more fringe, off-spec use-case that would only be a mild simplification on the manifest generation side, which I'm now reconsidering the value of.

@robwalch
Copy link
Collaborator

In theory there was no technical reason we couldn't rely solely on the PSSH box in the content itself, which currently works fine when there's only a single PSSH box.

This is definitely a more fringe, off-spec use-case that would only be a mild simplification on the manifest generation side, which I'm now reconsidering the value of.

For clear-lead playlists where the pssh is in the init segment, but the first segment or two are not encrypted, CDM setup can begin in response to appending the init segment and EME dispatching the "encrypted" event.

In this example, the pssh is present in "init.mp4", but first encrypted segment is "s3.mp4":

#EXTM3U
#EXT-X-VERSION:6
#EXT-X-TARGETDURATION:5
#EXT-X-PLAYLIST-TYPE:VOD
#EXT-X-MAP:URI="init.mp4"
#EXTINF:4.000,
s1.mp4
#EXTINF:4.000,
s2.mp4
#EXT-X-DISCONTINUITY
#EXT-X-KEY:METHOD=SAMPLE-AES-CTR,URI="data:text/plain;base64,AAY=",KEYID=0x800000000,KEYFORMATVERSIONS="1",KEYFORMAT="urn:uuid:edef8ba9-79d6-4ace-a3c8-27dcd51d21ed"
#EXTINF:4.000,
s3.mp4

If HLS.js didn't begin the key session on the "encrypted" event, it would not begin until just before loading the third segment - something to consider if you packaging media this way with more than one key system or multiple keys in the init segment.

Thanks for sharing the eme-initdata-cenc/#format details. I don't implement features without sample assets which is why this one only made it so far. I look forward to hearing which way you go.

@cjpillsbury
Copy link
Collaborator

Hey there @robwalch this actually showed up as an issue for us with multi-drm (widevine + playready pssh'es). We have an "in-app" (aka in the core of Mux Player) solution that accounts for it here muxinc/elements#957, but I'm hoping to dig in and see if there might be some assumptions to unwind in e.g. the mp4 -> mp4 transmuxing. Happy to share test content out of band if you wanna dig in as well.

@robwalch
Copy link
Collaborator

robwalch commented Jul 23, 2024

Have a look at generateRequestWithPreferredKeySession. Would it help if hls.js included the "reason" for the key-session request in the generateRequest callback? If this is called more than once or with different reasons on prior to renewal, that would be because of the request being generated for a playlist-key or an encrypted media element event:

  • 'playlist-key'
  • 'encrypted-event-key-match'
  • 'encrypted-event-no-match'
  • 'expired' (renewal only)

private generateRequestWithPreferredKeySession(
context: MediaKeySessionContext,
initDataType: string,
initData: ArrayBuffer | null,
reason:
| 'playlist-key'
| 'encrypted-event-key-match'
| 'encrypted-event-no-match'
| 'expired',
): Promise<MediaKeySessionContext> | never {
const generateRequestFilter =
this.config.drmSystems?.[context.keySystem]?.generateRequest;
if (generateRequestFilter) {
try {
const mappedInitData: ReturnType<typeof generateRequestFilter> =
generateRequestFilter.call(this.hls, initDataType, initData, context);
if (!mappedInitData) {
throw new Error(
'Invalid response from configured generateRequest filter',
);
}
initDataType = mappedInitData.initDataType;
initData = context.decryptdata.pssh = mappedInitData.initData
? new Uint8Array(mappedInitData.initData)
: null;
} catch (error) {
this.warn(error.message);
if (this.hls?.config.debug) {
throw error;
}
}
}
if (initData === null) {
this.log(`Skipping key-session request for "${reason}" (no initData)`);
return Promise.resolve(context);
}
const keyId = this.getKeyIdString(context.decryptdata);
this.log(
`Generating key-session request for "${reason}": ${keyId} (init data type: ${initDataType} length: ${
initData ? initData.byteLength : null
})`,
);

It would help to know which of these you are encountering.

Please also have a look at the context argument for the generateRequest callback. initData and type are passed through from the "encrypted" event. It is the context: MediaKeySessionContext argument that carries decryptdata: LevelKey with parsed pssh which will be used as the pssh response if a generateRequest callback filter is not supplied.

@robwalch
Copy link
Collaborator

For ArrayBuffer/Uint8Arrat type issues please thumbs up or comment on #5849.

@cjpillsbury
Copy link
Collaborator

will follow up in more detail but the tl;dr:

  1. our init segments contain two pssh boxes, one for playready and one for widevine
  2. in a "hack environment" I've set up to do DRM + segmented media testing, the MediaEncryptedEvent will signal an initDataType: 'cenc' with both pssh'es as the initData. Passing these along to the CDM via generateKey() seems to work fine without modification (though I definitely wouldn't claim that will work for every permutation). aka the CDM appears to "figure out" which pssh to use (assuming minimally via its system id) and just tosses the other
  3. I'm almost postive (but will confirm), we're hitting the hls.js generateRequest() config function from the EME signalling via the MediaEncryptedEvent.
  4. in hls.js, by the time generateRequest() is invoked, the initData is actually a pssh box inside another pssh box. This doesn't appear to reflect either EME/CDM behavior or what the actual box structure looks like in the ISO-BMFF init segment. This is why I was assuming it might actually be a breakdown in the mp4 -> mp4 demux + remux. I have yet to dig in on that front though, so this is very much just a tentative theory.

@robwalch
Copy link
Collaborator

robwalch commented Jul 23, 2024

This is why I was assuming it might actually be a breakdown in the mp4 -> mp4 demux + remux

For pssh extraction (from playlist KEY data URI) see getDecryptData in level-key. For pssh parsing of init data in "encrypted" media events see onMediaEncrypted > parsePssh. I don't think in either case or in any part of hls.js is the mp4 pssh being modified (or read from fragment response data).

I'm almost postive (but will confirm), we're hitting the hls.js generateRequest() config function from the EME signalling via the MediaEncryptedEvent.

The [eme] log lines will contain messages from generateRequestWithPreferredKeySession indicating which path is being taken.

@cjpillsbury
Copy link
Collaborator

perfect will pull these threads. Thanks, @robwalch!

@robwalch
Copy link
Collaborator

robwalch commented Jul 23, 2024

Currently the PSSH parsing code that handles the initData payload is written to assume there is only one PSSH box being passed

The description of this issue points to code that only handles PSSH found in Widevine and PlayReady KEY tags (which should not include PSSH data from other key-systems).

Passing these along to the CDM via generateKey() seems to work fine without modification

That is the expected behavior.

@robwalch
Copy link
Collaborator

robwalch commented Jul 23, 2024

I think what's going on (@jrivany and maybe @cjpillsbury although not sure about the nesting issue) is you want parsePssh to return only the PSSH for the selected key-system (or return a dictionary for selection once a system is selected).

Is this required on a specific UA or by your license server?

@cjpillsbury
Copy link
Collaborator

@robwalch
We own all of the pssh + manifest generation, so we can make changes as appropriate. I am pretty confident we're spec-compliant with ISO/IEC 23001-7 PSSH box signaling (plus the underlying widevine + playready specs/expectations) and afaik there isn't any official formalization of EXT-X-KEY beyond clearkey + fairplay usage, though we'd happily conform to whatever the norm is for players/playback engines (assuming there is one), including hls.js. I'm going to be diving into where the breakdown occurs under the hood in hls.js.
Re: UA question -
With our current multi-drm setup (which includes EXT-X-KEYs for fairplay, widevine, and playready, pssh boxes for widevine and playready, and a sinf box that can be used for fairplay + MSE signaling), we're only seeing an issue on windows ("Edgeium" with widevine disabled for testing purposes). Also, just to reiterate, this worked cleanly using a bare bones impl directly integrating with MSE + EME and simply relying on MediaEncryptedEvent signaling for PlayReady.

@cjpillsbury
Copy link
Collaborator

ah actually looks like this is, in fact, from the pssh generated by the EXT-X-KEY URI. We're b64 encoding the full pssh. and it looks like mp4pssh() in hls.js is assuming just the pssh data. If this is an industry standard, we should change our playlists. If this is a bit more "loosey goosey" (which was my understanding, though I could be mistaken), there's probably room for improvement here on the hls.js side, including:

  1. configurable processing of the EXT-X-KEY translation
  2. some dum dum checks on the URI value (including potentially checking if it's the pssh)
  3. better resilience on EME failures that are caused by (presumptuous) URI parsing, since the pssh signaling is more explicitly formalized.

@cjpillsbury
Copy link
Collaborator

Would it help if hls.js included the "reason" for the key-session request in the generateRequest callback?

Yeah being able to effectively filter in/out based on where the key is sourced from, I think that would be a solid improvement (now that I've gotten my bearings on root cause in this case)

@cjpillsbury
Copy link
Collaborator

fwiw just confirmed this minor change to mp4pssh would resolve our issue without changing our playlists:

if (keyids) {
    version = 1;
    kids = new Uint8Array(keyids.length * 16);
    for (let ix = 0; ix < keyids.length; ix++) {
      const k = keyids[ix]; // uint8array
      if (k.byteLength !== 16) {
        throw new RangeError('Invalid key');
      }
      kids.set(k, ix * 16);
    }
  // Code changes start here. Including above for context
  } else {
    const view = new DataView(data.buffer);
    if (
      // Looks like the data is a pssh box
      view.getUint32(0) === data.length && view.getUint32(4) === 0x70737368 &&
      // with a matching System ID
      new Uint8Array(data.buffer, 12, 16).every((idByte, idx) => {
        return idByte === systemId[idx];
      })
    ) {
      // So just return the data as the pssh in this case
      return data;
    // Code changes end here (bracketing newly nested if)
    } else {
      version = 0;
      kids = new Uint8Array();
    }
  }

current code, for reference:
https://github.com/video-dev/hls.js/blob/master/src/utils/mp4-tools.ts#L1306-L1324

@robwalch
Copy link
Collaborator

robwalch commented Jul 23, 2024

ah actually looks like this is, in fact, from the pssh generated by the EXT-X-KEY URI.

Right. So you should have a log message that looks like:
[eme]: – Generating key-session request for "playlist-key" ...

We're b64 encoding the full pssh.

Base64 encoding is expected for PlayReady. The encoded data is expected to be a PlayReady Object.

and it looks like mp4pssh() in hls.js is assuming just the pssh data.

There was a patch (#5699) that went in that perhaps belonged in getDecryptData (parse the PlayReady Object "Challenge" out to pass to mp4pssh), but instead went into eme-controller in unpackPlayReadyKeyMessage:

private unpackPlayReadyKeyMessage(
xhr: XMLHttpRequest,
licenseChallenge: Uint8Array,
): Uint8Array {
// On Edge, the raw license message is UTF-16-encoded XML. We need
// to unpack the Challenge element (base64-encoded string containing the
// actual license request) and any HttpHeader elements (sent as request
// headers).
// For PlayReady CDMs, we need to dig the Challenge out of the XML.
const xmlString = String.fromCharCode.apply(
null,
new Uint16Array(licenseChallenge.buffer),
);
if (!xmlString.includes('PlayReadyKeyMessage')) {
// This does not appear to be a wrapped message as on Edge. Some
// clients do not need this unwrapping, so we will assume this is one of
// them. Note that "xml" at this point probably looks like random
// garbage, since we interpreted UTF-8 as UTF-16.
xhr.setRequestHeader('Content-Type', 'text/xml; charset=utf-8');
return licenseChallenge;
}
const keyMessageXml = new DOMParser().parseFromString(
xmlString,
'application/xml',
);
// Set request headers.
const headers = keyMessageXml.querySelectorAll('HttpHeader');
if (headers.length > 0) {
let header: Element;
for (let i = 0, len = headers.length; i < len; i++) {
header = headers[i];
const name = header.querySelector('name')?.textContent;
const value = header.querySelector('value')?.textContent;
if (name && value) {
xhr.setRequestHeader(name, value);
}
}
}
const challengeElement = keyMessageXml.querySelector('Challenge');

The "Challenge" (the pssh payload) must be parsed from the PlayReady Object to create a valid PSSH that will then be supplied as initData. It looks like (#5699) worked around this by detecting the xml content and parsing the challenge out just before sending the data to the license server. The correct place to do this would be in getDecryptData where the KEY tag data is parsed to create the Widevine/PlayReady PSSH with mp4pssh.

@cjpillsbury
Copy link
Collaborator

We went ahead and updated our #EXT-X-KEY URI value to conform with the expectations in hls.js for PlayReady, but just to level set on the issue, I'm going to describe everything we had in our prior setup:

  1. Spec-compliant PSSH boxes (per ISO/IEC 23001-7) in our init segments for both Widevine and PlayReady:
Screenshot 2024-07-24 at 8 05 17 AM

As expected, the PlayReady (identified via its SystemId field) PSSH contains a PlayReady Object (PRO) as its Data field, with an enclosed, UTF-16 little endian PlayReady Header v4.3.0.0 (PRH) as its only record value.

Widevine also conforms in relevant ways, but it did not have any issues, so I'll not dive in on that one.

  1. Base64-encoded versions of those exact PSSH boxes described above for the #EXT-X-KEY URI values for both Widevine and PlayReady:

(Real world example no longer available bc we updated our servers. Below is the updated implementation that is consistent with hls.js's assumptions)

# NOTE: THE WIDEVINE B64 VALUE IN THE URI IS AND ALWAYS HAS REPRESENTED THE ENTIRE PSSH IN OUR IMPL AND
# WE NEVER HAD PLAYBACK ISSUES WITH WIDEVINE IN HLS.JS
#EXT-X-KEY:METHOD=SAMPLE-AES,URI="data:text/plain;base64,AAAAknBzc2gAAAAA7e+LqXnWSs6jyCfc1R0h7QAAAHISEJzpiZ1pGCYl7bOsZcDYXhciWGV5SmhjM05sZEVsa0lqb2lPVGc0TXpBME9EWXdNamczTURBNE56YzJJaXdpZG1GeWFXRnVkRWxrSWpvaU9UZzRNekEwT0RZd05ERXpNall6T0Rnd0luMD1I88aJmwY=",KEYID=0x9ce9899d69182625edb3ac65c0d85e17,KEYFORMAT="urn:uuid:edef8ba9-79d6-4ace-a3c8-27dcd51d21ed",KEYFORMATVERSION="1"

# NOTE: THE PLAYREADY B64 VALUE IN THE URI IS *NO LONGER* THE ENTIRE PSSH AND IS INSTEAD THE PSSH'S 
# DATA FIELD/THE PRH, BECAUSE THAT IS ASSUMED BY HLS.JS'S IMPL FOR PROCESSING ITS VALUE
#EXT-X-KEY:METHOD=SAMPLE-AES,URI="data:text/plain;charset=UTF-16;base64,vgEAAAEAAQC0ATwAVwBSAE0ASABFAEEARABFAFIAIAB4AG0AbABuAHMAPQAiAGgAdAB0AHAAOgAvAC8AcwBjAGgAZQBtAGEAcwAuAG0AaQBjAHIAbwBzAG8AZgB0AC4AYwBvAG0ALwBEAFIATQAvADIAMAAwADcALwAwADMALwBQAGwAYQB5AFIAZQBhAGQAeQBIAGUAYQBkAGUAcgAiACAAdgBlAHIAcwBpAG8AbgA9ACIANAAuADMALgAwAC4AMAAiAD4APABEAEEAVABBAD4APABQAFIATwBUAEUAQwBUAEkATgBGAE8APgA8AEsASQBEAFMAPgA8AEsASQBEACAAQQBMAEcASQBEAD0AIgBBAEUAUwBDAEIAQwAiACAAVgBBAEwAVQBFAD0AIgBuAFkAbgBwAG4AQgBoAHAASgBTAGIAdABzADYAeABsAHcATgBoAGUARgB3AD0APQAiAD4APAAvAEsASQBEAD4APAAvAEsASQBEAFMAPgA8AC8AUABSAE8AVABFAEMAVABJAE4ARgBPAD4APAAvAEQAQQBUAEEAPgA8AC8AVwBSAE0ASABFAEEARABFAFIAPgA=",KEYFORMAT="com.microsoft.playready",KEYFORMATVERSION="1"
  1. Some comments in relation to the situation
  • What the #EXT-X-KEY's URI value is "supposed to be/represent" for PlayReady and Widevine is not formally defined in any official specification afaik.
  • The inconsistent behaviors/expectations between the Widevine and PlayReady URI values feels "off" insofar as it is "baked in" to hls.js
  • My rough and ready proposed changes are a backwards compatible way of saying "the URI value may be a B64 encoding of either the PSSH's Data field or the entire PSSH.
  • Making it easier in hls.js to sidestep assumptions on what/how the URI value represents information for DRM might be a good idea in general, in large part bc of the first bullet, above
  • As much as possible, making sure that hls.js doesn't error/fail if "no one has done anything wrong" (aka has implemented everything in a spec-compliant way) wrt DRM-relevant data/data structures feels like a value add. In this particular case, since our PSSH boxes were valid and confirmed to work E2E in a "simplified / hack environment" directly working with MSE + EME, the fact that hls.js errored without recovery based on the "out of spec" URI value was unfortunate.

Since we've changed our implementation server-side (and since we also identified a workaround client side via config), none of this is urgent for me/us. I'm just noting a few places that resulted in some pain on our side that could plausibly be improved by unwinding some in-code assumptions that should arguably be loosened (or at least conditionalized, per my quick/hack example code change ☝️).

@robwalch
Copy link
Collaborator

What the #EXT-X-KEY's URI value is "supposed to be/represent" for PlayReady and Widevine is not formally defined in any official specification afaik.

It should be a PlayReady Object. The fact that getDecryptData passes the entire decoded data URI to mp4pssh as pssh data (making an incorrect assumption as you've pointed out) appears to be a bug in hls.js.

The proposed solution for the bug in hls.js is to extract the pssh data (the 'Challenge' element) from the PRO. This can be achieved by moving the code from unpackPlayReadyKeyMessage to getDecryptData, where it actually belongs.

I suggest using a getRequest callback to extract the pssh before a license request is made, as in muxinc/elements#957, as a workaround until a fix is released.

How does that sound? Would you be willing to contribute a fix for this, or would you prefer I make a PR? If so, would you help test the fix? We would be careful to maintain compatibility with either type of data in the KEY tag based on whether or not the base64 decoded data includes PRO XML or the expected system ID bytes.

@cjpillsbury
Copy link
Collaborator

cjpillsbury commented Jul 24, 2024

I'm 💯 cool with working on a PR here, but I want to make sure there's alignment first.

It should be a PlayReady Object

I'm arguing that hls.js probably shouldn't assume (or at least should reduce the number of assumptions) that the URI value is anything in particular because it's not part of any specification. I don't think the current code is a bad assumption; I think it's a bad assumption. In other words, accounting for the fact that the value may be a b64 encoded PRO isn't bad, but also accounting for the fact that it may be a b64 encoded PSSH would be good. I've explicitly seen #EXT-X-KEYs with URIs generated from 3rd party DRM providers that exactly conform to your current implementation. The thing is, this isn't part of any actual specification (afaik), so swapping one assumption for another doesn't feel ideal to me either.

@robwalch
Copy link
Collaborator

I think we're in alignment. What I am saying is HLS.js should expect a PRO but account for when it is not.

@robwalch
Copy link
Collaborator

There are a couple of other points we should align on:

  1. The issue is in getDecryptData extracting (or identifying) pssh data in PlayReady KEY tags only
  2. mp4pssh does not require changes as the problem you identified is not related to embedding multiple PSSH boxes in KEY tags (this should have been filed as a new issue)

@cjpillsbury
Copy link
Collaborator

Aligned on both. I'll open another issue. Sorry for misappropriating/coopting your issue, @jrivany !

@robwalch
Copy link
Collaborator

robwalch commented Jul 24, 2024

No worries - It could be the same root cause with a similar conclusion. (Although I think there is a point to be made with this issue about how to handle and document initData from "encrypted" events vs KEY tags.)

If you need this as a patch please make changes against patch/v1.5.x. I'm happy to cut a patch and merge the fix into dev as a follow up.

@robwalch
Copy link
Collaborator

Related with workarounds suggested to filter session generation based on playlist or "encypted" event keys: #6636

@robwalch robwalch added the DRM label Aug 21, 2024
@robwalch robwalch changed the title EME Support multiple PSSH boxes in a single initData payload PlayReady EME "encrypted" PSSH initData payload ignored Aug 21, 2024
@robwalch
Copy link
Collaborator

Renamed this issue for comments above: #6005 (comment)

PSSH parsing of initData was part of the problem, but that parsing was only for Widevine so that hls.js could initialize a session on clear segments with pssh payloads prior to requesting segments with KEY URIs in the playlist (see shaka-packager "clear-lead" example). #6640 will fix PSSH parsing with multi-key-system assets, but will continue to ignore PlayReady keys in the media.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

3 participants