RE: [EME] reuse of session

Thanks for your comments.
Please find replies inline.

From: David Dorwin [mailto:ddorwin@google.com]
Sent: Friday, April 25, 2014 4:58 AM
To: Maruyama, Shinya
Cc: public-html-media@w3.org
Subject: Re: [EME] reuse of session



On Wed, Apr 23, 2014 at 10:14 PM, Maruyama, Shinya <Shinya.Maruyama@jp.sony.com<mailto:Shinya.Maruyama@jp.sony.com>> wrote:
As to the reuse of session by promise, would it be better that CDM determines whether to create a new media key session?

Where possible, it is better to keep logic in the user agent. This helps ensure more consistent behavior, enables reuse and OSS implementation, and is more consistent with other specs.

I understand that perspective.

According to createSession algorithm step 6-1, a user agent determines it by checking list of active session Initialization Data.
As the user agent is not aware of the DRM specific part of initData (except for container-independent initData Bug 25269<https://www.w3.org/Bugs/Public/show_bug.cgi?id=25269>), the user agent can compare the entire of initData but it is impossible to look into the content of initData to see if requesting keyIds are available in active sessions.

The current text is intentionally comparing the raw initData and not the contents. In the case of CENC, this ensures the same number of sessions will be created regardless of which PSSH box (or key system) is used.

The current text is not intended to prevent all duplicate sessions - it's intended to prevent potentially hundreds of sessions from being created for the same initData during adaptive streaming, etc. and reduce the overall amount of work the application must do. Checking whether the CDM already has the appropriate keys would not address all such scenarios and could lead to inconsistent behavior. Specifically, there will often be two needkey events when playback starts as the audio and video streams are parsed. At that point, the CDM does not have any keys, and it can't know whether the first session that gets created will return a key for the second initData.

I understand the intention of current text.
From my viewpoint, I believe the usecase I showed below would be likely case.

.
This means, an application suffers unnecessary creating media key sessions as follows, for instance;

1. Prefetch license(s) by media key session initiated by container-independent initData including superset keyIds; e.g. KIDs for audio and video tracks, for a subsequent media presentation (at MPD fetch time)
2. When each initialization Data contains a pssh, it causes unnecessary media key session because the cenc-formatted initData is not identical to the preceding initData at step1; i.e. container-independent pssh, pssh for audio and pssh for video are all distinct.

Do I understand correctly that #1 uses initData from a manifest while #2 uses data from needkey events? If so, what is the use case? I expect that such an application would be rare and would need to either ensure the initData is consistent or handle session creation differently.

Yes, I mean #1 is a manifest; e.g. MPD for DASH.
MPD may carry sufficient information to trigger a license acquisition which includes the keys necessary to decrypt all the subsequent media streams; audio, video, subtitle, etc (unless key rotation happens).
To prefetch the license allows to acquire and process the license in advance of (or simultaneously) the Initialization/Media Segment being made available. I think this kind of advanced license request is a topic a lot of people have concern about.


Key rotation may be a another usecase. Reusing a media key session by just updating available keys within the session maybe provide some benefit but it is not possible because the pssh that requires new key(s) is most probably distinct from other active initData.

initData cannot be added to an existing session, so key rotation using the same session would need to be handled in some other way. Thus, step 6-1 does not affect this scenario.

OK. Probably we need to discuss the key rotation matter comprehensively instead of looking into current text one by one.

Received on Friday, 25 April 2014 02:11:36 UTC