This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 24873 - Current isTypeSupported() definition does not provide sufficient information to applications
Summary: Current isTypeSupported() definition does not provide sufficient information ...
Status: RESOLVED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: Encrypted Media Extensions (show other bugs)
Version: unspecified
Hardware: All All
: P3 normal
Target Milestone: ---
Assignee: Jerry Smith
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on: 24951
Blocks: 24673
  Show dependency treegraph
 
Reported: 2014-03-01 01:48 UTC by David Dorwin
Modified: 2014-08-12 15:52 UTC (History)
4 users (show)

See Also:


Attachments

Description David Dorwin 2014-03-01 01:48:17 UTC
*** Summary ***
I propose changing the isTypeSupported() signature from:
  static bool isTypeSupported(DOMstring keySystem, optional DOMString contentType);
to:
  static IsTypeSupported isTypeSupported(DOMstring keySystem, optional DOMString contentType, optional DOMString capability);

Note the change in the return value and the addition of an optional |capability| parameter.
(There is ongoing discussion of the format of |contentType|. This proposal may change as appropriate to accommodate the specification of Initialization Data format and codecs.)

This addresses at least three use cases:
1) Providing a clear indication of the capabilities without any assumptions or implied rules.
2) Applications can immediately provide the best supported streams without waiting for the license exchange.
3) Platforms that cannot provide an immediate definitive response can replay “maybe”, acknowledging the combination but telling the application to check the license request for a definitive answer on availability.

Without such changes, applications will likely need to rely on user agent string checks or provide a degraded experience.


*** Introduction ***
The purpose of isTypeSupported() is for applications to be able to select a media stream and key system combination that the client supports. The spec currently says, "The isTypeSupported(keySystem, contentType) method returns whether keySystem is supported with the specified container and codec contentType(s)” but does not define "supported".

In practice, selecting the right combination involves more than just the checking the container or codecs. The current definition does not allow applications to accurately determine whether many common scenarios are supported on the wide range of platform/CDM implementations. Responses from the current isTypeSupported() method may be ambiguous and represent many possible capabilities. We should address the ambiguity to avoid compatibility issues (and the likelihood of applications using user agent string checks to work around them).

Content providers/applications will have varying robustness requirements that may be different for audio and video. Likewise, implementations will have varying robustness for each content type, ranging from simply supporting decryption of blocks in the specified container to secure pipelines, yet there is currently no way for applications to tell the difference.


*** Discussion ***
The one thing that all CDMs must explicitly support is the container [protection scheme] (because they must be able to handle the corresponding Initialization Data). When |contentType| is a supported container without codecs, it is clear that isTypeSupported() should return true iff |keySystem| supports parsing Initialization Data for the specified container.

In the case where a secure pipeline is required for audio and video, an application would ask MediaKeys.isTypeSupported("com.example.somesystem", "video/foo; codecs='baraudio,bazvideo'"). (Two separate queries could also be used.) However, there is no guarantee that a true response indicates the availability of a secure pipeline.

Likewise, it’s not clear what query to issue to get an affirmative response from the largest set of clients when, for example, a secure pipeline is not required, there are lesser requirements on the audio pipeline, or the audio is not encrypted. In other words, how can an application know whether a 'true' response indicates that a secure pipeline is available for both audio and video, a secure pipeline is available for video but not audio, or the UA simply knows how to decrypt blocks in the foo container and has baraudio and bazvideo decoders? Without this information, an application doesn’t actually know whether its requirements are supported by the client.


*** Proposal ***
The only unambiguous solution - one that would actually avoid user agent checks - seems to be explicitly querying the capabilities/robustness levels:
  static IsTypeSupported isTypeSupported(DOMstring keySystem, optional DOMString contentType, optional DOMString capability);
where IsTypeSupported is an enum with the same set of values as CanPlayTypeEnum[1][2] and |capability| is a |keySystem|-specific string.

If |capability| is missing, the response would simply indicate whether the user agent is at all capable of using |keySystem| with |contentType|. While it doesn’t seem practical to standardize |capability| strings, it seems reasonable for applications to know the capabilities they require. Also, this part of the application already has key system-specific logic.

The synchronous and querying nature of IsTypeSupported() means that some implementations may not be able to provide definitive responses or may only be able to guarantee a minimum capability. Therefore, it makes sense to provide a response of “maybe” as an indication that there is no definitive answer and the application should confirm the answer with the license server. An application can then look for a “probably” and try the “maybe” combination if there are no “probably” responses and/or once the capability is determined to be available (i.e. as indicated in the license request). Responses should also follow the rules regarding “maybe”/”probably” that canPlayType() specifies for codecs.

** Different audio and video requirements **
When different capabilities are required for audio and video, the application would need to make two separate calls, one for each content type.

Another option would be to have separate |audioCapability| and |videoCapability| parameters.

Below are some drawbacks of separate calls. Separate parameters addresses all of them.
* Applications often:
  - Check audio and video support together.
  - Pass the full MIME type, including the codec(s). (Note: This may be obsoleted if we change |contentType|.)
* It is possible that querying audio and video separately could lead to false positives if the types and capabilities are supported by separate pipelines (and the UA can only use one at a time per media element). This mainly seems likely in the case of clear audio and encrypted video.


[1] http://www.w3.org/TR/2013/WD-html51-20130528/embedded-content-0.html#canplaytypeenum
[2] Unlike CanPlayTypeEnum, IsTypeSupported does not contain “Enum” per https://www.w3.org/Bugs/Public/show_bug.cgi?id=24190.
Comment 1 David Dorwin 2014-03-04 01:57:36 UTC
Capability strings may have other uses as well. See #1 in https://www.w3.org/Bugs/Public/show_bug.cgi?id=24025#c10.
Comment 2 David Dorwin 2014-03-04 17:06:40 UTC
For reference, bug 16611 was related to capability detection (using key system names). The decision was to not overload key system strings with capability detection and consider if we need a separate capability detection system.
Comment 3 David Dorwin 2014-03-05 23:48:53 UTC
If the proposal in bug 24951 is adopted, we may need to consider how to specify the Initialization format too.
Comment 4 Joe Steele 2014-03-17 21:05:13 UTC
I think some specific examples for _capabilities_ would help here.

Are you thinking of hardware related things like: 
  display:internal; maxResolution=1080P; audio=special_7.1

Or are you think of DRM/robustness related things like: 
  hardwareRootOfTrust=true; nonUserAccessibleBus=true; memberOfDomain:bar

Or would both of these examples fit?
Comment 5 Joe Steele 2014-03-17 21:06:50 UTC
BTW -- I think it is a good idea. Just trying to see what I would use it for and how standardized we can make it.
Comment 6 David Dorwin 2014-03-17 21:35:52 UTC
(In reply to Joe Steele from comment #4)
> I think some specific examples for _capabilities_ would help here.
> 
> Are you thinking of hardware related things like: 
>   display:internal; maxResolution=1080P; audio=special_7.1
> 
> Or are you think of DRM/robustness related things like: 
>   hardwareRootOfTrust=true; nonUserAccessibleBus=true; memberOfDomain:bar
> 
> Or would both of these examples fit?

I was thinking of the latter, though probably a set of functionality would be given a simple label. (Domains would not be included.)

The former (except maybe "display") seems more of a general media capability detection. If/when something like that is defined, it or isTypeSupported() could be extended to support EME or such detection, respectively.
Comment 7 Joe Steele 2014-03-17 22:38:23 UTC
Would it be fair to say that the primary motivator for this value is to drive the applications UI based on feedback from the CDM then? Presumably to avoid a round-trip to the license server to find out what the CDM could have told you already? (namely that this client does not support what you want)
Comment 8 David Dorwin 2014-03-17 23:42:34 UTC
(In reply to Joe Steele from comment #7)
> Would it be fair to say that the primary motivator for this value is to
> drive the applications UI based on feedback from the CDM then? Presumably to
> avoid a round-trip to the license server to find out what the CDM could have
> told you already? (namely that this client does not support what you want)

It is to drive some aspect of the application (i.e. UI or streams provided) based on the UA's knowledge of the CDM. Yes, this is to avoid round trips (and even the delay for asynchronously querying the CDM directly).
Comment 9 David Dorwin 2014-03-18 15:31:23 UTC
Waiting for bug 24951 to be addressed and the proposal updated.
Comment 10 David Dorwin 2014-04-08 19:24:40 UTC
With the introduction of initialization data type (bug 24951), we need to somehow include this in the isTypeSupported() call.

I don't think we can just rely on the MIME type (e.g. "video/mp4") because initialization data type allows multiple such types to be supported by a single container. There is also the prospect of container-independent types, such as proposed in bug 25269.

That seems to leave three options, both of which include an |initDataType| parameter:
1) static IsTypeSupported isTypeSupported(DOMstring keySystem, *optional DOMString initDataType,* optional DOMString contentType, optional DOMString capability);
2) static IsTypeSupported isTypeSupported(DOMstring keySystem, *optional DOMString initDataType, optional DOMString codecs,* optional DOMString capability);
2) static IsTypeSupported isTypeSupported(DOMstring keySystem, *optional DOMString initDataType, optional DOMString typeParameters,* optional DOMString capability);

#1 is unchanged from the original proposal except for the addition of the |initDataType| parameter. The MIME type in |contentType| is used similar to canPlayType() (does the user agent support this container?). |initDataType| is simply used to check whether |keySystem| can process initialization data in that format. In general, |initDataType| would be one supported by the MIME type container (or be container-independent). Codecs can continue to be specified in |contentType|.

#2 replaces the MIME type-containing |contentType| with a |codecs| string that only contains the value of the (optional) Codecs parameter from |contentType|.

#3 is a hybrid solution that replaces the |contentType| parameter with a string containing only the RFC 6381 parameters (Codecs, Profiles). It would not contain the the type and subtype type (container).


I lean towards #1 because:
* |contentType| is easily defined (RFCs 6838 and 6381)
* Applications are accustomed to passing the full MIME type, so retaining |contentType| may be simplest for applications.

#2 has some advantages:
* Allows us to eliminate "content type" completely.
* Avoids redundant information and the potential for conflict between |initDataType| and the MIME type.
But also disadvantages:
* It moves away from the standard MIME type supported by canPlayType() and MSE's isTypeSupported() and already defined by RFCs.
* Other parameters, such as Profiles, are not supported. (I'm not aware of existing support for these in browsers, but it does prevent future extension.)

#3 avoids the ambiguity, supports other parameters, and still somewhat relies on an RFC, but it is non-standard.
Comment 11 David Dorwin 2014-04-18 17:16:46 UTC
Actually, the container is not always redundant. If the initDataType is container-independent, such as proposed in bug 25269, then the container portion of contentType is relevant because the user agent (not the CDM) must support it.
Comment 12 David Dorwin 2014-04-25 00:45:45 UTC
I implemented option #1 above in https://dvcs.w3.org/hg/html-media/rev/5af732159d2e.
Comment 13 David Dorwin 2014-04-25 00:52:19 UTC
Jerry was looking into options for feature detection. I'm reopening this bug and assigning it to him to track that work.

In addition, we may want to rename isTypeSupported() since it is checking much more than type support and is very different from MSE's method of the same name.
Comment 14 Jerry Smith 2014-06-16 19:10:22 UTC
We would prefer to replace the DOMString capability with a dictionary that exposed specific capabilities with defined responses.  The capabilities would need to confirm DRM specific needs (Hardware or Software DRM, Hardware or Software Media Pipeline, and Link Protection).  There are also a set of capabilities that have more general media capabilities.  These may fit here as a convenience, though they might better belong on the MediaElement directly.  This could include capabilities like display resolution or a multi-channel audio configuration.  

The upside of using a dictionary is it would standardize the capabilities.  A downside is that it might confine future capabilities we don’t anticipate now, and also might include specific technologies in the allowed responses (e.g. HDCP 2.0).  

An alternative we may want to discuss is the concept of validating DRM license constraints in some way that doesn’t download content, but submits the license for pre-approval.  This might be done with a small amount of protected content included with the app, and has the advantage of being open ended, and definitive in that the license is processed by the CDM and passed or failed.
Comment 15 Jerry Smith 2014-06-26 20:00:21 UTC
Per the WG discussion on 6/17, I am resolving this bug as completed, and moving the discussion of system capabilities to a new bug:

https://www.w3.org/Bugs/Public/show_bug.cgi?id=26207
Comment 16 Jerry Smith 2014-08-12 15:50:10 UTC
Changing resolution to works for me.
Comment 17 Jerry Smith 2014-08-12 15:52:29 UTC
Reverting to resolved fixed. Works for me applies this bug opened after this one was resolved fixed:

https://www.w3.org/Bugs/Public/show_bug.cgi?id=26207