This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Encrypted media should achieve the same level of accessibility as other media. This is partially a quality-of-implementation concern, but there may be ways the spec could normatively encourage it. Specific examples include ensuring the correct interaction with high-contrast mode and captioning systems.
(In reply to Sergey Konstantinov from comment #0) > ... but there may > be ways the spec could normatively encourage it. Specific examples include > ensuring the correct interaction with high-contrast mode and captioning > systems. Could you please give specific references to where you think the EME spec should be changed and if possible provide a concrete proposal for your recommended changes? /paulc HTML WG co-chair
Our intent is as follows: - a number of technologies which make video accessible exists; some of them work or could potentially work in the Web; - we are very concerned that EME (especially coupled with hardware decryption) automatically makes most of these technologies inapplicable; - so we'd like to encourage HTML WG to evaluate this question: find out what Web video accessibility technologies exist and how to make them work with EME. For example, to be sure that user will be able to increase subtitles font size or redirect them to a system text-to-speech service, EME spec should require CDM to have an API to transfer subtitles back to user agent after decryption and/or encourage developing encrypted media transfer format with flat unencrypted subtitles stream alongside encrypted video stream.
(In reply to Sergey Konstantinov from comment #2) On behalf of the media accessibility sub-team of the HTML5-a11y Task Force: > > - a number of technologies which make video accessible exists; some of them > work or could potentially work in the Web; > - we are very concerned that EME (especially coupled with hardware > decryption) automatically makes most of these technologies inapplicable; The HTML5-a11yTF, and the accessible media sub-team, have been monitoring potential accessibility issues around the EME deployment. Specifically, this was discussed at the F-2-F meeting of the HTML WG on 2013/04/23 (http://www.w3.org/2013/04/23-html-wg-minutes.html#item11). Since EME is encrypting/decrypting the media stream content, a potential problem could arise when accessibility support content (captions, described audio, transcripts, etc.) is encrypted by a third party using a CDM different than the source video content. At the 2013 meeting however, this edge-case seemed unlikely, as content owners today are not requesting this level of encryption support: the media asset is encrypted but any out-of-band support materials are traditionally not. (Of course any encrypted in-band support materials would be decrypted at the same time as the media.) While this edge-case still exists today (i.e. can a browser decrypt two streams, reliant on two different CDMs, simultaneously and in sync?) it was felt that this rare scenario could/should be best addressed at the authoring guidance level (i.e. don't do it). > - so we'd like to encourage HTML WG to evaluate this question: find out what > Web video accessibility technologies exist and how to make them work with > EME. The accessible media sub-team has already created the Media Accessibility User Requirements (http://www.w3.org/WAI/PF/media-a11y-reqs/) which outlines what we believe is a complete list of requirements, both from a content creation perspective, as well as a playback/UI perspective, and when we looked at where EME might have an impact, we could not see any other specific areas of concern. We are however always open to more feedback and examples. > > For example, to be sure that user will be able to increase subtitles font > size or redirect them to a system text-to-speech service, EME spec should > require CDM to have an API to transfer subtitles back to user agent after > decryption While I will ask an engineer more involved with the technical aspects of CDMs to weigh in if I am incorrect, it is our understanding that the encryption is done at the media-wrapper layer, and once the wrapper is "unlocked" all of the piece-parts (including any in-band support content) would be exposed to the existing APIs in exactly the same fashion as unencrypted content. In the case of out-of-band content, again we have a low expectation of seeing encrypted support content (while still leaving open the possibility that it could happen), but once again, once the stream is unlocked, the content that comes forth should operate and interact with the UA in exactly the same way as non-encrypted content. There is no indication at this time that encrypted captions (for example) would result in decrypted VTT or TTML files that would lock-down text scaling, as the scalability of the text is controlled by the UA/UI, and not the encryption/decryption of the media stream. Likewise for high-contrast mode or TTS; it is our understanding that because EME only encrypts the content, and has no direct impact on the user-agent stack, once the content is unlocked all of the piece-parts are rendered to the DOM, where assistive technology then picks up the thread. > ...and/or encourage developing encrypted media transfer format with > flat unencrypted subtitles stream alongside encrypted video stream. We agree with this suggestion, and PFWG will ensure that this is taken back to the WCAG WG as authoring guidance that should be formalized (Action on JF). Based upon the feedback from representatives of Netflix, Comcast and others at the 2013 meeting however, this already appears to be the case today: content owners are not requiring encrypted support content at this time. Finally, it is worth noting that PFWG intends to continue to monitor this topic, and there are plans underway to begin testing using Assistive Technology and demo content encrypted using EME (This response closes PFWG Action-285)
> > While I will ask an engineer more involved with the technical aspects of > CDMs to weigh in if I am incorrect, it is our understanding that the > encryption is done at the media-wrapper layer, and once the wrapper is > "unlocked" all of the piece-parts (including any in-band support content) > would be exposed to the existing APIs in exactly the same fashion as > unencrypted content. s/wrapper/transport ...it is our understanding that the encryption is done at the media-transport layer, and once the transport is "unlocked" all of the piece-parts (including any inband support content) would be exposed to the existing APIs in exactly the same fashion as unencrypted content.
Thanks, John. I've added some clarification and context inline. (In reply to John Foliot from comment #3 with corrections from comment #4) > (In reply to Sergey Konstantinov from comment #2) > > On behalf of the media accessibility sub-team of the HTML5-a11y Task Force: > > > > > - a number of technologies which make video accessible exists; some of them > > work or could potentially work in the Web; > > - we are very concerned that EME (especially coupled with hardware > > decryption) automatically makes most of these technologies inapplicable; > > The HTML5-a11yTF, and the accessible media sub-team, have been monitoring > potential accessibility issues around the EME deployment. Specifically, this > was discussed at the F-2-F meeting of the HTML WG on 2013/04/23 > (http://www.w3.org/2013/04/23-html-wg-minutes.html#item11). > > Since EME is encrypting/decrypting the media stream content, a potential > problem could arise when accessibility support content (captions, described > audio, transcripts, etc.) is encrypted by a third party using a CDM > different than the source video content. At the 2013 meeting however, this > edge-case seemed unlikely, as content owners today are not requesting this > level of encryption support: the media asset is encrypted but any > out-of-band support materials are traditionally not. (Of course any > encrypted in-band support materials would be decrypted at the same time as > the media.) While this edge-case still exists today (i.e. can a browser > decrypt two streams, reliant on two different CDMs, simultaneously and in > sync?) it was felt that this rare scenario could/should be best addressed at > the authoring guidance level (i.e. don't do it). EME does not provide mechanisms for decrypting out-of-band support materials. Only in-band content passes into the media stack and CDM. In-band support materials _could_ be passed through the CDM. Ideally, these would then be returned to the user agent to be presented as usual. I think the concern is platform-based CDMs where this might not be possible. Also, some implementations might directly render captions, etc. along with the video. Perhaps we should normatively require that any in-band support materials (when supported) be returned to the user agent and/or discourage encrypting them. EME does not support multiple CDMs being used with the same media element. With the push for interoperability and DRM-independent content (i.e. bug 27093), this should not be necessary. <snip> > > For example, to be sure that user will be able to increase subtitles font > > size or redirect them to a system text-to-speech service, EME spec should > > require CDM to have an API to transfer subtitles back to user agent after > > decryption > > While I will ask an engineer more involved with the technical aspects of > CDMs to weigh in if I am incorrect, > ...it is our understanding that the encryption is done at the > media-transport layer, and once the transport is "unlocked" all of the > piece-parts (including any inband support content) would be exposed to the > existing APIs in exactly the same fashion as unencrypted content. > In the case of out-of-band content, again we have a low > expectation of seeing encrypted support content (while still leaving open > the possibility that it could happen), but once again, once the stream is > unlocked, the content that comes forth should operate and interact with the > UA in exactly the same way as non-encrypted content. EME assumes encryption is on blocks within the media container. That means the media data can be read (i.e. to generate "encrypted" events) and the various tracks can be processed independently. This is different, for example, than traditional MPEG2-TS encryption. As mentioned above, whether in-band encrypted support content is made available to the UA depends on the DRM implementation. > There is no indication at this time that encrypted captions (for example) > would result in decrypted VTT or TTML files that would lock-down text > scaling, as the scalability of the text is controlled by the UA/UI, and not > the encryption/decryption of the media stream. Likewise for high-contrast > mode or TTS; it is our understanding that because EME only encrypts the > content, and has no direct impact on the user-agent stack, once the content > is unlocked all of the piece-parts are rendered to the DOM, where assistive > technology then picks up the thread. I'm not aware of any current EME implementations that support decrypting in-band support content. (That's probably a good thing as it implicitly discourages encryption of such content.) If an implementation did, it should return it to the UA as it would unencrypted timed text tracks. The concern for high-contrast mode relates to implementations that do not rely on or allow the user agent to do the rendering. For example, protected audio/video pipelines may render the content directly, in which case the user agent stack is not involved and unable to apply a high-contrast filter. See https://dvcs.w3.org/hg/html-media/raw-file/default/encrypted-media/encrypted-media.html#media-element-restictions for related restrictions issues. > > ...and/or encourage developing encrypted media transfer format with > > flat unencrypted subtitles stream alongside encrypted video stream. > > We agree with this suggestion, and PFWG will ensure that this is taken back > to the WCAG WG as authoring guidance that should be formalized (Action on > JF). Based upon the feedback from representatives of Netflix, Comcast and > others at the 2013 meeting however, this already appears to be the case > today: content owners are not requiring encrypted support content at this > time. > > Finally, it is worth noting that PFWG intends to continue to monitor this > topic, and there are plans underway to begin testing using Assistive > Technology and demo content encrypted using EME What are they intending to test? Since this is an implementation issue, you would need to test all implementations. Note that content is not "encrypted using EME". Content is encrypted per a container-specific common encryption specification. Such content can be used with EME as well as other platforms. > (This response closes PFWG Action-285)
(In reply to John Foliot from comment #3) > While I will ask an engineer more involved with the technical aspects of > CDMs to weigh in if I am incorrect, it is our understanding that the > encryption is done at the media-wrapper layer, and once the wrapper is > "unlocked" all of the piece-parts (including any in-band support content) > would be exposed to the existing APIs in exactly the same fashion as > unencrypted content. This is not a correct assumption in general. EME has been designed to allow arrangements where the DRM subsystem performs more steps than merely decryption in order to hide things from the browser engine. Note that I'm not claiming that anyone is actually planning to a) let caption tracks [that are actually separate tracks as opposed to being U.S. TV captioning-compatible data inside an MPEG-2 video track] be encrypted and b) not let the decrypted caption track flow into the browser engine. (In reply to David Dorwin from comment #5) > EME assumes encryption is on blocks within the media container. While this may be the assumption and what every implements in practice, is there actually something in the spec that prohibits the encryption being a wrapper around the media container? (Not that it's relevant for this bug--just pointing out how many things EME doesn't actually specify, AFAICT.)
(In reply to Henri Sivonen from comment #6) > (In reply to John Foliot from comment #3) > > While I will ask an engineer more involved with the technical aspects of > > CDMs to weigh in if I am incorrect, it is our understanding that the > > encryption is done at the media-wrapper layer, and once the wrapper is > > "unlocked" all of the piece-parts (including any in-band support content) > > would be exposed to the existing APIs in exactly the same fashion as > > unencrypted content. > > This is not a correct assumption in general. EME has been designed to allow > arrangements where the DRM subsystem performs more steps than merely > decryption in order to hide things from the browser engine. > > Note that I'm not claiming that anyone is actually planning to a) let > caption tracks [that are actually separate tracks as opposed to being U.S. > TV captioning-compatible data inside an MPEG-2 video track] be encrypted and > b) not let the decrypted caption track flow into the browser engine. Would either of you like to propose text to address these concerns? Maybe we should file a separate sub-bug specifically for text track issues. > (In reply to David Dorwin from comment #5) > > EME assumes encryption is on blocks within the media container. > > While this may be the assumption and what every implements in practice, is > there actually something in the spec that prohibits the encryption being a > wrapper around the media container? (Not that it's relevant for this > bug--just pointing out how many things EME doesn't actually specify, AFAICT.) Good point. What restrictions do we need to place on the media? Maybe open a separate bug for this since it's not specifically an accessibility issue. Also, if you find other underspecification issues, please file bugs for those too. A number of assumptions were not clearly documented, and we've been trying to address those.
(In reply to David Dorwin from comment #7) > > > > Note that I'm not claiming that anyone is actually planning to a) let > > caption tracks [that are actually separate tracks as opposed to being U.S. > > TV captioning-compatible data inside an MPEG-2 video track] be encrypted and > > b) not let the decrypted caption track flow into the browser engine. > > Would either of you like to propose text to address these concerns? Maybe we > should file a separate sub-bug specifically for text track issues. It appears that advisory text against doing this will likely suffice - there does not appear to be any actual implementation of this today. In conversation with David Dorwin at TPAC, I will return to the accessibility Task force and we will supply draft author advisory text that can be added to the specification. David if you want to file a bug, I can respond to that, or I can respond via this bug - your call. > > > (In reply to David Dorwin from comment #5) > > > EME assumes encryption is on blocks within the media container. > > > > While this may be the assumption and what every implements in practice, is > > there actually something in the spec that prohibits the encryption being a > > wrapper around the media container? (Not that it's relevant for this > > bug--just pointing out how many things EME doesn't actually specify, AFAICT.) > > Good point. What restrictions do we need to place on the media? Maybe open a > separate bug for this since it's not specifically an accessibility issue. > Also, if you find other underspecification issues, please file bugs for > those too. A number of assumptions were not clearly documented, and we've > been trying to address those. I believe this is out of scope for accessibility concerns. I will leave it to others to file a bug and respond
(In reply to David Dorwin from comment #7) > Good point. What restrictions do we need to place on the media? As long as the spec e.g. allows non-CENC encryption inside MP4, I think it's not useful to band the theoretical possibility of instead putting the encryption outside MP4. In other words, pretending that restricting encryption to inside the container would in itself be an interop win when there remain multiple ways to do encryption inside the container seems illusory.
(In reply to Henri Sivonen from comment #9) > (In reply to David Dorwin from comment #7) > > Good point. What restrictions do we need to place on the media? > > As long as the spec e.g. allows non-CENC encryption inside MP4, I think it's > not useful to band the theoretical possibility of instead putting the > encryption outside MP4. In other words, pretending that restricting > encryption to inside the container would in itself be an interop win when > there remain multiple ways to do encryption inside the container seems > illusory. If I understand correctly, you are saying we should NOT restrict encryption to inside the container because the spec theoretically allows MP4 protection schemes other than CENC. (The same applies to the potential to support any other container.) Is that right? This is not about increasing interoperability - it is about stating assumptions about the media that the spec, as written, supports. The assumption that the container is not encrypted is currently fundamental to some of the spec algorithms. The most obvious case is initData extraction, which would not be possible if the container is encrypted (ignoring some type of nested containers). The processing of the media data, including the Encrypted Block Encountered algorithm, also make this assumption.
(In reply to David Dorwin from comment #10) > (In reply to Henri Sivonen from comment #9) > > (In reply to David Dorwin from comment #7) > > > Good point. What restrictions do we need to place on the media? > > > > As long as the spec e.g. allows non-CENC encryption inside MP4, I think it's > > not useful to band the theoretical possibility of instead putting the > > encryption outside MP4. In other words, pretending that restricting > > encryption to inside the container would in itself be an interop win when > > there remain multiple ways to do encryption inside the container seems > > illusory. > > If I understand correctly, you are saying we should NOT restrict encryption > to inside the container because the spec theoretically allows MP4 protection > schemes other than CENC. (The same applies to the potential to support any > other container.) Is that right? Yes, and by looking at WebKit source, this doesn't seem to be in the "theoretically" category. > This is not about increasing interoperability - it is about stating > assumptions about the media that the spec, as written, supports. The > assumption that the container is not encrypted is currently fundamental to > some of the spec algorithms. The most obvious case is initData extraction, > which would not be possible if the container is encrypted (ignoring some > type of nested containers). The processing of the media data, including the > Encrypted Block Encountered algorithm, also make this assumption. OK. In that case, it's worthwhile to state the assumption.
https://github.com/w3c/encrypted-media/commit/22f3514edfd78f31f78947c5a5cab3c2922e50e6 addresses some of the issues raised in this bug, including explicitly stating the assumption about the container being unencrypted and not encrypting support content.
John, please review the changes and let us know if there are any remaining accessibility concerns or issues to be addressed.
(In reply to David Dorwin from comment #13) > John, please review the changes and let us know if there are any remaining > accessibility concerns or issues to be addressed. This was brought to the HTML-a11y TF, and a review was done by both myself and Leonie Watson, with Leonie's response here: http://lists.w3.org/Archives/Public/public-html-a11y/2015Mar/0076.html At this time this appears to address the issues we raised regarding unencrypted in-band support.