Re: [Cloud Browser] Architecture updates/questions from John Foliot on 2016-05-06 (public-web-and-tv@w3.org from May 2016)

From: John Foliot <john.foliot@deque.com>
Date: Fri, 6 May 2016 11:45:13 -0500
To: "Meerveld, Colin" <C.Meerveld@activevideo.com>
Cc: "Alexandra.Mikityuk@telekom.de" <Alexandra.Mikityuk@telekom.de>, "public-web-and-tv@w3.org" <public-web-and-tv@w3.org>
Message-ID: <CAKdCpxwvdf_ayWx5VGiyk1ko3_PVRpzEYVDZm61XVTdjNpcYDw@mail.gmail.com>
> Such a function set could look like:
> Composition; execution; UI rendering; Video rendering; encoding/ decoding
(hardware,
> software); scrambling/ descrambling (hardware, software); networking;
session mgmt.;
> user input processing; transcoding.

Hi Alexandra,

I'm struggling here to understand where DOM representation and access fits
into the mix.

For example, you note UI rendering (but is that just the visual UI, or does
that encompass exposing the DOM structure as well?) and user input
processing (but does that take into account alternate input modes?
alternate input devices?)

Browsers that are tied to hardware solutions also act as a conduit to the
Accessibility APIs of the various operating systems (see:
https://www.w3.org/TR/html-aam-1.0/) so that the abstraction of both input
and output can be communicated to Assitive Technologies as required. (This
is why ARIA is so important, as it allows for custom widgets to be
accessible to AT, by allowing a means to express role, state, and property
of any widget on the page in an abstract way - for example a volume control
slider set to 8 (role="slider", property="volume control", state="8") - see
here:
https://www.w3.org/WAI/GL/wiki/Using_WAI-ARIA_state_and_property_attributes_to_expose_the_state_of_a_user_interface_component
)

I will suggest that one thing you will require is a method to connect
alternative input/output devices to some part of the client's hardware
stack, so that (for example) audio can be converted to braille output, or a
person with mobility issues can interact with the content using
non-traditional devices (http://www.cetfund.org/files/IMG_0439_JPG.jpg).
Often times that will also involve the need for additional non-browser
software (screen reader for example), and it is unclear where that would
live in the ecosystem.

*****************

Also, I read here:

The Graphics Library renders and writes the data into the Buffer.

However, in the case of the Cloud Browser, the data is not sent to the
display.

The data is encoded using a required video codec and sent as a video stream
to the Cloud Browser client over IP.

>From an accessibility perspective, I think this is going to be your biggest
issue with a "cloud" browser, as it is significantly more than just sending
data as a video codec stream to a screen.

I have mentioned previously that this sounds very similar to the early-days
problems with the <canvas> element in HTML5, where again we had a method of
generating moving pixels to a screen, but no "bones" or structure behind
that for Assistive Technology to interact with. The "solution" was the
creation of a Shadow DOM (https://www.w3.org/TR/shadow-dom/) and I strongly
recommend this group review that document for a clearer explanation of the
problem, and the proposed W3C recommendation to address the problem. I'm
not sure if this is 100% applicable (or achievable in the envisioned
ecosystem), but I will also suggest that this is critical for the cloud
browser initiative moving forward.

JF

On Fri, May 6, 2016 at 9:06 AM, Meerveld, Colin <C.Meerveld@activevideo.com>
wrote:

> Hi Alexandra,
>
> Thank you for this update, although i am a bit confused. It seems that you
> suggest that there are 4 independent architectures. I believe they are
> tight together. For example the Cloud player approach is a result of the
> Single stream Approach. They are different perspectives on the same
> architecture (The first from the player and the latter from the transport).
> In addition i am not sure if we have to define a flexible architecture.
> Instead of flexible i would rather see a solution where a implementer could
> use a subset and/or constrains. For example a rte which only support a
> single stream or a rte which only allow drm protected double stream media.
>
> I do like a explanatory description per approach but mainly as example
> rather than the only solution. In your example you could argue if it needs
> to be IP or the need of HTTP/RFB.
>
> regards,
>
> Colin Meerveld
>
>
>
>
>
> On 25 Apr 2016, at 20:32, Alexandra.Mikityuk@telekom.de wrote:
>
> Dear all,
>
> working on the action points [1] to [5] with regard to updates of the
> Cloud Browser architecture, I have a couple of questions to the group.
>
> 1.      On the one side it would be sufficient to define a flexible
> architecture, where the functions are not hard-coded and are rather
> assigned on the fly depending on the approach.
> However, as we have already defined 4 approaches and agreed that they
> would be enough for the moment to cover the Use Cases we have [6],
> my 1st question would be, would it make sense to define a simple function
> set and describe which functions are supported by the server and which by
> the client for each of the
> approaches?
> Such a function set could look like:
> Composition; execution; UI rendering; Video rendering; encoding/ decoding
> (hardware, software); scrambling/ descrambling (hardware, software);
> networking; session mgmt.; user input processing; transcoding.
>
> By defining of these functions for each Approach, it would be more clear,
> which of these build the Runtime Environment of the Cloud Browser Client.
> Some of these would be done only in hardware in case of the e.g. Single
> Stream Cloud Player
>
> 2.      The second question would be with regard to the description of
> these approaches. Would it make sense to describe it with following sections
>
> ·        Technical description
> ·        Considered devices
> ·        Operation scope (for which cases is used)
>
> To make it more clear, I have made an exemplarily description of the
> Single Stream Cloud Player approach that you can find in the end of the
> E-Mail in [12].
>
> 3.      Would it be better to include the Media Player into the Runtime
> Environment – this way it would be more clear that the Media Player is a
> part of the Runtime Environment of the client.
> I have also provided more clear arch pictures, inspired by the Initial
> Concept arch made by Colin. Please, see [7] to [10]. Do they provide a
> better description or are they too technical and it would be better to stay
> with the current once available in [11]?
>
>
> [1] https://www.w3.org/2011/webtv/track/actions/215
> [2] https://www.w3.org/2011/webtv/track/actions/220
> [3] https://www.w3.org/2011/webtv/track/actions/221
> [4] https://www.w3.org/2011/webtv/track/actions/225
> [5] https://www.w3.org/2011/webtv/track/actions/227
> [6] https://www.w3.org/2011/webtv/wiki/Main_Page/Cloud_Browser_TF/UseCases
> [7] https://www.w3.org/2011/webtv/wiki/File:Ss-cp.png
> [8] https://www.w3.org/2011/webtv/wiki/File:Ss-lp.png
> [9] https://www.w3.org/2011/webtv/wiki/File:Ds-cp.png
> [10] https://www.w3.org/2011/webtv/wiki/File:Ds-lp.png
> [11]
> https://www.w3.org/2011/webtv/wiki/Main_Page/Cloud_Browser_TF/Architecture
>
> [12] Description example:
>
> *Technical Description:*
> After the input data is downloaded, the CSS and HTML is parsed and
> interpreted by the Cloud Browser.
> The JavaScript is processed and executed by the JavaScript engine of the
> browser.
> After the DOM and the Render Trees are built, the painting commands are
> sent to the Graphics Library over the Graphics Context.
> The video is downloaded from the Media Sources by the browser and is also
> sent to the Graphics Library of the CB server.
> The Graphics Library renders and writes the data into the Buffer.
> However, in the case of the Cloud Browser, the data is not sent to the
> display.
> The data is encoded using a required video codec and sent as a video
> stream to the Cloud Browser client over IP.
> For this delivery either HTTP or RFB application layer protocols could be
> used, depending on the implementation.
> The CB client receives the input stream that is directly decoded by the
> Decoder.
> The stream might be also first processed by the Descrambler, if the
> content is encrypted by the CB with an encryption scheme supported by the
> CB client.
>
> *Considered devices*: legacy devices, low-end and low-power devices only
> with hardware functions.
>
> *Operation Scope:*
> In this approach the CB overtakes all CB client software functionalities.
> The CB adapts the output data to the available hardware of the CB client,
> e.g. secure processor, video decoding, network.
> Therefore, the client functions are limited to the available hardware.
> The reduced Runtime Environment of the CB client is limited to support of
> these legacy hardware capabilities, if required.
> It could contain specific video functions that could not be reduced, as
> quality probes for quality measurements.
> This approach could be used e.g. for simplification of the client
> decryption.
> The decryption functions are terminated in the Cloud and re-encrypted by
> sending the stream down to the client.
> This Scenario implies decryption capabilities of the Cloud Environment.
> The client doesn't need to support the source encryption.
>
>
>
> Mit freundlichen Grüßen / Viele Grüße / Best Regards
> Alexandra Mikityuk
>
>
>
>
> *DEUTSCHE TELEKOM AG*
> T-Labs (Research & Innovation)
> Alexandra Mikityuk
> Winterfeldtstr. 21, 10781 Berlin
> +4930835358151 (Tel.)
> +4930835358408 (Fax)
> E-Mail: alexandra.mikityuk@telekom.de
> www.telekom.com
>
> *ERLEBEN, WAS VERBINDET.*
>
> Die gesetzlichen Pflichtangaben finden Sie unter:
> www.telekom.com/pflichtangaben
>
> *GROSSE VERÄNDERUNGEN FANGEN KLEIN AN **–** RESSOURCEN SCHONEN UND NICHT
> JEDE E-MAIL DRUCKEN.*
>
>
>


-- 
John Foliot
Principal Accessibility Consultant
Deque Systems Inc.
john.foliot@deque.com

Advancing the mission of digital accessibility and inclusion
Received on Friday, 6 May 2016 16:45:44 UTC