[whatwg] <video> and acceleration

[Speaking with my Apple hat on.]

We agree that upgrading the hardware is not an acceptable answer to  
this question. Many web-enabled devices have much less CPU power than  
the average laptop or desktop (think mobile devices), but many have  
quite capable GPUs. On desktop machines too, hardware acceleration of  
video playback is essential for optimum performance, and for  
acceptable CPU usage with large videos in non-trivial content (e.g.  
content with CSS effects on the video, or lots of expensive-to-render  
content under or over the video).

Ignoring rotations, overlays etc. in order to get reasonable  
performance is an option, but should be considered a bug in the  
implementation.

You could also imagine an attribute on the video element akin to the  
"wmode" attribute for plug-ins, which allow the page author to opt-in  
to faster rendering, with the expectation that the visual rendering  
will suffer. However, we feel strongly that the video spec should not  
be encumbered by an attribute of this kind.

Taking the video full-screen is an approach that makes a lot of sense  
for mobile devices. It's unfortunate that the spec shies away from the  
full-screen issue.

In an ideal world, hardware acceleration of video playback would "just  
work", and the spec would have to say no more about it. We believe  
it's possible to do hardware acceleration of video (and other  
animating web content) and preserve rendering in many cases. WebKit  
actually has some experimental code for this.

I'm not sure how much an HTML spec will be able to say about triggers  
that may cause the video rendering to fall off the hardware- 
accelerated path, since many of those triggers will be CSS-related,  
and implementation details will differ between browsers. From our  
experience in WebKit, those triggers may include:
* clipping (via overflow on the video or an ancestor)
* overlapping elements
* blending (opacity)
* being affected by transforms (via SVG, or CSS transforms on video or  
an ancestor)
* CSS masks or reflections
* CSS box decorations (border, background)

If the spec does say something about performance of <video>, I think  
it should be no more than a note that performance may differ across  
browsers, and can be affected in various ways that may be non-obvious  
to the page author, related to the layout and styling of the video and  
other elements on the page.

Simon

On Apr 28, 2009, at 6:07 PM, Ian Fette wrote:

> Upgrade the hardware is not an acceptable answer. Video acceleration  
> is meant to offload work from CPU (especially on constrained  
> devices, e.g. mobile). You want to be able to do compositing on  
> video card, so that you don't have to read the video out of the  
> video card's memory, transfer it over the bus, to the CPU, do some  
> transforms/overlays/..., and then send it back to the video card for  
> display. Doing that absolutely kills framerate.
>
> As we (browsers) implement <video> I think a lot of us are starting  
> with software rendering. Certainly we want to be able to do hardware  
> acceleration at some point. Perhaps some things we will still be  
> able to do in hardware, e.g. overlays of HTML or certain transforms  
> (if the video device supports saying "take this, translate it, and  
> composite" and the rendering engine only needs geometry data.) Other  
> things we might not be able to do in hardware (e.g. if you have  
> "transparent" flash video on top, and Flash wants to know what  
> pixels are underneath it, then we would have to read that data off  
> of the video card, send it to CPU, ...)
>
> I think what would be helpful is for browsers who are implementing  
> <video> with hardware acceleration to publish information on what  
> would make them fall back to software rendering. If it turns out  
> that list is roughly similar across implementations, perhaps it  
> could be added as a note in the spec that doing the following  
> certain things may cause performance implications. We're probably  
> not ready to do that yet given that we don't have enough  
> implementation experience, but that would be my suggestion for how  
> to move forward.
>
> -Ian
>
> On Tue, Apr 28, 2009 at 5:59 PM, Ian Hickson <ian at hixie.ch> wrote:
> On Sat, 28 Mar 2009, Benjamin M. Schwartz wrote:
> >
> > The <video> tag has great potential to be useful on low-powered
> > computers and computing devices, where current internet video  
> streaming
> > solutions (such as Adobe's Flash) are too computationally expensive.
> > My personal experience is with OLPC XO-1, on which Flash (and  
> Gnash) are
> > terribly slow for any purpose, but Theora+Vorbis playback is quite
> > smooth at reasonable resolutions and bitrates.
> >
> > The <video> standard allows arbitrary manipulations of the video  
> stream
> > within the HTML renderer.  To permit this, the initial  
> implementations
> > (such as the one in Firefox 3.5) will perform all video decoding
> > operations on the CPU, including the tremendously expensive YUV->RGB
> > conversion and scaling.  This is viable only for moderate  
> resolutions
> > and extremely fast processors.
> >
> > Recognizing this, the Firefox developers expect that the decoding
> > process will eventually be accelerated.  However, an accelerated
> > implementation of the <video> spec inevitably requires a 3D GPU, in
> > order to permit transparent video, blended overlays, and arbitrary
> > rotations.
> >
> > Pure software playback of video looks like a slideshow on the XO,  
> or any
> > device with similar CPU power, achieving 1 or 2 fps.  However, these
> > devices typically have a 2D graphics chip that provides "video  
> overlay"
> > acceleration: 1-bit alpha, YUV->RGB, and simple scaling, all in
> > special-purpose hardware. Using the overlay (via XVideo on Linux)  
> allows
> > smooth, full-speed playback.
> >
> > THE QUESTION:
> > What is the recommended way to handle the <video> tag on such  
> hardware?
>
> Upgrade the hardware.
>
>
> > There are two obvious solutions:
> > 0. Implement the spec, and just let it be really slow.
> > 1. Attempt to approximate the correct behavior, given the  
> limitations of
> > the hardware.  Make the video appear where it's supposed to  
> appear, and
> > use the 1-bit alpha (dithered?) to blend static items over it.   
> Ignore
> > transparency of the video.  Ignore rotations, etc.
> > 2. Ignore the HTML context.  Show the video "in manners more  
> suitable to
> > the user (e.g. full-screen or in an independent resizable window)".
> >
> > Which is preferable?  Is it worth specifying a preferred behavior?
>
> >From HTML's point of view, all are acceptable. From the user's  
> point of
> view, 1 and 2 are preferable, probably at the user's option.
>
> I don't know what else to tell you. :-)
>
> --
> Ian Hickson               U+1047E                ) 
> \._.,--....,'``.    fL
> http://ln.hixie.ch/       U+263A                /,   _.. \   _ 
> \  ;`._ ,.
> Things that are impossible just take longer.   `._.-(,_..'-- 
> (,_..'`-.;.'
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20090429/e7cc0ea3/attachment.htm>

Received on Wednesday, 29 April 2009 12:11:20 UTC