Bug 15605 - Define how exactly to apply perspective (w-parameter)
Define how exactly to apply perspective (w-parameter)
Status: REOPENED
Product: CSS
Classification: Unclassified
Component: Transforms
unspecified
All All
: P2 normal
: ---
Assigned To: Dean Jackson
This bug has no owner yet - up for the taking
:
Depends on: 15958
Blocks:
  Show dependency treegraph
 
Reported: 2012-01-18 15:46 UTC by Aryeh Gregor
Modified: 2012-10-18 22:14 UTC (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Aryeh Gregor 2012-01-18 15:46:09 UTC
The 2D transform specs reference SVG, which is pretty clear, but it doesn't handle 3D transforms.  Thus nothing defines how exactly you compute or apply the current transformation matrix.  In particular, there's no explanation at all of how a value other than 1 in the fourth coordinate works -- I asked in #whatwg and got linked to a Wikipedia article that gives the general idea, but there should really be details in the spec.

I suggest the few paragraphs about transformation matrices be copied out of the SVG spec and adapted to 3D.  Something like

"""
Mathematically, all transformations can be represented as 4x4 transformation matrices:

 [m11 m12 m13 m14]
 [m21 m22 m23 m24]
 [m31 m32 m33 m34]
 [m41 m42 m43 m44]

This is also expressed as a sixteen-element vector in column-major order: [m11 m21 m31 m41 m12 m22 m32 m42 m13 m23 m33 m43 m14 m24 m34 m44].

Transformations map coordinates and lengths from a new coordinate system into a previous coordinate system as follows.  First, temporary values x, y, z, and w are computed by matrix multiplication:

 [x]   [m11 m12 m13 m14]   [x_newCoordSys]
 [y] = [m21 m22 m23 m24] . [y_newCoordSys]
 [z]   [m31 m32 m33 m34]   [z_newCoordSys]
 [w]   [m41 m42 m43 m44]   [1            ]

Then the old coordinates are obtained by dividing all the coordinates by w (this allows perspective transformations):

 [x_prevCoordSys]   [x/w]
 [y_prevCoordSys] = [y/w]
 [z_prevCoordSys]   [z/w]
 [1             ]   [1  ]

If w is zero, ???
"""

This text is probably not very good, since my background isn't in graphics.  But we do need something here.  And it should say what to do if w is zero, because I actually don't know what's supposed to happen if part of a box gets mapped there -- do those parts become invisible?  Nobody's going to understand the spec details without knowing basic linear algebra, but it would be nice if knowledge of 3D graphics weren't an absolute prerequisite.  Or at least link to some Wikipedia articles as informative references.
Comment 1 Aryeh Gregor 2012-01-24 16:45:33 UTC
It should also say exactly where the perspective and perspective-origin properties fit into it.
Comment 2 Aryeh Gregor 2012-01-26 19:40:36 UTC
Behavior of 'perspective' is quite different for Gecko and WebKit.  Consider this simple test-case:

data:text/html,<!doctype html>
<div style="height:100px;width:100px;
perspective:100px">
<div>
<div style="height:100px;background:blue;
transform:rotateX(45deg)">
</div></div></div>

The 'perspective' property has no effect here in Firefox 12.0a1, but does in Chrome 17 dev.  It takes effect in Firefox too if you remove the intervening div, or add style="perspective:100px" or style="perspective:inherit" to it.  In Chrome 17 dev, it does work as-is.  Adding style="perspective:100px" or style="perspective:inherit" or style="perspective:none" (!) to the intervening div has no effect.  Adding style="perspective:200px" or something does have an effect.

*But*, in WebKit, the values of getBoundingClientRect() match Gecko.  So go figure.

It seems like Gecko uses the perspective property of the pattern element only, while WebKit uses the perspective property of the nearest ancestor whose perspective is not "none".  Both behaviors are weird, IMO, but Gecko's more closely matches the spec, and matches WebKit's getBoundingClientRect().


So I think the transformation matrix for an element needs to be the product of the following matrices in order:

1) translate(-top left corner of parent's border box)

2) transformation matrix of parent

3) translate(top left corner of parent's border box)

4) translate(-top left corner of element's border box)

5) translate('transform-origin' of element)

6) 'transform' of element

7) translate(-'transform-origin' of element)

8) translate(top left corner of element's border box)

Then the actual transform applied to the element needs to be produced by the product of these matrices:

1) translate(-top left corner of parent's border box)

2) translate('perspective-origin' of parent)

3) perspective('perspective' of parent)

4) translate(-'perspective-origin' of parent)

5) translate(top left corner of parent's border box)

6) transformation matrix of element

(This is relative to an axis-aligned coordinate system with the origin anywhere you like, e.g., at the top left of the viewport.)

This neglects the effect of transform-style, which I haven't yet tested and which isn't defined in any way that's comprehensible to me.
Comment 3 Simon Fraser 2012-01-31 21:34:11 UTC
The new spec wording should address some of this.
Comment 4 Aryeh Gregor 2012-02-01 16:53:09 UTC
Thanks!  The new wording certainly looks better.  I'll inspect it closely over the coming days, particularly the parts about the 3D rendering model.
Comment 5 Dirk Schulze 2012-02-12 05:35:09 UTC
I hope that it will get more clear with the new math section I am working on.
Comment 6 Aryeh Gregor 2012-02-13 21:42:50 UTC
The new spec looks much clearer -- thanks to both of you!  But this is still true:

(In reply to comment #0)
> In particular, there's no explanation at
> all of how a value other than 1 in the fourth coordinate works -- I asked in
> #whatwg and got linked to a Wikipedia article that gives the general idea, but
> there should really be details in the spec.

Test-case that demonstrates this ambiguity:

data:text/html,<!doctype html>
<div style="perspective:35px;width:100px">
<div style="background:blue;height:50px;
transform:rotatey(-45deg)">

Firefox nightly and Chrome 19 dev render nothing.  IE10 Developer Preview renders the box, with extreme perspective applied, as one might expect.  The spec doesn't make it clear what to do here.  The issue is that some of the points in the box have their fourth coordinate equal to 0, so it should really stretch out to infinity somehow.
Comment 7 Aryeh Gregor 2012-02-13 21:46:56 UTC
Also, how about:

data:text/html,<!doctype html>
<div style="perspective:35px;width:100px">
<div style="background:blue;height:50px;border-left:10px solid yellow;
transform:translateZ(40px)">

Mathematically, according to current spec text, the box is made larger and flipped (like scale(-1)).  This is what Firefox does.  But IE and Chrome both cause points to vanish if their fourth coordinate is negative.  This seems more desirable.
Comment 8 Aryeh Gregor 2012-02-13 21:52:04 UTC
I filed a Gecko bug on comment 6 and comment 7: https://bugzilla.mozilla.org/show_bug.cgi?id=726766

I suggest that

1) Any points with fourth coordinate negative are not drawn.

2) If any point has fourth coordinate zero, draw a ray toward infinity in that direction.  Where to start the ray, and how big to consider the box, I'm not sure offhand.  This needs to be done carefully to avoid discontinuities at zero.
Comment 9 Aryeh Gregor 2012-02-22 21:20:28 UTC
Here's some proposed spec text.  The normative part isn't too long, but I added detailed examples that take up a lot of space.  I can provide pictures for the examples if desired.  Are people okay with this?  It seems to match what IE10 does; Gecko and WebKit just refuse to render things or do odd stuff with them when w < 0.

"""
The perspective matrix and transformation matrix are both 4x4 matrices, while the objects to be transformed are two-dimensional boxes.  To transform each corner (a, b) of a box, the matrix must first be applied to (a, b, 0, 1), which will result in a four-dimensional point (x, y, z, w).  This is transformed back to a three-dimensional point (x', y', z') as follows:

* If w > 0, (x', y', z') = (x/w, y/w, z/w).
* If w = 0, (x', y', z') = (x*n, y*n, z*n).  n is an implementation-dependent value that should be chosen so that x' or y' is much larger than the viewport size, if possible.  For example, (5px, 22px, 0px, 0) might become (5000px, 22000px, 0px), with n = 1000, but this value of n would be too small for (0.1px, 0.05px, 0px, 0).  This specification does not define the value of n exactly.  Conceptually, (x', y', z') is infinitely far in the direction (x, y, z).

If w < 0 for all four corners of the transformed box, the box is not rendered.  If w < 0 for one to three corners of the transformed box, it must be replaced by a polygon that has any parts with w < 0 cut out.  This will in general be a polygon with three to five vertices, of which exactly two will have w = 0 and the rest w > 0.  These vertices are then transformed to three-dimensional points using the rules just stated.  Conceptually, a point with w < 0 is "behind" the viewer, so should not be visible.

--------
  Example:

  <style>
  .transformed {
    height: 100px;
    width: 100px;
    background: lime;
    transform: perspective(50px) translateZ(100px);
  }
  </style>

  All of the box's corners have Z-coordinates greater than the perspective.  This means that the box is behind the viewer and will not display.  Mathematically, the point (x, y) first becomes (x, y, 0, 1), then is translated to (x, y, 100, 1), and then applying the perspective results in (x, y, 100, -1).  The w-coordinate is negative, so it does not display.  An implementation that doesn't handle the w < 0 case separately might incorrectly display this point as (-x, -y, -100), dividing by -1 and mirroring the box.
--------

--------
  Example:

  <style>
  .transformed {
    height: 100px;
    width: 100px;
    background: radial-gradient(yellow, blue);
    transform: perspective(50px) translateZ(50px);
  }
  </style>

  Here, the box is translated upward so that it sits at the same place the viewer is looking from.  This is like bringing the box closer and closer to one's eye until it fills the entire field of vision.  Since the default transform-origin is at the center of the box, which is yellow, the screen will be filled with yellow.

  Mathematically, the point (x, y) first becomes (x, y, 0, 1), then is translated to (x, y, 50, 1), then becomes (x, y, 50, 0) after applying perspective.  Relative to the transform-origin at the center, the upper-left corner was (-50, -50), so it becomes (-50, -50, 50, 0).  This is transformed to something very far to the upper left, such as (-5000, -5000, 5000).  Likewise the other corners are sent very far away.  The radial gradient is stretched over the whole box, now enormous, so the part that's visible without scrolling should be the color of the middle pixel: yellow.  However, since the box is not actually infinite, the user can still scroll to the edges to see the blue parts.

  (TODO: Maybe we should specify that the whole thing is yellow here somehow?  Doesn't seem worth it.)
--------

--------
  Example: 

  <style>
  .transformed {
    height: 50px;
    width: 50px;
    background: lime;
    border: 25px solid blue;
    transform-origin: left;
    transform: perspective(50px) rotateY(-45deg);
  }
  </style>

  The box will be rotated toward the viewer, with the left edge staying fixed while the right edge swings closer.  The right edge will be at about z = 70.7px, which is closer than the perspective of 50px.  Therefore, the rightmost edge will vanish ("behind" the viewer), and the visible part will stretch out infinitely far to the right.

  Mathematically, the top right vertex of the box was originally (100, -50), relative to the transform-origin.  It is first expanded to (100, -50, 0, 1).  After applying the transform specified, this will get mapped to about (70.71, -50, 70.71, -0.4142).  This has w = -0.4142 < 0, so we need to slice away the part of the box with w < 0.  This results in the new top-right vertex being (50, -50, 50, 0).  This is then mapped to some faraway point in the same direction, such as (5000, -5000, 5000), which is up and to the right from the transform-origin.  Something similar is done to the lower right corner, which gets mapped far down and to the right.  The resulting box stretches far past the edge of the screen.

  Again, the rendered box is still finite, so the user can scroll to see the whole thing if he or she chooses.  However, the right part has been chopped off.  No matter how far the user scrolls, the rightmost 30px or so of the original box will not be visible.  The blue border was only 25px wide, so it will be visible on the left, top, and bottom, but not the right.

  The same basic procedure would apply if one or three vertices had w < 0.  However, in that case the result of truncating the w < 0 part would be a triangle or pentagon instead of a quadrilateral.
--------
"""
Comment 10 Simon Fraser 2012-02-22 21:31:52 UTC
I don't feel competent to review this text, and specing what IE does, but not Gecko or WebKit makes me a bit uneasy.

Could you attach some sample content (with prefixes) that matches the examples?
Comment 11 Aryeh Gregor 2012-02-23 18:46:55 UTC
Sure.  I'll start with example 3 because it's the most compelling.  I suggest using <http://software.hixie.ch/utilities/js/live-dom-viewer/> to view.

data:text/html,<!doctype html>
<style>
div {
  height: 50px;
  width: 50px;
  background: lime;
  border: 25px solid blue;
  -webkit-transform-origin: left;
  -webkit-transform: perspective(50px) rotateY(-20deg);
}
</style>
<div></div>

This shows the box significantly distorted.  Increase the magnitude of the angle gradually: -25deg, -28deg, -29deg, -29.9deg, -29.99deg.  As you get close to -30deg, it gets stretched out more and more.  Notice that when you go from -28deg to -29deg, or -29deg to -29.9deg, the center part of the box doesn't get too much more distorted, but the right edge stretches out dramatically -- look at how big the scrollbar gets.  If you set it to -29.9deg, the right border is stretched out to enormous width.  This is true in IE, Gecko, and Chrome.  (When it gets very large, Gecko seems to not consistently display the blue border, and IE seems to display it somewhat faded, but that's a separate issue.)

Now set the rotation to -30.1deg.  In Gecko and Chrome, it vanishes, which is discontinuous.  In IE, however, it just stretches out still more, except that the right side starts to get cut off.  You can keep increasing the angle and it will still display in IE: -35deg, -40deg, -45deg, etc.  As you increase the angle, the box will just get gradually more and more stretched out, particularly the left parts.  At -40deg, the rightmost edge is still blue.  At -45deg, the blue part gets cut off and the rightmost part is green.  As you increase the angle to -50deg, -60deg, -70deg, -80deg, the green part takes over more and more.  At -89deg it's almost a green half-plane with a thin blue border on the left.  Finally, it disappears at -90deg, because at that point it's actually perpendicular to the viewer.

So Gecko/WebKit have an arbitrary discontinuity here at -30deg.  IE's behavior is continuous throughout up to -90deg.  (All browsers display nothing at -30deg exactly; this is just a bug, IMO.)


For example 1, use this:

data:text/html,<!doctype html>
<style>
div {
  height: 90px;
  width: 90px;
  border-style: solid;
  border-width: 0 10px 10px 0;
  background: lime;
  -webkit-transform: perspective(50px) translateZ(0px);
}
</style>
<div></div>

This just displays as a lime square with a black border on the right and bottom.  Change the 0px translation to 25px, and it doubles in size.  37.5px will quadruple it, and so on.  As you get closer to 50px, the size increases asymptotically.  At exactly 50px, it should theoretically be infinite in size, although in actual browsers it disappears (IMO this is a bug).

But now increase the translation further, to 75px or 100px.  In both IE and Chrome it stays vanished.  This is what it's supposed to do: perspective(50px) means you're looking at the scene from a distance of 50px, so if anything is higher than 50px, it should be invisible.  However, in Gecko it gets flipped.  translateZ(100px) looks the same as scale(-1).  Mathematically this is because you get coordinates of the form (x, y, z, -1), and it just divides by the w value, yielding (-x, -y, -z, 1).  But this doesn't make physical sense.  Thus the spec says it has to disappear, like IE and Chrome.


The second example is really kind of a corner case.  Behavior there doesn't matter so much.  The behavior I spec for the second example doesn't match any browser -- they all make it vanish.  But this isn't consistent.  It's probably the same issue as why -30deg in the first markup I give in this post causes the box to vanish in IE, while -29.99 and -30.01 do not.  I think that's clearly a bug, so browsers are all wrong here.
Comment 13 Aryeh Gregor 2012-02-23 19:08:33 UTC
(In reply to comment #12)
> Not true for Safari on Mac. Don't expect Chrome and Safari to have equivalent
> behavior here.

Yeah, I misspoke.  I don't have access to a Mac, so I can't easily test on Safari.  How does Safari behave -- like IE10, with no discontinuity?  If so, that's more argument for the spec change.
Comment 14 Dirk Schulze 2012-05-25 16:55:43 UTC
The FX TF would like to see some wording on the spec. The general meaning was that it is better to have a slightly wrong definition at the beginning than no wording at all. The resolution was to edit the spec to add the proposal that might make most sense of the editor. Aryeh, can you edit the spec with your proposal please?
Comment 15 Aryeh Gregor 2012-05-28 10:11:12 UTC
Done: http://dvcs.w3.org/hg/csswg/rev/4154be20d399

I added an issue to the spec noting that this is a first pass and implementer feedback is welcome.  My background is in pure math, not 3D graphics, so while what I wrote makes sense to me, it's very possible someone whose linear algebra experience is more practical would see problems with it.  It's a start, though!

(If people would prefer to remove the issue and make it a note or comment or something, go ahead.  I think it's a good idea to let readers know somehow that this section could use more review.)
Comment 16 Dirk Schulze 2012-05-30 04:06:15 UTC
(In reply to comment #15)
> Done: http://dvcs.w3.org/hg/csswg/rev/4154be20d399
> 
> I added an issue to the spec noting that this is a first pass and implementer
> feedback is welcome.  My background is in pure math, not 3D graphics, so while
> what I wrote makes sense to me, it's very possible someone whose linear algebra
> experience is more practical would see problems with it.  It's a start, though!
> 
> (If people would prefer to remove the issue and make it a note or comment or
> something, go ahead.  I think it's a good idea to let readers know somehow that
> this section could use more review.)

You did not add the matrix multiplication, did you? Maybe I missed it.


[x]   [m11 m12 m13 m14]   [x_newCoordSys]
[y] = [m21 m22 m23 m24] . [y_newCoordSys]
[z]   [m31 m32 m33 m34]   [z_newCoordSys]
[w]   [m41 m42 m43 m44]   [1            ]


Also, there are two comments in the paragraphs that are talking about perspective that we should add more math here. Is that solved? Can these comments get removed?
Comment 17 Dirk Schulze 2012-05-30 04:11:46 UTC
Also, can you change the TODO comment into a real paragraph with ToDo class please? I am not sure what you want to specify yellow, or what you mean with this comment.

To my previous comment. I found a nice blog post about transforming a point by the CTM:
http://dev.opera.com/articles/view/understanding-the-css-transforms-matrix/

Maybe we can adapt some parts?

Do you plan to transform the formulas into mathml, or images?
Comment 18 Dirk Schulze 2012-05-30 04:29:34 UTC
More comments, you write

""
The accumulated 3D transformation matrix is a 4×4 matrix, while the objects to be transformed are two-dimensional boxes. To transform each corner (a, b) of a box, the matrix must first be applied to (a, b, 0, 1), which will result in a four-dimensional point (x, y, z, w). This is transformed back to a three-dimensional point (x′, y′, z′) as follows:
""

I am not sure what you mean here. In HTML an element can have "rounded corners". A path in SVG is 2D, but definitely not a box. Can you clarify this more please? Maybe it is enough to just speak about two-dimensional points of two-dimensional shapes. And say that every point needs to get extended to a 4 entry vector and transformed afterwards. And I would add the matrix from your first comment here. If you want, I can help you to create PNG's. In general you can create MathML directly and embed it. See discussions on public-fx.

""
x′ or y′ is much larger than the viewport size, if possible
""
Can we define what much larger is? Or at least the dimension?

""
For example, (5px, 22px, 0px, 0) might become (5000px, 22000px, 0px), with n = 1000, but this value of n would be too small for (0.1px, 0.05px, 0px, 0)
""

Can you put that into a new paragraph with class="example"? I don't think it should be in paragraph of a normative section.


""
the transformed box, the box is not rendered
""
Maybe we can just speak about the 'object' like we do on other places.

""
If w < 0 for one to three corners of the transformed box
""
The description should be independent of a certain shape.

""
All of the box's corners have z-coordinates greater than the perspective.
""
If you add a "<div class="transformed"></div>" after the style sheet, you can speak about the box. Ditto for all other examples.

This section is a great progress! Thanks.
Comment 19 Aryeh Gregor 2012-05-31 11:30:17 UTC
(In reply to comment #16)
> You did not add the matrix multiplication, did you? Maybe I missed it.
> 
> 
> [x]   [m11 m12 m13 m14]   [x_newCoordSys]
> [y] = [m21 m22 m23 m24] . [y_newCoordSys]
> [z]   [m31 m32 m33 m34]   [z_newCoordSys]
> [w]   [m41 m42 m43 m44]   [1            ]

I just said "the matrix must first be applied to (a, b, 0, 1), which will result in a four-dimensional point (x, y, z, w)."  I'm assuming the reader knows how to apply a matrix, and if they don't, writing it out probably won't help.  Or are you concerned about premultiplication vs. postmultiplication?

> Also, there are two comments in the paragraphs that are talking about
> perspective that we should add more math here. Is that solved? Can these
> comments get removed?

I think those parts are probably clear enough, and it would be okay to remove the comments.

(In reply to comment #17)
> Also, can you change the TODO comment into a real paragraph with ToDo class
> please? I am not sure what you want to specify yellow, or what you mean with
> this comment.

It doesn't really matter -- the comment can just be removed.

> To my previous comment. I found a nice blog post about transforming a point by
> the CTM:
> http://dev.opera.com/articles/view/understanding-the-css-transforms-matrix/
> 
> Maybe we can adapt some parts?

I'd be okay with adding it as an informative reference in the intro, for people who don't have linear algebra background.

> Do you plan to transform the formulas into mathml, or images?

The formulas I added are extremely simple, and I think regular HTML is appropriate.

(In reply to comment #18)
> More comments, you write
> 
> ""
> The accumulated 3D transformation matrix is a 4×4 matrix, while the objects to
> be transformed are two-dimensional boxes. To transform each corner (a, b) of a
> box, the matrix must first be applied to (a, b, 0, 1), which will result in a
> four-dimensional point (x, y, z, w). This is transformed back to a
> three-dimensional point (x′, y′, z′) as follows:
> ""
> 
> I am not sure what you mean here. In HTML an element can have "rounded
> corners". A path in SVG is 2D, but definitely not a box. Can you clarify this
> more please? Maybe it is enough to just speak about two-dimensional points of
> two-dimensional shapes. And say that every point needs to get extended to a 4
> entry vector and transformed afterwards. And I would add the matrix from your
> first comment here. If you want, I can help you to create PNG's. In general you
> can create MathML directly and embed it. See discussions on public-fx.

In HTML, elements' border boxes are what are transformed.  You're right that the spec is vague about what to do with anything other than simple boxes; and you're also right that it doesn't apply to SVG at all.  So this is an issue, and I'm not sure how to deal with it.

The problem is that if you include perspective, some curves will be mapped to infinity.  E.g., see the final example in comment 9 -- as you get close to the right edge of the box there, the points get mapped arbitrarily far to the right.  The transform of the box is therefore infinitely wide, in theory.  But we can't support this, because the element needs to have finite width.  I tried to fudge it by saying the edges should just be mapped really far away, but this doesn't work in cases other than simple boxes, you're right.

So I agree this is a problem and I'm not sure how to fix it.  IIRC, existing browsers other than IE seem to just get it entirely wrong -- like not rendering the box at all, or flipping it.

One way we could solve this is just by saying that any box/shape/etc. that has any point that's mapped at or behind the viewer (w <= 0) is not rendered at all.  This is how WebKit behaves right now, I think, and how Gecko behaves in some cases.  Maybe that makes more sense -- at least for now.  So that would mean that all three of my examples just wouldn't render anything.  It would sure be simpler to spec and understand than what I wrote.

That still leaves open the question of what exactly counts as "has any point with w <= 0".  What if the box has a box-shadow outside the border that has w = 0 at one point, but the border box itself doesn't?

This is where someone with 3D graphics experience would probably know better than me.  I know how the math works in theory, but don't know what the best way is to translate it to real life.  :(

> ""
> x′ or y′ is much larger than the viewport size, if possible
> ""
> Can we define what much larger is? Or at least the dimension?
> 
> ""
> For example, (5px, 22px, 0px, 0) might become (5000px, 22000px, 0px), with n =
> 1000, but this value of n would be too small for (0.1px, 0.05px, 0px, 0)
> ""
> 
> Can you put that into a new paragraph with class="example"? I don't think it
> should be in paragraph of a normative section.
> 
> 
> ""
> the transformed box, the box is not rendered
> ""
> Maybe we can just speak about the 'object' like we do on other places.
> 
> ""
> If w < 0 for one to three corners of the transformed box
> ""
> The description should be independent of a certain shape.
> 
> ""
> All of the box's corners have z-coordinates greater than the perspective.
> ""
> If you add a "<div class="transformed"></div>" after the style sheet, you can
> speak about the box. Ditto for all other examples.
> 
> This section is a great progress! Thanks.

See above -- I'm starting to think we should just scrap this section.  Instead, something more like this might be appropriate:

"""
The perspective matrix and transformation matrix are both 4x4 matrices, while
the objects to be transformed are two-dimensional.  To transform each
point (a, b) in a shape, the matrix must first be applied to (a, b, 0, 1), which
will result in a four-dimensional point (x, y, z, w).  If w > 0, this is then mapped to the three-dimensional point (x/w, y/w, z/w), which can be projected onto the viewing surface.  If w <= 0, however, the point does not correspond to anything, and the entire transformed object must not be rendered.

--------
  Example:

  <style>
  .transformed {
    height: 100px;
    width: 100px;
    background: lime;
    transform: perspective(50px) translateZ(100px);
  }
  </style>
  <div class="transformed"></div>

  All of the box's corners have Z-coordinates greater than the perspective. 
This means that the box is behind the viewer and will not display. 
Mathematically, the point (x, y) first becomes (x, y, 0, 1), then is translated
to (x, y, 100, 1), and then applying the perspective results in (x, y, 100,
-1).  The w-coordinate is negative, so it does not display.  An implementation
that doesn't handle the w < 0 case separately might incorrectly display this
point as (-x, -y, -100), dividing by -1 and mirroring the box.

  The same would be true with a z-translation of 50px instead of 100px.  In that case, the w-coordinate would be 0, so the box still would not display.  However, reducing the z-translation to slightly less than 50px (such as 49.9px) would make the box appear again, greatly enlarged, since then w would be positive.
--------

--------
  Example: 

  <style>
  .transformed {
    height: 50px;
    width: 50px;
    background: lime;
    border: 25px solid blue;
    transform-origin: left;
    transform: perspective(50px) rotateY(-45deg);
  }
  </style>
  <div class="transformed"></div>

  The box here is rotated toward the viewer, with the left edge staying fixed
while the right edge swings closer.  The right edge is at about z =
70.7px, which is closer than the perspective of 50px.  Since part of the box is behind the viewer, the entire box is not rendered.  Reducing the rotation to about 30deg will make the box reappear.

  Mathematically, the top right vertex of the box was originally (100, -50),
relative to the transform-origin.  It is first expanded to (100, -50, 0, 1). 
After applying the transform specified, this will get mapped to about (70.71,
-50, 70.71, -0.4142).  This has w = -0.4142 < 0, so the entire box is not rendered.
--------
"""

This is still vague on what counts as being "in the shape".  The background of the <html> element is theoretically infinite, so does applying any 3D rotation with perspective to <html> make the whole page vanish if the background isn't transparent?  I don't know.  I think this is better than what I just committed, though.
Comment 20 Dirk Schulze 2012-06-01 17:38:52 UTC
(In reply to comment #19)
> (In reply to comment #16)
> > You did not add the matrix multiplication, did you? Maybe I missed it.
> > 
> > 
> > [x]   [m11 m12 m13 m14]   [x_newCoordSys]
> > [y] = [m21 m22 m23 m24] . [y_newCoordSys]
> > [z]   [m31 m32 m33 m34]   [z_newCoordSys]
> > [w]   [m41 m42 m43 m44]   [1            ]
> 
> I just said "the matrix must first be applied to (a, b, 0, 1), which will
> result in a four-dimensional point (x, y, z, w)."  I'm assuming the reader
> knows how to apply a matrix, and if they don't, writing it out probably won't
> help.  Or are you concerned about premultiplication vs. post multiplication?
Yes, you might believe it is stupid, but the even the words post-multiplied are not obvious for everyone :). I don't suggest that you change the test here, but to add a graphic for demonstration. And your formula looks correct to me.

> 
> > Also, there are two comments in the paragraphs that are talking about
> > perspective that we should add more math here. Is that solved? Can these
> > comments get removed?
> 
> I think those parts are probably clear enough, and it would be okay to remove
> the comments.
Great!

> 
> (In reply to comment #17)
> > Also, can you change the TODO comment into a real paragraph with ToDo class
> > please? I am not sure what you want to specify yellow, or what you mean with
> > this comment.
> 
> It doesn't really matter -- the comment can just be removed.
Sounds good.

> 
> > To my previous comment. I found a nice blog post about transforming a point by
> > the CTM:
> > http://dev.opera.com/articles/view/understanding-the-css-transforms-matrix/
> > 
> > Maybe we can adapt some parts?
> 
> I'd be okay with adding it as an informative reference in the intro, for people
> who don't have linear algebra background.
I don't think that we need to do that.

> 
> > Do you plan to transform the formulas into mathml, or images?
> 
> The formulas I added are extremely simple, and I think regular HTML is
> appropriate.
I agree.

> 
> (In reply to comment #18)
> > More comments, you write
> > 
> > ""
> > The accumulated 3D transformation matrix is a 4×4 matrix, while the objects to
> > be transformed are two-dimensional boxes. To transform each corner (a, b) of a
> > box, the matrix must first be applied to (a, b, 0, 1), which will result in a
> > four-dimensional point (x, y, z, w). This is transformed back to a
> > three-dimensional point (x′, y′, z′) as follows:
> > ""
> > 
> > I am not sure what you mean here. In HTML an element can have "rounded
> > corners". A path in SVG is 2D, but definitely not a box. Can you clarify this
> > more please? Maybe it is enough to just speak about two-dimensional points of
> > two-dimensional shapes. And say that every point needs to get extended to a 4
> > entry vector and transformed afterwards. And I would add the matrix from your
> > first comment here. If you want, I can help you to create PNG's. In general you
> > can create MathML directly and embed it. See discussions on public-fx.
> 
> In HTML, elements' border boxes are what are transformed.  You're right that
> the spec is vague about what to do with anything other than simple boxes; and
> you're also right that it doesn't apply to SVG at all.  So this is an issue,
> and I'm not sure how to deal with it.
> 
> The problem is that if you include perspective, some curves will be mapped to
> infinity.  E.g., see the final example in comment 9 -- as you get close to the
> right edge of the box there, the points get mapped arbitrarily far to the
> right.  The transform of the box is therefore infinitely wide, in theory.  But
> we can't support this, because the element needs to have finite width.  I tried
> to fudge it by saying the edges should just be mapped really far away, but this
> doesn't work in cases other than simple boxes, you're right.
> 
> So I agree this is a problem and I'm not sure how to fix it.  IIRC, existing
> browsers other than IE seem to just get it entirely wrong -- like not rendering
> the box at all, or flipping it.
> 
> One way we could solve this is just by saying that any box/shape/etc. that has
> any point that's mapped at or behind the viewer (w <= 0) is not rendered at
> all.  This is how WebKit behaves right now, I think, and how Gecko behaves in
> some cases.  Maybe that makes more sense -- at least for now.  So that would
> mean that all three of my examples just wouldn't render anything.  It would
> sure be simpler to spec and understand than what I wrote.
I would be fine with doing it that way. We could say that a future version of the spec may provide some rules to allow it.

> 
> That still leaves open the question of what exactly counts as "has any point
> with w <= 0".  What if the box has a box-shadow outside the border that has w =
> 0 at one point, but the border box itself doesn't?
> 
> This is where someone with 3D graphics experience would probably know better
> than me.  I know how the math works in theory, but don't know what the best way
> is to translate it to real life.  :(
Maybe this is just the problem of you 'bounding box' definition that does not include all visual data. It may be easier to use a "affected by painting area" and apply the rules you have according to this area. This could be used for SVG as well (any spec that might use CSS Transforms later).

> 
> > ""
> > x′ or y′ is much larger than the viewport size, if possible
> > ""
> > Can we define what much larger is? Or at least the dimension?
> > 
> > ""
> > For example, (5px, 22px, 0px, 0) might become (5000px, 22000px, 0px), with n =
> > 1000, but this value of n would be too small for (0.1px, 0.05px, 0px, 0)
> > ""
> > 
> > Can you put that into a new paragraph with class="example"? I don't think it
> > should be in paragraph of a normative section.
> > 
> > 
> > ""
> > the transformed box, the box is not rendered
> > ""
> > Maybe we can just speak about the 'object' like we do on other places.
> > 
> > ""
> > If w < 0 for one to three corners of the transformed box
> > ""
> > The description should be independent of a certain shape.
> > 
> > ""
> > All of the box's corners have z-coordinates greater than the perspective.
> > ""
> > If you add a "<div class="transformed"></div>" after the style sheet, you can
> > speak about the box. Ditto for all other examples.
> > 
> > This section is a great progress! Thanks.
> 
> See above -- I'm starting to think we should just scrap this section.  Instead,
> something more like this might be appropriate:
> 
> """
> The perspective matrix and transformation matrix are both 4x4 matrices, while
> the objects to be transformed are two-dimensional.  To transform each
> point (a, b) in a shape, the matrix must first be applied to (a, b, 0, 1),
> which
> will result in a four-dimensional point (x, y, z, w).  If w > 0, this is then
> mapped to the three-dimensional point (x/w, y/w, z/w), which can be projected
> onto the viewing surface.  If w <= 0, however, the point does not correspond to
> anything, and the entire transformed object must not be rendered.
is yout "viewing surface" the same like the area that I tried to explain in the comment above? If yes, I fully agree.

> 
> --------
>   Example:
> 
>   <style>
>   .transformed {
>     height: 100px;
>     width: 100px;
>     background: lime;
>     transform: perspective(50px) translateZ(100px);
>   }
>   </style>
>   <div class="transformed"></div>
> 
>   All of the box's corners have Z-coordinates greater than the perspective.
Just a note, can you say "div box" or be more explicit what for a box you mean please?

> This means that the box is behind the viewer and will not display. 
> Mathematically, the point (x, y) first becomes (x, y, 0, 1), then is translated
> to (x, y, 100, 1), and then applying the perspective results in (x, y, 100,
> -1).  The w-coordinate is negative, so it does not display.  An implementation
> that doesn't handle the w < 0 case separately might incorrectly display this
> point as (-x, -y, -100), dividing by -1 and mirroring the box.
> 
>   The same would be true with a z-translation of 50px instead of 100px.  In
> that case, the w-coordinate would be 0, so the box still would not display. 
> However, reducing the z-translation to slightly less than 50px (such as 49.9px)
> would make the box appear again, greatly enlarged, since then w would be
> positive.
> --------
> 
> --------
>   Example: 
> 
>   <style>
>   .transformed {
>     height: 50px;
>     width: 50px;
>     background: lime;
>     border: 25px solid blue;
>     transform-origin: left;
>     transform: perspective(50px) rotateY(-45deg);
>   }
>   </style>
>   <div class="transformed"></div>
> 
>   The box here is rotated toward the viewer, with the left edge staying fixed
> while the right edge swings closer.  The right edge is at about z =
> 70.7px, which is closer than the perspective of 50px.  Since part of the box is
> behind the viewer, the entire box is not rendered.  Reducing the rotation to
> about 30deg will make the box reappear.
> 
>   Mathematically, the top right vertex of the box was originally (100, -50),
> relative to the transform-origin.  It is first expanded to (100, -50, 0, 1). 
> After applying the transform specified, this will get mapped to about (70.71,
> -50, 70.71, -0.4142).  This has w = -0.4142 < 0, so the entire box is not
> rendered.
> --------
> """
> 
> This is still vague on what counts as being "in the shape".  The background of
> the <html> element is theoretically infinite, so does applying any 3D rotation
> with perspective to <html> make the whole page vanish if the background isn't
> transparent?  I don't know.  I think this is better than what I just committed,
> though.
I also wouldn't use shape, it was my fault. Just object is better, but yes you are right. That is why I mentioned the area that is affected by painting (before the transformation).

I think the changes would make the text even better.
Comment 21 Aryeh Gregor 2012-06-04 07:39:08 UTC
(In reply to comment #20)
> Yes, you might believe it is stupid, but the even the words post-multiplied are
> not obvious for everyone :). I don't suggest that you change the test here, but
> to add a graphic for demonstration. And your formula looks correct to me.

If you have a graphic to add, feel free.  I'm not the most graphically talented.

> > One way we could solve this is just by saying that any box/shape/etc. that has
> > any point that's mapped at or behind the viewer (w <= 0) is not rendered at
> > all.  This is how WebKit behaves right now, I think, and how Gecko behaves in
> > some cases.  Maybe that makes more sense -- at least for now.  So that would
> > mean that all three of my examples just wouldn't render anything.  It would
> > sure be simpler to spec and understand than what I wrote.
> I would be fine with doing it that way. We could say that a future version of
> the spec may provide some rules to allow it.

Right.

> Maybe this is just the problem of you 'bounding box' definition that does not
> include all visual data. It may be easier to use a "affected by painting area"
> and apply the rules you have according to this area. This could be used for SVG
> as well (any spec that might use CSS Transforms later).

The problem is, sometimes it's not clear what's "affected".  If you have a box-shadow with blur, the shadow fades to transparent with a Gaussian blur.  At what point is it no longer "affected"?  It will be effectively transparent a few pixels away, but at what point do we say it's no longer "affected"?  There's no one point at which it becomes transparent -- the opacity gets lower and lower until it's eventually rounded down to zero as an implementation detail.

Also, what happens to the root element?  Its background is painted on the page's canvas, and any transformations on it affect the page's canvas too.  The page's canvas is defined to be infinite <http://www.w3.org/TR/CSS21/intro.html#the-canvas>, so does that mean that perspective(1000000px) rotate(0.001deg) on the root element should make the whole page disappear?  If not, what do we use?  The region that's reachable using scrollbars, perhaps?  But part of that might get moved to the visible region of the page by the transform, so how does it get rendered?

I don't know if there are comparable issues for any SVG features, but at least for CSS boxes, I suggest we leave this vague for now and punt it to a future version.

> > """
> > The perspective matrix and transformation matrix are both 4x4 matrices, while
> > the objects to be transformed are two-dimensional.  To transform each
> > point (a, b) in a shape, the matrix must first be applied to (a, b, 0, 1),
> > which
> > will result in a four-dimensional point (x, y, z, w).  If w > 0, this is then
> > mapped to the three-dimensional point (x/w, y/w, z/w), which can be projected
> > onto the viewing surface.  If w <= 0, however, the point does not correspond to
> > anything, and the entire transformed object must not be rendered.
> is yout "viewing surface" the same like the area that I tried to explain in the
> comment above? If yes, I fully agree.

By "viewing surface" I mean the screen.  It's probably not correct.  I think I mean "the canvas", but I'm not expert in CSS terminology.

> >   All of the box's corners have Z-coordinates greater than the perspective.
> Just a note, can you say "div box" or be more explicit what for a box you mean
> please?

Sure.

> I think the changes would make the text even better.

I can do that.  Any objections?
Comment 22 Simon Fraser 2012-10-18 22:14:32 UTC
We need to review this text to see if it matches WebKit. Dirk is going to run some tests in the different UAs.