Copyright ©2000 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
XSL is a language for expressing stylesheets. It consists of two parts:
a language for transforming XML documents, and
an XML vocabulary for specifying formatting semantics.
An XSL stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses the formatting vocabulary.
This is a W3C Working Draft for review by W3C members and other interested parties. This adds additional functionality to what was described in the previous draft. It is a draft document and may be updated, replaced, or obsoleted by other documents at any time. The XSL Working Group will not allow early implementation to constrain its ability to make changes to this specification prior to final release. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". A list of current W3C working drafts can be found at http://www.w3.org/TR/.
This document has been produced as part of the W3C Style Activity by the XSL Working Group (members only).
Comments may be sent to xsl-editors@w3.org. Public discussion of XSL takes place on the XSL-List mailing list.
The following section is still to be developed. It will contain pictures and an explanation of the XSL "Processing Model".
TBD
TBD
Issue (imaging-model-font-indices):
The plan is to reference SVG Section 4: Rendering Model. Does SVG permit glyph indices into a particular font?
TBD
TBD
TBD
What this section will contain:
XSLT fits in how
This is the normative reference to XSLT
Issue (xslt-reference):
This serves temporary place-holder
The Tree Construction is described in "XSL Transformations (XSLT)".
The provisions in "XSL Transformations" form an integral part of this recommendation and are considered normative.
The XSL namespace has the URI http://www.w3.org/1999/XSL/Format.
NOTE:The
1999in the URI indicates the year in which the URI was allocated by the W3C. It does not indicate the version of XSL being used.
XSL processors must use the XML namespaces mechanism [W3C XML Names] to recognize elements and attributes from this namespace. Elements from the XSL namespace are recognized only in the stylesheet, not in the source document. Implementors must not extend the XSL namespace with additional elements or attributes. Instead, any extension must be in a separate namespace.
This specification uses the prefix fo: for referring
to elements in the XSL namespace. However, XSL stylesheets are free
to use any prefix, provided that there is a namespace declaration that
binds the prefix to the URI of the XSL namespace.
An element from the XSL namespace may have any attribute not from the XSL namespace, provided that the expanded-name of the attribute has a non-null namespace URI. The presence of such attributes must not change the behavior of XSL elements and functions defined in this document. Thus, an XSL processor is always free to ignore such attributes, and must ignore such attributes without giving an error if it does not recognize the namespace URI. Such attributes can provide, for example, unique identifiers, optimization hints, or documentation.
It is an error for an element from the XSL namespace to have attributes with expanded-names that have null namespace URIs (i.e., attributes with unprefixed names) other than attributes defined for the element in this document.
NOTE:The conventions used for the names of XSL elements, attributes, and functions are as follows: names are all lower-case, hyphens are used to separate words, dots are used to separate names for the components of complex datatypes, and abbreviations are used only if they already appear in the syntax of a related language such as XML or HTML.
The aim of this section is to describe the general process of formatting, enough to read the area model and the formatting-object descriptions and properties and to understand the process of refinement.
Formatting is the process of turning the result of an XSL transformation into a tangible form for the reader or listener. This process comprises several steps, some of which depend on others in a non-sequential way. Our model for formatting will be the construction of an area tree, which is an ordered tree containing geometric information for the placement of every glyph, shape, and image in the document, together with information embodying spacing constraints and other rendering information; this information is referred to under the rubric of traits, which are to areas what properties are to formatting objects and attributes are to XML nodes. Section 4 (see [4 Area Model]) will describe the area tree and define the default placement-constraints on stacked areas. However, this is an abstract model which need not be actually implemented in this way in a formatter, so long as the resulting tangible form obeys the implied constraints.
Formatting objects are elements in the formatting-object tree, whose names are from the XSL namespace; a formatting object belongs to a class of formatting objects identified by its element name. The formatting behavior of each class of formatting objects is described in terms of what areas are created by a formatting object of that class, how the traits of the areas are established based, and how the areas are structured hierarchically with respect to areas created by other formatting objects. Sections 6 (see [6 Formatting Objects]) and Section 7 (see [7 Formatting Properties] describe formatting objects and their properties.
Some formatting objects are block-level and others are inline-level. This refers to the types of areas which they generate, which in turn refer to their default placement method. Inline-areas (for example, glyph-areas) are collected into lines and the direction in which they are stacked is the inline-progression-direction. Lines are a type of block-area and these are stacked in a direction perpendicular to the inline-progression-direction, called the block-progression-direction. See Section 4 for detailed decriptions of these area types and directions.
In Western writing systems, the block-progression-direction is "top-to-bottom" and the inline-progression-direction is "left-to-right". This specification treats other writing systems as well and introduces the terms "block" and "inline" instead of using absolute indicators like "vertical" and "horizontal". Similarly this specification tries to give relatively-specified directions ("before" and "after" in the block-progression-direction, "start" and "end" in the inline-progression-direction) where appropriate, either in addition to or in place of absolutely-specified directions such as "top", "bottom", "left", and "right". These are interpreted according to the value of the writing-mode property.
Central to this model of formatting is refinement. This is a computational process which finalizes the specification of properties based on the attribute values in the XML result tree. Though the XML result tree and the formatting-object tree have very similar structure, it is helpful to think of them as separate conceptual entities. Refinement involves
propagating the various inherited values of properties (both implicitly and those with an attribute value of "inherit"),
evaluating expressions in property value specifications into actual values, which are then used to determine the value of the properties
converting relative numerics to absolute numerics,
constructing some composite properties from more than one attribute,
converting text nodes to sequences of fo:character formatting objects,
creating implied fo:bidi-override formatting objects to support mixed writing directions.
Some of these operations (particularly evaluating expressions) depend on knowledge of the area tree. Thus refinement is not necessarily a straightforward, sequential procedure, but may involve look-ahead, back-tracking, or control-splicing with other processes in the formatter. Refinement is described more fully in Section 5. See (see [5 Property Refinement / Resolution]).
To summarize, formatting proceeds by constructing an area tree (containing areas and their traits) which satisfies constraints based on information contained in the XML result tree (containing element nodes and their attributes). Conceptually, there is an intermediate step of refinement, constructing a formatting-object tree (containing formatting objects and their properties); this step may proceed in an interleaved fashion during the construction of the area tree.
This subsection contains a conceptual description of how formatting could work. This conceptual procedure does not mandate any particular algorithms or data structures as long as the result obeys the implied constraints.
The procedure works by processing formatting objects. Each object, while being processed, may initiate processing in other objects. While the objects are hierarchically structured, the processing is not; processing of a given object is rather like a co-routine which may pass control to other processes, but pick up again later where it left off. The procedure starts by initiating the processing of the fo:root formatting object.
Unless otherwise specified, processing a formatting object creates areas and returns them to its parent to be placed in the area tree. Like a co-routine, it resumes control later and initiates formatting of its own children (if any), or some subset of them. The formatting object supplies parameters to its children based on the traits of areas already in the area tree, possibly including areas generated by the formatting object or its ancestors. It then disposes of the areas returned by its formatting-object children. It might simply return such an area to its parent (and will always do this if it does not generate areas itself), or alternatively it might arrange the area in the area tree according to the semantics of the formatting object; this may involve changing its geometric position. It terminates processing when all its children have terminated processing (if initiated) and it is finished generating areas.
Some formatting objects do not themselves generate areas, instead these formatting objects simply return the areas returned to them by their children. Alternatively, a formatting object may continue to generate (and return) areas based on information discovered while formatting its own children; for example, the fo:page-sequence formatting object will continue generating pages as long as it contains a flow with unprocessed descendants.
Areas received by an fo:root formatting object are pages, and are simply placed as children of the area tree root in the order in which they are returned, with no geometrical implications.
As a general rule, the order of the area tree parallels the order of the formatting-object tree. That is, if one formatting object precedes another in the depth-first traversal of the formatting-object tree, with neither containing the other, then all the areas generated by the first will precede all the areas generated by the second in the depth-first traversal of the area tree, unless otherwise specified. Typical exceptions to this rule would be things like inline floats, block floats, and footnotes.
At the end of the procedure, the areas and their traits have been constructed, and they are required to satisfy constraints described in the definitions of their associated formatting objects, and in the area model section. In particular, size and position of the areas will be subject to the placement and spacing constraints described in the area model, unless the formatting-object definition indicates otherwise.
The formatting-object definitions, property descriptions, and area model are not algorithms. Thus, the formatting-object semantics do not specify how the line-breaking algorithm must work in collecting characters into words, positioning words within lines, shifting lines within a container, etc. Rather this specification assumes that the formatter has done these things and describes the constraints which the result is supposed to satisfy.
In XSL, one creates a tree of formatting objects that serve as inputs or specifications to a formatter. The formatter generates a hierarchical arrangement of areas which comprise the formatted result. This section defines the general model of areas and how they interact. The purpose is to present an abstract framework which is used in describing the semantics of formatting objects. It should be seen as describing a series of constraints for conforming implementations, and not as prescribing particular algorithms.
The formatter generates an ordered tree, the area tree, which describes a geometric structuring of the output medium. The terms child, sibling, parent, descendant, and ancestor refer to this tree structure. The tree has a root node.
Each area tree node other than the root is called an area and is associated to a rectangular portion of the output medium. Areas are not formatting objects; rather, a formatting object generates zero or more rectangular areas, and normally each area is generated by a unique object in the formatting object tree.
NOTE:The only exceptions are when several leaf nodes of the formatting object tree are combined to generate a single area, for example when several characters in sequence generate a single ligature glyph. In all such cases, relevant properties such as font-family and font-size are the same for all the generating formatting objects.
An area has a content-rectangle, the portion in which its child areas are assigned, and optional padding and border. The diagram shows how these portions are related to one another. The outer bound of the border is called the border-rectangle, and the outer bound of the padding is called the padding-rectangle.
Each area has a set of traits, a mapping of names to values, in the way elements have attributes and formatting objects have properties. Individual traits are used either for rendering the area or for defining constraints on the result of formatting, or both. Traits used strictly for formatting purposes or for defining constraints may be called formatting traits, and traits used for rendering may be called rendering traits. For the complete list of the type of traits see [C Property Index].
The semantics of each type of formatting object that generates areas are given in terms of which areas it generates and their place in the area-tree hierarchy. This may be further modified by interactions between the various types of formatting objects. The properties of the formatting object determine what areas are generated and how the formatting object's content is distributed among them. (For example, a word that is not to be hyphenated may not have its glyphs distributed into areas on two separate line-areas.)
The traits of an area are either:
1. "directly-derived" -- The values of directly-derived traits are the computed value of a property of the same name on the generating formatting object, or
2. "indirectly-derived" -- The values of indirectly-derived traits are the result of a computation involving the computed values of one or more properties on the generating formatting object, other traits on this area or other interacting areas (ancestors, parent, siblings, and/or children) and/or one or more values constructed by the formatter. The calculation formula may depend on the type of the formatting object.
This description assumes that refined values have been computed for all properties of formatting objects in the result tree, i.e., all relative and corresponding values have been computed and the inheritable values have been propagated as described in [5 Property Refinement / Resolution]. This allows the process of inheritance to be described once and avoids a need to repeat information on computing values in this description.
There are two types of areas: block-areas and inline-areas. These differ according to how they are typically stacked by the formatter. An area can have child areas of one type or the other as determined by the generating formatting object, but an area's children must all be of one type. One should note that although block-areas and inline-areas are typically stacked, some areas can be explicitly positioned.
A line-area is a special kind of block-area whose children are all inline-areas. A glyph-area is a special kind of inline-area which has no child areas, and has a single glyph image as content.
Typical examples would be: a paragraph rendered by using an fo:block formatting object, which generates block-areas, and a character rendered by using an fo:character formatting object, which generates an inline-area (in fact, a glyph-area).
Associated with any area are two directions, which are derived from the generating formatting object's "writing-mode" and "reference-orientation" properties: the block-progression-direction is the direction for stacking block-area descendants of the area, and the inline-progression-direction is the direction for stacking inline-area descendants of the area. Another trait, the shift-direction, is present on inline-areas and refers to the direction in which the baseline shifts are applied. Also the glyph-orientation defines the orientation of glyph-images in the rendered result.
An area has a Boolean trait is-indent-reference, which determines
whether or not it establishes a coordinate system for
specifying indents. An area for which this trait is true
is called a reference-area. A reference-area may be either a
block-area or an inline-area.
A set of traits describes the position, height, and width of the area. Other traits specify:
the amount of space outside the border-rectangle: space-before, space-after, space-start, and space-end (though some of these may be required to be zero on certain classes of area);
the thickness of each of the four sides of the padding: padding-before, padding-after, padding-start, and padding-end;
the style, thickness, and color of each of the four sides of the border: border-before, etc.; and
the background rendering of the area: background-color, etc.
As described above, the content-rectangle is the rectangle bounding the inside of the padding and is used to describe the constraints on the positions of descendant areas. It is possible that marks from glyph contents or descendant areas may appear outside the content-rectangle.
Related to this is the allocation-rectangle of an area, which is used to describe the constraints on the position of the area within its parent area. For an inline-area this extends to the content-rectangle in the block-progression-direction and to the border-rectangle in the inline-progression-direction.
Allocation- and content-rectangles of an inline-area
For a block-area, it extends to the border-rectangle in the block-progression-direction and outside the border-rectangle in the inline-progression-direction by an amount equal to the space-end, and in the opposite direction by an amount equal to the space-start. The traits actual-height and actual-width of an area apply to the content-rectangle.
NOTE:The inclusion of space outside the border-rectangle of a block-area in the inline-progression-direction does not affect placement constraints, and is intended to promote compatibility with the CSS box model.
Allocation- and content-rectangles of a block-area
The edges of a rectangle are designated as follows:
the before-edge is the edge occurring first in the block-progression-direction and perpendicular to it;
the after-edge is the edge opposite the before-edge;
the start-edge is the edge occurring first in the inline-progression-direction and perpendicular to it,
the end-edge is the edge opposite the start-edge.
The following diagram shows the correspondence between the various edge names for a mixed writing-mode example:
For purposes of this definition, the content-rectangle of an area uses the inline-progression-direction and block-progression-direction of that area; but the border-rectangle, padding-rectangle, and allocation-rectangle use the directions of its parent area. Thus the edges designated for the content-rectangle may not correspond with the same-named edges on the padding-, border-, and allocation-rectangles. This is important in the case of nested block-areas with different writing-modes.
Each inline-area has a designated position-point on the start-edge of its allocation-rectangle; for a glyph-area, this is a point on the leading edge of the glyph on its nominal baseline. The descent of an inline-area is defined to be how far its allocation-rectangle extends in the block-progression-direction from the position-point, and the ascent is defined to be how far the allocation-rectangle extends in the opposite direction. Together these add up to the actual-height of the inline-area.
In the area tree, the set of areas with a given parent is ordered. The terms initial, final, preceding, and following refer to this ordering.
This extends to a strict partial ordering among all descendant areas of a given area by saying (recursively) that A precedes B (and B follows A) when
A precedes B under the same parent, or
A precedes the parent of B, or
A's parent precedes B.
(This is a only a partial ordering, since it does not define an area as following or preceding its descendant areas.)
If C follows A and precedes B, C is said to be between A and B.
Typically areas follow one another in sequence when stacked, but some areas
may be marked as not following the main sequence (e.g., floats
and absolutely positioned areas). The out-of-sequence trait is
a Boolean value which is true if the area is
to be treated in this way; a sequenced-area is
an area for
which this trait is false.
Issue (out-of-line-or-sequence):
Inadvertent inconsistency between Area Model that uses term "out-of-sequence" and Introduction to Formatting Objects that uses the term "out-of-line" to be consistent with the categorization of formatting objects in the April Draft. Another discrepancy is between "sequenced-area" and "normal area". These need to be harmonized.
If A and B are sequenced block-areas, A consecutively precedes B (or B consecutively follows A) if A precedes B and there is no sequenced-area between A and B, and further: (1) all ancestors of A which precede B have zero border-after and padding-after, and (2) all ancestors of B which follow A have zero border-before and padding-before.
Similarly, if A and B are inline-areas, A consecutively precedes B (or B consecutively follows A) if A precedes B and there is no area between A and B, and further: (1) all ancestors of A which precede B have zero border-end and padding-end, and (2) all ancestors of B which follow A have zero border-start and padding-start.
NOTE:The intention of the definition is to identify areas at any level of the tree which have only space between them.
Example. In this diagram each node represents a block-area. Assume that all padding and border widths are zero. Then A consecutively precedes B, A consecutively precedes C, C consecutively precedes D, B consecutively precedes E, and D consecutively precedes E; these are the only pairs of consecutively preceding elements in the diagram. If B had non-zero padding-after, then D would not consecutively precede E (though B would still consecutively precede E).
Recursively define an area A to be leading in another area B if
1. A is the initial child of B, or
2. A is the initial child of an area C, where C is leading in B and C has zero border-before and padding-before.
Define an area A to begin an area B if A is leading in B and all of A's ancestors which are descendants of B have zero space-before.
Recursively define an area A to be trailing in another area B if
1. A is the final child of B, or
2. A is the final child of an area C, where C is trailing in B and C has zero border-after and padding-after.
Define an area A to end an area B if A is trailing in B and all of A's ancestors which are descendants of B have zero space-after.
NOTE:It is possible for several areas to be leading or trailing, e.g., if the first child area is a block-area that has nested block-areas.
Example. In this diagram each node represents a block-area. Assume that all areas have zero border and padding. Then A, B, and C are all leading areas in P. If B had non-zero before-border, then only A and B would be leading in P.
A space-specifier is a compound datatype consisting of a minimum, optimum, and maximum, conditionality, and precedence.
Minimum, optimum, and maximum are lengths and can be used to define a constraint on a distance, namely that the distance should preferably be the optimum, and in any case no less than the minimum nor more than the maximum. Any of these values may be negative, which can (for example) cause areas to overlap, but in any case the minimum should be less than or equal to the optimum value, and the optimum less than or equal to the maximum value.
Conditionality is a Boolean value which controls whether a
space-specifier has effect at
the beginning or end of a reference-area; a conditional
space-specifier is one for which this value is true.
Precedence has a value which is either an integer or the special
token force. A forcing space-specifier
is one for which this value is force.
Space-specifiers occurring in sequence may interact with each other. The constraint imposed by a sequence of space-specifiers is computed by calculating for each space-specifier its associated resolved space-specifier in accordance with their conditionality and precedence, as shown below in the space-resolution rules.
The constraint imposed on a distance by a sequence of resolved space-specifiers is additive; that is, the distance is constrained to be no less than the sum of the resolved minimum values and no larger than the sum of the resolved maximum values.
Recursively define that a space-specifier S' consecutively follows another space-specifier S (and S consecutively precedes S') if
S' is the space-before of a block-area, and S is the space-after of the area's preceding sibling, or
S' is the space-after of a block-area B, where B has zero padding-after and border-after and is not a reference-area, and S is the space-after of a block-area A which is a not a line-area, and A is the final child of B, or
S' is the space-before of a block-area B, where B has zero padding-before and border-before and is not a reference-area, and S is the space-before of a block-area A which is not a line-area, and A is the initial child of B, or
S' is the space-start of an inline-area, and S is the space-end of the area's preceding sibling, or
S' is the space-end of an inline-area I, where I has zero padding-end and border-end values, and is not a reference-area, and S is the space-end of the final child of I, or
S' is the space-start of an inline-area I, where I has zero padding-start and border-start values and is not a reference-area, and S is the space-start of the initial child of I, or
S' consecutively follows a space-specifier S'', S'' consecutively follows S, and S'' has zero values for minimum, optimum, and maximum.
Space-resolution rules. To compute the resolved space-specifier of a given space-specifier S, consider the maximal sequence of space-specifiers containing S in which each is consecutively followed by the next. The resolved space-specifier of each of these space-specifiers is a non-conditional space-specifier computed in terms of this sequence.
1. If any of the space-specifiers is conditional, and is the space-before of an area which begins a reference-area, then it is suppressed, which means that its resolved space-specifier is zero. Further, conditional space-specifiers which consecutively follow a space-specifier suppressed in this way are also suppressed.
If a conditional space-specifier is the space-after of an area which ends a reference-area, then it is suppressed together with any other conditional space-specifiers which consecutively precede it.
2. If any of the remaining space-specifiers is forcing, all non-forcing space-specifiers are suppressed, and the value of each of the forcing space-specifiers is taken as its resolved value.
3. Alternatively if all of the remaining space-specifiers are non-forcing, then the resolved space-specifier is defined in terms of those space-specifiers whose precedence is highest, and among these those whose optimum value is the greatest. All other space-specifiers are suppressed. If there is only one of these then its value is taken as its resolved value.
Otherwise the resolved space-specifier of the last space-specifier in the sequence is derived from these spaces by taking their common optimum value as its optimum, the greatest of their minimum values as its minimum, and the least of their maximum values as its maximum, and all other space-specifiers are suppressed.
Example. Suppose the sequence of space values occurring at the
beginning of a reference-areas is: first, a space with value 10 points (that is
minimum,
optimum, and maximum all equal to 10 points) and conditionality
true; second, a space with value 4 points and
conditionality false; and third, a space
with value 5 points and conditionality true;
all three spaces having precedence zero. Then the first (10 point) space is
suppressed under rule 1, and the
second (4 point) space is suppressed under rule 3. The resolved value of the
third space is a non-conditional 5 points, even though
it originally came from a conditional space.
Note that the padding of a block-area does not affect the resolved value of any space-specifier (except that by definition, the presence of padding at the before- or after-edge causes space-specifiers around it to be non-consecutive.)
The border or padding at the before-edge of a block-area may be specified as conditional. If so, then it is set to zero if its space-before is zero or conditional, and either its space-before consecutively follows a conditional space suppressed under rule 1 or the area begins a reference-area. In this case, the border or padding is taken to be zero for purposes of the definition of when a space-specifier consecutively follows another. Similarly the border or padding at the after-edge of a block-area may be specified as conditional. If so, then it is set to zero if its space-after is zero or conditional, and either its space-after consecutively precedes a conditional space suppressed under rule 1 or the area ends a reference-area.
Block-areas have several traits which typically affect the placement of their
children. The line-height is used in line placement calculations.
So is its nominal-glyph-height, which is the size (in the
block-progression-direction) of a glyph-area in the default font of the
generating formatting object;
this is the sum of its nominal-ascent and nominal-descent.
These three "nominal" traits depend only on
the default font and not on which glyphs (or fonts) actually occur among
descendants of the block-area.
The line-stacking-strategy trait controls what kind of allocation
is used for descendant line-areas and has an enumerated value
(either font-height, max-height,
or line-height). This is all rigorously described below.
All block-areas have these traits,
but they only have meaning for areas which have stacked block-area children.
The space-before and space-after determine the distance between the block-area and surrounding block-areas.
A block-area which is not a line-area typically has its size in the inline-progression-direction determined by its start-indent and end-indent and by the size of its nearest ancestor reference-area. A block-area which is not a line-area typically varies in the block-progression-direction to accommodate its descendants. Alternatively the generating formatting object may specify a height for the block-area.
Block-area children of an area are typically stacked in the block-progression-direction within their parent area, and this is the default method of positioning such block-areas. However, formatting objects are free to specify other methods of positioning child areas of areas which they generate.
If P is an area with block-area children, a
block-area descendant B of P is
stackable in P if the
out-of-sequence trait of B is
false and either
1. B is a child area of P, or
2. B is a child area of a block-area A, where A is a stackable descendant of P, A is not a reference-area, and A has the same block-progression-direction and inline-progression-direction as P.
Example. In the diagram,P is a
block area, B1 is an
embedded block-area,
L1,...,L5
are line-areas, H is a block-area
which is an inline float (with
out-of-sequence trait
equal to true) and
B2 is a block-area
child of H, T is a reference-area
which is a block-area generated by an fo:table formatting
object, and
C11,...,C22
are block-areas which represent
cells of the table. In this case L1,
B1, L2,
L3, L4,
L5, and T are all stackable
in P. However,
H, B2, and
C11,
..., C22 are not stackable in
P, and
thus the constraints described below do not involve them.
For a parent area P whose children are block-areas, P is defined to be properly stacked if all of the following conditions hold:
Either P is a reference-area R, or P has an ancestor reference-area and P's closest ancestor reference-area R has the same block-progression-direction and inline-progression-direction as P.
For each stackable block-area which is a descendant of P, the following hold:
the before-edge and after-edge of its allocation-rectangle are parallel to the before-edge and after-edges of the content-rectangle of P,
the start-edge of its allocation-rectangle is parallel to the start-edge of the content-rectangle of R, and offset from it by a distance equal to the block-area's start-indent plus its start-intrusion-adjustment, minus its border-start, padding-start, and space-start values, and
the end-edge of its allocation-rectangle is parallel to the end-edge of the content-rectangle of R, and offset from it by a distance equal to the block-area's end-indent plus its end-intrusion-adjustment, minus its border-end, padding-end, and space-end values.
NOTE:The start-intrusion-adjustment and end-intrusion-adjustment are traits used to deal with intrusions from floats. The notion of indent is intended to apply to the content-rectangle, but the constraint is written in terms of the allocation-rectangle, because as noted earlier the edges of the content-rectangle may not correspond to like-named edges of the allocation-rectangle.
For each pair of stackable block-area descendants B and B' of P, if B precedes B' consecutively, then the distance from the after-edge of the allocation-rectangle of B to the before-edge of the allocation-rectangle of B' is consistent with the constraint imposed by the resolved values of the space-after traits of B and of any of its ancestors which precede B', and of the space-before traits of B' and of any of its ancestors which follow B.
NOTE:In both this clause and the next, several block-areas may have placement constraints relative to another area or to the content-rectangle of P, and together these determine the size of all block-areas, which may affect things like which background color has effect at a given point.
Example. In the diagram, if area
A
has a space-after value of 3 points, B a
space-before
of 1 point, and C a space-before of 2 points, all
with
precedence of force, and with zero border and padding,
then the constraints will place B's
allocation-rectangle
4 points below that of A, and C's
allocation-rectangle
6 points below that
of A. Thus the 4-point gap receives the
background color
from P, and the 2-point gap before C
receives the background color from B.
The distance from the before-edge of the content-rectangle of P to the before-edge of the allocation-rectangle of any leading block-area descendant B of P is consistent with the constraint imposed by the resolved values of the space-before traits of B of all the ancestors of B which are descendants of P.
Similarly, the distance from the after-edge of the allocation-rectangle of any trailing block-area descendant B' of P to the after-edge of the content-rectangle of P is consistent with the constraint imposed by the resolved values of the space-after traits of B' and of all the ancestors of B' which are descendants of P.
A line-area is a special type of block-area, and is generated by the same formatting object which generated its parent. Line-areas do not have borders and padding, i.e., border-before-width, padding-before-width, etc. are all zero.
The allocation-rectangle of a line is determined by the value of the
line-stacking-strategy trait: if the
value is font-height, the allocation-rectangle is
the nominal-requested-line-rectangle, defined below; if the value is
max-height, the allocation-rectangle is the
maximum-line-rectangle, defined below; and if
the value is
line-height, the allocation-rectangle is the
per-inline-height-rectangle, defined below.
The nominal-requested-line-rectangle for a line-area is the rectangle bounded in the inline-progression-direction by the content-rectangle of the parent block-area, as modified by typographic properties such as indents, and in the block-progression-direction by the nominal-glyph-height of the parent block-area. It has the same height for each line-area child of a block-area.
The maximum-line-rectangle for a line-area has the same length as the nominal-requested-line-rectangle in the inline-progression-direction. In the block-progression-direction, it is bounded by the maximum ascent and the maximum descent for the inline-areas descendants stacked within the line-area, as raised and lowered by the shift-amount trait perpendicular to the inline-progression-direction, but may not be less than the nominal-glyph-height. Its height may vary depending on the descendants of the line-area.
Nominal and Maximum Line Rectangles
The per-inline-height-rectangle has the same length as the nominal-requested-line-rectangle in the inline-progression-direction. For each inline-area the half-leading is defined to be half the difference of its line-height minus its actual-height. The expanded-ascent of an inline-area is its ascent plus half-leading, and the expanded-descent is its descent plus half-leading. As in the definition of the maximum-line-rectangle, this is raised or lowered according to mandated adjustments perpendicular to the baseline. The per-inline-height-rectangle extends in the line-progression-direction from the maximum expanded-ascent to the maximum expanded-descent over all the inline-area descendants stacked within the line-area. Its height may vary depending on the descendants of the line-area.
NOTE:Using the nominal-requested-line-rectangle allows equal baseline-to-baseline spacing. Using the maximum-line-rectangle allows constant space between line-areas. Using the per-inline-height-rectangle and zero space-before and space-after allows CSS-style line box stacking.
Inline-areas are stacked within a line-area relative to a baseline-start-point which is a point on the start-edge of its content-rectangle, separated from the before-edge of the nominal-requested-line-rectangle by a distance equal to the nominal-ascent.
An inline-area has its own line-height trait, which may be
different from the line-height of its containing block-area. This may affect the
placement of its ancestor line-area when the line-stacking-strategy
is line-height.
An inline area has a font-ascent and font-descent trait for the
font associated with the generating formatting object.
An inline-area may or may not have child areas, and if so it may or may not be a reference-area. The content-rectangle for an inline-area without children has a specified size in both dimensions. An inline-area with children has a content-rectangle which is the minimum rectangle (with sides parallel to those of the content-rectangle of its parent area) which includes the allocation-rectangles of all of its children, and which extends in the block-progression-direction by at least the font-descent from its position-point, and in the opposite direction by at least the font-ascent from its position-point.
Examples of inline-areas with children might include portions of inline mathematical expressions or areas arising from mixed writing systems (left-to-right within right-to-left, for example).
Inline-area children of an area are typically stacked in the inline-progression-direction within their parent area, and this is the default method of positioning such inline-areas.
If P is an area with inline-area children, an inline-area
descendant I of P is
stackable in P if the
out-of-sequence trait of I is false
and either
1. I is a child area of P, or
2. I is a child area of an inline-area J, where J is a stackable descendant of P, J is not a reference-area, and J has the same block-progression-direction and inline-progression-direction as P.
A stackable inline-area is leading if there is no inline-area descendant of P which precedes it, and is trailing if there is no inline-area descendant of P which follows it.
Inline-areas are stacked relative to a baseline, defined as follows:
1. If P is a line-area, the baseline of P is defined to be the line through the baseline-start-point which is parallel to the inline-progression-direction;
2. If P is an inline-area, the baseline of P is defined to be the line through the position-point of P which is parallel to the inline-progression-direction.
For a parent area P whose children are inline-areas, P is defined to be properly stacked if all of the following conditions hold:
1. For each stackable inline-area descendant I of P, the before-edge and after-edge of the allocation-rectangle of I are parallel to the before-edge and after-edge of the content-rectangle of P, and the start-edge and end-edge of the content-rectangle of I are parallel to those of P.
2. For each pair of stackable inline-area descendants I and I' of P, if I precedes I' consecutively, then the distance from the end-edge of the allocation-rectangle of I to the start-edge of the allocation-rectangle of I' is consistent with the constraint imposed by the resolved values of the space-after traits of I and of any of its ancestors which precede I', and of the space-before traits of I' and of any of its ancestors which follow I.
Issue (mixed-writing-direction-line-area-2):
This item (2.) has not yet been updated to account for mixed writing directions within a single line-area.
3. The distance from the start-edge of the content-rectangle of P to the start-edge of the allocation-rectangle of any leading inline-area descendant I of P is consistent with the constraint imposed by the resolved values of the space-start traits of I of all the ancestors of I which are descendants of P.
Similarly, the distance from the end-edge of the allocation-rectangle of any trailing inline-area descendant I' of P to the end-edge of the content-rectangle of P is consistent with the constraint imposed by the resolved values of the space-end traits of I' and of all the ancestors of I' which are descendants of P.
4. For any stackable inline-area descendant I of P, the distance in the shift-direction from the baseline of P to the position-point of I equals the value of the baseline-shift trait of I. The baseline-shift trait is calculated from the baseline-shift property plus an amount computed to compensate for mixed writing systems with different nominal glyph baselines.
Issue (mixed-writing-direction-line-area-4):
This item (4.) has not yet been updated to reflect the notion of different baselines for mixed writing systems within a single line-area.
The most common inline-area is a glyph-area, which contains the representation for a character in a particular font.
A glyph-area has font-family, font-size, and font-weight traits, which apply to its character data.
The position-point and escapement-point of a glyph-area are assigned according to the writing-system in use (e.g., the glyph baseline in European languages), and are used to control placement of inline-areas descendants of a line-area. The formatter may generate inline-areas with different inline-progression-directions from their parent to accommodate correct inline-area stacking in the case of mixed-language formatting.
A glyph-area has no children, and its ascent and descent should depend only on the font-name, font-size, and font-weight properties of its generating formatting object, and is not based on the individual glyph rendered.
Issue (Mapping-CSS-Box):
Include mapping to CSS box so that CSS properties can be read and interpreted for XSL.
Issue (refinement1):
This section is incomplete and may be inconsistent with other parts of this working draft.
The semantics of formatting is divided into a set of simple steps to simplify the explanation and to highlight what interactions (or, perhaps more importantly, lack of interactions) exist between formatting objects and the properties on them. Although they are described as steps, this is solely for the convenience of exposition and does not imply they must be implemented as separate steps in any conforming implementation. A conforming implementation must only achieve the same effect. Some of the relevant steps are, for each formatting object in the result tree, determining (computed) values for all the properties applicable to that formatting object and then using the formatting model to describe how the values of these properties constrain the distribution of the content of these formatting objects into areas and the resolution of the spacing adjustments among the various areas.
The step of determining the computed values of the relevant properties is described below. The process of distribution and space resolution is described and/or constrained by the descriptions of the formatting properties and the formatting objects which use them.
For every property that is applicable to the a given formatting object, it is necessary to determine the value of the property. Three variants of the property value are distinguished: the specified value, the computed value, and the actual value. The "specified value" is one that is placed on the formatting object during the tree-constuction process. A specified value may not be in a form that is directly usable; for example, it may be a percentage that must be converted into an absolute value. A value resulting from such a conversion is called the "computed value". Finally, the computed value may not be realizable on the output media and may need to be adjusted prior to use in rendering. For example, a line width may be adjusted to become an integral number of output medium pixels. This adjusted value is the "actual value."
The specified value of a property is determined using the following mechanisms (in order of precedence):
If the tree-construction process placed the property on the formatting object, use the value of that property as the specified value. This is called "explicit specification".
Otherwise, if the property is inheritable, use the value of that property from the parent formatting object, generally the computed value (see below).
Otherwise use the property's initial value, if it has one. The initial value of each property is indicated in the property's definition. If there is no initial value, that property is not specifed on the formatting object. (A property will only be "not specified" if there are corresponding properties that can provide equivalent information.
Since it has no parent, the root of the result tree cannot use values from its parent formatting object; in this case, the initial value is used if necessary.
Specified values may be absolute (i.e., they are not specified relative to another value, as in "red" or "2mm") or relative (i.e., they are specified relative to another value, as in "auto", "2em", and "12%"). For most absolute values, no computation is needed to find the computed value. Relative values, on the other hand, must be transformed into computed values: percentages must be multiplied by a reference value (each property defines which value that is), values with relative units (em, ex, px) must be made absolute by multiplying with the appropriate font or pixel size, "auto" values must be computed by the formulas given with each property, certain property values ("smaller", "bolder") must be replaced according to their definitions.
Some properties have more than one way in which the property value can be specified. The simplest example of such properties are those which can be specified either in terms of a direction relative to the writing-mode (e.g., padding-before) or a direction in terms of the absolute geometric orientation of the viewport (e.g., padding-top). These two properties are called the relative property and the absolute property, respectively. Collectively, they are called "corresponding properties."
Specifying a value for one property determines both a computed value for the specified property and a computed value for the corresponding property. Which relative property corresponds to which absolute property depends on the writing-mode. For example, if the "writing-mode" at the top level of a document is "lr-tb", then "padding-start" corresponds to 'padding-left", but if the "writing-mode" is "rl-tb", then "padding-start" corresponds to "padding-right". The exact specification of how to compute the values of corresponding properties is given in the section on Computing the values of Corresponding Properties, below.
In most cases, elements inherit computed values. However, there are some properties whose specified value may be inherited (e.g., the number value for the "line-height" property). In the cases where child elements do not inherit the computed value, this is described in the property definition.
A computed value is in principle ready to be used, but a user agent may not be able to make use of the value in a given environment. For example, a user agent may only be able to render borders with integer pixel widths and may, therefore, have to adjust the computed width to an integral number of media pixels. The actual value is the computed value after any such adjustments have been applied.
Issue (refinement2):
This section is incomplete and may be inconsistent with other parts of this working draft.
Some of the properties applicable to formatting objects are "inheritable." Such properties are so identified in the property description. The inheritable properties can be placed on any formatting object. The inheritable properties are propagated down the result tree from a parent to each child. (These properites are given their initial value at the root of the result tree.) For a given inheritable property, if that property is present on a child, then that value of the property is used for that child (and its descendents until explicitly re-set in a lower descendent); otherwise, the specified value of that property on the child is the computed value of that property on the parent formatting object. Hence there is always a specified value defined for every inheritable property for every formatting object.
Issue (refinement3):
This section is incomplete and may be inconsistent with other parts of this working draft.
Where there are corresponding properties, such as "padding-left" and "padding-start", a computed value is determined for all the corresponding properties. How the computed values are determined for a given formatting object is dependent on which of the corresponding properties are specified.
The simplest class of corresponding properties are those for which there are only two variants in the correspondance, an absolute property and a relative property, and the property names differ only in the choice of absolute or relative designation; for example, "border-left-color" and "border-start-color".
Issue (correspondance):
Need to add description of the correspondance because it involves both writing-mode and reference-orientation of all enclosing reference-areas.
For this class, the computed values of the corresponding properties are determined as follows. If the corresponding absolute variant of the property is specified on the formatting object, its computed value is used to set the computed value of the corresponding relative property. If the corresponding absolute property is not explicitly specified, then the computed value of the absolute property is set to the computed value of the relative property of the same name.
Note that if both the absolute and the relative properties are not explicitly specified, then the rules for determining the specifed value will use either inheritance if that is defined for the property or the initial value. The initial value must be the same for all possible corresponding properties. If both an absolute and a corresponding relative property are explicitly specified, then the above rule gives precedence to the absolute property, and the specified value of the corresponding relative property is ignored in determining the computed value of the corresponding properties.
The (corresponding) properties that use the above rule to determine their computed value are:
Border-after-color
Border-before-color
Border-end-color
Border-start-color
Border-after-style
Border-before-style
Border-end-style
Border-start-style
Border-after-width
Border-before-width
Border-end-width
Border-start-width
Padding-after
Padding-before
Padding-end
Padding-start
The "space-before", "space-after", "space-start", and "space-end" properties are handled very similarly to the properties immediately above, but the corresponding absolute properties are in the set: "margin-top", "margin-bottom", "margin-left", and "margin-right". Again the correspondance between the relative property and the corresponding absolute property is determined by the "writing-mode" property of the formatting object. For example, at the top level of a document, if the "writing-mode" is "lr-tb", then "space-before" corresponds to "margin-top", but if the "writing-mode" is "tb-rl", the "space-before" corresponds to "margin-right".
There are two more properties, "end-indent" and "start-indent", for which the computed value may be determined by the computed value of the absolute margin properties. For these traits, the calculation of the value of the trait when the corresponding absolute property is present depends on three computed values: the computed value of the corresponding absolute property, the computed value of the corresponding "padding" property, and the computed value of the corresponding "border-width" property.
Here the term "corresponding" has been broadened to mean that if "margin-left" is the corresponding absolute property to "start-indent", then "padding-left" (and "padding-start") and 'border-left-width" (and "border-start-width") are the "corresponding" "padding" and "border-width" properties.
The formulae for calculating the computed value of the x-indent properties are as follows (where "margin-corresponding" is a place-holder for the corresponding absolute "margin" property):
End-indent = margin-corresponding + padding-end + border-end-width
Start-indent = margin-corresponding + padding-start + border-start-width
If an absolute "margin" property is not explicity specified, these equations determine a computed value for the corresponding "margin" property given values for the three traits corresponding-indent, padding-corresponding and border-corresponding width.
All property value specifications in attributes within an XSL stylesheet can be expressions. These expressions represent the value of the property specified. The expression is first evaluated and then the resultant value is used to determine the value of the property.
Properties are evaluated against a property-specific context. This context provides:
A list of allowed resultant types for a property value.
Conversions from resultant expression value types to an allowed type for the property.
The current font-size value.
Conversions from relative numerics by type to absolute numerics within additive expressions.
NOTE:It is not necessary that a conversion is provided for all types. If no conversion is specified, it is an error.
When a type instance (e.g., a string, a keyword, a numeric, etc.) is recognized in the expression it is evaluated against the property context. This provides the ability for specific values to be converted with the property context's specific algorithms or conversions for use in the evaluation of the expression as a whole.
For example, the "auto" enumeration token for certain properties is a calculated value. Such a token would be converted into a specific type instance via an algorithm specified in the property definition. In such a case the resulting value might be an absolute length specifying the width of some aspect of the formatting object.
In addition, this allows certain types like relative numerics to be resolved into absolute numerics prior to mathematical operations.
All property contexts allow conversions as specified in [5.4.12 Expression Value Conversions].
When a set of properties is being evaluated for a specific formatting object element in the formatting object element tree there is a specific order in which properties must be evaluated. Essentially, the font-size property must be evaluated first before all other properties. Once the font-size property has been evaluated, all other properties may be evaluated in any order.
When the "font-size" property is evaluated, the current font-size for use in evaluation is the font-size of the formatting object element's parent. Once the "font-size" property has been evaluated, that value is used as the current font-size for all property contexts of all properties value expressions being further evaluated.
| [1] | Expr | ::= | AdditiveExpr | |
| [2] | PrimaryExpr | ::= | '(' Expr ')' | |
| |Numeric | ||||
| | Literal | ||||
| | Color | ||||
| | Keyword | ||||
| | EnumerationToken | ||||
| | FunctionCall |
| [3] | FunctionCall | ::= | FunctionName '(' ( Argument ( ',' Argument)*)? ')' | |
| [4] | Argument | ::= | Expr |
A numeric represents all the types of numbers in an XSL expression. Some of these numbers are absolute values. Others are relative to some other set of values. All of these values use a floating-point number to represent the number-part of their definition.
A floating-point number can have any double-precision 64-bit format IEEE 754 value [IEEE 754]. These include a special "Not-a-Number" (NaN) value, positive and negative infinity, and positive and negative zero. See Section 4.2.3 of [JLS] for a summary of the key rules of the IEEE 754 standard.
| [5] | Numeric | ::= | AbsoluteNumeric | |
| | RelativeNumeric | ||||
| [6] | AbsoluteNumeric | ::= | AbsoluteLength | |
| [7] | AbsoluteLength | ::= | Number AbsoluteUnitName? | |
| [8] | RelativeNumeric | ::= | Percent | |
| | RelativeLength | ||||
| [9] | Percent | ::= | Number '%' | |
| [10] | RelativeLength | ::= | Number RelativeUnitName |
The following operators may be used with numerics:
+
Performs addition.
-
Performs subtraction or negation.
*
Performs multiplication.
div
Performs floating-point division according to IEEE 754.
mod
Returns the remainder from a truncating division.
NOTE:Since XML allows
-in names, the-operator (when not used as a UnaryExpr negation) typically needs to be preceded by whitespace. For example the expression10pt - 2ptmeans subtract 2 points from 10 points. The expression10pt-2ptmeans a length value of 10 with a unit of "pt-2pt".
NOTE:The following are examples of the
modoperator:
5 mod 2returns1
5 mod -2returns1
-5 mod 2returns-1
-5 mod -2returns-1
NOTE:The
modoperator is the same as the%operator in Java and ECMAScript and is not the same as the IEEE remainder operation, which returns the remainder from a rounding division.
| [11] | AdditiveExpr | ::= | MultiplicativeExpr | |
| | AdditiveExpr '+' MultiplicativeExpr | ||||
| | AdditiveExpr '-' MultiplicativeExpr | ||||
| [12] | MultiplicativeExpr | ::= | UnaryExpr | |
| | MultiplicativeExpr MultiplyOperator UnaryExpr | ||||
| | MultiplicativeExpr 'div' UnaryExpr | ||||
| | MultiplicativeExpr 'mod' UnaryExpr | ||||
| [13] | UnaryExpr | ::= | PrimaryExpr | |
| | '-' UnaryExpr |
NOTE:The effect of this grammar is that the order of precedence is (lowest precedence first):
+, -
*, div, mod
and the operators are all left associative. For example, 2*3 + 4 div 5 is equivalent to (2*3) + (4 div 5).
If a non-numeric value is used in an AdditiveExpr and there is no property context conversion from that type into an absolute numeric value, the expression is invalid and considered an error.
An absolute numeric is an absolute length which is a pair consisting of a Number and a UnitName raised to a power. When an absolute length is written without a unit, the unit power is assumed to be zero. Hence, all floating point numbers are a length with a power of zero.
Each unit name has associated with it an internal ratio to some common internal unit of measure (e.g., a meter). When a value is written in a property expression, it is first converted to the internal unit of measure and then mathematical operations are performed.
In addition, only the mod, addition, and subtraction operators require that the numerics on either side of operation be absoluted numerics of the same unit power. For other operations, the unit powers may be different and the result should be mathematically consistent as with the handling of powers in algebra.
A property definition may constrain an absolute length to a particular power. For example, when specifying font-size, the value is expected to be of power "one". That is, it is expect to have a single powered unit specified (e.g., 10pt).
When the final value of property is calculated, the resulting power of the absolute numeric must be either zero or one. If any other power is specified, the value is an error.
Relative lengths are values that are calculated relative to some other set of values. When written as part of an expression, they are either converted via the property context into an absolute numeric or passed verbatim as the property value.
It is an error if the property context has no available conversion for the relative numeric and a conversion is required for expression evaluation (e.g., within an add operation).
Percentages are values that are counted in 1/100 units. That is, 10%
as a percentage value is 0.10 as a floating point number.
When converting to an absolute numeric, the percentage is defined in the
property definition as being a percentage of some known
property value.
For example, a value of "110%" on a "font-size" property would be evaluated to mean 1.1 times the current font size. Such a definition of the allowed conversion for percentages is specified on the property definition. If no conversion is specified, the resulting value is a percentage.
A relative length is a unit-based value that is measured against the
current value of the font-size property.
There is only one relative unit of measure, the "em". The definition of "1em" is equal to the current font size. For example, a value of "1.25em" is 1.25 times the current font size.
When an em measurement is used in an expression, it is converted according to the font-size value of the current property's context. The result of the expression is an absolute length. See [7.6.3 "font-size"]
Strings are represented either as literals or as an enumeration token. All properties contexts allow conversion from enumeration tokens to strings. See [5.4.12 Expression Value Conversions].
A color is a set of values used to identify a particular color from a color space. Currently, only RGB colors are supported by this draft.
Issue (color-space):
Change this section when/if "color-profile" and "rendering-intent" properties are incorporated into XSL.
RGB colors are directly represented in the expression language using a hexadecimal notation. They can also be accessed through the system-color function or through conversion from a EnumerationToken via the property context.
Keywords are special tokens in the grammar that provide access to calculated values or other property values. The allowed keywords are defined in the following subsections.
The property takes the same computed value as the property for the formatting object's parent object.
When processing an expression, whitespace (ExprWhitespace) may be allowed before or after any expression token even though it is not explicitly defined as such in the grammar. In some cases, whitespace is necessary to make tokens in the grammar lexically distinct. Essentially, whitespace should be treated as if it does not exist after tokenization of the expression has occurred.
The following special tokenization rules must be applied in the order specified to disambiguate the grammar: