Copyright ©2001 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
This document presents a set of CSS text formatting properties. In addition to what was already existing in CSS 2 [CSS2], many new properties are addressing basic requirements in international context (mostly East Asian and Bidirectional). However, their usage is not limited to those instances.
This document is a working draft of the CSS working group which is part of the Style activity. It contains a proposal for features to be included in CSS level 3.
This document has been produced as a combined effort of the W3C Internationalization Activity, and the Style Activity. It also includes extensive contribution made by members of the XSL Working Group (members only). Finally, some of the proposal surfaced first in the Scalable Vector Graphics (SVG) 1.0 Specification [SVG1.0]. The text has been duplicated in this document to reflect which properties and specification should be eventually referenced in CSS itself.
The previous title of this draft was "International Layout."
Feedback is very much welcomed. Comments can be sent directly to the editor, but the mailing list www-style@w3.org (see instructions) is also open and is preferred for discussion of this and other drafts in the Style area.
This working draft may be updated, replaced or rendered obsolete by other W3C documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". Its publication does not imply endorsement by the W3C membership or the CSS & FP Working Group (members only).
To find the latest version of this working draft, please follow the "Latest version" link above, or visit the list of W3C Technical Reports.
This CSS3 module depends on the following other CSS3 modules:
It has non-normative (informative) references to the following other CSS3 modules:
In both CSS1 and CSS2, text formatting has been limited to simple effects like for example: text decoration, text alignment and character spacing. However, International typography contains types of formatting that could not be achieved without using special workarounds or graphics.
Along with already existing text related properties, this document presents a number of new CSS properties to represent such formatting. For example, the features this proposal covers include two of the most important features for East Asian typography: vertical text and layout grid.
There is a number of illustrations in this document for which the following legend is used:
- wide-cell glyph (e.g. Han)
which is the n-th character in the text run,
- narrow-cell glyph (e.g. Roman)
which is the n-th glyph in the text run,
- connected glyph (e.g. Arabic)
which is the n-th glyph in the text run.
Many typographical properties in East Asian typography depends on the fact that a character is typically rendered as either a wide or narrow character. All characters described by the Unicode Standard can be categorized by a width property. This is covered by a Unicode Technical report (TR#11) available from the Unicode Web site.
The orientation which the above symbols assume in the diagrams corresponds to the orientation that the glyphs they represent are intended to assume when rendered in the UA. Spacing between these characters in the diagrams is usually symbolic, unless intentionally changed to make a point.
This section describes the text layout features supported by CSS, which includes support for various international writing directions, such as left-to-right (e.g., Roman scripts), right-to-left (e.g., Hebrew or Arabic), bidirectional (e.g., mixing Roman with Arabic) and vertical (e.g., Asian scripts).
The 'writing-mode' property determines an inline progression and a line to line progression, also called block progression. For example, Roman scripts are typically written left to right and top to bottom. The glyph orientation determines the orientation of the rendered visual shape of characters relative to the primary text advance direction.
Within a line, the adjustment to the current text position is based on the current glyph orientation relative to the text advance direction, the metrics of the glyph just rendered, kerning tables in the font and the current values of various attributes and properties, such as the spacing properties.
Bi-directionality introduces another level of complexity in text layout, as in many combinations of 'writing-mode' and glyph orientation values the proper directionality of text will be determined by an algorithm. The Unicode standard ([UNICODE], section 3.12) defines such an algorithm consisting of an implicit part based on character properties, as well as explicit controls for embeddings and overrides. It is also possible to override the inherent directionality of the content characters by using of combination of the 'writing-mode' and 'unicode-bidi' properties.
CSS3 relies on this algorithm to achieve proper text bidirectional rendering. However reordering of characters only occurs for specific values of the glyph orientation properties. See their description for the exact conditions.
CSS2 specified the 'direction' property which is a subset of the 'writing-mode' property as it only determines an inline progression. The 'direction' property may still be used when no line to line progression change is desired.
The HTML 4.0 specification ([HTML40], section 8.2) defines bi-directionality behavior for HTML elements. Conforming HTML user agents may therefore ignore the 'direction' and 'unicode-bidi' properties in author and user style sheets. The style sheet rules that would achieve the bidi behavior specified in [HTML40] are given in the sample style sheet. The HTML 4.0 specification also contains more information on bidirectionality issues. Note that HTML 4.0 does not cover the more general case described by the 'writing-mode' property.
Finally, this module uses extensively the 'before', 'after', 'start' and 'end' notation to specify the four edge of a box relative to its text advance direction, independently of its absolute positioning in terms of 'top', 'bottom', 'left' and 'right' (corresponding respectively to the 'before', 'after', 'start' and 'end' positions in a typical Western text layout). This notation is also used extensively in [XSL] for the same purpose.
The 'writing-mode' property specifies whether the primary text advance direction shall be left-to-right, right-to-left, or top-to-bottom. (Note that even when the primary text advance direction if left-to-right or right-to-left, some or all of the content within a given element might advance in the opposite direction because of the Unicode [UNICODE] bidirectional algorithm or because of explicit text advance overrides due to this property or 'direction' and 'unicode-bidi'. This property also changes the 'direction' property for the element. For more on bidirectional text, see the section about Embedding and override.
| Value: | lr-tb | rl-tb | tb-rl | tb-lr | bt-rl | bt-lr | lr | rl | tb | inherit |
| Initial: | lr-tb |
| Applies to: | all elements |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
The combination of primary text advance direction and line progression direction set by the writing-mode property is also referred as a flow orientation. In such contexts, the values: lr-tb, lr, rl-tb and rl correspond to horizontal flow orientations, and the others (tb-rl, tb, tb-lr, bt-rl, bt-lr) correspond to vertical flow orientations.
For horizontal flow orientations, the top and bottom margins can be collapsed. For vertical flow orientations, the left and right margin can be collapsed. See Collapsing margins in the CSS3 Box module [forthcoming] for the details of collapsing margins.
This property also specifies the direction of table column layout, the direction of the overflow oriented in the same way as the primary text advance direction (e.g. for writing-mode: lr-tb a block element will overflow horizontally on the right) , the initial alignment of text and the position of an incomplete last line in a block in case of 'text-align: justify'.
For the 'writing-mode' property to have any effect on inline-level elements, one or both of the following conditions must be met:
An inline-level element that has a different writing-mode value than its parent becomes an inline-block element.
Editor's note: The 'width' and 'height' property descriptions in the CSS3 Box module need to be updated to describe the algorithm for vertical flow orientations in more details. For example, in vertical flow orientations, it is expected that the height will be no more than the minimum of the parent layout height (minus margin and border) and the element optimum height. The element optimum height is typically determined as being 10 ideographic characters 'advance width' long. This mechanism is required to avoid 'infinitely' long vertical lines or single line vertical flow (would look like rtl horizontal Japanese). In particular section 7.2 of the box module should discuss the case of 'auto' for vertical flow element contained in an horizontal flow element.
Here is a diagram of a horizontal flow (writing-mode: lr-tb):
Here is a diagram for a vertical flow used in East Asia (writing-mode: tb-lr) :
And finally, here is a diagram for another flow used for Uyghur and Mongolian (writing-mode: tb-lr):
In East Asian documents, it is often preferred to display certain Latin-based strings, such as numerals in a year, always in a horizontal layout flow regardless of the flow mode of the line of text these strings appear in, as in:
This effect is known as "Tate chu yoko". In order to achieve it, the Latin string should be enclosed within a SPAN element with an horizontal flow orientation, as in:
<span STYLE="writing-mode: lr-tb">1996</span>
This is an application of changing the flow of an inline element as described earlier. Line breaking is normally disabled for such runs of text. This can be accomplished using the CSS 'white-space: nowrap' setting.
| Value: | ltr | rtl | inherit |
| Initial: | ltr |
| Applies to: | all elements, but see prose |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
Values for this property have the following meanings:
This property specifies the line direction component of the text advance direction and the direction of embeddings and overrides (see 'unicode-bidi') for the Unicode bidirectional algorithm. The values 'ltr' and 'rtl' have to be interpreted 'relatively' to the line direction. In addition, it specifies the direction of table column layout, the direction of the overflow oriented in the same way as the primary text advance direction (e.g. for writing-mode: lr-tb a block element will overflow horizontally on the right) , the initial alignment of text and the position of an incomplete last line in a block in case of 'text-align: justify'. For the 'direction' property to have any effect on inline-level elements, the 'unicode-bidi' property's value must be 'embed' or 'bidi-override' and the glyph orientation of the characters within the element must be 'auto' or 90/-90 degree in vertical layout or 0/180 degree in horizontal layout.
The usage of the 'direction' property for block-level elements is discouraged in CSS3 as the 'writing-mode' property supersedes it.
Note. The 'writing-mode' and 'direction' properties, when specified for table column elements, are not inherited by cells in the column since columns don't exist in the document tree. Thus, CSS cannot easily capture the "dir" attribute inheritance rules described in [HTML40], section 11.3.2.1.
Note. The 'writing-mode' and 'direction' properties interact with each other. As such, 'writing-mode' resets the 'direction' value. Similarly, modifying 'direction' after 'writing-mode' changes effectively the 'writing-mode' value to the opposite inline progression. For example, 'direction:rtl' applied to an element with 'writing-mode:lr-tb' effectively makes 'writing-mode:rl-tb'. This is one of the main reason why the mixed usage of these two properties is discouraged or at least they should be used with great caution.
Note. These properties do not affect the positioning of background images.
In some cases, it is required to alter the orientation of a sequence of characters relative to the primary text advance direction. The requirement is particularly applicable to vertical layouts of East Asian documents, where sometimes half-width Roman text is to be displayed horizontally and other times vertically.
Two properties control the glyph orientation relative to the primary text advance direction. 'glyph-orientation-vertical' controls glyph orientation when the primary text advance direction is vertical. 'glyph-orientation-horizontal' controls glyph orientation when the primary text advance direction is horizontal. It is necessary to distinguish between vertical and horizontal for the following reasons:
| Value: | <angle> | auto | inherit |
| Initial: | auto |
| Applies to: | all elements |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
Note. A value of auto will generally produce the expected results in common uses of mixing Japanese with European characters; however, the exact algorithms are based on complex interactions between many factors, including font design, and thus different algorithms might be employed in different processing environments. For precise control, specify explicit <angle> values.
This property specifies the orientation of glyphs relative to the inline progression determined by the 'writing-mode' property. This property is applied only to text written in a vertical writing-mode. Conforming user agents may do the following in increasing levels of supports:
The value of this property affects both the alignment and height of the glyph area generated for the affected glyphs. If a glyph is oriented so that the normal orientation of the glyph is parallel to the dominant-baseline, then the vertical alignment-point of the rotated glyph is aligned with the alignment-baseline appropriate to that glyph. The baseline to which the rotated glyph is aligned is the vertical baseline identified by the "alignment-baseline" for the script to which the glyph belongs. The height of the glyph area is determined from the height font characteristic for the glyph.
The horizontal alignment points, baselines and heights (computed as glyph advance width) are used if the normal orientation of the glyph is perpendicular to the dominant-baseline.
The diagrams below illustrate different uses of 'glyph-orientation-vertical'. The diagram on the left shows the result of the mixing of full-width ideographic characters with half-width Roman characters when 'glyph-orientation-vertical' for the Roman characters is either auto or "90deg". The diagram on the right show the result of mixing full-width ideographic characters with half-width Roman characters when Roman characters are specified to have a 'glyph-orientation-vertical' of "0deg".
The bidi algorithm and the 'glyph-orientation-vertical' property have the following interaction:
| Value: | <angle> | inherit |
| Initial: | 0deg |
| Applies to: | all inline-level elements |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
This property specifies the orientation of glyphs relative to the inline progression determined by the 'writing-mode' property. This property is applied only to text written in a horizontal writing-mode. Conforming user agents may do the following in increasing levels of supports:
The value of this property affects both the alignment and width of the glyph area generated for the affected glyphs. If a glyph is oriented so that the normal orientation of the glyph is parallel to the dominant-baseline, then the vertical alignment-point of the rotated glyph is aligned with the alignment-baseline appropriate to that glyph. The baseline to which the rotated glyph is aligned is the horizontal baseline identified by the "alignment-baseline" for the script to which the glyph belongs. The width of the glyph area is determined from the vertical width font characteristic for the glyph.
The horizontal alignment points, baselines and widths are used if the normal orientation of the glyph is perpendicular to the dominant-baseline.
| Value: | normal | embed | bidi-override | inherit |
| Initial: | normal |
| Applies to: | all elements, but see prose |
| Inherited: | no |
| Percentages: | N/A |
| Media: | visual |
This property allows further control of the Unicode bidirectional algorithm by allowing new embedding level or direction override. Values for this property have the following meanings:
The final order of characters in each block-level element is the same as if the bidi control codes had been added as described above, mark-up had been stripped, and the resulting character sequence had been passed to an implementation of the Unicode bidirectional algorithm for plain text that produced the same line-breaks as the styled text. In this process, non-textual entities such as images are treated as neutral characters, unless their 'unicode-bidi' property has a value other than 'normal', in which case they are treated as strong characters in the 'direction' specified for the element.
Note. In order to be able to flow inline boxes in a uniform direction (either entirely left-to-right or entirely right-to-left), more inline boxes (including anonymous inline boxes) may have to be created, and some inline boxes may have to be split up and reordered before flowing.
Because the Unicode algorithm has a limit of 61 levels of embedding, care should be taken not to use 'unicode-bidi' with a value other than 'normal' unless appropriate. In particular, a value of 'inherit' should be used with extreme caution. However, for elements that are, in general, intended to be displayed as blocks, a setting of 'unicode-bidi: embed' is preferred to keep the element together in case display is changed to inline (see example below).
The following example shows an XML document with bidirectional text. It illustrates an important design principle: DTD designers should take bidi into account both in the language proper (elements and attributes) and in any accompanying style sheets. The style sheets should be designed so that bidi rules are separate from other style rules. The bidi rules should not be overridden by other style sheets so that the document language's or DTD's bidi behavior is preserved.
Example(s):
In this example, lowercase letters stand for inherently left-to-right characters and uppercase letters represent inherently right-to-left characters:
<HEBREW> <PAR>HEBREW1 HEBREW2 english3 HEBREW4 HEBREW5</PAR> <PAR>HEBREW6 <EMPH>HEBREW7</EMPH> HEBREW8</PAR> </HEBREW> <ENGLISH> <PAR>english9 english10 english11 HEBREW12 HEBREW13</PAR> <PAR>english14 english15 english16</PAR> <PAR>english17 <HE-QUO>HEBREW18 english19 HEBREW20</HE-QUO></PAR> </ENGLISH>
Since this is XML, the style sheet is responsible for setting the writing direction. This is the style sheet:
/* Rules for bidi */
HEBREW, HE-QUO {direction: rtl; unicode-bidi: embed}
ENGLISH {direction: ltr; unicode-bidi: embed}
/* Rules for presentation */
HEBREW, ENGLISH, PAR {display: block}
EMPH {font-weight: bold}
The HEBREW element is a block with a right-to-left base direction, the ENGLISH element is a block with a left-to-right base direction. The PARs are blocks that inherit the base direction from their parents. Thus, the first two PARs are read starting at the top right, the final three are read starting at the top left. Please note that HEBREW and ENGLISH are chosen as element names for explicitness only; in general, element names should convey structure without reference to language.
The EMPH element is inline-level, and since its value for 'unicode-bidi' is 'normal' (the initial value), it has no effect on the ordering of the text. The HE-QUO element, on the other hand, creates an embedding.
The formatting of this text might look like this if the line length is long:
5WERBEH 4WERBEH english3 2WERBEH 1WERBEH
8WERBEH 7WERBEH 6WERBEH
english9 english10 english11 13WERBEH 12WERBEH
english14 english15 english16
english17 20WERBEH english19 18WERBEH
Note that the HE-QUO embedding causes HEBREW18 to be to the right of english19.
If lines have to be broken, it might be more like this:
2WERBEH 1WERBEH
-EH 4WERBEH english3
5WERB
-EH 7WERBEH 6WERBEH
8WERB
english9 english10 en-
glish11 12WERBEH
13WERBEH
english14 english15
english16
english17 18WERBEH
20WERBEH english19
Because HEBREW18 must be read before english19, it is on the line above english19. Just breaking the long line from the earlier formatting would not have worked. Note also that the first syllable from english19 might have fit on the previous line, but hyphenation of left-to-right words in a right-to-left context, and vice versa, is usually suppressed to avoid having to display a hyphen in the middle of a line.
In text layout, many of the behaviors are related to a character classification based on scripts. For example, line breaking or text justification behaviors depend on the 'dominant' script of the textual content of an element. This can be heuristically determined by finding the first character that has an unambiguous script identifier in an element. It can also be explicitly specified by using the 'script' property.
| Value: | auto | none | <script> | inherit |
| Initial: | auto |
| Applies to: | all elements |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
Values have the following meanings:
| Value: | start | end | left | right | center | justify | <string> | inherit |
| Initial: | start |
| Applies to: | block-level elements |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
This property describes how inline content of a block is aligned. Values have the following meanings:
A block of text is a stack of line boxes. In the case of 'start', 'end', 'left', 'right' and 'center', this property specifies how the inline boxes within each line box align with respect to the line box's start and end sides; alignment is not with respect to the viewport. In the case of 'justify', the UA may stretch the inline boxes in addition to adjusting their positions. (See also 'letter-spacing' and 'word-spacing'.)
Example(s):
In this example, note that since 'text-align' is inherited, all block-level elements inside the DIV element with 'class=center' will have their inline content centered.
DIV.center { text-align: center }
| Value: | auto | inter-word | inter-ideograph | distribute | newspaper | inter-cluster | kashida | inherit |
| Initial: | auto |
| Applies to: | block-level elements |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
This property selects the type of justify alignment. It affects the text layout only if 'text-align' is set to 'justify'. That way, UA's that do not support this property will still render the text as fully justified, which most of the time is at least partially correct. Typically the text-justify property does not affect the last line, unless the last line itself is justified. Most of the text-justify values affects writing systems in very specific ways. These writing systems (or group of) are:
The text-justification behavior of textual components is guided by the script classification of the characters. The 'script' property allows to modify the behavior of these components.
Depending on the text-justify value, spacing may be altered between words or letters.
The possible values for the text-justify property are:
The diagram below illustrates this mode, by showing how the characters are laid out in the last two lines of an element:
Figure 3.2.1: Mixed glyph layout in the last two lines in an inter-word justified element
For example a viewer could render an 'inter-word' justified paragraph in the following way:
Figure 3.2.2: Inter-word justification applied to mixed text
The diagram below illustrates this mode:
Figure 3.2.3: Mixed character layout in the last two lines of a newspaper justified element
The diagram below illustrates this mode:
Figure 3.2.4: Mixed glyph layout in the last two lines in an inter-ideograph justified element
Below is an example of how this mode would work:
Figure 3.2.5: Inter-ideograph justification applied to mixed text
The diagram below illustrates this mode:
Figure 3.2.6: Mixed character layout in the last two lines of a distribute justified element
For example a viewer could render a 'distribute' justified paragraph in the following way:
Figure 3.2.7: Distribute justification applied to mixed text
The following table describes the expansion/compression strategy for the combination of each script groups and the text-justify property value for each relevant text-justify property value:
| text-justify property value | |||||||
|---|---|---|---|---|---|---|---|
| Script groups | auto* | inter-word | newspaper | inter-ideograph | distribute | inter-cluster | kashida |
| Latin | word-spacing only* | word-spacing only | prioritization between word-spacing and letter-spacing | word-spacing only | word-spacing only | word-spacing only | word-spacing only |
| CJK | no extra spacing* | no extra spacing | letter-spacing | letter-spacing | letter-spacing | no extra spacing | no extra spacing |
| Devanagari* | word-spacing* | word-spacing | word-spacing | word-spacing | word-spacing | word-spacing | word-spacing |
| Arabic | kashida and word-spacing* | word-spacing | kashida and word-spacing | word-spacing | kashida and word-spacing | word-spacing | kashida and word-spacing |
| SE Asian clusters | inter-cluster spacing* | inter-cluster spacing | inter-cluster spacing | no extra spacing | inter-cluster spacing | inter-cluster spacing | no extra spacing |
Figure 3.2.8: Interaction between text-justify values and script groups
*The values shown for the auto column are only a
recommendation. The UAs might implement a different strategy.
*The Devanagari entry represents as well other scripts and writing systems used in India that use baseline connectors like Bengali and Gurmukhi.
| Value: | auto | start | end | center | justify | size | inherit |
| Initial: | auto |
| Applies to: | block-level elements |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
This property describes how the last line of the inline content of a block is aligned. This also applies to the only line of a block if it contains a single line, the line preceding a BR element and to last lines of anonymous blocks. Typically the last line is aligned like the other lines of the block element, this is set by the 'text-align' property. However, in some situations like when the 'text-align' property is set to 'justify', the last line may be aligned differently.
Values have the following meanings:
The following example shows the usage of the alignment properties in a case where all lines are justified in a distributed justification. This is commonly found in East Asian typography:
P.distributealllines
{ text-align: justify; text-justify: distribute; text-align-last: justify }
| Value: | <font-size> | inherit |
| Initial: | 0 |
| Applies to: | all elements |
| Inherited: | yes |
| Percentages: | element's computed 'font-size' |
| Media: | visual |
If 'text-align-last' is 'size', the fonts of the last line of an element are not allowed to become smaller than the smaller of 'font-size' and 'min-font-size'.
| Value: | <font-size> | auto | inherit |
| Initial: | auto |
| Applies to: | all elements |
| Inherited: | yes |
| Percentages: | element's computed 'font-size' |
| Media: | visual |
If 'text-align-last' is 'size', the fonts of the last line of an element are not allowed to become larger than the larger of 'font-size' and 'max-font-size'. 'auto' means that there is no limit.
| Value: | none | punctuation | punct-and-kana | inherit |
| Initial: | punctuation |
| Applies to: | block-level elements |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
This sets the individual font blank space compression permissions for the text justification algorithm, when 'text-justify' is anything other than 'inter-word'. This special type of space compression occurs on the font level, i.e. the blank space within the character area itself may be reduced without affecting the appearance of the glyph. This applies to wide-cell glyphs only. Possible values:
Figure 3.4.1: Glyph layout with no compression
Figure 3.4.2: Glyph layout with punctuation compression
Figure 3.4.3: Character layout with punctuation and Kana compression
| Value: | <percentage> | inherit |
| Initial: | 0% |
| Applies to: | block-level elements |
| Inherited: | yes |
| Percentages: | as described |
| Media: | visual |
Kashida is a typographic effect used in Arabic writing systems that allows character elongation at some carefully chosen points in Arabic. Each elongation can be accomplished using a number of kashida glyphs, a single graphic or character elongation on each side of the kashida point. (The UA may use either mechanism based on font or system capability). The text-kashida-space property expresses the ratio of the kashida expansion size to the white space expansion size, 0% means no kashida expansion, 100% means kashida expansion only . This property can be used with any justification style where kashida expansion is used (currently text-justify: auto, kashida, distribute and newspaper).
In the diagram below showing two identical paragraphs of Arabic text, the blue line in the second line (not justified) shows the length that is used for kashida and divided among the elongation opportunities in the first line (justified), as indicated by the red underlines:
Figure 3.5: Kashida applied to Arabic text
In that example no expansion occurs between the word themselves, indicating that the text-kashida-space property is set to 100%.
The glyphs of a given script are positioned so that a particular point on each glyph, the alignment-point, is aligned with the alignment-points of the other glyphs in that script. The glyphs of different scripts are typically aligned at different points on the glyph. For example, Western glyphs are aligned on the bottoms of the capital letters, certain Indic glyphs (including glyphs from the Devanagari, Gurmukhi and Bengali scripts) are aligned at the top of a horizontal stroke near the top of the glyphs and East Asian glyphs are aligned either at the bottom or center of the EM box of the glyph. Within a script and within a line of text having a single font-size, the sequence of alignment-points defines, in the inline-progression-direction, a geometric line called a baseline. Western and most other alphabetic and syllabic glyphs are aligned to an "alphabetic" baseline, the above Indic glyphs are aligned to a "hanging" baseline and the East Asian glyphs are aligned to an "ideographic" baseline.
This figure shows the vertical position of the alignment-point for alphabetic and many syllabic scripts, illustrated by a Roman "A"; for certain Indic scripts, illustrated by a Gurmukhi syllable "ji"; and for ideographic scripts, illustrated by the ideographic glyph meaning "country". The thin black rectangle around the ideographic glyph illustrates the EM box for that glyph and shows the typical positioning of the "black marks" of the glyph within the EM box.
A baseline-table specifies the position of one or more baselines in the design space coordinate system. The function of the baseline table is to facilitate the alignment of different scripts with respect to each other when they are mixed on the same text line. Because the desired relative alignments may depend on which script is dominant in a line (or block), there may be a different baseline table for each script. In addition, different alignment positions are needed for horizontal and vertical writing modes. Therefore, the font may have a set of baseline tables: typically, one or more for horizontal writing-modes and zero or more for vertical writing-modes.
Examples of horizontal and vertical baseline positions. The thin lined box in each example is the "EM box". For the Latin glyphs, only the EM box of the first glyph is shown. Example 1 shows typical Latin text written horizontally. This text is positioned relative to the alphabetic baseline, shown in blue. Example 2 shows a typical ideographic glyph positioned on the horizontal ideographic baseline. Note that the EM Box is positioned differently for these two cases. Examples 3 and 4 show the same set of baselines used in vertical writing. The Latin text, example 3, is shown with a glyph-orientation of 90 degrees which is typical for proportionally space Latin glyphs in vertical writing. Even though the ideographic glyph in Example 4 is positioned on the vertical ideographic baseline, because it is centered in the EM box, all glyphs with the same EM Box are centered, vertically, with respect to one another.
The font tables for a font include font characteristics for the individual glyphs in the font. CSS assumes that the font tables include, for each glyph in the font, one width value, one alignment-baseline and one alignment-point for the horizontal writing-modes. If vertical writing-modes are supported, then each glyph must have another width value, alignment-baseline and alignment-point for the vertical writing-modes. (Even though it is specified as a width, for vertical writing-modes the width is used in the vertical direction.)
The script to which a glyph belongs determines an alignment-baseline to which the glyph is to be aligned. The position of this baseline in the design space coordinate system determines the default block progression direction position of the alignment-point. The inline progression direction position of the alignment-point is on the start-edge of the glyph.
This figure shows glyphs from three different scripts, each with its EM box and within the EM box, the baseline table applicable to that glyph. The alignment-point of each glyph is shown by an "X" on the start edge of the EM box and by making alignment-baseline blue. The baseline-table of the parent element of the characters that mapped to these glyphs is shown as a set of dashed lines.
The baseline alignment properties control the alignment of child element with respect to their parent. The positions of these baselines are illustrated in the following figure:
This figure shows samples of Gurmukhi (a hanging Indic script), Latin and ideographic scripts together with most of the baselines defined below. The thin line around the ideographic glyphs symbolizes the EM box in which these glyphs are centered. In this figure, the position of the "text-before-edge" and "text-after-edge" baselines is computed assuming that the "alphabetic" baseline is the dominant-baseline. The "central" baseline has been omitted from the figure, but it lies halfway between the "text-before-edge" and "text-after-edge" baselines, just about where the "math" baseline is shown.
The baseline-identifiers below are used in this specification. Some of these are determined by baseline-tables contained in a font as described in the section describing the baseline information provided by fonts. Others are computed from other font characteristics as described below.
This identifies the baseline used by most alphabetic and syllabic scripts. These include, but are not limited to, many Western, Southern Indic, Southeast Asian (non-ideographic) scripts.
This identifies the baseline used by ideographic scripts. For historical reasons, this baseline is at the bottom of the ideographic EM box and not in the center of the ideographic EM box. See the "central" baseline. The ideographic scripts include Chinese, Japanese, Korean, and Vietnamese Chu Nom.
This identifies the baseline used by certain Indic scripts. These scripts include Devanagari, Gurmukhi and Bengali.
This identifies the baseline used by mathematical symbols.
This identifies a computed baseline that is at the center of the EM box. This baseline lies halfway between the text-before-edge and text-after-edge baselines.
Note. For ideographic fonts, this baseline is often used to align the glyphs; it is an alternative to the ideographic baseline.
This identifies a baseline that is offset from the alphabetic baseline in the shift-direction by 1/2 the value of the x-height font characteristic. The position of this baseline may be obtained from the font data or, for fonts that have a font characteristic for "x-height", it may be computed using 1/2 the "x-height". Lacking either of these pieces of information, the position of this baseline may be approximated by the "central" baseline.
This identifies the before-edge of the EM box. The position of this baseline may be specified in the baseline-table or it may be calculated.
Note. The position of this baseline is normally around or at the top of the ascenders, but it may not encompass all accents that can appear above a glyph. For these fonts the value of the "ascent" font characteristic is used. For ideographic fonts, the position of this baseline is normally 1 EM in the shift-direction from the "ideographic" baseline. However, some ideographic fonts have a reduced width in the inline-progression-direction to allow tighter setting. When such a font, designed only for vertical writing-modes, is used in a horizontal writing-mode, the "text-before-edge" baseline may be less than 1 EM from the text-after-edge.
This identifies the after-edge of the EM box. The position of this baseline may be specified in the baseline-table or it may be calculated.
Note. For fonts with descenders, the position of this baseline is normally around or at the bottom of the descenders. For these fonts the value of the "descent" font characteristic is used. For ideographic fonts, the position of this baseline is normally at the "ideographic" baseline.
There are, in addition, two computed baselines that are only defined for line boxes. For each line box, there is a dominant-baseline, a baseline-table and a baseline-table font-size which are those of the nearest ancestor element that completely contains the whole line. The "before-edge" and "after-edge" baselines are defined as follows.
The offset of the "before-edge" baseline of the line from the dominant-baseline of the line is determined by ignoring all inline boxes whose alignment-baseline is either "before-edge" or "after-edge". For the "before-edge", extents are measured from the dominant-baseline in the direction toward the top (relative) of the box. The "before-edge" baseline offset is set to the maximum extent of the "before-edges" of the allocation-rectangles of the remaining areas. If all the inline-areas in a line-area are aligned either to the "before-edge" or to the "after-edge", then use the offset of the "text-before-edge" baseline of the line as the offset of the "before-edge" baseline of the line.
The offset of the "after-edge" baseline of the line from the
dominant-baseline of the line is determined by ignoring all inline boxes
whose alignment-baseline is after-edge. For the "after-edge",
extents are measured from the dominant-baseline in the direction toward the
bottom (relative) of the reference-area. The "after-edge" baseline offset is
set to the negative of the maximum of (1) the maximum extent of the
"after-edges" of the allocation-rectangles of the remaining areas and (2) the
maximum height of the allocation-rectangles of the areas that are ignored
minus the offset of the "before-edge" baseline of the line.
Note. If all the inline-areas in a line-area are aligned to the "after-edge" then the specification for the "before-edge" will set the "before-edge" baseline to coincide with the "text-before-baseline" of the line. Then, case (2) above will determine an offset to the "bottom-edge" baseline that will align the "before-edge" of the area with the greatest height to its allocation-rectangle to "before-edge" baseline.
Note. The above specifications for "before-edge" and "after-edge" have the following three properties: (1) the allocation-rectangles of all the areas are below the "before-edge", (2) the allocation-rectangles of all the areas are above the "after-edge", and (3) the distance between the "before-edge" and the "after-edge" cannot be decreased without violating (1) or (2). The specified placement of the "before-edge" and "after-edge" is not the only way that (1)-(3) can be satisfied, but it is the only way they can be satisfied with the smallest possible offset to the "before-edge".
Examples showing "before-edge" and "after-edge" alignment:
The rectangles with lines or arrows are images with an intrinsic size as shown. The rectangles with no arrows represent images that receive the default, dominant baseline, alignment. The alignment of the other rectangles is at the furthest point from the arrow head (which is in the middle when there are two arrowheads). Examples 1 and 2 show the "before-edge" alignment is determined by the tallest non-"before-edge" aligned objects: in example 1 this is the default aligned, arrowhead free rectangular image and in example 2 this is the double headed arrow rectangle. Examples 3 and 4 show defaulting to the "text-before-edge" when all the boxes have either "before-edge" or "after-edge" alignment. In example 3, the images with "before-edge" alignment has a taller member than do the "after-edge" aligned images. In example 4, the tallest image is in the "after-edge" aligned set. Example 5 is a repetition of example 2 with largest image being an "after-edge" aligned image.
There are also four baselines that are defined only for horizontal writing-modes.
This baseline is the same as the "before-edge" baseline in a horizontal writing-mode and is undefined in a vertical writing mode.
This baseline is the same as the "text-before-edge" baseline in a horizontal writing-mode and is undefined in a vertical writing mode.
This baseline is the same as the "after-edge" baseline in a horizontal writing-mode and is undefined in a vertical writing mode.
This baseline is the same as the "text-after-edge" baseline in a horizontal writing-mode and is undefined in a vertical writing mode.
The alignment of an element with respect to its parent is determined by three things: the scaled-baseline-table of the parent and the alignment-baseline and alignment-point of the element being aligned. Prior to alignment, the scaled-baseline-table of the parent may be shifted. The property specifications below provide the information necessary to align the parent and child elements.
There are four properties that control alignment of elements to the above set of baselines: 'dominant-baseline', 'alignment-baseline', 'baseline-shift' and 'alignment-adjust'. These properties are all independent and are designed so that typically only the specification of one of the properties is needed to achieve a particular alignment goal.
The primary baseline alignment property is the 'dominant-baseline' property. This property has a compound value with three components. The dominant-baseline-identifier component is the default 'alignment-baseline' to be used when aligning two inline areas. The baseline-table component specifies the positions of the baselines in the font design space coordinates. The baseline-table acts something like a musical staff; it defines particular points along the block-progression-direction to which glyphs and inline elements can be aligned. The baseline-table 'font-size' component provides a scaling factor for the baseline-table.
Because the value of the 'font-family' property is a list of fonts, to insure a consistent choice of baseline-table we define the nominal font in a font list as the first font in the list for which a glyph data is available. This is the first that could contain a glyph for each character encountered. (For this definition, glyph data is assumed to be present if a font substitution is made or if the font is synthesized.) This definition insures a content independent determination of the font and baseline table that is to be used.
For convenience, the specification will sometimes refer to the baseline identified by the dominant-baseline-identifier component of the "dominant-baseline" property as the "dominant baseline" (in an abuse of terminology).
The model also assumes that each glyph has a 'alignment-baseline' value which specifies the baseline with which the glyph is to be aligned. (The 'alignment-baseline' is called the "Baseline Tag" in the OpenType baseline-table description.) The initial value of the 'alignment-baseline' property uses the baseline identifier associated with the given glyph. Alternate values for 'alignment-baseline' can be useful for glyphs such as a "*" which are ambiguous with respect to script membership.
The model assumes that the font from which the glyph is drawn also has a baseline table, the font baseline-table. This baseline table has offsets in units-per-em from the (0,0) point to each of the baselines the font knows about. In particular, it has the offset from the glyph's (0,0) point to the baseline identified by the 'alignment-baseline'.
The offset values in the baseline-table are in "design units" which means fractional units of the EM. CSS calls these "units-per-em". Thus, the current 'font-size' is used to determine the actual offset from the dominant baseline to the alternate baselines.
The glyph is aligned so that its baseline identified by its 'alignment-baseline' is aligned with the baseline with the same name from the dominant baseline-table.
The offset from the dominant baseline of the parent to the baseline identified by the 'alignment-baseline' is computed using the dominant baseline-table and dominant baseline-table font-size. The font baseline-table and font-size applicable to the glyph are used to compute the offset from the identified baseline to the (0,0) point of the glyph. This second offset is subtracted from the first offset to get the position of the (0,0) point in the shift direction. Both offsets are computed by multiplying the baseline value from the baseline-table times the appropriate font-size value.
If the 'alignment-baseline' identifies the dominant baseline, then the first offset is zero and the glyph is aligned with the dominant baseline; otherwise, the glyph is aligned with the chosen alternate baseline.
The third baseline alignment property is the 'baseline-shift' property. Like the properties other than the "dominant-baseline" property, this property does not change the baseline-table or the baseline-table font-size. It does shift the whole baseline table of the parent element so that when an inner inline element is aligned to one of the parents baselines, the position of the inner inline element is shifted.
The fourth alignment property is the 'alignment-adjust' property. This property is primarily used for elements, such as some graphics, that do not belong to a particular script and do not have a predefined alignment point. The "alignment-adjust" property allows the author to assign where, on the start-edge of the object, the alignment point for that element lies.
In addition to the following definition of these properties, an informative appendix: B provides usage examples of these properties.
| Value: | auto | use-script | no-change | reset-size| ideographic | alphabetic | hanging | mathematical | inherit |
| Initial: | auto |
| Applies to: | inline-level elements |
| Inherited: | no |
| Percentages: | N/A |
| Media: | visual |
The 'dominant-baseline' property is used to determine or re-determine a scaled-baseline-table. A scaled-baseline-table is a compound value with three components:
Some values of the property re-determine all three values; other only reestablish the baseline-table font-size. Values for the property have the following meaning:
If there is no baseline-table in the nominal font or if the baseline-table lacks an entry for the desired baseline, then the user agent may use heuristics to determine the position of the desired baseline.
| Value: | baseline | auto-script | before-edge | text-before-edge | after-edge |
text-after-edge | central | middle | ideographic | alphabetic | hanging | mathematical | inherit |
| Initial: | baseline |
| Applies to: | inline-level elements |
| Inherited: | no |
| Percentages: | N/A |
| Media: | visual |
This property specifies how an inline-level element is aligned with respect to its parent. That is, to which of the parent's baselines the alignment point of this element is aligned. Unlike the 'dominant-baseline' property the 'alignment-baseline' property has no effect on its children dominant-baselines.
Note: The 'alignment-adjust' property specifies how the alignment point is determined and defaults to the baseline with the same name as the computed value of the alignment-baseline property.
Except for 'auto-script', all baseline values refer to the respective baseline-identifier components of the dominant-baseline of the parent, and glyphs within the element are aligned similarly to the element itself. The description for 'auto-script' covers these points specifically. The property values have the following meanings:
The values: before-edge, text-before-edge, after-edge and text-after-edge all works relatively to the writing-mode property values. For example 'before-edge' means 'top' in an horizontal writing mode and 'right' in a vertical writing mode.
Note. The reason why 'baseline' is the initial value instead of 'auto-script' (called 'auto' in the similar XSL property) has to do with the fact that most fonts today are designed with an alignment point located at the 'alphabetical' level, even for glyphs belonging to non Latin scripts. User agents have to deal with that constraint, and therefore they use the 'baseline' value as initial.
| Value: | auto | baseline | before-edge | text-before-edge | middle | central | after-edge | text-after-edge | ideographic | alphabetic | hanging | mathematical | <percentage> | <length> | inherit |
| Initial: | auto |
| Applies to: | inline-level elements |
| Inherited: | no |
| Percentages: | refers to the 'line-height' of the element |
| Media: | visual |
The 'alignment-adjust' property allows more precise alignment of elements, such as graphics, that do not have a baseline-table or lack the desired baseline in their baseline-table. With the 'alignment-adjust' property, the position of the baseline identified by the 'alignment-baseline' can be explicitly determined. It also determines precisely the alignment point for each glyph within a textual element. The user agent should use heuristics to determine the position of a non existing baseline for a given element.
Values for the property have the following meaning:
| Value: | baseline | sub | super | <percentage> | <length> | inherit |
| Initial: | baseline |
| Applies to: | inline-level elements |
| Inherited: | no |
| Percentages: | refers to the 'line-height' of the parent element |
| Media: | visual |
The 'baseline-shift' property allows repositioning of the dominant-baseline relative to the dominant-baseline. The shifted object might be a sub- or superscript. Within the shifted element, the whole baseline table is offset; not just a single baseline. For sub- and superscript, the amount of offset is determined from the nominal font of the parent.
Values for the property have the following meaning:
Note. Although it may seem that 'baseline-shift' and
'alignment-adjust' properties are doing
the same thing, there are important differences. For 'alignment-adjust'
the percentage values refer to the 'line-height' of the element being
aligned. For 'baseline-shift the percentage values refer to the 'line-height'
of the parent element. Similarly, it is the 'sub' and 'super' offsets of the
parent that are used to align the shifted baseline rather than the 'sub' and
'super' offsets of the element being positioned. To ensure a consistent sub-
or superscript position, it makes more sense to use the parent as the
reference rather than the subscript element which may have a changed
"line-height" due to "font-size" changes in the sub- or superscript
element.
Using the "alignment-adjust" property is more suitable for positioning
elements, such as graphics, that have no internal textual structure. Using
the "baseline-shift" property is intended for sub- and superscripts where the
positioned element may itself be textual. The baseline-shift provides a way
to define a specific baseline offset other than the named offsets that are
defined relative to the dominant-baseline. In addition, having
"baseline-shift" makes it easier for tool to generate the relevant
properties; many formatting programs already have a notion of baseline shift.
| Value: | auto | auto-script | baseline | sub | super | top | text-top | central | middle | bottom | text-bottom | <percentage> | <length> | inherit |
| Initial: | auto |
| Applies to: | inline-level and 'table-cell' elements |
| Inherited: | no |
| Percentages: | refers to the 'line-height' of the element itself |
| Media: | visual |
This property affects the vertical positioning inside a line box of the boxes generated by an inline-level element. The following values only have meaning with respect to a parent inline-level element, or to a parent block-level element, if that element generates anonymous inline boxes; they have no effect if no such parent exists.
Note. Values of this property have slightly different meanings in the context of tables. Please consult the section on table height algorithms for details.
The remaining values refer to the line box in which the generated box appears:
The 'vertical-align' is not a shorthand for the baseline alignment properties as setting them has no effect on the vertical-align property. But setting the vertical-align property can be seen as a macro of these alignment properties as it will set them as follows:
| vertical-align value | alignment-baseline | alignment-adjust | baseline-shift | dominant-baseline |
|---|---|---|---|---|
| auto | baseline | auto | baseline | auto |
| baseline | baseline | auto | baseline | alphabetic (if no parent or
different flow from parent) no-change (otherwise) |
| sub | baseline | auto | sub | auto |
| super | baseline | auto | super | auto |
| top | before-edge | auto | baseline | auto |
| text-top | text-before-edge | auto | baseline | auto |
| middle | middle | auto | baseline | auto |
| bottom | after-edge | auto | baseline | auto |
| text-bottom | text-after-edge | auto | baseline | auto |
| <percentage> | baseline | <percentage> | baseline | auto |
| <length> | baseline | <length> | baseline | auto |
Editor's note: There are the following differences with the XSL definition:
The initial value of alignment-baseline is baseline instead of 'auto-script' (value called 'auto' in XSL). This reflects user agent current practice.
'vertical-align: baseline' maps to dominant-baseline:'alphabetic' or 'no-change' (instead of auto)
'vertical-align: auto' (proposed as new initial value) maps to 'dominant-baseline:auto'.
Editor's note: It is tempting to make vertical-align a shorthand property of the four other alignment properties, there are however several issues:
Vertical-align is a simple enumerated property, changing it to a shorthand would create issues from a DOM point of view.
The names of the values of the individual properties are not designed to be used in a single shorthand notation, unless a strict sequence is enforced, like having dominant-baseline first, followed by alignment-base, alignment-adjust and baseline-shift. The usage could be cumbersome.
| Value: | <length> | <percentage> | inherit |
| Initial: | 0 |
| Applies to: | block-level elements |
| Inherited: | yes |
| Percentages: | refers to width of containing block |
| Media: | visual |
This property specifies the indentation of the first line of text in a block. More precisely, it specifies the indentation of the first box that flows into the block's first line box. The box is indented with respect to the starting edge of the line box. User agents should render this indentation as blank space.
Values have the following meanings:
The value of 'text-indent' may be negative, but there may be implementation-specific limits.
Example(s):
The following example causes a '3em' text indent.
P { text-indent: 3em }
In documents written in Latin-based languages, where runs of characters make up words and words are separated by spaces or hyphens, line breaking is relatively simple. In the most general case, (assuming no hyphenation dictionary is available to the UA), a line break can occur only at whitespace characters or hyphens, including U+ 00AD SOFT HYPHEN.
In ideographic typography, however, where what appears as a single glyph can represent an entire word and no spaces nor any other word separating characters are needed, a line breaking opportunity is not as obvious as a space. It can occur after or before many other characters. Certain line breaking restrictions still apply, but they are not as strict as they are in Latin typography.
(As a side note, Thai is another interesting example with its own special line breaking rules. Since Thai words are made up of runs of characters, it resembles Latin in that respect. But the lack of spaces as word delimiters, or in fact any consistent word delimiters, makes it similar to CJK. Thai, like Latin in the absence of a hyphenating dictionary, never breaks inside of words. In fact, a knowledge of the vocabulary is necessary to be able to correctly break a line of Thai text.). Finally, the Unicode character: U+200B ZERO WIDTH SPACE can be inserted in such scripts to specify an explicit line breaking opportunity.
A number of levels of line-breaking "strictness" can be used in Japanese typography. These levels add or remove line breaking restric