W3C

CSS3 module: text

W3C Working Draft 17 May 2001

This version:
http://www.w3.org/TR/2001/WD-css3-text-20010517
Latest version:
http://www.w3.org/TR/css3-text
Previous version:
http://www.w3.org/TR/1999/WD-i18n-format-19990910/
Editors:
Michel Suignard (Microsoft)
Chris Lilley (W3C)
Authors:
Original CSS2 authors
Bert Bos (W3C)
Tantek Çelik (Microsoft)
Marcin Sawicki (former editor)
Michel Suignard (Microsoft)
Chris Wilson (Microsoft)
Steve Zilles (Adobe)

Abstract

This document presents a set of CSS text formatting properties. In addition to what was already existing in CSS 2 [CSS2], many new properties are addressing basic requirements in international context (mostly East Asian and Bidirectional). However, their usage is not limited to those instances.

Status of This Document

This document is a working draft of the CSS working group which is part of the Style activity. It contains a proposal for features to be included in CSS level 3.

This document has been produced as a combined effort of the W3C Internationalization Activity, and the Style Activity. It also includes extensive contribution made by members of the XSL Working Group (members only). Finally, some of the proposal surfaced first in the Scalable Vector Graphics (SVG) 1.0 Specification [SVG1.0]. The text has been duplicated in this document to reflect which properties and specification should be eventually referenced in CSS itself.

The previous title of this draft was "International Layout."

Feedback is very much welcomed. Comments can be sent directly to the editor, but the mailing list www-style@w3.org (see instructions) is also open and is preferred for discussion of this and other drafts in the Style area.

This working draft may be updated, replaced or rendered obsolete by other W3C documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". Its publication does not imply endorsement by the W3C membership or the CSS & FP Working Group (members only).

To find the latest version of this working draft, please follow the "Latest version" link above, or visit the list of W3C Technical Reports.

Contents


1. Dependencies on other modules

This CSS3 module depends on the following other CSS3 modules:

It has non-normative (informative) references to the following other CSS3 modules:

2. Introduction

In both CSS1 and CSS2, text formatting has been limited to simple effects like for example: text decoration, text alignment and character spacing. However, International typography contains types of formatting that could not be achieved without using special workarounds or graphics.

Along with already existing text related properties, this document presents a number of new CSS properties to represent such formatting. For example, the features this proposal covers include two of the most important features for East Asian typography: vertical text and layout grid.

There is a number of illustrations in this document for which the following legend is used:

Symbolic wide-cell glyph representation - wide-cell glyph (e.g. Han) which is the n-th character in the text run,
Symbolic narrow-cell glyph representation - narrow-cell glyph (e.g. Roman) which is the n-th glyph in the text run,
Symbolic connected glyph representation - connected glyph (e.g. Arabic) which is the n-th glyph in the text run.

Many typographical properties in East Asian typography depends on the fact that a character is typically rendered as either a wide or narrow character. All characters described by the Unicode Standard can be categorized by a width property. This is covered by a Unicode Technical report (TR#11) available from the Unicode Web site.

The orientation which the above symbols assume in the diagrams corresponds to the orientation that the glyphs they represent are intended to assume when rendered in the UA. Spacing between these characters in the diagrams is usually symbolic, unless intentionally changed to make a point.


3. Text layout

3.1. Text layout introduction

This section describes the text layout features supported by CSS, which includes support for various international writing directions, such as left-to-right (e.g., Roman scripts), right-to-left (e.g., Hebrew or Arabic), bidirectional (e.g., mixing Roman with Arabic) and vertical (e.g., Asian scripts).

The 'writing-mode' property determines an inline progression and a line to line progression, also called block progression. For example, Roman scripts are typically written left to right and top to bottom. The glyph orientation determines the orientation of the rendered visual shape of characters relative to the primary text advance direction.

Within a line, the adjustment to the current text position is based on the current glyph orientation relative to the text advance direction, the metrics of the glyph just rendered, kerning tables in the font and the current values of various attributes and properties, such as the spacing properties.

Bi-directionality introduces another level of complexity in text layout, as in many combinations of 'writing-mode' and glyph orientation values the proper directionality of text will be determined by an algorithm. The Unicode standard ([UNICODE], section 3.12) defines such an algorithm consisting of an implicit part based on character properties, as well as explicit controls for embeddings and overrides. It is also possible to override the inherent directionality of the content characters by using of combination of the 'writing-mode' and 'unicode-bidi' properties.

CSS3 relies on this algorithm to achieve proper text bidirectional rendering. However reordering of characters only occurs for specific values of the glyph orientation properties. See their description for the exact conditions.

CSS2 specified the 'direction' property which is a subset of the 'writing-mode' property as it only determines an inline progression. The 'direction' property may still be used when no line to line progression change is desired.

The HTML 4.0 specification ([HTML40], section 8.2) defines bi-directionality behavior for HTML elements. Conforming HTML user agents may therefore ignore the 'direction' and 'unicode-bidi' properties in author and user style sheets. The style sheet rules that would achieve the bidi behavior specified in [HTML40] are given in the sample style sheet. The HTML 4.0 specification also contains more information on bidirectionality issues. Note that HTML 4.0 does not cover the more general case described by the 'writing-mode' property.

Finally, this module uses extensively the 'before', 'after', 'start' and 'end' notation to specify the four edge of a box relative to its text advance direction, independently of its absolute positioning in terms of 'top', 'bottom', 'left' and 'right' (corresponding respectively to the 'before', 'after', 'start' and 'end' positions in a typical Western text layout). This notation is also used extensively in [XSL] for the same purpose.

3.2. Setting the primary text advance direction: the 'writing-mode' and 'direction' properties

The 'writing-mode' property specifies whether the primary text advance direction shall be left-to-right, right-to-left, or top-to-bottom. (Note that even when the primary text advance direction if left-to-right or right-to-left, some or all of the content within a given element might advance in the opposite direction because of the Unicode [UNICODE] bidirectional algorithm or because of explicit text advance overrides due to this property or 'direction' and 'unicode-bidi'. This property also changes the 'direction' property for the element. For more on bidirectional text, see the section about Embedding and override.

'writing-mode'
Value: lr-tb | rl-tb | tb-rl | tb-lr | bt-rl | bt-lr | lr | rl | tb | inherit
Initial: lr-tb
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
lr-tb | lr
Sets the primary text advance direction to left-to-right, and the line progression direction to top-to-bottom as is common in most Roman-based documents. For most characters, the current text position is advanced from left to right after each glyph is rendered. The 'direction' property is set to 'ltr'.
rl-tb | rl
Sets the primary text advance direction to right-to-left, and the line progression direction to top-to-bottom as is common in Arabic or Hebrew scripts. The direction property is set to 'rtl'.
tb-rl | tb
Sets the primary text advance direction to top-to-bottom, and the line progression direction to right-to-left as is common in Asian scripts. The baseline alignment may be different in this context. Typically, the dominant baseline runs through the center of the upright glyphs. The direction property is set to 'ltr'.
tb-lr
Sets the primary text advance direction to top-to-bottom, and the line progression direction to left-to-right. Though hardly as frequent as horizontal, this type of vertical layout also occurs in Latin based documents, particularly in table column or row labels. The baseline alignment may be different in this context. Typically, the dominant baseline runs through the center of the upright glyphs. The direction property is set to 'ltr'.
bt-rl
Sets the primary text advance direction to bottom-to-top, and the line progression direction to right-to-left. This value only exists to cover the case of the direction property value 'rtl' applied to an element where the current writing-mode property value is 'tb-rl' or 'tb'. The direction property is set to 'rtl'.
bt-lr
Sets the primary text advance direction to bottom-to-top, and the line progression direction to left-to-right. This value only exists to cover the case of the direction property value 'rtl' applied to an element where the current writing-mode property value is 'tb-lr'. The direction property is set to 'rtl'.

The combination of primary text advance direction and line progression direction set by the writing-mode property is also referred as a flow orientation. In such contexts, the values: lr-tb, lr, rl-tb and rl correspond to horizontal flow orientations, and the others (tb-rl, tb, tb-lr, bt-rl, bt-lr) correspond to vertical flow orientations.

For horizontal flow orientations, the top and bottom margins can be collapsed. For vertical flow orientations, the left and right margin can be collapsed. See Collapsing margins in the CSS3 Box module [forthcoming] for the details of collapsing margins.

This property also specifies the direction of table column layout, the direction of the overflow oriented in the same way as the primary text advance direction (e.g. for writing-mode: lr-tb a block element will overflow horizontally on the right) , the initial alignment of text and the position of an incomplete last line in a block in case of 'text-align: justify'.

For the 'writing-mode' property to have any effect on inline-level elements, one or both of the following conditions must be met:

An inline-level element that has a different writing-mode value than its parent becomes an inline-block element.

Editor's note: The 'width' and 'height' property descriptions in the CSS3 Box module need to be updated to describe the algorithm for vertical flow orientations in more details. For example, in vertical flow orientations, it is expected that the height will be no more than the minimum of the parent layout height (minus margin and border) and the element optimum height. The element optimum height is typically determined as being 10 ideographic characters 'advance width' long. This mechanism is required to avoid 'infinitely' long vertical lines or single line vertical flow (would look like rtl horizontal Japanese). In particular section 7.2 of the box module should discuss the case of 'auto' for vertical flow element contained in an horizontal flow element.

Here is a diagram of a horizontal flow (writing-mode: lr-tb):

Figure showing how a block with

Here is a diagram for a vertical flow used in East Asia (writing-mode: tb-lr) :

Figure showing how a block with

And finally, here is a diagram for another flow used for Uyghur and Mongolian (writing-mode: tb-lr):

Figure showing how a block with

In East Asian documents, it is often preferred to display certain Latin-based strings, such as numerals in a year, always in a horizontal layout flow regardless of the flow mode of the line of text these strings appear in, as in:

Layout of Tate Naka Yoko, showing a group of glyphs appearing
horizontally in a vertical column of text Example of Tate Naka Yoko, showing the
year 1996 appearing horizontally in a column of vertical text

Figure 2.2.1: Horizontal in vertical (a.k.a "Tate-chu-yoko")

This effect is known as "Tate chu yoko". In order to achieve it, the Latin string should be enclosed within a SPAN element with an horizontal flow orientation, as in:

<span STYLE="writing-mode: lr-tb">1996</span>

This is an application of changing the flow of an inline element as described earlier. Line breaking is normally disabled for such runs of text. This can be accomplished using the CSS 'white-space: nowrap' setting.

'direction'
Value: ltr | rtl | inherit
Initial: ltr
Applies to: all elements, but see prose
Inherited: yes
Percentages: N/A
Media: visual

Values for this property have the following meanings:

ltr
Left-to-right direction.
rtl
Right-to-left direction.

This property specifies the line direction component of the text advance direction and the direction of embeddings and overrides (see 'unicode-bidi') for the Unicode bidirectional algorithm. The values 'ltr' and 'rtl' have to be interpreted 'relatively' to the line direction. In addition, it specifies the direction of table column layout, the direction of the overflow oriented in the same way as the primary text advance direction (e.g. for writing-mode: lr-tb a block element will overflow horizontally on the right) , the initial alignment of text and the position of an incomplete last line in a block in case of 'text-align: justify'. For the 'direction' property to have any effect on inline-level elements, the 'unicode-bidi' property's value must be 'embed' or 'bidi-override' and the glyph orientation of the characters within the element must be 'auto' or 90/-90 degree in vertical layout or 0/180 degree in horizontal layout.

The usage of the 'direction' property for block-level elements is discouraged in CSS3 as the 'writing-mode' property supersedes it.

Note. The 'writing-mode' and 'direction' properties, when specified for table column elements, are not inherited by cells in the column since columns don't exist in the document tree. Thus, CSS cannot easily capture the "dir" attribute inheritance rules described in [HTML40], section 11.3.2.1.

Note. The 'writing-mode' and 'direction' properties interact with each other. As such, 'writing-mode' resets the 'direction' value. Similarly, modifying 'direction' after 'writing-mode' changes effectively the 'writing-mode' value to the opposite inline progression. For example, 'direction:rtl' applied to an element with 'writing-mode:lr-tb' effectively makes 'writing-mode:rl-tb'. This is one of the main reason why the mixed usage of these two properties is discouraged or at least they should be used with great caution.

Note. These properties do not affect the positioning of background images.

3.3. Glyph orientation within a text run: the 'glyph-orientation-vertical' and 'glyph-orientation-horizontal' properties

In some cases, it is required to alter the orientation of a sequence of characters relative to the primary text advance direction. The requirement is particularly applicable to vertical layouts of East Asian documents, where sometimes half-width Roman text is to be displayed horizontally and other times vertically.

Two properties control the glyph orientation relative to the primary text advance direction. 'glyph-orientation-vertical' controls glyph orientation when the primary text advance direction is vertical. 'glyph-orientation-horizontal' controls glyph orientation when the primary text advance direction is horizontal. It is necessary to distinguish between vertical and horizontal for the following reasons:

'glyph-orientation-vertical'
Value: <angle> | auto | inherit
Initial: auto
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual
<angle>
The User Agent may round the value of the angle to the values of glyph rotation supported by the user agent. Conforming user agents may only support the following values: 0deg, 90deg, 180deg and 270deg.
A value of "0deg" indicates that all glyphs are oriented with the bottom of the glyphs toward the primary text advance direction, resulting in glyphs which are stacked vertically on top of each other. A value of "90deg" indicates a rotation of 90 degrees clockwise from the "0deg" orientation. For characters which have this property set to 90 or 270 degree, reordering is first applied according to the Unicode Bidi algorithm and then the resulting glyphs are rotated according to the <angle> value.
auto
The glyph orientation relative to the primary text advance direction is determined automatically based on the Unicode character number of the rendered character.

Full-width ideographic and full-width Roman glyphs (excluding ideographic punctuation) are oriented as if an <angle> of "0deg" had been specified (i.e., glyphs are oriented with the bottom of the glyphs toward the primary text advance direction, resulting in glyphs which are stacked vertically on top of each other).

Ideographic punctuation and other ideographic characters having alternate horizontal and vertical glyph shapes shall use the vertical shape of the glyph.

Text which is not full-width will be set as if an <angle> of "90deg" had been specified; thus, half-width Roman text will be rotated 90 degree clockwise versus full-width ideographic and full-width Roman text.

Hebrew and Arabic text are also rotated 90 degree clockwise. The visual order of this text is determined by the bidirectional algorithm applied prior to the rotation.

Note. A value of auto will generally produce the expected results in common uses of mixing Japanese with European characters; however, the exact algorithms are based on complex interactions between many factors, including font design, and thus different algorithms might be employed in different processing environments. For precise control, specify explicit <angle> values.

This property specifies the orientation of glyphs relative to the inline progression determined by the 'writing-mode' property. This property is applied only to text written in a vertical writing-mode. Conforming user agents may do the following in increasing levels of supports:

  1. support only the 90deg value,
  2. support the 0deg, 90deg, 180deg and 270deg values.

The value of this property affects both the alignment and height of the glyph area generated for the affected glyphs. If a glyph is oriented so that the normal orientation of the glyph is parallel to the dominant-baseline, then the vertical alignment-point of the rotated glyph is aligned with the alignment-baseline appropriate to that glyph. The baseline to which the rotated glyph is aligned is the vertical baseline identified by the "alignment-baseline" for the script to which the glyph belongs. The height of the glyph area is determined from the height font characteristic for the glyph.

The horizontal alignment points, baselines and heights (computed as glyph advance width) are used if the normal orientation of the glyph is perpendicular to the dominant-baseline.

The diagrams below illustrate different uses of 'glyph-orientation-vertical'. The diagram on the left shows the result of the mixing of full-width ideographic characters with half-width Roman characters when 'glyph-orientation-vertical' for the Roman characters is either auto or "90deg". The diagram on the right show the result of mixing full-width ideographic characters with half-width Roman characters when Roman characters are specified to have a 'glyph-orientation-vertical' of "0deg".

Layout of mixed glyphs in vertical-ideographic mode.
Wide-cell glyphs are upright, Non-wide-cell glyphs are rotated by 90
degrees. Example of mixed Japanese and English in vertical-ideographic layout.
Japanese glyphs are upright, English rotated. Layout of mixed glyphs in
vertical mode. All glyphs are upright. Example of mixed Japanese and English in
vertical layout. All glyphs are upright.

The bidi algorithm and the 'glyph-orientation-vertical' property have the following interaction:

  1. The bidi algorithm is applied separately to each contiguous text range having the same glyph-orientation-vertical value. In other words a change in the property value reset the bidi algorithm.
  2. When the glyph-orientation-vertical value is 270 degree, all mirroring symbols after all due bidi processing are mirrored (that is, their glyph is inverted along the 'horizontal' axis) before being rotated 270 degree clockwise. This is done to achieve the desired rendering result, which is to have the mirroring characters pointing 'inward' the text they are enclosing.
'glyph-orientation-horizontal'
Value: <angle> | inherit
Initial: 0deg
Applies to: all inline-level elements
Inherited: yes
Percentages: N/A
Media: visual
<angle>
The user agent may round the value of the angle to the values of glyph rotation supported by the user agent. A value of "0deg" indicates that all glyphs are oriented with the right edge of the glyphs toward the primary text advance direction, resulting in glyphs which are positioned side by side. A value of "90deg" indicates an orientation of 90 degrees clockwise from the "0" orientation. For characters which have this property set to 0 or 180 degree, reordering is first applied according to the Unicode Bidi algorithm and then the resulting glyphs are rotated clockwise according to the <angle> value.

This property specifies the orientation of glyphs relative to the inline progression determined by the 'writing-mode' property. This property is applied only to text written in a horizontal writing-mode. Conforming user agents may do the following in increasing levels of supports:

  1. support only the 0deg value,
  2. support the 0deg, 90deg, 180deg and 270deg values.

The value of this property affects both the alignment and width of the glyph area generated for the affected glyphs. If a glyph is oriented so that the normal orientation of the glyph is parallel to the dominant-baseline, then the vertical alignment-point of the rotated glyph is aligned with the alignment-baseline appropriate to that glyph. The baseline to which the rotated glyph is aligned is the horizontal baseline identified by the "alignment-baseline" for the script to which the glyph belongs. The width of the glyph area is determined from the vertical width font characteristic for the glyph.

The horizontal alignment points, baselines and widths are used if the normal orientation of the glyph is perpendicular to the dominant-baseline.

3.4. Embedding and override: the 'unicode-bidi' property

'unicode-bidi'
Value: normal | embed | bidi-override | inherit
Initial: normal
Applies to: all elements, but see prose
Inherited: no
Percentages: N/A
Media: visual

This property allows further control of the Unicode bidirectional algorithm by allowing new embedding level or direction override. Values for this property have the following meanings:

normal
The element does not open an additional level of embedding with respect to the bidirectional algorithm. For inline-level elements, implicit reordering works across element boundaries.
embed
If the element is inline-level, this value opens an additional level of embedding with respect to the bidirectional algorithm. The direction of this embedding level is given by the 'direction' property. Inside the element, reordering is done implicitly. This corresponds to adding a LRE (U+202A; for 'direction: ltr') or RLE (U+202B; for 'direction: rtl') at the start of the element and a PDF (U+202C) at the end of the element.
bidi-override
If the element is inline-level or a block-level element that contains only inline-level elements, this creates an override. This means that inside the element, reordering is strictly in sequence according to the 'direction' property; the implicit part of the bidirectional algorithm is ignored. This corresponds to adding a LRO (U+202D; for 'direction: ltr') or RLO (U+202E; for 'direction: rtl') at the start of the element and a PDF (U+202C) at the end of the element.

The final order of characters in each block-level element is the same as if the bidi control codes had been added as described above, mark-up had been stripped, and the resulting character sequence had been passed to an implementation of the Unicode bidirectional algorithm for plain text that produced the same line-breaks as the styled text. In this process, non-textual entities such as images are treated as neutral characters, unless their 'unicode-bidi' property has a value other than 'normal', in which case they are treated as strong characters in the 'direction' specified for the element.

Note. In order to be able to flow inline boxes in a uniform direction (either entirely left-to-right or entirely right-to-left), more inline boxes (including anonymous inline boxes) may have to be created, and some inline boxes may have to be split up and reordered before flowing.

Because the Unicode algorithm has a limit of 61 levels of embedding, care should be taken not to use 'unicode-bidi' with a value other than 'normal' unless appropriate. In particular, a value of 'inherit' should be used with extreme caution. However, for elements that are, in general, intended to be displayed as blocks, a setting of 'unicode-bidi: embed' is preferred to keep the element together in case display is changed to inline (see example below).

The following example shows an XML document with bidirectional text. It illustrates an important design principle: DTD designers should take bidi into account both in the language proper (elements and attributes) and in any accompanying style sheets. The style sheets should be designed so that bidi rules are separate from other style rules. The bidi rules should not be overridden by other style sheets so that the document language's or DTD's bidi behavior is preserved.

Example(s):

In this example, lowercase letters stand for inherently left-to-right characters and uppercase letters represent inherently right-to-left characters:

<HEBREW>
  <PAR>HEBREW1 HEBREW2 english3 HEBREW4 HEBREW5</PAR>
  <PAR>HEBREW6 <EMPH>HEBREW7</EMPH> HEBREW8</PAR>
</HEBREW>
<ENGLISH>
  <PAR>english9 english10 english11 HEBREW12 HEBREW13</PAR>
  <PAR>english14 english15 english16</PAR>
  <PAR>english17 <HE-QUO>HEBREW18 english19 HEBREW20</HE-QUO></PAR>
</ENGLISH>

Since this is XML, the style sheet is responsible for setting the writing direction. This is the style sheet:

/* Rules for bidi */
HEBREW, HE-QUO  {direction: rtl; unicode-bidi: embed}
ENGLISH         {direction: ltr; unicode-bidi: embed}

/* Rules for presentation */
HEBREW, ENGLISH, PAR  {display: block}
EMPH                  {font-weight: bold}

The HEBREW element is a block with a right-to-left base direction, the ENGLISH element is a block with a left-to-right base direction. The PARs are blocks that inherit the base direction from their parents. Thus, the first two PARs are read starting at the top right, the final three are read starting at the top left. Please note that HEBREW and ENGLISH are chosen as element names for explicitness only; in general, element names should convey structure without reference to language.

The EMPH element is inline-level, and since its value for 'unicode-bidi' is 'normal' (the initial value), it has no effect on the ordering of the text. The HE-QUO element, on the other hand, creates an embedding.

The formatting of this text might look like this if the line length is long:

               5WERBEH 4WERBEH english3 2WERBEH 1WERBEH

                                8WERBEH 7WERBEH 6WERBEH

english9 english10 english11 13WERBEH 12WERBEH

english14 english15 english16

english17 20WERBEH english19 18WERBEH

Note that the HE-QUO embedding causes HEBREW18 to be to the right of english19.

If lines have to be broken, it might be more like this:

       2WERBEH 1WERBEH
  -EH 4WERBEH english3
                 5WERB

   -EH 7WERBEH 6WERBEH
                 8WERB

english9 english10 en-
glish11 12WERBEH
13WERBEH

english14 english15
english16

english17 18WERBEH
20WERBEH english19

Because HEBREW18 must be read before english19, it is on the line above english19. Just breaking the long line from the earlier formatting would not have worked. Note also that the first syllable from english19 might have fit on the previous line, but hyphenation of left-to-right words in a right-to-left context, and vice versa, is usually suppressed to avoid having to display a hyphen in the middle of a line.

3.5. Script character classification: the 'script' property

In text layout, many of the behaviors are related to a character classification based on scripts. For example, line breaking or text justification behaviors depend on the 'dominant' script of the textual content of an element. This can be heuristically determined by finding the first character that has an unambiguous script identifier in an element. It can also be explicitly specified by using the 'script' property.

'script'
Value: auto | none | <script> | inherit
Initial: auto
Applies to: all elements
Inherited: yes
Percentages: N/A
Media: visual

Values have the following meanings:

auto
Use the first character descendant[, after any reordering due to character direction and bi-directionality,] which has an unambiguous script identifier to determine the dominant script of the element's content. This determines the computed script value. Each textual component of the element may however behave in typographical related behaviors as dictated by its script identifier. In the absence of any textual components with a clear script identifier (or no textual content at all), the computed value is 'Latin'.
none
Indicates the script is unknown or is not significant to the proper formatting of this element.
<script>
A script definition in conformance with [ISO15924]. All textual components of the element must behave in typography related behaviors as dictated by this script value, not the inherent script value of these textual components.

4. Text alignment and justification

4.1. Text alignment:the 'text-align' property

'text-align'
Value: start | end | left | right | center | justify | <string> | inherit
Initial: start
Applies to: block-level elements
Inherited: yes
Percentages: N/A
Media: visual

This property describes how inline content of a block is aligned. Values have the following meanings:

start
The text is aligned on the start of the inline progression as determined by the current text advance direction.
end
The text is aligned on the end of the inline progression as determined by the current text advance direction.
left, right
In horizontal inline progression, the text is aligned on the left or right respectively. In vertical inline progression, the alignment is UA dependent. The values 'start' and 'end' should be used instead. The usage of 'left' and 'right' is deprecated in CSS3.
center
The text is center aligned.
justify
The text is justified. The justification algorithm can be further refined by using the 'text-justify' property. Conforming user agents may interpret the value 'justify' as 'start'
<string>
Specifies a string on which cells in a table column will align (see the section on horizontal alignment in a column for details and an example). This value applies only to table cells. If set on other elements, it will be treated as 'start'.

A block of text is a stack of line boxes. In the case of 'start', 'end', 'left', 'right' and 'center', this property specifies how the inline boxes within each line box align with respect to the line box's start and end sides; alignment is not with respect to the viewport. In the case of 'justify', the UA may stretch the inline boxes in addition to adjusting their positions. (See also 'letter-spacing' and 'word-spacing'.)

Example(s):

In this example, note that since 'text-align' is inherited, all block-level elements inside the DIV element with 'class=center' will have their inline content centered.

DIV.center { text-align: center }

4.2. Justification: the 'text-justify' property

'text-justify'
Value: auto | inter-word | inter-ideograph | distribute | newspaper | inter-cluster | kashida | inherit
Initial: auto
Applies to: block-level elements
Inherited: yes
Percentages: N/A
Media: visual

This property selects the type of justify alignment. It affects the text layout only if 'text-align' is set to 'justify'. That way, UA's that do not support this property will still render the text as fully justified, which most of the time is at least partially correct. Typically the text-justify property does not affect the last line, unless the last line itself is justified. Most of the text-justify values affects writing systems in very specific ways. These writing systems (or group of) are:

The text-justification behavior of textual components is guided by the script classification of the characters. The 'script' property allows to modify the behavior of these components.

Depending on the text-justify value, spacing may be altered between words or letters.

The possible values for the text-justify property are:

auto
The UA determines the justification algorithm to follow, based on a balance between performance and adequate presentation quality. Inter-word expansion is typically used for all scripts that use space as word delimiter. However, if the kashida-space property has a non zero value it is recommended to use kashida elongation for Arabic text.
inter-word
Selects the simplest and fastest full justification behavior, which spreads the text evenly across the line by increasing the width of the space between words only. The concept of a word is script dependent, the exact algorithm is determined by the User Agent. A minimum, justification is expected to occur at each white space boundary. No expansion or compression occurs within the words, i.e. no additional letter spacing is created. No kashida effect takes place.

The diagram below illustrates this mode, by showing how the characters are laid out in the last two lines of an element:

Diagram showing glyph distribution in inter-word justification

Figure 3.2.1: Mixed glyph layout in the last two lines in an inter-word justified element

For example a viewer could render an 'inter-word' justified paragraph in the following way:

Example of inter-word justification applied to mixed Japanese
and English text

Figure 3.2.2: Inter-word justification applied to mixed text

newspaper
Selects the justification behavior in which both inter-word and inter-letter spacing can be expanded or reduced to spread the text across the whole line. Also, text distribution on any given line may depend on the layout or the contents of the previous or the following several lines. This is the significantly slower and more sophisticated type of the full justify behavior preferred in newspaper and magazines, as it is especially useful for narrow columns. For example, typically, compression is tried first. If unsuccessful, expansion occurs: inter-word spaces are expanded up to a threshold, and finally inter-letter expansion is performed. This is applied to all scripts groups except Devanagari and other South Asian writing systems using baseline connectors. The threshold value may be related to the column width (in number of characters). The exact layout algorithm is determined by the User Agent.

The diagram below illustrates this mode:

Diagram showing character distribution in newspaper
justification

Figure 3.2.3: Mixed character layout in the last two lines of a newspaper justified element

inter-ideograph
In this mode, letter-spacing modification only occurs for the CJK group. Others only use inter-word expansion. No kashida effect takes place. This is the preferred justification in the context of the Japanese writing system, but not Latin nor Korean.

The diagram below illustrates this mode:

Diagram showing glyph distribution in inter-ideograph
justification

Figure 3.2.4: Mixed glyph layout in the last two lines in an inter-ideograph justified element

Below is an example of how this mode would work:

Example of inter-ideograph justification applied to mixed
Japanese and English text

Figure 3.2.5: Inter-ideograph justification applied to mixed text

distribute
Like 'newspaper' it allows letter spacing modification for most script groups (except Hindi), but unlike newspaper, it does not prioritize between word spacing and letter spacing, i.e. the space character gets the same letter spacing modification as others. And by consequence there are no variation between narrow and wide column. This value is best used in East Asian context.

The diagram below illustrates this mode:

Diagram showing character distribution in distribute
justification

Figure 3.2.6: Mixed character layout in the last two lines of a distribute justified element

For example a viewer could render a 'distribute' justified paragraph in the following way:

Example of distribute justification applied to mixed Japanese
and English text

Figure 3.2.7: Distribute justification applied to mixed text

inter-cluster
Plays the same role as inter-ideograph but for South Eastern Asian scripts. That is letter spacing only occurs for clusters belonging to those scripts. A cluster is defined as a group of characters formatted as a single unit.
kashida
Plays the same role as inter-ideograph but for Arabic through the Kashida effect. That is, no letter spacing occurs for other scripts.

The following table describes the expansion/compression strategy for the combination of each script groups and the text-justify property value for each relevant text-justify property value:

text-justify property value
Script groups auto* inter-word newspaper inter-ideograph distribute inter-cluster kashida
Latin word-spacing only* word-spacing only prioritization between word-spacing and letter-spacing word-spacing only word-spacing only word-spacing only word-spacing only
CJK no extra spacing* no extra spacing letter-spacing letter-spacing letter-spacing no extra spacing no extra spacing
Devanagari* word-spacing* word-spacing word-spacing word-spacing word-spacing word-spacing word-spacing
Arabic kashida and word-spacing* word-spacing kashida and word-spacing word-spacing kashida and word-spacing word-spacing kashida and word-spacing
SE Asian clusters inter-cluster spacing* inter-cluster spacing inter-cluster spacing no extra spacing inter-cluster spacing inter-cluster spacing no extra spacing

Figure 3.2.8: Interaction between text-justify values and script groups

*The values shown for the auto column are only a recommendation. The UAs might implement a different strategy.

*The Devanagari entry represents as well other scripts and writing systems used in India that use baseline connectors like Bengali and Gurmukhi.

4.3. Last line alignment: the 'text-align-last' property'

'text-align-last'
Value: auto | start | end | center | justify | size | inherit
Initial: auto
Applies to: block-level elements
Inherited: yes
Percentages: N/A
Media: visual

This property describes how the last line of the inline content of a block is aligned. This also applies to the only line of a block if it contains a single line, the line preceding a BR element and to last lines of anonymous blocks. Typically the last line is aligned like the other lines of the block element, this is set by the 'text-align' property. However, in some situations like when the 'text-align' property is set to 'justify', the last line may be aligned differently.

Values have the following meanings:

auto
The last line will be aligned like the other lines, that is determined by the value of the 'text-align' property. However, if the 'text-align' property is set to the value 'justify', the last line will be aligned to the start of the inline progression.
start, end and center
Start, end and center text respectively.
justify
The last line will be justified like the other lines, using the justification type set by the 'text-justify' property. Note however that if there is no expansion opportunity in the last line, the line might not appeared justified.
size
The line content is scaled to fit on the line. All the fonts on the line must be scaled by the same factor, and that factor must be as small as possible (i.e. fit as much on a line as possible). Typically this value is used for single line element.

The following example shows the usage of the alignment properties in a case where all lines are justified in a distributed justification. This is commonly found in East Asian typography:

P.distributealllines
 { text-align: justify; text-justify: distribute; text-align-last: justify }

4.4. Minimum and maximum font size: the 'min-font-size' and 'max-font-size' property

'min-font-size'
Value: <font-size> | inherit
Initial: 0
Applies to: all elements
Inherited: yes
Percentages: element's computed 'font-size'
Media: visual

If 'text-align-last' is 'size', the fonts of the last line of an element are not allowed to become smaller than the smaller of 'font-size' and 'min-font-size'.

'max-font-size'
Value: <font-size> | auto | inherit
Initial: auto
Applies to: all elements
Inherited: yes
Percentages: element's computed 'font-size'
Media: visual

If 'text-align-last' is 'size', the fonts of the last line of an element are not allowed to become larger than the larger of 'font-size' and 'max-font-size'. 'auto' means that there is no limit.

4.5. Additional compression: The 'text-justify-trim' property

'text-justify-trim'
Value: none | punctuation | punct-and-kana | inherit
Initial: punctuation
Applies to: block-level elements
Inherited: yes
Percentages: N/A
Media: visual

This sets the individual font blank space compression permissions for the text justification algorithm, when 'text-justify' is anything other than 'inter-word'. This special type of space compression occurs on the font level, i.e. the blank space within the character area itself may be reduced without affecting the appearance of the glyph. This applies to wide-cell glyphs only. Possible values:

none
No wide-cell font space compression is allowed.

Diagram of glyph layout with no compression

Figure 3.4.1: Glyph layout with no compression

punctuation
Space can be taken away only from wide-cell punctuation glyphs.

Diagram of glyph layout with punctuation compression

Figure 3.4.2: Glyph layout with punctuation compression

punct-and-kana
Space compression is allowed on wide-cell punctuation and wide-cell Kana glyphs.

Diagram of character layout with punct-and-kana compression

Figure 3.4.3: Character layout with punctuation and Kana compression

4.6. Kashida effect: the 'text-kashida-space' property'

'text-kashida-space'
Value: <percentage> | inherit
Initial: 0%
Applies to: block-level elements
Inherited: yes
Percentages: as described
Media: visual

Kashida is a typographic effect used in Arabic writing systems that allows character elongation at some carefully chosen points in Arabic. Each elongation can be accomplished using a number of kashida glyphs, a single graphic or character elongation on each side of the kashida point. (The UA may use either mechanism based on font or system capability). The text-kashida-space property expresses the ratio of the kashida expansion size to the white space expansion size, 0% means no kashida expansion, 100% means kashida expansion only . This property can be used with any justification style where kashida expansion is used (currently text-justify: auto, kashida, distribute and newspaper).

In the diagram below showing two identical paragraphs of Arabic text, the blue line in the second line (not justified) shows the length that is used for kashida and divided among the elongation opportunities in the first line (justified), as indicated by the red underlines:

Example of kashida applied to Arabic text

Figure 3.5: Kashida applied to Arabic text

In that example no expansion occurs between the word themselves, indicating that the text-kashida-space property is set to 100%.


5. Baseline alignment

5.1. Baseline information provided by fonts

The glyphs of a given script are positioned so that a particular point on each glyph, the alignment-point, is aligned with the alignment-points of the other glyphs in that script. The glyphs of different scripts are typically aligned at different points on the glyph. For example, Western glyphs are aligned on the bottoms of the capital letters, certain Indic glyphs (including glyphs from the Devanagari, Gurmukhi and Bengali scripts) are aligned at the top of a horizontal stroke near the top of the glyphs and East Asian glyphs are aligned either at the bottom or center of the EM box of the glyph. Within a script and within a line of text having a single font-size, the sequence of alignment-points defines, in the inline-progression-direction, a geometric line called a baseline. Western and most other alphabetic and syllabic glyphs are aligned to an "alphabetic" baseline, the above Indic glyphs are aligned to a "hanging" baseline and the East Asian glyphs are aligned to an "ideographic" baseline.

three alignment points

This figure shows the vertical position of the alignment-point for alphabetic and many syllabic scripts, illustrated by a Roman "A"; for certain Indic scripts, illustrated by a Gurmukhi syllable "ji"; and for ideographic scripts, illustrated by the ideographic glyph meaning "country". The thin black rectangle around the ideographic glyph illustrates the EM box for that glyph and shows the typical positioning of the "black marks" of the glyph within the EM box.

A baseline-table specifies the position of one or more baselines in the design space coordinate system. The function of the baseline table is to facilitate the alignment of different scripts with respect to each other when they are mixed on the same text line. Because the desired relative alignments may depend on which script is dominant in a line (or block), there may be a different baseline table for each script. In addition, different alignment positions are needed for horizontal and vertical writing modes. Therefore, the font may have a set of baseline tables: typically, one or more for horizontal writing-modes and zero or more for vertical writing-modes.

different baseline positions

Examples of horizontal and vertical baseline positions. The thin lined box in each example is the "EM box". For the Latin glyphs, only the EM box of the first glyph is shown. Example 1 shows typical Latin text written horizontally. This text is positioned relative to the alphabetic baseline, shown in blue. Example 2 shows a typical ideographic glyph positioned on the horizontal ideographic baseline. Note that the EM Box is positioned differently for these two cases. Examples 3 and 4 show the same set of baselines used in vertical writing. The Latin text, example 3, is shown with a glyph-orientation of 90 degrees which is typical for proportionally space Latin glyphs in vertical writing. Even though the ideographic glyph in Example 4 is positioned on the vertical ideographic baseline, because it is centered in the EM box, all glyphs with the same EM Box are centered, vertically, with respect to one another.

The font tables for a font include font characteristics for the individual glyphs in the font. CSS assumes that the font tables include, for each glyph in the font, one width value, one alignment-baseline and one alignment-point for the horizontal writing-modes. If vertical writing-modes are supported, then each glyph must have another width value, alignment-baseline and alignment-point for the vertical writing-modes. (Even though it is specified as a width, for vertical writing-modes the width is used in the vertical direction.)

The script to which a glyph belongs determines an alignment-baseline to which the glyph is to be aligned. The position of this baseline in the design space coordinate system determines the default block progression direction position of the alignment-point. The inline progression direction position of the alignment-point is on the start-edge of the glyph.

alignment in em box

This figure shows glyphs from three different scripts, each with its EM box and within the EM box, the baseline table applicable to that glyph. The alignment-point of each glyph is shown by an "X" on the start edge of the EM box and by making alignment-baseline blue. The baseline-table of the parent element of the characters that mapped to these glyphs is shown as a set of dashed lines.

5.2. Baseline identifiers

The baseline alignment properties control the alignment of child element with respect to their parent. The positions of these baselines are illustrated in the following figure:

different baselines

This figure shows samples of Gurmukhi (a hanging Indic script), Latin and ideographic scripts together with most of the baselines defined below. The thin line around the ideographic glyphs symbolizes the EM box in which these glyphs are centered. In this figure, the position of the "text-before-edge" and "text-after-edge" baselines is computed assuming that the "alphabetic" baseline is the dominant-baseline. The "central" baseline has been omitted from the figure, but it lies halfway between the "text-before-edge" and "text-after-edge" baselines, just about where the "math" baseline is shown.

The baseline-identifiers below are used in this specification. Some of these are determined by baseline-tables contained in a font as described in the section describing the baseline information provided by fonts. Others are computed from other font characteristics as described below.

alphabetic

This identifies the baseline used by most alphabetic and syllabic scripts. These include, but are not limited to, many Western, Southern Indic, Southeast Asian (non-ideographic) scripts.

ideographic

This identifies the baseline used by ideographic scripts. For historical reasons, this baseline is at the bottom of the ideographic EM box and not in the center of the ideographic EM box. See the "central" baseline. The ideographic scripts include Chinese, Japanese, Korean, and Vietnamese Chu Nom.

hanging

This identifies the baseline used by certain Indic scripts. These scripts include Devanagari, Gurmukhi and Bengali.

mathematical

This identifies the baseline used by mathematical symbols.

central

This identifies a computed baseline that is at the center of the EM box. This baseline lies halfway between the text-before-edge and text-after-edge baselines.

Note. For ideographic fonts, this baseline is often used to align the glyphs; it is an alternative to the ideographic baseline.

middle

This identifies a baseline that is offset from the alphabetic baseline in the shift-direction by 1/2 the value of the x-height font characteristic. The position of this baseline may be obtained from the font data or, for fonts that have a font characteristic for "x-height", it may be computed using 1/2 the "x-height". Lacking either of these pieces of information, the position of this baseline may be approximated by the "central" baseline.

text-before-edge

This identifies the before-edge of the EM box. The position of this baseline may be specified in the baseline-table or it may be calculated.

Note. The position of this baseline is normally around or at the top of the ascenders, but it may not encompass all accents that can appear above a glyph. For these fonts the value of the "ascent" font characteristic is used. For ideographic fonts, the position of this baseline is normally 1 EM in the shift-direction from the "ideographic" baseline. However, some ideographic fonts have a reduced width in the inline-progression-direction to allow tighter setting. When such a font, designed only for vertical writing-modes, is used in a horizontal writing-mode, the "text-before-edge" baseline may be less than 1 EM from the text-after-edge.

text-after-edge

This identifies the after-edge of the EM box. The position of this baseline may be specified in the baseline-table or it may be calculated.

Note. For fonts with descenders, the position of this baseline is normally around or at the bottom of the descenders. For these fonts the value of the "descent" font characteristic is used. For ideographic fonts, the position of this baseline is normally at the "ideographic" baseline.

There are, in addition, two computed baselines that are only defined for line boxes. For each line box, there is a dominant-baseline, a baseline-table and a baseline-table font-size which are those of the nearest ancestor element that completely contains the whole line. The "before-edge" and "after-edge" baselines are defined as follows.

before-edge

The offset of the "before-edge" baseline of the line from the dominant-baseline of the line is determined by ignoring all inline boxes whose alignment-baseline is either "before-edge" or "after-edge". For the "before-edge", extents are measured from the dominant-baseline in the direction toward the top (relative) of the box. The "before-edge" baseline offset is set to the maximum extent of the "before-edges" of the allocation-rectangles of the remaining areas. If all the inline-areas in a line-area are aligned either to the "before-edge" or to the "after-edge", then use the offset of the "text-before-edge" baseline of the line as the offset of the "before-edge" baseline of the line.

after-edge

The offset of the "after-edge" baseline of the line from the dominant-baseline of the line is determined by ignoring all inline boxes whose alignment-baseline is after-edge. For the "after-edge", extents are measured from the dominant-baseline in the direction toward the bottom (relative) of the reference-area. The "after-edge" baseline offset is set to the negative of the maximum of (1) the maximum extent of the "after-edges" of the allocation-rectangles of the remaining areas and (2) the maximum height of the allocation-rectangles of the areas that are ignored minus the offset of the "before-edge" baseline of the line.

Note. If all the inline-areas in a line-area are aligned to the "after-edge" then the specification for the "before-edge" will set the "before-edge" baseline to coincide with the "text-before-baseline" of the line. Then, case (2) above will determine an offset to the "bottom-edge" baseline that will align the "before-edge" of the area with the greatest height to its allocation-rectangle to "before-edge" baseline.

Note. The above specifications for "before-edge" and "after-edge" have the following three properties: (1) the allocation-rectangles of all the areas are below the "before-edge", (2) the allocation-rectangles of all the areas are above the "after-edge", and (3) the distance between the "before-edge" and the "after-edge" cannot be decreased without violating (1) or (2). The specified placement of the "before-edge" and "after-edge" is not the only way that (1)-(3) can be satisfied, but it is the only way they can be satisfied with the smallest possible offset to the "before-edge".

Examples showing "before-edge" and "after-edge" alignment:

aligning text and images

The rectangles with lines or arrows are images with an intrinsic size as shown. The rectangles with no arrows represent images that receive the default, dominant baseline, alignment. The alignment of the other rectangles is at the furthest point from the arrow head (which is in the middle when there are two arrowheads). Examples 1 and 2 show the "before-edge" alignment is determined by the tallest non-"before-edge" aligned objects: in example 1 this is the default aligned, arrowhead free rectangular image and in example 2 this is the double headed arrow rectangle. Examples 3 and 4 show defaulting to the "text-before-edge" when all the boxes have either "before-edge" or "after-edge" alignment. In example 3, the images with "before-edge" alignment has a taller member than do the "after-edge" aligned images. In example 4, the tallest image is in the "after-edge" aligned set. Example 5 is a repetition of example 2 with largest image being an "after-edge" aligned image.

There are also four baselines that are defined only for horizontal writing-modes.

top

This baseline is the same as the "before-edge" baseline in a horizontal writing-mode and is undefined in a vertical writing mode.

text-top

This baseline is the same as the "text-before-edge" baseline in a horizontal writing-mode and is undefined in a vertical writing mode.

bottom

This baseline is the same as the "after-edge" baseline in a horizontal writing-mode and is undefined in a vertical writing mode.

text-bottom

This baseline is the same as the "text-after-edge" baseline in a horizontal writing-mode and is undefined in a vertical writing mode.

5.3. Overview of the baseline alignment process

The alignment of an element with respect to its parent is determined by three things: the scaled-baseline-table of the parent and the alignment-baseline and alignment-point of the element being aligned. Prior to alignment, the scaled-baseline-table of the parent may be shifted. The property specifications below provide the information necessary to align the parent and child elements.

There are four properties that control alignment of elements to the above set of baselines: 'dominant-baseline', 'alignment-baseline', 'baseline-shift' and 'alignment-adjust'. These properties are all independent and are designed so that typically only the specification of one of the properties is needed to achieve a particular alignment goal.

The primary baseline alignment property is the 'dominant-baseline' property. This property has a compound value with three components. The dominant-baseline-identifier component is the default 'alignment-baseline' to be used when aligning two inline areas. The baseline-table component specifies the positions of the baselines in the font design space coordinates. The baseline-table acts something like a musical staff; it defines particular points along the block-progression-direction to which glyphs and inline elements can be aligned. The baseline-table 'font-size' component provides a scaling factor for the baseline-table.

Because the value of the 'font-family' property is a list of fonts, to insure a consistent choice of baseline-table we define the nominal font in a font list as the first font in the list for which a glyph data is available. This is the first that could contain a glyph for each character encountered. (For this definition, glyph data is assumed to be present if a font substitution is made or if the font is synthesized.) This definition insures a content independent determination of the font and baseline table that is to be used.

For convenience, the specification will sometimes refer to the baseline identified by the dominant-baseline-identifier component of the "dominant-baseline" property as the "dominant baseline" (in an abuse of terminology).

The model also assumes that each glyph has a 'alignment-baseline' value which specifies the baseline with which the glyph is to be aligned. (The 'alignment-baseline' is called the "Baseline Tag" in the OpenType baseline-table description.) The initial value of the 'alignment-baseline' property uses the baseline identifier associated with the given glyph. Alternate values for 'alignment-baseline' can be useful for glyphs such as a "*" which are ambiguous with respect to script membership.

The model assumes that the font from which the glyph is drawn also has a baseline table, the font baseline-table. This baseline table has offsets in units-per-em from the (0,0) point to each of the baselines the font knows about. In particular, it has the offset from the glyph's (0,0) point to the baseline identified by the 'alignment-baseline'.

The offset values in the baseline-table are in "design units" which means fractional units of the EM. CSS calls these "units-per-em". Thus, the current 'font-size' is used to determine the actual offset from the dominant baseline to the alternate baselines.

The glyph is aligned so that its baseline identified by its 'alignment-baseline' is aligned with the baseline with the same name from the dominant baseline-table.

The offset from the dominant baseline of the parent to the baseline identified by the 'alignment-baseline' is computed using the dominant baseline-table and dominant baseline-table font-size. The font baseline-table and font-size applicable to the glyph are used to compute the offset from the identified baseline to the (0,0) point of the glyph. This second offset is subtracted from the first offset to get the position of the (0,0) point in the shift direction. Both offsets are computed by multiplying the baseline value from the baseline-table times the appropriate font-size value.

If the 'alignment-baseline' identifies the dominant baseline, then the first offset is zero and the glyph is aligned with the dominant baseline; otherwise, the glyph is aligned with the chosen alternate baseline.

The third baseline alignment property is the 'baseline-shift' property. Like the properties other than the "dominant-baseline" property, this property does not change the baseline-table or the baseline-table font-size. It does shift the whole baseline table of the parent element so that when an inner inline element is aligned to one of the parents baselines, the position of the inner inline element is shifted.

The fourth alignment property is the 'alignment-adjust' property. This property is primarily used for elements, such as some graphics, that do not belong to a particular script and do not have a predefined alignment point. The "alignment-adjust" property allows the author to assign where, on the start-edge of the object, the alignment point for that element lies.

In addition to the following definition of these properties, an informative appendix: B provides usage examples of these properties.

5.4. Dominant baseline: the 'dominant-baseline' property

'dominant-baseline'
Value: auto | use-script | no-change | reset-size|
ideographic | alphabetic | hanging | mathematical | inherit
Initial: auto
Applies to: inline-level elements
Inherited: no
Percentages: N/A
Media: visual

The 'dominant-baseline' property is used to determine or re-determine a scaled-baseline-table. A scaled-baseline-table is a compound value with three components:

Some values of the property re-determine all three values; other only reestablish the baseline-table font-size. Values for the property have the following meaning:

auto
If this property occurs on a block or inline-block element, then the value of the baseline-identifier depends on the value of the 'script' property. If the value of the script property is 'auto, the baseline-identifier is 'alphabetic' for horizontal 'writing-mode' values and 'central' for vertical 'writing-mode' values. If the value of the script property is other than 'auto', the baseline-identifier value is based on that script. The 'writing-mode' value, whether horizontal or vertical is used to select the baseline-table that correspond to that baseline-identifier. The baseline-table font-size component is set to the value of the 'font-size' property on this element.
Otherwise, if this property occurs on an inline-level element, then the baseline-identifier and the baseline-table components remain the same as those of the parent element. The baseline-table font-size also remains the same as the parent's one, unless the computed 'baseline-shift' value actually shifts the baseline; then the baseline-table font-size is set to the value of the 'font-size' property on this element. If there is no parent element, the dominant-baseline components are set as for the block elements.
use-script
The dominant baseline-identifier is set using the computed value of the 'script' property. The 'writing-mode' value, whether horizontal or vertical is used to select the baseline-table that correspond to that baseline-identifier. The baseline-table font-size component is set to the value of the 'font-size' property on this element.
no-change
The dominant baseline-identifier, the baseline-table and the baseline-table font-size remain the same as that of the parent.
reset-size
The dominant baseline-identifier and the baseline table remain the same, but the baseline-table font-size is changed to the value of the 'font-size' property on this element. This re-scales the baseline table for the current 'font-size'.
ideographic
The dominant baseline-identifier is set to the 'ideographic' baseline using the baseline-table and baseline-table font-size of the parent, the baseline table is changed to correspond to the 'ideographic' baseline, and the baseline-table font-size is changed to the value of the 'font-size' property on this element.
hanging
The dominant baseline-identifier is set to the 'hanging' baseline using the baseline-table and baseline-table font-size of the parent, the baseline table is changed to correspond to the 'hanging' baseline, and the baseline-table font-size is changed to the value of the 'font-size' property on this element.
alphabetic
The dominant baseline-identifier is set to the 'alphabetic' baseline using the baseline-table and baseline-table font-size of the parent, the baseline table is changed to correspond to the 'alphabetic' baseline, and the baseline-table font-size is changed to the value of the 'font-size' property on this element. (The 'alphabetic' baseline is the standard baseline for Roman scripts.)
mathematical
The dominant baseline-identifier is set to the 'mathematical' baseline using the baseline-table and baseline-table font-size of the parent, the baseline table is changed to correspond to the 'mathematical' baseline, and the baseline-table font-size is changed to the value of the 'font-size' property on this element.

If there is no baseline-table in the nominal font or if the baseline-table lacks an entry for the desired baseline, then the user agent may use heuristics to determine the position of the desired baseline.

5.5. Aligning the alignment point of an element:the 'alignment-baseline' property

'alignment-baseline'
Value: baseline | auto-script | before-edge | text-before-edge | after-edge | text-after-edge |
central | middle | ideographic | alphabetic | hanging | mathematical | inherit
Initial: baseline
Applies to: inline-level elements
Inherited: no
Percentages: N/A
Media: visual

This property specifies how an inline-level element is aligned with respect to its parent. That is, to which of the parent's baselines the alignment point of this element is aligned. Unlike the 'dominant-baseline' property the 'alignment-baseline' property has no effect on its children dominant-baselines.

Note: The 'alignment-adjust' property specifies how the alignment point is determined and defaults to the baseline with the same name as the computed value of the alignment-baseline property.

Except for 'auto-script', all baseline values refer to the respective baseline-identifier components of the dominant-baseline of the parent, and glyphs within the element are aligned similarly to the element itself. The description for 'auto-script' covers these points specifically. The property values have the following meanings:

baseline
The alignment-point of the element being aligned is aligned with the dominant baseline of the parent.
auto-script
If the element 'script' property value is 'auto', the alignment point of each glyph is aligned with the baseline-identifier of the script to which the glyph belongs. If the element 'script' property value is other than 'auto', the alignment point of each glyph is aligned with the baseline-identifier specified by the 'script' property. The baseline-identifier position is determined by using the relevant information related to the parent element dominant-baseline set. The alignment point of the element itself is aligned as for the 'baseline' value.
before-edge
The alignment point of the box is aligned with the 'before-edge' baseline of the parent.
text-before-edge
The alignment-point of the element being aligned is aligned with the 'text-before-edge' baseline of the parent.
after-edge
The alignment point of the box is aligned with the 'after-edge' baseline of the parent.
text-after-edge
The alignment-point of the element being aligned is aligned with the 'text-after-edge' baseline of the parent.
central
The alignment point of the box is aligned with the 'central' baseline of the parent.
middle
The alignment point of the box is aligned with the 'middle' baseline of the parent.
ideographic
The alignment-point of the element being aligned is aligned with the 'ideographic' baseline of the parent.
alphabetic
The alignment-point of the element being aligned is aligned with the lower baseline of the parent.
hanging
The alignment-point of the element being aligned is aligned with the hanging baseline of the parent.
mathematical
The alignment-point of the element being aligned is aligned with the mathematical baseline of the parent.

The values: before-edge, text-before-edge, after-edge and text-after-edge all works relatively to the writing-mode property values. For example 'before-edge' means 'top' in an horizontal writing mode and 'right' in a vertical writing mode.

Note. The reason why 'baseline' is the initial value instead of 'auto-script' (called 'auto' in the similar XSL property) has to do with the fact that most fonts today are designed with an alignment point located at the 'alphabetical' level, even for glyphs belonging to non Latin scripts. User agents have to deal with that constraint, and therefore they use the 'baseline' value as initial.

5.6. Setting the alignment point:the 'alignment-adjust' property

'alignment-adjust'
Value: auto | baseline | before-edge | text-before-edge | middle | central | after-edge | text-after-edge | ideographic | alphabetic | hanging | mathematical | <percentage> | <length> | inherit
Initial: auto
Applies to: inline-level elements
Inherited: no
Percentages: refers to the 'line-height' of the element
Media: visual

The 'alignment-adjust' property allows more precise alignment of elements, such as graphics, that do not have a baseline-table or lack the desired baseline in their baseline-table. With the 'alignment-adjust' property, the position of the baseline identified by the 'alignment-baseline' can be explicitly determined. It also determines precisely the alignment point for each glyph within a textual element. The user agent should use heuristics to determine the position of a non existing baseline for a given element.

Values for the property have the following meaning:

auto
For each glyph corresponding to textual information within the element, the alignment-point is the intersection of the start-edge of the glyph box and the block-progression-direction position of the alignment point from the font. Padding, border or margin do not affect that alignment point. The alignment point of the inline-level element itself is at the intersection of the start-edge of the first inline box and the baseline identified by the 'alignment-baseline' property if this baselines exists in the baseline-table for the element dominant-baseline. If the inline element is an inline block, the alignment point occurs on the last line of the inline block element. If that specific baseline does not exist, the user agent may use heuristics to determine where that missing baseline would be. For other inline box content like images, the user agent will use heuristics to determine the position of the alignment point. For example when the resulting baseline is 'alphabetic' or 'ideographic', it is expected that the alignment point will be at the intersection of the start-edge and the after-edge of the inline box, including its respective margin. If the resulting baseline is 'hanging', the intersection of the start-edge and the before-edge of the inline box, including its respective margin should be used instead.
baseline
The alignment point is at the intersection of the start-edge of the element and the dominant-baseline of the element.
before-edge
The alignment point is at the intersection of the start-edge of the element and the 'before-edge' baseline of the element.
text-before-edge
The alignment point is at the intersection of the start-edge of the element and the 'text-before-edge' baseline of the element.
central
The alignment point is at the intersection of the start-edge of the element and the 'central' baseline of the element.
middle
The alignment point is at the intersection of the start-edge of the element and the 'middle' baseline of the element.
after-edge
The alignment point is at the intersection of the start-edge of the element and the 'after-edge' baseline of the element.
text-after-edge
The alignment point is at the intersection of the start-edge of the element and the 'text-after-edge' baseline of the element.
ideographic
The alignment point is at the intersection of the start-edge of the element and the 'ideographic' baseline of the element.
alphabetic
The alignment point is at the intersection of the start-edge of the element and the 'alphabetic' baseline of the element.
hanging
The alignment point is at the intersection of the start-edge of the element and the 'hanging' baseline of the element.
mathematical
The alignment point is at the intersection of the start-edge of the element and the 'mathematical' baseline of the element.
<percentage>
The computed value of the property is this percentage multiplied by the computed 'line-height' of the element (this includes the margin for replaced elements) . The alignment point is on the start-edge of the inline box. Its position along the start-edge relative to the intersection of the dominant-baseline and the start-edge is offset by the computed value. The offset is opposite to the shift-direction (positive value) or in the shift-direction (negative value). A value of '0%' makes the dominant-baseline the alignment point.
<length>
The alignment-point is on the start-edge of the inline box. Its position along the start-edge relative to the intersection of the dominant-baseline and the start-edge is offset by the <length> value. The offset is opposite to the shift-direction (positive value) or in the shift-direction (negative value). A value of '0cm' makes the dominant-baseline the alignment point.

5.7. Repositioning the dominant baseline:the 'baseline-shift' property

'baseline-shift'
Value: baseline | sub | super | <percentage> | <length> | inherit
Initial: baseline
Applies to: inline-level elements
Inherited: no
Percentages: refers to the 'line-height' of the parent element
Media: visual

The 'baseline-shift' property allows repositioning of the dominant-baseline relative to the dominant-baseline. The shifted object might be a sub- or superscript. Within the shifted element, the whole baseline table is offset; not just a single baseline. For sub- and superscript, the amount of offset is determined from the nominal font of the parent.

Values for the property have the following meaning:

baseline
There is no baseline shift; the dominant baseline remains in its original position.
sub
The dominant baseline is shifted to the default position for subscripts. The offset for this position is determined by the font data for the parent nominal font as adjusted by the dominant baseline-table font-size of the parent element. If there is no applicable font data the User Agent may use heuristic to determine the offset.
super
The dominant baseline is shifted to the default position for superscripts. The offset for this position is determined by the font data for the parent nominal font as adjusted by the dominant baseline-table font-size of the parent element. If there is no applicable font data the User Agent may use heuristic to determine the offset.
<percentage>
The computed value of the property is this percentage multiplied by the computed 'line-height' of the parent element. The dominant-baseline is shifted in the shift-direction (positive value) or opposite to the shift-direction (negative value) of the parent area by the computed value. A value of '0%' is equivalent to 'baseline'.
<length>
The dominant-baseline is shifted in the shift-direction (positive value) or opposite to the shift-direction (negative value) of the parent area by the <length> value. A value of '0cm' is equivalent to 'baseline'.

Note. Although it may seem that 'baseline-shift' and 'alignment-adjust' properties are doing the same thing, there are important differences. For 'alignment-adjust' the percentage values refer to the 'line-height' of the element being aligned. For 'baseline-shift the percentage values refer to the 'line-height' of the parent element. Similarly, it is the 'sub' and 'super' offsets of the parent that are used to align the shifted baseline rather than the 'sub' and 'super' offsets of the element being positioned. To ensure a consistent sub- or superscript position, it makes more sense to use the parent as the reference rather than the subscript element which may have a changed "line-height" due to "font-size" changes in the sub- or superscript element.
Using the "alignment-adjust" property is more suitable for positioning elements, such as graphics, that have no internal textual structure. Using the "baseline-shift" property is intended for sub- and superscripts where the positioned element may itself be textual. The baseline-shift provides a way to define a specific baseline offset other than the named offsets that are defined relative to the dominant-baseline. In addition, having "baseline-shift" makes it easier for tool to generate the relevant properties; many formatting programs already have a notion of baseline shift.

5.8. Vertical alignment: the 'vertical-align' property, relation with the baseline alignment properties

'vertical-align'
Value: auto | auto-script | baseline | sub | super | top | text-top | central | middle | bottom | text-bottom | <percentage> | <length> | inherit
Initial: auto
Applies to: inline-level and 'table-cell' elements
Inherited: no
Percentages: refers to the 'line-height' of the element itself
Media: visual

This property affects the vertical positioning inside a line box of the boxes generated by an inline-level element. The following values only have meaning with respect to a parent inline-level element, or to a parent block-level element, if that element generates anonymous inline boxes; they have no effect if no such parent exists.

Note. Values of this property have slightly different meanings in the context of tables. Please consult the section on table height algorithms for details.

auto
Align the dominant baseline of the parent box with the equivalent, or heuristically reconstructed, baseline of the element. If the box doesn't have a baseline, align the bottom of the box, including its margin with the parent's dominant baseline. If the inline box has several lines (inline block element), use the last line baseline. If there is no parent or if there is a change of flow orientation between this element and its parent, the dominant baseline is set to 'alphabetic' for horizontal flow and 'central' for vertical flow.
use-script
Align the script preferred baseline of the element (determined by first visible character in the element after reordering) with the equivalent baseline of the parent element.
baseline
Align the 'alphabetic' baseline of the box with the 'alphabetic' baseline of the parent box. If the box doesn't have an 'alphabetic' baseline, align the bottom of the box, including its margin for replaced elements, with the parent's 'alphabetic' baseline. The dominant baseline is set to 'alphabetic' if there is no parent or if there is a flow orientation change between this element and its parent, otherwise it is set to 'no-change'.
central
Align the vertical midpoint of the box with the central baseline of the parent box.
middle
Align the vertical midpoint of the box with the baseline of the parent box plus half the x-height of the parent.
sub
Lower the baseline of the box to the proper position for subscripts of the parent's box. (This value has no effect on the font size of the element's text.)
super
Raise the baseline of the box to the proper position for superscripts of the parent's box. (This value has no effect on the font size of the element's text.)
text-top
Align the top of the box with the before-edge of the parent element's font.
text-bottom
Align the bottom of the box with the after-edge of the parent element's font.
<percentage>
Raise (positive value) or lower (negative value) the box by this distance (a percentage of the computed 'line-height' of the element). The value '0%' means the same as 'baseline'.
<length>
Raise (positive value) or lower (negative value) the box by this distance. The value '0cm' means the same as 'baseline'.

The remaining values refer to the line box in which the generated box appears:

top
Align the top of the box with the before-edge of the line box.
bottom
Align the bottom of the box with the after-edge of the line box.

The 'vertical-align' is not a shorthand for the baseline alignment properties as setting them has no effect on the vertical-align property. But setting the vertical-align property can be seen as a macro of these alignment properties as it will set them as follows:

vertical-align value alignment-baseline alignment-adjust baseline-shift dominant-baseline
auto baseline auto baseline auto
baseline baseline auto baseline alphabetic (if no parent or different flow from parent)
no-change (otherwise)
sub baseline auto sub auto
super baseline auto super auto
top before-edge auto baseline auto
text-top text-before-edge auto baseline auto
middle middle auto baseline auto
bottom after-edge auto baseline auto
text-bottom text-after-edge auto baseline auto
<percentage> baseline <percentage> baseline auto
<length> baseline <length> baseline auto

Editor's note: There are the following differences with the XSL definition:

  1. The initial value of alignment-baseline is baseline instead of 'auto-script' (value called 'auto' in XSL). This reflects user agent current practice.

  2. 'vertical-align: baseline' maps to dominant-baseline:'alphabetic' or 'no-change' (instead of auto)

  3. 'vertical-align: auto' (proposed as new initial value) maps to 'dominant-baseline:auto'.

Editor's note: It is tempting to make vertical-align a shorthand property of the four other alignment properties, there are however several issues:

  1. Vertical-align is a simple enumerated property, changing it to a shorthand would create issues from a DOM point of view.

  2. The names of the values of the individual properties are not designed to be used in a single shorthand notation, unless a strict sequence is enforced, like having dominant-baseline first, followed by alignment-base, alignment-adjust and baseline-shift. The usage could be cumbersome.


6. Indentation: the 'text-indent' property

'text-indent'
Value: <length> | <percentage> | inherit
Initial: 0
Applies to: block-level elements
Inherited: yes
Percentages: refers to width of containing block
Media: visual

This property specifies the indentation of the first line of text in a block. More precisely, it specifies the indentation of the first box that flows into the block's first line box. The box is indented with respect to the starting edge of the line box. User agents should render this indentation as blank space.

Values have the following meanings:

<length>
The indentation is a fixed length.
<percentage>
The indentation is a percentage of the containing block width.

The value of 'text-indent' may be negative, but there may be implementation-specific limits.

Example(s):

The following example causes a '3em' text indent.

  P { text-indent: 3em }

7. Line breaking

7.1. Types of line breaking

In documents written in Latin-based languages, where runs of characters make up words and words are separated by spaces or hyphens, line breaking is relatively simple. In the most general case, (assuming no hyphenation dictionary is available to the UA), a line break can occur only at whitespace characters or hyphens, including U+ 00AD SOFT HYPHEN.

In ideographic typography, however, where what appears as a single glyph can represent an entire word and no spaces nor any other word separating characters are needed, a line breaking opportunity is not as obvious as a space. It can occur after or before many other characters. Certain line breaking restrictions still apply, but they are not as strict as they are in Latin typography.

(As a side note, Thai is another interesting example with its own special line breaking rules. Since Thai words are made up of runs of characters, it resembles Latin in that respect. But the lack of spaces as word delimiters, or in fact any consistent word delimiters, makes it similar to CJK. Thai, like Latin in the absence of a hyphenating dictionary, never breaks inside of words. In fact, a knowledge of the vocabulary is necessary to be able to correctly break a line of Thai text.). Finally, the Unicode character: U+200B ZERO WIDTH SPACE can be inserted in such scripts to specify an explicit line breaking opportunity.

A number of levels of line-breaking "strictness" can be used in Japanese typography. These levels add or remove line breaking restric