W3C

WD-i18n-format-19990127


International Layout in CSS

World Wide Web Consortium Working Draft 27-January-1999

This version:
http://www.w3.org/TR/1999/WD-i18n-format-19990127
Latest version:
http://www.w3.org/TR/WD-i18n-format
Also available for local browsing as a Zipped archive
Editor:
Marcin Sawicki (Microsoft)
Additional contributors:
Michel Suignard (Microsoft)
Takao Suzuki (Microsoft)
Chris Wilson (Microsoft)

Copyright  ©  1999 W3C (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.

Status of This Document

This is a W3C Working Draft for review by W3C members and other interested parties. It is a draft document and may be updated, replaced, or obsoleted by other documents at any time. The Internationalization Working Group will not allow early implementation to constrain its ability to make changes to this specification prior to final release. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". A list of current W3C working drafts can be found at http://www.w3.org/TR.

This document has been produced as part of the W3C Internationalization Activity and is related to the Style Activity. Since this proposal predates the efforts on the part of the XSL and CSS&FP groups to create a common formatting model, it focuses on CSS [CSS2] only. It is however the intention of all the groups involved for the model presented in this document and the model being developed by the XSL group to converge. The end result of this convergence is expected to form part of the common formatting model which will be expressed in both the CSS [CSS2] and XSL [XSL] syntaxes. Please send comments and questions regarding this document to i18n-editor@w3.org. Comments in languages other than English, in particular Japanese, are also welcome.


Abstract

The HyperText Markup Language (HTML) is a simple markup language used to create hypertext documents that are portable from one platform to another. HTML documents are SGML documents with generic semantics that are appropriate for representing information from a wide range of applications. CSS is a style sheets language that can be applied to HTML to control the style of a document: which fonts and colors to use, how much white space to insert, etc. The following specification extends CSS to support East Asian and Bi-directional text formatting. Familiarity with both CSS 2 [CSS2] and HTML 4.0 [HTML4] is assumed.


Contents


1.  Introduction

International typography contains types of formatting that are not yet exposed in HTML and thus impossible to achieve on the web without using special workarounds or graphics.

This document introduces a number of new CSS properties to represent such formatting. For example, the features this proposal covers include two of the most important features for East Asian typography: vertical layout flow and layout grid.

Since the various typographical effects and algorithms described here often apply differently to different character sets, the following naming convention is used to indicate the character set which a given property or value applies to or is especially relevant for: if the property or the value name contains the suffix '-ideographic', then it generally applies to all fullwidth characters (Kanji, Katakana, Hiragana, fullwidth Roman) as well as halfwidth Katakana and halfwidth Hiragana, unless otherwise specified.

There is a number of illustrations in this document for which the following legend is used:

Symbolic fullwidth character representation - fullwidth character (e.g. Han) which is the n-th character in the text run
Symbolic halfwidth character representation - halfwidth non-cursive character (e.g. Roman) which is the n-th character in the text run
Symbolic halfwidth cursive character representation - cursive (or RTL) character (e.g. Arabic) which is the n-th character in the text run

The orientation which the above symbols assume in the diagrams corresponds to the orientation that the glyphs they represent are intended to assume when rendered in the UA. Spacing between these characters in the diagrams is usually symbolic, unless intentionally changed to make a point.


2.  Layout Flow

2.1  Types of layout flow

Most Latin based documents use a simple horizontal left-to-right text layout flow in which the next line always appears below the previous one. The example below shows three lines of mixed text in regular horizontal layout flow mode available to web authors today:

Example of mixed Japanese and English in horizontal layout

Figure 2.1.1: Mixed text in horizontal layout

Unfortunately HTML and CSS today provide support only for the above layout scenario. There are several others, however, which are especially important in East Asian documents. They are discussed in the following sections.

2.2  'layout-flow'

Value: horizontal | vertical | vertical-ideographic | horizontal-ideographic
Initial: horizontal
Applies to: all elements
Inherited: yes
Percentage values: N/A

This property sets the layout flow for the element. It is valid on all elements. Possible values:

2.3  Horizontal text in vertical layout

In East Asian documents, it is often preferred to display certain Latin-based strings, such as numerals in a year, always in a horizontal layout flow regardless of the flow mode of the line of text these strings appear in, as in:

Layout of Tate Naka YokoExample of Tate Naka Yoko

Figure 2.3.1: Horizontal in vertical (a.k.a "Tate-naka-yoko")

This effect is known as "Tate naka yoko". In order to achieve it, the Latin string should be enclosed within a SPAN element with a 'layout-flow: horizontal' setting in CSS, as in:

<SPAN style="layout-flow: horizontal">1996</SPAN>

Also, line breaking is normally disabled for such runs of text. This can be accomplished using the CSS "white-space: nowrap" setting [CSS2].

2.4  Relationship with Bidirectionality

The different layout flows discussed in the preceding sections determine the text flow independently of the inherent directionality of the content characters. This means that, unless special formatting is applied to them, Hebrew and Arabic characters will be read from right to left in a horizontal layout flow and bottom to top in the 'vertical-ideographic' layout flow.

The dir attribute will affect the base direction of the element that it is applied to, but will not affect the line to line flow. For example, an element with a 'vertical' (not 'vertical-ideographic') layout flow and an RTL direction will flow top to bottom and left to right. Worth mentioning is the case of Mongolian, related to Right to Left writing systems, that would use the vertical layout flow mode as its normal rendering mode. Blocks of Latin text in Mongolian context would have 'vertical-ideographic' applied to them directly.

Insofar, it has been assumed that ideographs have an inherent LTR directionality, and they are treated as such by the Unicode bi-directional algorithm. However it may be desirable to show ideographs flowing in arbitrary direction as this is found frequently in Asian writing systems. This may be achieved by mixing the layout flow modes, the dir attribute and the BDO element. For example, encapsulating ideographs with a <BDO dir="rtl"> in a horizontal flow will make them flow right to left and top to bottom. A case could also be made of creating other layout flow modes to capture additional writing systems usage.


3.  Document grid

3.1  What is document grid?

It is very common for the characters in documents written in East Asian languages, such as Chinese or Japanese, to be laid out on the page according to a specified one- or two-dimensional grid. The concept of grid can also be used in other, non-ideographic contexts such as Braille or monospaced layout.

The diagram below represents a fragment of horizontal text on a page with mixed fullwidth and halfwidth characters that a Japanese user intended to be laid out on a grid which resulted in 9 characters per line (gray grid lines shown for clarity):

Example of strict (genko) grid applied to mixed Japanese and English in horizontal layout

Figure 3.1.1: 'Genko' grid applied to mixed text

The grid affects not only the placement of the characters, but it can also modify the behavior of several other layout-related behaviors, such as indent size, margins or paragraph alignment.

One can distinguish between three types of grid: a strict one, used mostly in Chinese, but also occasionally in Japanese (a.k.a. "genko"), a loose one, frequently used in Japanese and sometimes in Korean, as well as a fixed one, potentially useful for non-ideographic text, such as Braille or mono-spaced layout in general.

The grid type entails a set of layout rules that determine how much flexibility the UA is allowed to have when laying out a line of text.

Different grids can be defined for different parts of the document.

The grid can be selectively disabled in either dimension on fragments of text.

Line grid can be disabled for individual paragraphs. If line grid is disabled for a paragraph, the lines of the paragraph are laid out just as if no line grid were specified. The characters in a paragraph with line grid disabled still follow the character grid, if one is specified.

The CSS model described in this section exposes the necessary grid parameters the author needs to control.

3.2  'layout-grid-type'

Value: loose | strict | fixed
Initial: loose
Applies to: block-level elements
Inherited: yes
Percentage values: N/A

Specifies the type of grid to use. Each grid type entails a different set of rules for rendering contents when a grid is enabled and specified. Possible values:

3.3  'layout-grid-line'

Value: none | auto | <length> | <percentage>
Initial: none
Applies to: block-level elements
Inherited: yes
Percentage values: relative to element height

This property sets the line grid value for an element. If the text layout flow of the element is horizontal, this property can be thought of as the "vertical" grid size or grid height. In other words, it always determines the line spacing increment, regardless of the layout flow mode. Its effect is visually somewhat similar to the effect of applying a 'line-height' value to an element. The following table shows the mapping between each of the 'layout-flow' values and the meaning of the 'layout-grid-line' property:

'layout-flow' Meaning of 'layout-grid-line'
horizontal vertical height of grid space
vertical horizontal width of grid space
vertical-ideographic horizontal width of grid space
horizontal-ideographic vertical width of grid space

Figure 3.3.1: Mapping between 'layout-flow' and the interpretations of 'layout-grid-line'

Note that in order for this property to have an effect, 'layout-grid-mode' must be set to 'line' or 'both'.

When this property is set to anything other than 'none', a line of text is vertically centered within the grid row and baseline-aligned by default. If the line contains a character or an object that is taller than the grid space, then the whole line is centered within the smallest number of grid rows necessary for its tallest object to fit in. This is illustrated below, where a represents the numerical 'layout-grid-line' value:

Layout of contents within grid showing the contents vertically centered within their grid rows

Figure 3.3.2: Layout of contents within line grid, where a represents the layout-grid-line value

Possible values:

The following markup:

DIV.section1 { layout-grid-line: .5in }

would make each line of text in a horizontally (including '-ideographic') laid out section of a document to be rendered within 0.5 inch of vertical space. It is also equivalent to having a line-height of 0.5 in, as shown below:

Example of a layout-grid-line setting applied to mixed Japanese and English text in horizontal layout

Figure 3.3.3: Enlarged line grid applied to mixed text in horizontal layout

If the section's layout flow is vertical (including '-ideographic'), then 0.5in is the width of each column of vertical text. This time, the 0.5in value applies to the 'width' of each cell:

Example of a layout-grid-line setting applied to mixed Japanese and English text in vertical-ideographic layout

Figure 3.3.4: Enlarged line grid applied to mixed text in vertical-ideographic layout

If the author preferred a specific number of lines (20 for example) to appear in an element, he would use a percentage value:

DIV.section1 { layout-grid-line: 5% }

3.4  'layout-grid-char'

Value: none | auto | <length> | <percentage>
Initial: none
Applies to: block-level elements
Inherited: yes
Percentage values: relative to element width

This property affects the dimension perpendicular to that controlled by 'layout-grid-line'. It controls the character (or "horizontal", if in horizontal layout) grid size for an element if the 'layout-grid-type' property is set to 'strict' or 'fixed'. However, if 'layout-grid-type is 'loose', then this property sets the size of the increment added to each fullwidth character, and, indirectly, of that added to each halfwidth character, as per the description in the specification of 'layout-grid-type'. Its effect in 'loose' grid is somewhat similar to the effect of the 'letter-spacing' property.

Note that in order for this property to have an effect, 'layout-grid-mode' must be set to 'char' or 'both'.

Possible values:

DIV.section1 { layout-grid-char: .5in }

would make each character in a horizontally laid out part of a document rendered within 0.5 inch of horizontal space:

Example of a layout-grid-char setting applied to mixed Japanese and English text in horizontal layout

Figure 3.4.1: Enlarged character grid applied to mixed text in horizontal layout

If the section's layout flow is vertical, then 0.5in becomes the vertical distance between consecutive characters in a column:

Example of a layout-grid-char setting applied to mixed Japanese and English text in vertical-ideographic layout

Figure 3.4.2: Enlarged character grid applied to mixed text in vertical-ideographic layout

If the author preferred a specific number of characters (5 for example) to appear in a line, he would set the character grid to a percentage value:

DIV.section1 { layout-grid-char: 20% }

3.5  'layout-grid-mode'

Value: none | line | char | both
Initial: both
Applies to: all elements
Inherited: yes
Percentage values: N/A

This property selectively enables or disables the two dimensions of the grid. Possible values:

3.6  'layout-grid'

Value: none | [<mode> || <type> || [<line> [<char>]? ] ]
Initial: not defined for shorthand properties
Applies to: all elements
Inherited: yes
Percentage values: allowed on <char> and <line>

The 'layout-grid' property is a shorthand property for setting 'layout-grid-mode', 'layout-grid-type', 'layout-grid-line' and 'layout-grid-char' at the same time in the style sheet. Using the value 'none' on the shorthand property sets the 'layout-grid-mode' to 'none'. Using the value "none none" sets both the 'layout-grid-mode' and 'layout-grid-line' to 'none', and using the value "none none none" sets the previous properties as well as 'layout-grid-char' to 'none'.

The first numerical, percentage or 'auto' value specified sets 'layout-grid-line'. If a second numerical, percentage or 'auto' value is present, it sets 'layout-grid-char'. For example:

DIV.section1 { layout-grid: both strict .5in 20% }

The 'layout-grid' property above is set to have the 'layout-grid-type' set to 'strict', 'layout-grid-mode' to 'both', 'layout-grid-line' to 0.5in and the 'layout-grid-char' to 20% of the parent width.

Notes:

3.7  The 'gd' length unit

The existence of a grid in an element makes it possible and very useful to express various measurements in that element in terms of grid units. Grid units are used very frequently in East Asian typography, especially for the left, right, top and bottom element margins.

Therefore a new length unit is necessary: gd to enable the author to specify the various measurements in terms of the grid.

For example, consider the following style:

P { layout-grid: strict both 20pt 15pt; margin: 1gd 3gd 1gd 2gd }

This way, all P elements would effectively acquire a 15pt top margin, a 60pt right margin, a 15pt bottom margin and a 40pt left margin.

If no grid is specified, the gd unit should be treated the same as the em unit.


4.  Line breaking

4.1  Types of line breaking

In documents written in Latin-based languages, where runs of characters make up words and words are separated by spaces or hyphens, line breaking is relatively simple. In the most general case, (assuming no hyphenation dictionary is available to the UA), a line break can occur only at whitespace characters or hyphens.

In ideographic typography, however, where what appears as a single glyph can represent an entire word and no spaces nor any other word separating characters are needed, a line breaking opportunity is not as obvious as a space. It can occur after or before many other characters. Certain line breaking restrictions still apply, but they are not as strict as they are in Latin typography.

(As a side note, Thai is another interesting example with its own special line breaking rules. Since Thai words are made up of runs of characters, it resembles Latin in that respect. But the lack of spaces as word delimiters, or in fact any consistent word delimiters, makes it similar to CJK. Thai, like Latin in the absence of a hyphenating dictionary, never breaks inside of words. In fact, a knowledge of the vocabulary is necessary to be able to correctly break a line of Thai text.)

A number of levels of line-breaking "strictness" can be used in Japanese typography. These levels add or remove line breaking restrictions. The model presented in this specification distinguishes between two most commonly used line breaking levels for Japanese text, using the 'line-break' property.

In ideographic typography, it is also possible, though not always preferred, to allow line breaks to occur inside of quoted Latin and Hangul (Korean) words without following the line breaking rules of those particular scripts. The model proposed in this document gives the author control over that behavior through the 'word-break' property.

4.2  'line-break'

Value: normal | strict
Initial: normal
Applies to: all elements
Inherited: yes
Percentage values: N/A

This property selects the line breaking level for CJK text. This functionality is especially useful to Japanese authors. There are two possibilities for this behavior:

In Japanese, a set of line breaking restrictions is referred to as "Kinsoku". JIS X-4051 [JIS] is a popular source of reference for this behavior using the strict set of rules. This architecture involves character classification into line breaking behavior classes. Those classes are then analyzed in a two dimensional behavior table where each row-column position represents a pair action to be taken at the occurrence of these classes. For example, given a closing character class and an opening character class, the intersection in that table of these two classes (the first character belonging to the opening class and the second belonging to the closing class) will indicate no line breaking opportunity.

Note that both values, 'normal' and 'strict' imply that a set of line-breaking restrictions is in use. In fact, there appears to be no valid line breaking mode in CJK in which line breaks can appear just anywhere among ideographs.

4.3  'word-break'

Value: normal | break-all | keep-all
Initial: normal
Applies to: block-level elements
Inherited: yes
Percentage values: N/A

This property controls line-breaking behavior inside of words. This functionality is especially useful to Korean authors. Possible values:

P.anywordbreaks { word-break: break-all }

5.  Justification behaviors

5.1  'text-justify'

Value: auto | inter-word | inter-ideograph | distribute | distribute-all-lines | newspaper
Initial: auto
Applies to: block-level elements
Inherited: yes
Percentage values: N/A

This property selects the type of justify alignment. It affects the text layout only if 'text-align' is set to 'justify'. That way, UA's that do not support this property will still render the text as fully justified, which most of the time is at least partially correct.

The possible values are:

5.2  'text-justify-trim'

Value: none | punctuation | punct-and-kana
Initial: punctuation
Applies to: block-level elements
Inherited: yes
Percentage values: N/A

This sets the individual font blank space compression permissions for the text justification algorithm, when 'text-justify' is anything other than 'inter-word'. This special type of space compression occurs on the font level, i.e. the blank space within the character area itself may be reduced without affecting the appearance of the glyph. This applies to full-width characters only. Possible values:

5.3  'text-kashida'

Value: <percentage>
Initial: 0%
Applies to: block-level elements
Inherited: yes
Percentage values: as described

This property determines the minimum percentage of the text area width to be used for distribution among the "elongation opportunities" in Arabic text, when one of the justification modes is selected. Each elongation can be accomplished using a number of kashida characters or a single graphic, if the UA is capable of creating such a graphic. (The font itself determines the exact appearance of the kashida)

The UA is free to determine whether spaces inside of Latin text should be treated as elongation opportunities as well (and elongated using blank space) or not.

In the diagram below showing two identical paragraphs of Arabic text, the blue line in the second line (not justified) shows the length that is allocated for kashida and divided among the elongation opportunities in the first line (justified), as indicated by the red underlines:

Example of kashida applied to Arabic text

Figure 5.3.1: Kashida applied to Arabic text


6.  Miscellaneous text formatting

6.1  'punctuation-wrap'

Value: simple | hanging
Initial: simple
Applies to: block-level elements
Inherited: yes
Percentage values: N/A

This property determines whether a punctuation mark, if one is present, can be placed in the margin area at the end of a full line of text, or not. This is a common setting in East Asian typography. Possible values:

6.2  'punctuation-trim'

Value: none | leading
Initial: none
Applies to: block-level elements
Inherited: yes
Percentage values: N/A

This property determines whether or not a fullwidth punctuation mark character should be trimmed if it appears at the beginning of a line, so that its "ink" lines up with the first character in the line above and below. In some scenarios, it may be preferable for the author not to allow leading punctuation marks to be trimmed, for example when it is more important that the characters tend to line up vertically. In other scenarios such an effect is desirable, for example when it is more important for the author that as much text as possible fits on a single line. Possible values:

6.3  'text-combine'

Value: none | letters | lines
Initial: none
Applies to: all elements
Inherited: no
Percentage values: N/A

This property controls the creation of composite characters (a.k.a. "kumimoji") or lines (a.k.a. "warichu").

Possible values:

6.4  'font-emphasize-style'

Value: none | accent | dot | circle | disc
Initial: none
Applies to: all elements
Inherited: yes
Percentage values: N/A

This property sets the style for the emphasis formatting applied to text. East Asian documents use the following symbols on top of each character to emphasize a run of text: an 'accent' symbol, a 'dot', a hollow 'circle', or a solid 'disc'.

For example:

Example of emphasis in Japanese appearing above the text

Figure 6.4.1: Accent emphasis (shown in blue for clarity) applied to Japanese text

Note, that unlike 'text-decoration', this property can affect the line height. Furthermore the emphasis style should be distinguished from the text-decoration which is another method to 'emphasize' text content.

6.5  'font-emphasize-position'

Value: above | below
Initial: above
Applies to: all elements
Inherited: yes
Percentage values: N/A

This property sets the position of the emphasis symbols. They can appear either 'above' or 'below' the emphasized run of text. 'Above' and 'below' should be understood as relative to the line baseline. In a vertical layout flow, the symbols would appear respectively on the right or on the left.

In Japanese for example, the preferred position is 'above' when in horizontal layout:

Example of emphasis in Japanese appearing above the text

Figure 6.5.1: Emphasis (shown in blue for clarity) applied above a fragment of Japanese text

In Chinese used in the PRC, on the other hand, the preferred position is 'below' when in horizontal layout:

Example of emphasis in Chinese appearing below the text

Figure 6.5.1: Emphasis (shown in blue for clarity) applied below a fragment of Chinese text

6.6  'font-emphasize'

Value: <style> || <position>
Initial: not defined for shorthand properties
Applies to: all elements
Inherited: yes
Percentage values: N/A

This property is shorthand for 'font-emphasize-style' and 'font-emphasize-position'.

6.7  'text-autospace

Value: none | [ideograph-numeric || ideograph-alpha || ideograph-space || ideograph-parenthesis]
Initial: none
Applies to: all elements
Inherited: yes
Percentage values: N/A

When a run of non-ideographic or numeric characters appears inside of ideographic text, a certain amount of space is often preferred on both sides of the non-ideographic text to separate it from the surrounding ideographic characters. This property controls the creation of that space when rendering the text. That added width does not correspond to the insertion of additional space characters, but instead to the width increment of existing characters.

(A commonly used algorithm for determining this behavior is specified in JIS X-4051 [JIS].)

Possible values:

<SPAN style="text-autospace:none">[ideographs]1997[ideographs]</SPAN>

would appear as:

Diagram of character layout without autospaceExample of Japanese text mixed with a number without autospace

Figure 6.7.1: Mixed character layout when autospace is disabled

while:

<SPAN style="text-autospace:ideograph-numeric">[ideographs]1997[ideographs]</SPAN>

would appear more like:

Diagram of character layout with autospaceExample of Japanese text mixed with a number without autospace

Figure 6.7.2: Mixed character layout when autospace is enabled

6.8  'text-fit'

Value: auto | <length>
Initial: auto
Applies to: inline elements
Inherited: no
Percentage values: relative to line width

This property controls the amount of space a run of text is to fill or fit into. If the specified amount is greater than that required by the text, the characters are evenly distributed across that space. If the specified amount is less than that required by the text, the glyphs are scaled horizontally so as to make the text fit within the specified space.

The value 'auto' indicates that no special fill/fit behavior is to take place.

SPAN.fitinseven { text-fit: 7em }

would cause a word to be rendered in the space of 7 'm' characters by adding inter-letter spacing.


7.  Input Filtering

This is a placeholder.


8.  Ruby

8.1  What is ruby?

"Ruby" is the commonly used name for a run of text that appears in the immediate vicinity of another run of text, referred to as the "base", and serves as an annotation or a pronunciation guide associated with that run of text. Ruby, as used in Japanese, is described in JIS X-4051 [JIS]. The ruby structure and the HTML markup to represent it is described in the Ruby specification [RUBY]. This section describes the CSS properties relevant to ruby.

Example of ruby applied on top of a Japanese expression

Figure 8.1.1: Labeled example of ruby used in Japanese

The following is the box representation of the ruby element from the Ruby specification, that is also used in the sections below to help illustrate the effects of the CSS properties:

Diagram showing the three boxed in the ruby box model

Figure 8.1.2: Ruby box model

8.2  'ruby-position'

Value: above | inline
Initial: above
Applies to: ruby element
Inherited: yes
Percentage values: N/A

This property is used on the ruby [RUBY] element to control the position of the ruby text with respect to its base. Possible values:

8.3  'ruby-align'

Value: auto | left | center | right | distribute-letter | distribute-space | line-edge
Initial: auto
Applies to: all elements
Inherited: yes
Percentage values: N/A

This property can be used on any element to control the text alignment of the ruby text and ruby base contents relative to each other. It applies to all the ruby's in the element. The alignment is applied to the ruby child element whose content is shorter: either the rb or the rt [RUBY]. Possible values:

8.4  'ruby-overhang'

Value: auto | none
Initial: auto
Applies to: ruby element
Inherited: yes
Percentage values: N/A

This property determines whether ruby text is allowed to partially overhang any adjacent text in addition to its own base, when the ruby text is wider than the ruby base. Possible values:

8.5  New 'display' values

Value: ruby-text | ruby-base | ...

These two new values are added to the existing 'display' property to represent the rt and rb [RUBY] elements respectively. That way any element (e.g. SPAN) could be made to behave as ruby via CSS.

9.  Glossary

"Bopomofo"
37 characters and 4 tone markings used as phonetics in Chinese, especially standard Mandarin.
"Hangul"
Subset of the Korean writing system.
"Hanja"
Subset of the Korean writing system that utilizes ideographic characters borrowed or adapted from the Chinese writing system. Also see Kanji.
"Hiragana"
Subset of the Japanese writing system consisting of phonetic characters to represent Roman words. Also see Katakana.
Ideogram, Ideograph
Character in the Chinese (or East Asian in general) writing system that represents a thing or an idea but not a particular word or phrase for it.
"Kana"
Syllabic subset of the Japanese system of writing that can be used exclusively for writing foreign words or in combination with kanji.
"Kanji"
Subset of the Japanese writing system that utilizes ideographic characters borrowed or adapted from Chinese writing. Also see Hanja.
"Kashida"
Arabic elongation character.
"Katakana"
Subset of the Japanese writing system consisting of phonetic characters used to represent Japanese words. Also see Hiragana.
"Kinsoku"
Japanese term for a set (or sets) of line breaking restrictions.
"Kumimoji"
Composite character consisting of up to 5 characters that are reduced in size and combined to fit within the space of a single character.
Logograph, Logogram
Character in the Chinese (or East Asian in general) writing system that represents an entire word.
Ruby
A run of text that appears in the vicinity of another run of text and serves as an annotation or a pronunciation guide for that text.
"Tate naka yoko"
Run of horizontal text inside of a column of vertical text; frequently used in East Asian documents for displaying certain numbers, such as years.
"Warichu"
A run of text of reduced font size that appears inside of a line of text as two lines of equal height and length whose combined height is equal to the height of the line they appear in.

Acknowledgements

This specification would not have been possible without the help from:

Ayman Aldahleh, Bert Bos, Martin Dürst, Laurie Anna Edlund, Ben Errez, Yaniv Feinberg, Arye Gittelman, Richard Ishida, Koji Ishii, Masayasu Ishikawa, Michael Jochimsen, Eric LeVine, Chris Pratley, Rahul Sonnad, Frank Tang, Chris Thrasher.


References

[CSS2]
Cascading Stylesheets, level 2 (CSS2) Specification, W3C Recommendation
Bert Bos, Hċkon Wium Lie, Chris Lilley and Ian Jacobs, 12 May 1998
Available at: http://www.w3.org/TR/REC-CSS2
[HTML4]
HTML 4.0 Specification, W3C Recommendation
Dave Raggett, Arnaud Le Hors and Ian Jacobs, 18 December 1997, revised 24 April 1998
Available at: http://www.w3.org/TR/REC-html40
[JIS]
Line composition rules for Japanese documents
JIS X 4051-1995, Japanese Standards Association, 1995 (in Japanese)
[RUBY]
Ruby, W3C Working Draft
Marcin Sawicki, 21 December 1998
Available at: http://www.w3.org/TR/WD-ruby
[XSL]
Extensible Stylesheet Language (XSL), W3C Working Draft
James Clark, Stephen Deach, 16 December 1998
Available at: http://www.w3.org/TR/WD-xsl