This document may contain examples in another language or script.

Use accesskey "n" to jump to the internal navigation links at any point. Right now you can do one of the following:

Go to W3C Home PageGo to Architecture Domain home page  Internationalization 
 

Test results: White space and ideographic text

This page summarises results for a series of tests that seek to establish how user agents support the display of white space associated with passages of text in an ideographic script. Ideographic text doesn't usually include spaces, although spaces may be introduced at boundaries with Latin text or digits (at least, until the CSS autospace property becomes more widely available).

The test pages are:

Expected results

The assumptions we make here about expected behaviour are based on the white-space processing model described in CSS 2.1. This specification is not explicit about the expected detailed behaviour for specific script types, so some assumptions have been borrowed from ongoing work on the CSS3 Text Module, which will specify this more clearly.

The CSS 2.1 spec says that white space should be removed from around linebreaks when the white-space property is set to normal. It then says that a remaining linebreak can be transformed into a space, a zero-width space, or removed, according to UA-specific algorithms. Although it is technically correct to argue that transforming the linebreak into a space in the middle of ideographic text is not counter to the specification, the intent of the removal option is to allow broken lines of ideographic text to be reconstituted as one would expect - ie. with no intervening space.

As will be seen from the results below, major browsers, such as Internet Explorer and Firefox, do behave in this way when there is only a linebreak between runs of ideographic text.

Given the observed behaviour mentioned in the last paragraph, and the recommendation of the white-space processing model to remove white space around a linebreak, it is reasonable to assume that any consecutive white space at the end or beginning of a two lines of ideographic text will be removed.

We also assume that ideographic characters are detected, and that all ideographic characters are treated in the same way as those chosen for this test.

Results

Windows-based user agents were tested on Windows XP.

The results for both test pages were identical (ie. with or without the CSS white-space: normal), so there is only one set of result tables shown.

For each test on each user agent, Y indicates that the result was the same as that expected. Otherwise any deviation from the expected result is described.

TestSpaces in ideographic runLinebreak in ideographic runMultiple spaces at end of lineMultiple spaces at beginning of lineMultiple spaces at beginning and end of lineMultiple linebreaks embedded in ideographic text, with additional spaces between linebreaks
Expected behavioursingle line, one space after 行single line, no spaces after 行single line, no spaces after 行single line, no spaces after 行single line, no spaces after 行single line, no spaces after 行
IE 6.0YYone space after 行one space after 行one space after 行one space after 行
Firefox 1.0PRYYone space after 行one space after 行one space after 行one space after 行
Mozilla 1.7.2YYone space after 行one space after 行one space after 行one space after 行
Navigator 7.1YYone space after 行one space after 行one space after 行one space after 行
Opera 7.54Yone space after 行one space after 行one space after 行one space after 行one space after 行
TestFull-width punctuation at end of lineNon full-width punctuation at end of lineFull-width digits at end of lineNon full-width digits at end of line
Expected behavioursingle line, no spaces after the commasingle line, one space after the commasingle line, no spaces after last digitsingle line, one space after last digit
IE 6.0YYYY
Firefox 1.0PRYYYY
Mozilla 1.7.2YYYY
Navigator 7.1YYYY
Opera 7.54one space after 行Yone space after 行Y
TestConsecutive space between Latin text and spaces between Latin and ideographicLinebreak (and no spaces) between Latin text embedded in ideographicSpace before Latin text at end of previous line, and a space after Latin text
Expected behaviour3 spaces: one before 'Latin', one before 'text' and one after 'text'single line, a space between 'Latin' and 'text'single line, one space before and after 'Latin'
IE 6.0YYY
Firefox 1.0PRYYY
Mozilla 1.7.2YYY
Navigator 7.1YYY
Opera 7.54YYY

Summary

Opera doesn't appear to apply different behaviour to ideographic text and text in a script that uses spaces between words.

Internet Explorer and the Gecko-based browsers perform as expected except where there are spaces alongside linebreaks, in which case a space is introduced into the run of ideographic text.

The implications of this are that authors of Chinese or Japanese text must avoid editing environments that will automatically reformat their text, since leading indentation that uses spaces will introduce unexpected gaps into their text.

In most of the user agents tested, it is possible to use explicit linebreaks within a block of text and still get appropriate results, but not with Opera. If you want your text to look right in Opera, you need to author in an environment that soft wraps your text, and add no explicit linebreaks within a block.

Further reading

Author: Richard Ishida (W3C).

Valid XHTML 1.0!
Valid CSS!
Encoded in UTF-8!

Content created 14 October, 2004. Last update 2004-10-14 13:08 GMT