Re: [CSS21] Spaces, non-breaking spaces, ideographic spaces and the word-spacing property

About http://wiki.csswg.org/spec/css2.1#issue-84

As already stated in the partial resolution, the principle 
of 'word-spacing' (the issue text says 'white-space' but I think that's 
a typo) is that it affects the width of the empty space between "words" 
on a line iff the size of that white space is by nature flexible.

There are several ways to create inline white space. I think we can make 
an explicit list of cases that are affected or not affected:

Not affected by 'word-spacing' are inline spaces that consist solely of 
one or more of the following:

  - 'margin'
  - 'padding'
  - 'border-spacing'
  - the space created by a Unicode character between U+2000 and U+200A,
    inclusive (i.e., EN QUAD, EM QUAD, etc.)
  - the space created by SPACE or TAB (in the source or in 'content')
    when 'white-space' is 'pre' or 'pre-wrap'
  - space created by OGHAM SPACE in the source or in 'content' (note
    that in some fonts it's actually a line, not a space)
  - space created by MEDIUM MATHEMATICAL SPACE in the source or
    in 'content'
  - space created by MONGOLIAN VOWEL SEPARATOR in the source or
    in 'content'
  - NARROW NO-BREAK SPACE in the source or in 'content'
  - ZERO-WIDTH SPACE in the source or in 'content'

(ZERO-WIDTH SPACE, as the name indicates, generates no space, but it may 
be useful to include it here anyway, because it has "space" in the name 
and because when one reads the Unicode spec a bit too quickly it 
appears to say (chapters 11.1 and 16.2) that this space can become 
visible because of justification. On closer reading, that refers only 
to justification by means of letter spacing.)

Affected by 'word-spacing' are all inline spaces that are created by one 
or more of the following:

  - SPACE, TAB, CR or LF in the source or in 'content' that
    are collapsed to a single space because of the setting
    of 'white-space' (i.e., 'normal', 'nowrap' or 'pre-line')
  - NO-BREAK SPACE in the source or in 'content'
  - IDEOGRAPHIC SPACE in the source or in 'content'

'Word-spacing' affects each such inline space only once, even if that 
space is (partially) generated by several of the above.

Not 100% sure about IDEOGRAPHIC SPACE U+3000. Unicode says in one place 
(TR #14) that it can be compressed or expanded, just like SPACE; and in 
another (chapter 6.2) that people use it because it has the same width 
as an ideograph. I'm assuming the latter remarks refers to cases where 
text is displayed as-is, without any formatting.

I don't know how to handle LINE SEPARATOR and PARAGRAPH SEPARATOR. But 
they seem not so important, as they are rare and don't occur in XML.

I have a small doubt about NARROW NO-BREAK SPACE. Its purpose is to put 
half a space between a word and the following punctuation ("espace fine 
insécable," fallen into disuse in most languages, but not in French). I 
think it looks better if that space is kept at a fixed width, but I've 
seen hints that some people consider the narrow space as stretchable as 
the normal space, just narrower.



Bert

PS. The way NO-BREAK SPACE has come to be used in HTML (and subsequently 
defined in CSS) seems not quite correct. It should probably have been 
collapseable, a bit like 'white-space: nowrap' in CSS...

-- 
  Bert Bos                                ( W 3 C ) http://www.w3.org/
  http://www.w3.org/people/bos                               W3C/ERCIM
  bert@w3.org                             2004 Rt des Lucioles / BP 93
  +33 (0)4 92 38 76 92            06902 Sophia Antipolis Cedex, France

Received on Wednesday, 11 February 2009 17:02:35 UTC