RE: [css3-text] line-break questions/comments

Hi Glenn, thank you for looking into this and wonderful feedbacks.

> (1) "known to be Chinese or Japanese" is not defined in a manner
> sufficient to obtain testability or interoperability at any level; some
> default algorithm should be defined, e.g., "use the 'lang' attribute ..."
> or "use the default language of the font if any" or "if there are any
> hiragana or katakana character, then treat as Japanese; if any
> hangul character, treat as Korean, otherwise ...", etc

This refers to content language[1], and when such is not in the document, the spec says "it is possible for the content language of an element to be unknown", so this portion does not apply. This part of the spec is informative (as it is recommended) so UA may rely on other methods to determine if unknown such as automatic language detection.

I guess we should change the "language" to "content language" with link to the terminology.

> (2) line-break support is optional but recommended for CJK markets;
> however, it is unclear whether its rules are intended to be applied in
> the absence of "known to be Chinese or Japanese"; e.g., if in a UA
> that supports line-break, the default algorithm for "known to be
> Chinese or Japanese" returns false (e.g., if the entire text is
> "A‥‥B"), then does the rule forbidding a break
> between ‥ characters still apply when line-break:strict?

Yes. Code points that may introduce unexpected behavior are under "if the language is known to be ..." and outside of that are either good or do no harm to apply regardless of scripts.

> (3) speaking of "breaks between some inseparable characters: ‥ U+2025,
> … U+2026" what exactly does "between" mean here? does it mean
> between only the following four pairs or something else?
>
> ‥‥
> ‥…
> …‥
> ……

Correct. This refers to IN (Inseparable Characters)[2] class in UAX#14.

> (4) is it permissible for 'auto' behavior to differ from all of
> normal|strict|loose? e.g., map to 'foo' (where foo is defined internally by UA)?

I didn't think about this, but as far as spec says, I think yes. From author perspective, I think yes too; authors should use the property if they want specific behavior, possibly along with lang attribute.

> (5) regarding "breaks before postfixes", what if there is nothing prior
> to the postfix or nothing prior within the same element? e.g., if we have
>
> <span style="line-break:strict">
>  <span>X</span><span>%</span>
> </span>
>
> then is a break permitted before the "[don't] break before postfix" '%'?

The line break rules should apply cross-elements boundary, so the rule should apply in this case too. I know some implementations are broken in this regard though. As far as I discussed this with fantasai last time, 5.1. Line Breaking Details[3] says "a replaced element or other atomic inline is equivalent to that of the Object Replacement Character (U+FFFC)" so if one of the adjacent elements are inline-block, this will not apply.

> (6) same question as (5) for "breaks after prefixes", substituting after for before?
>
> <span style="line-break:strict">
>  <span>$</span><span>X</span>
> </span>
> then is a break permitted after the "[don't] break after prefix" '$'?

Same as (5). There are use cases like this:
  <p><ruby>base<rt>r</rt><ruby>.</p>
We don't want to break before the period.

> (7) what is behavior when different line-break modes apply to adjacent text? e.g.
>
> <span style="line-break:loose">$</span><span style="line-break:strict">%</span>
>
> <span style="line-break:strict">$</span><span style="line-break:loose">%</span>

That is a really good question. I thought I discussed this with fantasai and defined but it looks like it was a dream...

Take stricter one works good for me. If it works good for everyone, I'll add this to the spec.

[1] http://dev.w3.org/csswg/css3-text/#content-language

[2] http://unicode.org/reports/tr14/#IN

[3] http://dev.w3.org/csswg/css3-text/#line-break-details


Regards,
Koji

Received on Sunday, 26 August 2012 04:36:42 UTC