Brief   Full   Jump  

Small
Medium
Large

Teal
High contrast
Bluish
Black

Sans-serif
Serif
Monospaced
Close
d
?
Styles

I18N-ISSUE-316

13 messages.

[css-text] I18N-ISSUE-316: Line breaking defaults
"Phillips, Addison"   Fri, 24 Jan 2014 18:22:32 +0000

www-style > January 2014 > 0000.html

Received on Friday, 24 January 2014 18:27:35 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: www-style@w3.org, www-style@w3.org
Copied to: www-international@w3.org.

State: OPEN WG Comment Product: CSS3-text Raised by: Richard Ishida Opened on: 2013-12-11 Description: 5. Line Breaking and Word Boundaries http://www.w3.org/TR/2013/WD-css-text-3-20131010/#line-breaking "CSS does not fully define where soft wrap opportunities occur, however some controls are provided to distinguish common variations" I think that, for the sake of interoperability, the CSS spec should require the use of UAX14 as a default for line breaking behaviour. It should also state that the rules in UAX14 may need tailoring for certain scripts, and that the properties specified in this section assist the user in controlling line breaking behaviour. Text in the spec such as the definition of word-break: normal, which says "Words break according to their usual rules", would then provide a little more guidance to the implementor.
Re: [css-text] I18N-ISSUE-316: Line breaking defaults
Koji Ishii   Sun, 20 Apr 2014 16:20:29 +0000

www-style > April 2014 > 0000.html

Received on Sunday, 20 April 2014 16:21:05 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: addison@lab126.com
Copied to: www-style@w3.org, www-style@w3.org, www-international@w3.org.

Thank you for the feedback, this issue was also pointed out by DPUB IG[1] and was fixed. [1] http://lists.w3.org/Archives/Public/www-style/2014Apr/0262.html /koji On Jan 25, 2014, at 3:22 AM, Phillips, Addison <addison@lab126.com> wrote: > State: > OPEN WG Comment > Product: > CSS3-text > Raised by: > Richard Ishida > Opened on: > 2013-12-11 > Description: > 5. Line Breaking and Word Boundaries > http://www.w3.org/TR/2013/WD-css-text-3-20131010/#line-breaking > > "CSS does not fully define where soft wrap opportunities occur, however some controls are provided to distinguish common variations" > > I think that, for the sake of interoperability, the CSS spec should require the use of UAX14 as a default for line breaking behaviour. It should also state that the rules in UAX14 may need tailoring for certain scripts, and that the properties specified in this section assist the user in controlling line breaking behaviour. > > Text in the spec such as the definition of word-break: normal, which says "Words break according to their usual rules", would then provide a little more guidance to the implementor.
Re: [css-text] I18N-ISSUE-316: Line breaking defaults
Koji Ishii   Thu, 8 May 2014 09:03:37 +0000

www-style > May 2014 > 0000.html

Received on Thursday, 8 May 2014 09:04:13 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: addison@lab126.com
Copied to: www-style@w3.org, www-style@w3.org, www-international@w3.org.

By reading the comment again, I think the fix for DPUB IG does not cover all the comments in this thread, so I’ll keep this issue open. /koji On Apr 20, 2014, at 9:20, Koji Ishii <kojiishi@gluesoft.co.jp> wrote: > Thank you for the feedback, this issue was also pointed out by DPUB IG[1] and was fixed. > > [1] http://lists.w3.org/Archives/Public/www-style/2014Apr/0262.html > > /koji > > On Jan 25, 2014, at 3:22 AM, Phillips, Addison <addison@lab126.com> wrote: > >> State: >> OPEN WG Comment >> Product: >> CSS3-text >> Raised by: >> Richard Ishida >> Opened on: >> 2013-12-11 >> Description: >> 5. Line Breaking and Word Boundaries >> http://www.w3.org/TR/2013/WD-css-text-3-20131010/#line-breaking >> >> "CSS does not fully define where soft wrap opportunities occur, however some controls are provided to distinguish common variations" >> >> I think that, for the sake of interoperability, the CSS spec should require the use of UAX14 as a default for line breaking behaviour. It should also state that the rules in UAX14 may need tailoring for certain scripts, and that the properties specified in this section assist the user in controlling line breaking behaviour. >> >> Text in the spec such as the definition of word-break: normal, which says "Words break according to their usual rules", would then provide a little more guidance to the implementor. > >
Re: [css-text] I18N-ISSUE-316: Line breaking defaults
Koji Ishii   Sat, 10 May 2014 07:32:39 +0000

www-style > May 2014 > 0000.html

Received on Saturday, 10 May 2014 07:33:18 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: addison@lab126.com
Copied to: www-style@w3.org, www-style@w3.org, www-international@w3.org.

My first response was actually the response for I18N-ISSUE-314, which fantasai also responded. Re-replying for the original issue: > I think that, for the sake of interoperability, the CSS spec should require the use of UAX14 as a default for line breaking behaviour. It should also state that the rules in UAX14 may need tailoring for certain scripts, and that the properties specified in this section assist the user in controlling line breaking behaviour. That makes sense, though, we can’t do that today for web-compatibility. We’ll keep considering this in future. /koji On May 8, 2014, at 2:03, Koji Ishii <kojiishi@gluesoft.co.jp> wrote: > By reading the comment again, I think the fix for DPUB IG does not cover all the comments in this thread, so I’ll keep this issue open. > > /koji > > On Apr 20, 2014, at 9:20, Koji Ishii <kojiishi@gluesoft.co.jp> wrote: > >> Thank you for the feedback, this issue was also pointed out by DPUB IG[1] and was fixed. >> >> [1] http://lists.w3.org/Archives/Public/www-style/2014Apr/0262.html >> >> /koji >> >> On Jan 25, 2014, at 3:22 AM, Phillips, Addison <addison@lab126.com> wrote: >> >>> State: >>> OPEN WG Comment >>> Product: >>> CSS3-text >>> Raised by: >>> Richard Ishida >>> Opened on: >>> 2013-12-11 >>> Description: >>> 5. Line Breaking and Word Boundaries >>> http://www.w3.org/TR/2013/WD-css-text-3-20131010/#line-breaking >>> >>> "CSS does not fully define where soft wrap opportunities occur, however some controls are provided to distinguish common variations" >>> >>> I think that, for the sake of interoperability, the CSS spec should require the use of UAX14 as a default for line breaking behaviour. It should also state that the rules in UAX14 may need tailoring for certain scripts, and that the properties specified in this section assist the user in controlling line breaking behaviour. >>> >>> Text in the spec such as the definition of word-break: normal, which says "Words break according to their usual rules", would then provide a little more guidance to the implementor. >> >> >
Re: [css-text] I18N-ISSUE-316: Line breaking defaults
Richard Ishida   Fri, 23 May 2014 15:24:58 +0100

www-style > May 2014 > 0000.html

Received on Friday, 23 May 2014 14:25:36 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: kojiishi@gluesoft.co.jp, addison@lab126.com
Copied to: www-style@w3.org, www-style@w3.org, www-international@w3.org.

On 10/05/2014 08:32, Koji Ishii wrote: > That makes sense, though, we can’t do that today for web-compatibility. We’ll keep considering this in future. Could you give some more details? Thanks, RI
Re: [css-text] I18N-ISSUE-316: Line breaking defaults
Koji Ishii   Sun, 25 May 2014 05:28:25 +0000

www-style > May 2014 > 0000.html

Received on Sunday, 25 May 2014 05:29:03 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: ishida@w3.org
Copied to: addison@lab126.com, www-style@w3.org, www-style@w3.org, www-international@w3.org.

On May 23, 2014, at 11:24 PM, Richard Ishida <ishida@w3.org> wrote: > On 10/05/2014 08:32, Koji Ishii wrote: >> That makes sense, though, we can’t do that today for web-compatibility. We’ll keep considering this in future. > > Could you give some more details? Sorry I was too terse. What implementers want to know are what to fix, rather than asking them to scrap-and-rebuild existing line breaking code. There are too many documents on the web that rely on existing behavior, I think we came too far from where scrap-and-rebuild can give better results than fixing issues. I’m very happy to hear feedback where existing implementations do differently from UAX#14, so that we could examine each issue and decide whether or how to fix them. One example was from Kenny last year[1], where IE, Chrome, Safari, and Firefox do not honor line breaking behavior between &nbsp; and replaced elements. We fixed this by changing rule priorities of LB20 and LB11/LB12 since it was considered that the benefits of following UAX#14 for this specific issue is lower than the bad impacts by changing this behavior for existing documents. Note that this is not a WG resolution, editors thought this is the right thing to do and no objections so far (or no attentions yet ;), so please re-raise if our judge does not seem to be right. But I hope this example makes sense to you. As we get more specific feedback and the spec level increases, we could add more to the Line Breaking Details[2] if we had real specific issues and people agrees that benefits win over the breaking changes for that issue. At the point that implementers think that their existing code have incorporated all the changes to be conformant to UAX#14, we could remove all such details and just say “follow UAX#14”, but until then, we would like to know what to fix against existing implementations. [1] http://lists.w3.org/Archives/Public/www-style/2014May/0069.html [2] http://dev.w3.org/csswg/css-text/#line-break-details /koji
Re: [css-text] I18N-ISSUE-316: Line breaking defaults
Richard Ishida   Fri, 25 Jul 2014 19:22:34 +0100

www-style > July 2014 > 0000.html

Received on Friday, 25 July 2014 18:23:10 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: kojiishi@gluesoft.co.jp
Copied to: addison@lab126.com, www-style@w3.org, www-style@w3.org, www-international@w3.org.

On 25/05/2014 06:28, Koji Ishii wrote: > I’m very happy to hear feedback where existing implementations do differently from UAX#14, so that we could examine each issue and decide whether or how to fix them. That information is available as follows: For general characters: Line break, BA: Break after characters http://www.w3.org/International/tests/repository/css3-text/line-break-baspglwj/results-ba#ba_space (good support on the whole, but some categories not or half-heartedly supported by Firefox and IE - seems like just a question of adding them to a list somewhere) SP, ZW: Non-tailorable spaces http://www.w3.org/International/tests/repository/css3-text/line-break-baspglwj/results-gl-wj#sp (all supported) GL: Non-breaking ("Glue") http://www.w3.org/International/tests/repository/css3-text/line-break-baspglwj/results-gl-wj#gl (all supported, except for 3 tibetan chars in FF and IE) WJ: Word joiner http://www.w3.org/International/tests/repository/css3-text/line-break-baspglwj/results-gl-wj#wj (all supported) For CJK in the default case: OP: Opening punctuation, CL: Closing punctuation & NS: Non-starters http://www.w3.org/International/tests/repository/css3-text/line-break-opclns/results-opclns (good support for Chrome, Safari & Opera - significant gaps but also a fair amount of support from FF and IE - again, maybe just need a list updating?) NS: Non-starters, small kana http://www.w3.org/International/tests/repository/css3-text/line-break-opclns/results-opclns#kana (full support by FF and Safari, but no support for Chrome, Safari & Opera) This last, small category appears to be the only one where systematic differences appear for the different browsers*. My guess is that, for the other characters, we are rather looking at a lack of items in a list. Hope that helps, RI * There are a set of tests for the line-break property for Japanese http://www.w3.org/International/tests/repository/css3-text/line-break-jazh/results-ja and Chinese http://www.w3.org/International/tests/repository/css3-text/line-break-jazh/results-zh which appear to bear out this difference in philosophy. PS: All the above tests have been copied to the CSS Test Suite.
Re: [css-text] I18N-ISSUE-316: Line breaking defaults
fantasai   Fri, 25 Jul 2014 19:52:33 +0100

www-style > July 2014 > 0000.html

Received on Friday, 25 July 2014 18:53:09 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: ishida@w3.org, kojiishi@gluesoft.co.jp
Copied to: addison@lab126.com, www-style@w3.org, www-style@w3.org, www-international@w3.org.

On 07/25/2014 07:22 PM, Richard Ishida wrote: > On 25/05/2014 06:28, Koji Ishii wrote: >> I’m very happy to hear feedback where existing implementations do differently from UAX#14, so that we could examine each >> issue and decide whether or how to fix them. > > That information is available as follows: > > For general characters: > > Line break, BA: Break after characters > http://www.w3.org/International/tests/repository/css3-text/line-break-baspglwj/results-ba#ba_space > (good support on the whole, but some categories not or half-heartedly supported by Firefox and IE - seems like just a question > of adding them to a list somewhere) > [...] > Hope that helps, Very nice summary, yes. :) One of the main problems is actually the handling of various punctuation like slashes. A lot of these breaks need some amount of prioritization in order to work correctly. See, for example, this bug: https://bugzilla.mozilla.org/show_bug.cgi?id=389710 We do normatively require the behavior defined for the following categories: BK, CR, LF, CM, NL, SG, WJ, ZW, GL, CJ I think I'd be OK to include the restrictions for opening and closing punctuation... however, since there are very real problems with simply adopting the UAX14 pairs table, I don't want to normatively require its implementation. As Koji says, adopting UAX14 wholesale would require a very detailed review of UAX14, its compatibility with dumb line-breaking algorithms like a pairs table without prioritization, and Web-compatibility. And that is not a task we'd like to tackle right now. ~fantasai
Re: [css-text] I18N-ISSUE-316: Line breaking defaults
Richard Ishida   Wed, 08 Oct 2014 20:07:16 +0100

www-style > October 2014 > 0000.html

Received on Wednesday, 8 October 2014 19:07:56 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: fantasai.lists@inkedblade.net, kojiishi@gluesoft.co.jp
Copied to: addison@lab126.com, www-style@w3.org, www-style@w3.org, www-international@w3.org.

On 25/07/2014 19:52, fantasai wrote: > On 07/25/2014 07:22 PM, Richard Ishida wrote: >> On 25/05/2014 06:28, Koji Ishii wrote: >>> I’m very happy to hear feedback where existing implementations do >>> differently from UAX#14, so that we could examine each >>> issue and decide whether or how to fix them. >> >> That information is available as follows: >> >> For general characters: >> >> Line break, BA: Break after characters >> http://www.w3.org/International/tests/repository/css3-text/line-break-baspglwj/results-ba#ba_space >> >> (good support on the whole, but some categories not or half-heartedly >> supported by Firefox and IE - seems like just a question >> of adding them to a list somewhere) >> [...] >> Hope that helps, > > Very nice summary, yes. :) > > One of the main problems is actually the handling of various punctuation > like > slashes. A lot of these breaks need some amount of prioritization in > order to > work correctly. See, for example, this bug: > > https://bugzilla.mozilla.org/show_bug.cgi?id=389710 > > We do normatively require the behavior defined for the following > categories: > BK, CR, LF, CM, NL, SG, WJ, ZW, GL, CJ > > I think I'd be OK to include the restrictions for opening and closing > punctuation... however, since there are very real problems with simply > adopting the UAX14 pairs table, I don't want to normatively require > its implementation. > > As Koji says, adopting UAX14 wholesale would require a very detailed > review of UAX14, its compatibility with dumb line-breaking algorithms > like a pairs table without prioritization, and Web-compatibility. And > that is not a task we'd like to tackle right now. But what we're asking for is that the spec recommend that UAX14 be used as the default, ie. in lieu of any other considerations - we're not asking that browsers conform to it rigidly. (We are also suggesting, remember, that there be clear indication that tailoring is needed for certain characters in certain scripts.) Falling back to UAX14 as a default would at least (a) prompt implementers to consider improving conformance to UAX14 for cases that are currently just ignored and not actually controversial, (b) provide some kind of consistency and predictability (ie. interop) going forward for characters that are not (yet) problematic, and possibly (c) prompt people to request special behaviour for particular characters in particular scripts - at least they'd be starting from a common base. For example, as i point out in my summary [1], there there is currently a lack of interop in a subset of the 140-odd characters just at [2] and [3] that is unlikely to be controversial but which will be currently affecting support for content in a number of languages. It would be good to ensure that Firefox and IE do the same as Chrome, Safari and Opera for these cases. At least while we wait for people tell us that something different is needed for a given character we would see a consistent handling of that character. ri [1] http://lists.w3.org/Archives/Public/www-international/2014JulSep/0073.html [2] http://www.w3.org/International/tests/repository/css3-text/line-break-baspglwj/results-ba [3] http://www.w3.org/International/tests/repository/css3-text/line-break-baspglwj/results-gl-wj#gl
Re: [css-text] I18N-ISSUE-316: Line breaking defaults
Richard Ishida   Wed, 22 Oct 2014 17:51:02 +0100

www-style > October 2014 > 0000.html

Received on Wednesday, 22 October 2014 16:51:36 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: fantasai.lists@inkedblade.net, kojiishi@gluesoft.co.jp
Copied to: addison@lab126.com, www-style@w3.org, www-style@w3.org, www-international@w3.org.

The i18n WG discussed this at http://www.w3.org/2014/10/16-i18n-minutes.html#item06 and concluded that there should be normative wording to say that UAX14 SHOULD be followed except in those specific cases where issues arise (we don't think there are many besides the kana characters, and mostly it's a question of encouraging those who haven't implemented UAX14 for a given set of characters to catch up those browsers that do). See the test results below for details. ri On 08/10/2014 20:07, Richard Ishida wrote: > On 25/07/2014 19:52, fantasai wrote: >> On 07/25/2014 07:22 PM, Richard Ishida wrote: >>> On 25/05/2014 06:28, Koji Ishii wrote: >>>> I’m very happy to hear feedback where existing implementations do >>>> differently from UAX#14, so that we could examine each >>>> issue and decide whether or how to fix them. >>> >>> That information is available as follows: >>> >>> For general characters: >>> >>> Line break, BA: Break after characters >>> http://www.w3.org/International/tests/repository/css3-text/line-break-baspglwj/results-ba#ba_space >>> >>> >>> (good support on the whole, but some categories not or half-heartedly >>> supported by Firefox and IE - seems like just a question >>> of adding them to a list somewhere) >>> [...] >>> Hope that helps, >> >> Very nice summary, yes. :) >> >> One of the main problems is actually the handling of various punctuation >> like >> slashes. A lot of these breaks need some amount of prioritization in >> order to >> work correctly. See, for example, this bug: >> >> https://bugzilla.mozilla.org/show_bug.cgi?id=389710 >> >> We do normatively require the behavior defined for the following >> categories: >> BK, CR, LF, CM, NL, SG, WJ, ZW, GL, CJ >> >> I think I'd be OK to include the restrictions for opening and closing >> punctuation... however, since there are very real problems with simply >> adopting the UAX14 pairs table, I don't want to normatively require >> its implementation. >> >> As Koji says, adopting UAX14 wholesale would require a very detailed >> review of UAX14, its compatibility with dumb line-breaking algorithms >> like a pairs table without prioritization, and Web-compatibility. And >> that is not a task we'd like to tackle right now. > > But what we're asking for is that the spec recommend that UAX14 be used > as the default, ie. in lieu of any other considerations - we're not > asking that browsers conform to it rigidly. (We are also suggesting, > remember, that there be clear indication that tailoring is needed for > certain characters in certain scripts.) > > Falling back to UAX14 as a default would at least (a) prompt > implementers to consider improving conformance to UAX14 for cases that > are currently just ignored and not actually controversial, (b) provide > some kind of consistency and predictability (ie. interop) going forward > for characters that are not (yet) problematic, and possibly (c) prompt > people to request special behaviour for particular characters in > particular scripts - at least they'd be starting from a common base. > > For example, as i point out in my summary [1], there there is currently > a lack of interop in a subset of the 140-odd characters just at [2] and > [3] that is unlikely to be controversial but which will be currently > affecting support for content in a number of languages. It would be > good to ensure that Firefox and IE do the same as Chrome, Safari and > Opera for these cases. > > At least while we wait for people tell us that something different is > needed for a given character we would see a consistent handling of that > character. > > ri > > > > [1] > http://lists.w3.org/Archives/Public/www-international/2014JulSep/0073.html > > [2] > http://www.w3.org/International/tests/repository/css3-text/line-break-baspglwj/results-ba > > > [3] > http://www.w3.org/International/tests/repository/css3-text/line-break-baspglwj/results-gl-wj#gl > > >
Re: [css-text] I18N-ISSUE-316: Line breaking defaults
fantasai   Wed, 22 Oct 2014 18:12:55 -0400

www-style > October 2014 > 0000.html

Received on Wednesday, 22 October 2014 22:13:30 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: ishida@w3.org, kojiishi@gluesoft.co.jp
Copied to: addison@lab126.com, www-style@w3.org, www-style@w3.org, www-international@w3.org.

On 10/22/2014 12:51 PM, Richard Ishida wrote: > The i18n WG discussed this at http://www.w3.org/2014/10/16-i18n-minutes.html#item06 and concluded that there should be > normative wording to say that UAX14 SHOULD be followed except in those specific cases where issues arise (we don't think there > are many besides the kana characters, and mostly it's a question of encouraging those who haven't implemented UAX14 for a > given set of characters to catch up those browsers that do). See the test results below for details. It's not an issue of kana characters, those are actually normatively covered in the section on 'line-break'. It's also not an issue of the non-tailorable sets you are citing, since those are already normatively required also. If you're asking about the BA category, in order to safely make a normative requirement, I need it split into two sets: - characters after which a break is always permissible and recommended, such as the visible word separators - characters after which a break is sometimes a good idea but not always, such as hyphens and slashes I will not issue a normative recommendation to honor BA behavior of the second category. This will result in bad line-breaking when implementations try to comply without performing a thoughtful survey of each individual case and what contextual information the line break may need to consider. Please note that this is not a theoretical concern: we have already run into this exact problem. ~fantasai
Re: [css-text] I18N-ISSUE-316: Line breaking defaults
Asmus Freytag   Wed, 22 Oct 2014 15:49:43 -0700

www-style > October 2014 > 0000.html

Received on Wednesday, 22 October 2014 22:50:05 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: fantasai.lists@inkedblade.net, ishida@w3.org, kojiishi@gluesoft.co.jp
Copied to: addison@lab126.com, www-style@w3.org, www-international@w3.org.

On 10/22/2014 3:12 PM, fantasai wrote: > On 10/22/2014 12:51 PM, Richard Ishida wrote: >> The i18n WG discussed this at >> http://www.w3.org/2014/10/16-i18n-minutes.html#item06 and concluded >> that there should be >> normative wording to say that UAX14 SHOULD be followed except in >> those specific cases where issues arise (we don't think there >> are many besides the kana characters, and mostly it's a question of >> encouraging those who haven't implemented UAX14 for a >> given set of characters to catch up those browsers that do). See the >> test results below for details. > > It's not an issue of kana characters, those are actually > normatively covered in the section on 'line-break'. > > It's also not an issue of the non-tailorable sets you are > citing, since those are already normatively required also. > > If you're asking about the BA category, in order to safely > make a normative requirement, I need it split into two sets: BA Category 1 > - characters after which a break is always permissible > and recommended, such as the visible word separators BA Category 2 > - characters after which a break is sometimes a good > idea but not always, such as hyphens and slashes Are there any other members of Category 2? Is the issue "generic" to all kinds of hyphens and slashes, or is it "specific" to special strings like dates, path names or identifiers? If it's the latter, then the proper approach would be to focus on the fact that you may want either a) normative default breaking of path names and similar identifiers to be different from normative default line breaking of ordinary text (but then you'd have to specify that in detail) b) normatively reserve the ability for UAs to do "better" for those kinds of strings (and leave it up to the UAs to recognize and handle them). > > I will not issue a normative recommendation to honor BA > behavior of the second category. This will result in bad > line-breaking when implementations try to comply without > performing a thoughtful survey of each individual case > and what contextual information the line break may need > to consider. Please note that this is not a theoretical > concern: we have already run into this exact problem. > I suspect that the issue is more about substrings that represent some special context, rather than the generic occurrence of these in running text. Because of that, I suggest the proper approach would be to specify that UAs should be allowed (encouraged?) to recognize patterns like date strings and to apply specific line breaking logic to them (treat them as embedded objects with their own rules, in other words). If I understood the discussion to this point correctly, the use of UAX#14 was intended as a common default, not as a limit to what UAs could do to provide more sophisticated line breaking. A./ > ~fantasai > >
Re: [css-text] I18N-ISSUE-316: Line breaking defaults
fantasai   Thu, 23 Oct 2014 01:17:51 -0400

www-style > October 2014 > 0000.html

Received on Thursday, 23 October 2014 05:18:21 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: asmusf@ix.netcom.com, ishida@w3.org, kojiishi@gluesoft.co.jp
Copied to: addison@lab126.com, www-style@w3.org, www-international@w3.org.

On 10/22/2014 06:49 PM, Asmus Freytag wrote: > On 10/22/2014 3:12 PM, fantasai wrote: >> >> If you're asking about the BA category, in order to safely >> make a normative requirement, I need it split into two sets: > > BA Category 1 >> - characters after which a break is always permissible >> and recommended, such as the visible word separators > > BA Category 2 >> - characters after which a break is sometimes a good >> idea but not always, such as hyphens and slashes > > Are there any other members of Category 2? I am unsure and don't have the time to solve this particular problem within the next 2 weeks. If you or the i18nwg would like to go through the entire list and annotate it over the next couple weeks, then perhaps we could ask the CSSWG to reconsider this issue. Personally I don't see why we are so concerned. UAX14 is already referenced normatively for all the non-tailorable categories and informatively for all the rest. I am sure that any implementer would be happy to accept bugs filed against their implementation for specific cases where it is clearly better than the line-breaking behavior they have now. I am not in favor of normatively requiring all of UAX14 because I don't want anyone to go filing bugs against implementers where they violate UAX14's tailorable rules and say "you should follow these rules because they're required [unless you can justify otherwise]". If we're filing line-breaking bugs, I want them to be argued on correctness for the particular characters that are not compliant. I want UAX14 to be used as a source of information, not as a source of rules, and for that an informative reference is the right approach. UAX14 line breaking is great *iff* you have a more sophisticated algorithm that is not simply a pairs table, that has some level of prioritization-by-distance or perhaps some other kind of heuristics. It is not, in its current state, suitable for compliance by a pairwise implementation. > Is the issue "generic" to all kinds of hyphens and slashes, > or is it "specific" to special strings like dates, path names > or identifiers? It's fairly broad. E-mail, for example, shouldn't be broken at the hyphen. Neither should :-) nor -x. And of course, as you mention, neither should dates. >> I will not issue a normative recommendation to honor BA >> behavior of the second category. This will result in bad >> line-breaking when implementations try to comply without >> performing a thoughtful survey of each individual case >> and what contextual information the line break may need >> to consider. Please note that this is not a theoretical >> concern: we have already run into this exact problem. > > I suspect that the issue is more about substrings that represent > some special context, rather than the generic occurrence of > these in running text. It was both. When unsure, it is safer to not break than to break. Knowing that the UAX14 pairs table is insufficient for acceptable line breaking, and that UAs attempting to "improve" their implementation by following it will regress, I cannot in good conscience require it as a baseline. I believe, based on past experience of doing exactly that, that this approach will result in problems for our implementers. I stand by my answer in http://lists.w3.org/Archives/Public/www-style/2014Jul/0500.html and I think the existing references to UAX14 are sufficient given the current situation. Which doesn't mean we can't work on creating a safer pairs table that is suitable for dumb line-breaking implementations applied to Web content, and require that in the future. But as Koji and I keep re-iterating, that is a significantly larger project than is in-scope for us right now. ~fantasai