Brief   Full   Jump  

Small
Medium
Large

Teal
High contrast
Bluish
Black

Sans-serif
Serif
Monospaced
Close
d
?
Styles

[css-text] text-transform:capitalize and Unicode digraphs

7 messages.

[css-text] text-transform:capitalize and Unicode digraphs
Jonathan Kew   Sun, 15 Mar 2015 18:50:56 +0000

www-style > March 2015 > 0000.html

Received on Sunday, 15 March 2015 18:51:25 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: www-style@w3.org.

Unicode includes a few digraph characters such as "dz" and "lj" that have uppercase (DZ, LJ) and titlecase (Dz, Lj) equivalents. How should these be handled by text-transform:capitalize when they occur in word-initial position? It's clear that the lowercase digraphs (dz) will be transformed according to their titlecase mapping (Dz), and that titlecase digraphs will be unchanged. But what should be done when the text contains an uppercase digraph such as DZ? By a strict reading of the current CSS Text draft[1]: # 'capitalize' # Puts the first typographic letter unit of each word in titlecase; other characters are unaffected. together with the Unicode standard, which gives Dz as the titlecase mapping for DZ, it appears that a word-initial uppercase digraph should be converted to its titlecase (mixed) form. This is the behavior I see in WebKit and Blink with an example like: data:text/html;charset=utf-8,<div style="text-transform:capitalize">DZa Dza dza which renders all three "words" identically: "Dza Dza Dza". Gecko, in contrast, does NOT apply the titlecase mapping if the first letter is already uppercase, and so the example renders as "DZa Dza Dza". Although the spec/WebKit/Blink behavior looks "better" for this (artificial) example, I would argue that Gecko's behavior is preferable. While the "DZa" result here does look poor, it makes little sense for an author to enter text in this form in the first place. In contrast, consider what happens if text that is originally entered as all-uppercase is subject to text-transform:capitalize: data:text/html;charset=utf-8,<div style="text-transform:capitalize">LJUBLJANA Here, WebKit and Blink will render the word as "LjUBLJANA", while Gecko gives the (better) result "LJUBLJANA". IMO, this example -- where the entire word is uppercase -- seems more important than the case where an uppercase digraph has been used to begin an otherwise-lowercase word. So I'd like to propose a minor change to the definition, something like: # 'capitalize' # Puts the first typographic letter unit of each word in titlecase, unless it is already uppercase, in which case it is unchanged. Other characters are unaffected. An alternative, perhaps even better, would be to make it contextual: # Puts the first typographic letter unit of each word in titlecase, unless it is already uppercase and is followed by another uppercase letter, in which case it is unchanged. Other characters are unaffected. However, given that text-transform:capitalize is likely to remain a rather crude instrument -- it doesn't "know" about language-specific stop lists of small words that should not be capitalized, for example -- I don't think the additional implementation cost of making it context-dependent is worthwhile. Feedback/comments welcomed.... JK [1] http://dev.w3.org/csswg/css-text-3/#propdef-text-transform
Re: [css-text] text-transform:capitalize and Unicode digraphs
Patrick Dark   Sun, 15 Mar 2015 16:45:19 -0500

www-style > March 2015 > 0000.html

Received on Sunday, 15 March 2015 21:45:45 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: jfkthame@gmail.com
Copied to: www-style@w3.org, www-style@w3.org.

On 3/15/2015 1:50 PM, Jonathan Kew wrote: > Although the spec/WebKit/Blink behavior looks "better" for this (artificial) example, I would argue that Gecko's behavior is preferable. While the "DZa" result here does look poor, it makes little sense for an author to enter text in this form in the first place. In contrast, consider what happens if text that is originally entered as all-uppercase is subject to text-transform:capitalize: > > data:text/html;charset=utf-8,<div style="text-transform:capitalize">LJUBLJANA > > Here, WebKit and Blink will render the word as "LjUBLJANA", while Gecko gives the (better) result "LJUBLJANA". This example seems contrived. In the improved case (LJUBLJANA), you can get the same result by not using the property at all. On 3/15/2015 1:50 PM, Jonathan Kew wrote: > However, given that text-transform:capitalize is likely to remain a rather crude instrument -- it doesn't "know" about language-specific stop lists of small words that should not be capitalized, for example -- I don't think the additional implementation cost of making it context-dependent is worthwhile. > > Feedback/comments welcomed.... I can't think of a single case where text-transform: capitalize makes sense. Given that, I think the capitalize value should be altogether removed from the CSS3 Text module spec, which would make this issue moot.
Re: [css-text] text-transform:capitalize and Unicode digraphs
Jonathan Kew   Sun, 15 Mar 2015 22:18:51 +0000

www-style > March 2015 > 0000.html

Received on Sunday, 15 March 2015 22:19:21 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: www-style.at.w3.org@patrick.dark.name
Copied to: www-style@w3.org, www-style@w3.org.

On 15/3/15 21:45, Patrick Dark wrote: > On 3/15/2015 1:50 PM, Jonathan Kew wrote: >> Although the spec/WebKit/Blink behavior looks "better" for this >> (artificial) example, I would argue that Gecko's behavior is >> preferable. While the "DZa" result here does look poor, it makes little >> sense for an author to enter text in this form in the first place. In >> contrast, consider what happens if text that is originally entered as >> all-uppercase is subject to text-transform:capitalize: >> >> data:text/html;charset=utf-8,<div >> style="text-transform:capitalize">LJUBLJANA >> >> Here, WebKit and Blink will render the word as "LjUBLJANA", while Gecko >> gives the (better) result "LJUBLJANA". > > This example seems contrived. In the improved case (LJUBLJANA), you can > get the same result by not using the property at all. Yes, clearly it wouldn't make much sense to write such an example directly. But I think it's reasonable to suppose that sites might be applying text-transform:capitalize to elements such as headlines that are being pulled from external data sources, and that some of that external data -- not under the control of the designer writing the CSS for the aggregating site -- might at times be provided in all-caps. JK
Re: [css-text] text-transform:capitalize and Unicode digraphs
Patrick Dark   Sun, 15 Mar 2015 21:17:15 -0500

www-style > March 2015 > 0000.html

Received on Monday, 16 March 2015 02:17:42 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: jfkthame@gmail.com
Copied to: www-style@w3.org, www-style@w3.org.

On 3/15/2015 5:18 PM, Jonathan Kew wrote: > But I think it's reasonable to suppose that sites might be applying text-transform:capitalize to elements such as headlines that are being pulled from external data sources, and that some of that external data -- not under the control of the designer writing the CSS for the aggregating site -- might at times be provided in all-caps. That seems unlikely; if the vast majority of headlines one is aggregating uses conventional title case, then text-transform: capitalize is going to make most imported content look worse by capitalizing things like articles, conjunctions, prepositions, and proper nouns like "amiibo", "document.URL", or "iPhone" which, conventionally, begin with a lowercase letter. If an aggregator is willing to mangle their imported text like that, then I don't see why they'd be particularly concerned about an all-caps headline being restyled with a title case digraph. The above use-case seems especially unlikely because it requires three unlikely scenarios to occur at once: (A) an author applies text-transform: capitalize to all of their imported headlines; (B) the author is importing content with malformed, all-caps headlines; and (C) some of those all-caps headlines contain digraphs.
Re: [css-text] text-transform:capitalize and Unicode digraphs
Brad Kemper   Mon, 16 Mar 2015 09:13:34 -0700

www-style > March 2015 > 0000.html

Received on Monday, 16 March 2015 16:14:03 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: www-style.at.w3.org@patrick.dark.name
Copied to: jfkthame@gmail.com, www-style@w3.org, www-style@w3.org.

> On Mar 15, 2015, at 7:17 PM, Patrick Dark <www-style.at.w3.org@patrick.dark.name> wrote: > >> On 3/15/2015 5:18 PM, Jonathan Kew wrote: >> But I think it's reasonable to suppose that sites might be applying text-transform:capitalize to elements such as headlines that are being pulled from external data sources, and that some of that external data -- not under the control of the designer writing the CSS for the aggregating site -- might at times be provided in all-caps. > > That seems unlikely; It happens all the time. > if the vast majority of headlines one is aggregating uses conventional title case, Depends on the source of your headlines. If the source is the first several words of a comment someone left, for instance, it might be in all caps, all lowercase, or anything in between. In such cases, text-transform:capitalize is a good way to stylistically normalize the case into something that looks like a title. > then text-transform: capitalize is going to make most imported content look worse by capitalizing things like articles, conjunctions, prepositions, and proper nouns like "amiibo", "document.URL", or "iPhone" which, conventionally, begin with a lowercase letter. If your source is so pristine that everyone writing the headlines is consistently following the style guide for not capitalizing articles, conjunctions, prepositions, etc. then you don't need text-transform:capitalize. But others do, even if it is more simplistic algorithm. > If an aggregator is willing to Take mangle their imported text like that, then I don't see why they'd be particularly concerned about an all-caps headline being restyled with a title case digraph. Because we don't live in a perfect world, and "better" or "good enough" is often better "didn't even try to improve" something that starts off with a lot of inconsistencies. > The above use-case seems especially unlikely because it requires three unlikely scenarios to occur at once: (A) an author applies text-transform: capitalize to all of their imported headlines; Not at all unlikely, in many situations. > (B) the author is importing content with malformed, all-caps headlines; Not at all unlikely. > and (C) some of those all-caps headlines contain digraphs. >
Re: [css-text] text-transform:capitalize and Unicode digraphs
Patrick Dark   Mon, 16 Mar 2015 20:10:20 -0500

www-style > March 2015 > 0000.html

Received on Tuesday, 17 March 2015 01:10:45 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: brad.kemper@gmail.com
Copied to: jfkthame@gmail.com, www-style@w3.org, www-style@w3.org.

On 3/16/2015 11:13 AM, Brad Kemper wrote: > On Mar 15, 2015, at 7:17 PM, Patrick Dark <www-style.at.w3.org@patrick.dark.name> wrote: >>> On 3/15/2015 5:18 PM, Jonathan Kew wrote: >>> But I think it's reasonable to suppose that sites might be applying text-transform:capitalize to elements such as headlines that are being pulled from external data sources, and that some of that external data -- not under the control of the designer writing the CSS for the aggregating site -- might at times be provided in all-caps. >> That seems unlikely; > It happens all the time. Can you provide a few examples? On 3/16/2015 11:13 AM, Brad Kemper wrote: > Depends on the source of your headlines. If the source is the first several words of a comment someone left, for instance, it might be in all caps, all lowercase, or anything in between. In such cases, text-transform:capitalize is a good way to stylistically normalize the case into something that looks like a title. Can you provide a few examples? In particular, I'm curious to see usage involving "the first several words of a comment someone left". On 3/16/2015 11:13 AM, Brad Kemper wrote: >> The above use-case seems especially unlikely because it requires three unlikely scenarios to occur at once: (A) an author applies text-transform: capitalize to all of their imported headlines; > Not at all unlikely, in many situations. Can you provide a few examples? On 3/16/2015 11:13 AM, Brad Kemper wrote: > >> (B) the author is importing content with malformed, all-caps headlines; > Not at all unlikely. Can you provide a few examples?
Re: [css-text] text-transform:capitalize and Unicode digraphs
fantasai   Mon, 5 Mar 2018 15:53:58 +0900

www-style > March 2018 > 0000.html

Received on Monday, 5 March 2018 06:54:35 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: jfkthame@gmail.com, www-style@w3.org.

On 03/16/2015 03:50 AM, Jonathan Kew wrote: > Unicode includes a few digraph characters such as "dz" and "lj" that have uppercase (DZ, LJ) and titlecase (Dz, Lj) equivalents. How > should these be handled by text-transform:capitalize when they occur in word-initial position? > > It's clear that the lowercase digraphs (dz) will be transformed according to their titlecase mapping (Dz), and that titlecase > digraphs will be unchanged. But what should be done when the text contains an uppercase digraph such as DZ? > ... > So I'd like to propose a minor change to the definition, something like: > > # 'capitalize' > #     Puts the first typographic letter unit of each word in titlecase, unless it is already uppercase, in which case it is > unchanged. Other characters are unaffected. > > An alternative, perhaps even better, would be to make it contextual: > > #     Puts the first typographic letter unit of each word in titlecase, unless it is already uppercase and is followed by > another uppercase letter, in which case it is unchanged. Other characters are unaffected. > > However, given that text-transform:capitalize is likely to remain a rather crude instrument -- it doesn't "know" about > language-specific stop lists of small words that should not be capitalized, for example -- I don't think the additional > implementation cost of making it context-dependent is worthwhile. Hi Jonathan, The CSSWG accepted your proposed changes in https://lists.w3.org/Archives/Public/www-style/2016Oct/0068.html and the changes were committed in https://hg.csswg.org/drafts/rev/11e8aa074031 Please let me know if that resolves the issue, or if further edits are needed. Thanks (and sorry for the belated response)! ~fantasai