ISSUE-407: Clarification of initial letter example

Clarification of initial letter example

State:
CLOSED
Product:
ilreq
Raised by:
Richard Ishida
Opened on:
2015-01-16
Description:
5.1 First Letter
http://www.w3.org/TR/2014/WD-ilreq-20141216/#first-letter

"Note how the vowel sign appears to the left of the first character, not the third. There are three grapheme clusters here. The first includes the SA+VIRAMA,THA+I and T+II. We see that the styling is done on the basis of the syllable, not the first character. A syllable includes a base consonant and any combination of the following characters in the text stream:"

This text is misleading when paired with figure 4 when it talks about 3 graphemes and there are 3 red circles. It also doesn't show first letter styling, as the text says, which is confusing. There is also an error in the romanization.

How about the following wording, based around the example at https://www.flickr.com/photos/ishida/16084553630/
I also suggest renaming the section to Initial Letter Styling, to match the CSS Inline spec
---------
Indic script behavior in initial letter styling is based on syllables, rather than individual letter forms.

Figure 4 shows an example of a drop intial in Hindi. In the first word of the paragraph, स्कूल ('skūl'), the sequence of characters is stored in memory is as follows:

स ‎U+0938 DEVANAGARI LETTER SA
् ‎U+094D DEVANAGARI SIGN VIRAMA
क ‎U+0915 DEVANAGARI LETTER KA
ू ‎U+0942 DEVANAGARI VOWEL SIGN UU
ल ‎U+0932 DEVANAGARI LETTER LA

There are two syllables in this word: SA+VIRAMA+KA+UU and LA. Note, however, that there are three Unicode grapheme clusters here: SA+VIRAMA, KA+UU and LA.

Styling is done on the basis of the whole orthographic syllable, not the first character, nor even the first grapheme.

A syllable includes a base consonant and any combination of the following characters in the text stream:
- sequences of consonants preceded by virama (i.e. conjuncts).
- vowel signs
- visarga, anusvara or candrabindu.


NOTE: The detailed definition of Indic syllables is given in section 2.

Here are some further examples of initial letter styling based on the Indic syllable definition.

...
---------

An alternative would be to take the above text and put it at the bottom of section 3 Text Segmentation, as an illustration of the point made in the last paragraph ("text segmentation should be done as Indic syllable"). This is useful because it clearly distinguishes between grapheme cluster and syllabic units, and could be referred to from other sections, too, such as the section on vertical text.

And then simply say, at the start of section 5.1 that selection of initial letters uses the orthographic syllable as the unit, as illustrated in section 2, and then simply give some examples. The majority of section 5.1 could then focus on more specific requirements, such as what styles of highlighting are common, and what the alignment points, etc, are.
Related Actions Items:
No related actions
Related emails:
  1. Weekly github digest (I18n repos) (from sysbot+gh@w3.org on 2017-04-25)
  2. Daily github digest (I18n repos) (from sysbot+gh@w3.org on 2017-04-25)
  3. Daily github digest (w3c/ilreq) (from sysbot+gh@w3.org on 2017-04-25)
  4. [minutes] Internationalization telecon 2015-01-22 (from ishida@w3.org on 2015-01-22)
  5. I18N-ISSUE-407: Clarification of initial letter example [ilreq] (from sysbot+tracker@w3.org on 2015-01-16)

Related notes:

No additional notes.

Changelog:

Created issue 'Clarification of initial letter example' nickname owned by Richard Ishida on product ilreq, description '5.1 First Letter
http://www.w3.org/TR/2014/WD-ilreq-20141216/#first-letter

"Note how the vowel sign appears to the left of the first character, not the third. There are three grapheme clusters here. The first includes the SA+VIRAMA,THA+I and T+II. We see that the styling is done on the basis of the syllable, not the first character. A syllable includes a base consonant and any combination of the following characters in the text stream:"

This text is misleading when paired with figure 4 when it talks about 3 graphemes and there are 3 red circles. It also doesn't show first letter styling, as the text says, which is confusing. There is also an error in the romanization.

How about the following wording, based around the example at https://www.flickr.com/photos/ishida/16084553630/
I also suggest renaming the section to Initial Letter Styling, to match the CSS Inline spec
---------
Indic script behavior in initial letter styling is based on syllables, rather than individual letter forms.

Figure 4 shows an example of a drop intial in Hindi. In the first word of the paragraph, स्कूल ('skūl'), the sequence of characters is stored in memory is as follows:

स ‎U+0938 DEVANAGARI LETTER SA
् ‎U+094D DEVANAGARI SIGN VIRAMA
क ‎U+0915 DEVANAGARI LETTER KA
ू ‎U+0942 DEVANAGARI VOWEL SIGN UU
ल ‎U+0932 DEVANAGARI LETTER LA

There are two syllables in this word: SA+VIRAMA+KA+UU and LA. Note, however, that there are three Unicode grapheme clusters here: SA+VIRAMA, KA+UU and LA.

Styling is done on the basis of the whole orthographic syllable, not the first character, nor even the first grapheme.

A syllable includes a base consonant and any combination of the following characters in the text stream:
- sequences of consonants preceded by virama (i.e. conjuncts).
- vowel signs
- visarga, anusvara or candrabindu.


NOTE: The detailed definition of Indic syllables is given in section 2.

Here are some further examples of initial letter styling based on the Indic syllable definition.

...
---------

An alternative would be to take the above text and put it at the bottom of section 3 Text Segmentation, as an illustration of the point made in the last paragraph ("text segmentation should be done as Indic syllable"). This is useful because it clearly distinguishes between grapheme cluster and syllabic units, and could be referred to from other sections, too, such as the section on vertical text.

And then simply say, at the start of section 5.1 that selection of initial letters uses the orthographic syllable as the unit, as illustrated in section 2, and then simply give some examples. The majority of section 5.1 could then focus on more specific requirements, such as what styles of highlighting are common, and what the alignment points, etc, are.' non-public

Richard Ishida, 16 Jan 2015, 18:17:14

title changed to 'Clarification of initial letter example ⓟ'

Richard Ishida, 16 Jan 2015, 18:18:44

title changed to 'Clarification of initial letter example'

Richard Ishida, 26 Jan 2015, 14:16:26

Status changed to 'closed'

Richard Ishida, 16 Sep 2015, 07:09:24


Addison Phillips <addison@amazon.com>, Chair, Richard Ishida <ishida@w3.org>, Bert Bos <bert@w3.org>, Fuqiao Xue <xfq@w3.org>, Atsushi Shimono <atsushi@w3.org>, Staff Contacts
Tracker: documentation, (configuration for this group), originally developed by Dean Jackson, is developed and maintained by the Systems Team <w3t-sys@w3.org>.
$Id: index.php,v 1.326 2018/10/13 17:29:51 vivien Exp $