Accesskey n skips to in-page navigation. Skip to the content start
This document contains examples in another language/script.
This page gathers information about the use of the :first-letter pseudo-element proposed for CSS3. It includes a summary of comments made by numerous people.
We are looking for comments and discussion on this topic. To comment:
The latest working draft of CSS3 Selectors proposes the::first-letter pseudo-element.
The ::first-letter pseudo-element represents the first letter of the first line of a block, if it is not preceded by any other content (such as images or inline tables) on its line. It allows that first letter to be styled individually, without markup. It may be used for "initial caps" and "drop caps", which are common typographical effects in text in Latin script.
We commented to the CSS Working Group that they need to define 'letter' more carefully, and proposed that they specify that 'letter' equates to 'default grapheme cluster', as described in the Unicode Standard Annex #29.
(A rough and ready explanation of this is that base characters and any following combining characters are styled together. So
0065: e LATIN SMALL LETTER E + 0301: ́ COMBINING ACUTE ACCENT
would be handled as a single letter.)
We also suggested that implementors should then be encouraged to provide tailored algorithms on a per language basis to cope with anomolies, particularly such as may occur in non-Latin scripts.
Here are some initial questions for which we are seeking answers:
[1] Are there scripts that would never use this approach?
[2] We mention 'initial caps' and 'drop caps' above. What other types of styling would be commonly applied in other scripts if this feature were available?
[3] What script features would cause difficulties, eg syllabic groupings (see the example of indic script example below), ligatures, cursive text (eg. Arabic, Urdu, etc.), and how would the script normally deal with them?
Indic script behavior relates to syllables, rather than individual letter forms. In the Hindi word स्थिति ('sthiti') the sequence of characters in the first syllable is as follows in memory:
0938: स DEVANAGARI LETTER SA
094D: ् DEVANAGARI SIGN VIRAMA
0925: थ DEVANAGARI LETTER THA
093F: ि DEVANAGARI VOWEL SIGN I
The displayed text, however, is
Note how the vowel sign appears to the left of the first character, not the third.
There are two default grapheme clusters here. The first includes the SA+VIRAMA+THA+I. (The second is the last two characters, T+II.)
From the feedback we have received it appears that first-letter styling will be needed for Indic scripts. We have examples in the mail archive for such styling in Devanagari, Bengali, and Malayalam, though we have reports that it is needed for other scripts, such as Telugu. Tamil and Kannada.
We see that the styling is done on the basis of the syllable, not the first character. A syllable includes a base consonant and any combination of the following characters in the text stream:
These combinations are
all default grapheme clusters NOT equivalent to default grapheme clusters, as defined by Unicode. The default grapheme
cluster is only a part of an indic syllable cluster. This means that, as it stands, user agent developers will need to implement special algorithms
to support first-letter styling in indic text. Such algorithms will need to automatically detect that such rules are applicable to the text being
styled. It will be interesting to ascertain whether the rules vary by script only or by language. If the latter, then it is important to mark up the
language of the text correctly.
Note that the order in which these characters are displayed may be different from the order in memory. Note also that there is no one to one mapping between the codes and the glyphs used. There are often ligatures, vowel signs that appear on both sides of the consonant base, etc. The styling is applied to all glyph used to represent the syllable as a whole.
The examples show a predominance of styling similar to what would be called 'drop letter' in English. Where a character is enlarged in a script has a headstroke, the height of the headstroke in the large text and the regular text is typically approximately on the same level, but commonly does not join.
The following is an example of a drop letter in Hindi.
In some cases there is additional coloring applied to a drop letter. In other cases, the coloring is the distinctive styling.
We also had some examples of increased font size without the drop letter characteristics. This example is in the Malayalam script.
Since Arabic and Mongolian letters in a word are normally joined, has first letter styling been used at all in these scripts?
Do languages using these scripts do first letter styling?
Do languages using these scripts do first letter styling?
Do other scripts need first letter styling? In particular, do they have any special requirements?
Content first published 14 July, 2006. Last substantive update 2006-07-14 11:05 GMT. This version 2006-07-14 11:05 GMT
For the history of document changes, search for uf-firstletter in the i18n blog.
Copyright © 2005 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements.