This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 13502 - Text run starting with composing character should be valid
Summary: Text run starting with composing character should be valid
Status: VERIFIED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: LC1 HTML5 spec (show other bugs)
Version: unspecified
Hardware: PC Linux
: P2 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords: a11y
Depends on:
Blocks:
 
Reported: 2011-08-01 16:02 UTC by Shai Berger
Modified: 2013-04-21 06:37 UTC (History)
9 users (show)

See Also:


Attachments
Effects when a text node begins with a combining character (2.01 KB, text/html;charset=UTF-8)
2011-09-28 02:01 UTC, Leif Halvard Silli
Details

Description Shai Berger 2011-08-01 16:02:11 UTC
This is a continuation of bug #12400, which I have filed against the W3C validator. According to the validator, the sequence

<h2 class="ddd"><span>&#x05de;</span>&#x0592;</h2>

is invalid, because "Text run starts with a composing character". In this sequence, 05de is the Hebrew Letter Mem, but 0592 is the composing character "Hebrew Accent Segol" (three dots displayed on top of the letter).

I remember finding that in the spec before, but now I can't. In fact, a Google search limited to the dev.w3.org site finds no references to "text run" that relate to HTML, and no references to "composing character" at all.

As discussed in #12400, Chrome, Firefox and Opera have no issue with this, and display the text as intended -- with different styles for the letter and the accent. Internet Explorer 9 does not. An attachment to said bug,
http://www.w3.org/Bugs/Public/attachment.cgi?id=973, is an HTML file exemplifying and explaining the issue.

So -- does the current html5 spec allow text runs beginning with composing characters?
Comment 1 Aryeh Gregor 2011-08-02 22:21:01 UTC
I'm going to bet that font support won't reliably permit styling a letter separately from its diacritics, in general.  Things like color, maybe, but I'd be very surprised if you could get bold/italics/font-face/etc. to work reliably.  So I don't know how much sense it makes to allow this.  Something should specify how it's rendered if authors do it, though.
Comment 2 Michael[tm] Smith 2011-08-04 05:05:54 UTC
mass-moved component to LC1
Comment 3 Ian 'Hixie' Hickson 2011-08-22 22:42:56 UTC
Henri, what do you want the spec to say here? Should we have a section similar to "Requirements relating to bidirectional-algorithm formatting characters" that requires authors to not have lone combining characters?
Comment 4 Henri Sivonen 2011-09-26 07:13:22 UTC
(In reply to comment #3)
> Henri, what do you want the spec to say here?

The validator's behavior is based on a draft of charmod. The basic assumption is that styling different parts of a grapheme cluster differently is not supported and if someone wants to discuss a combining character in isolation, they should combine it with U+0020.

If people who work on the text shaping subsystems of browsers actually want to support different parts of a grapheme cluster differently, the basic assumption needs revisiting. I don't work on text shaping. You should ask people who do.

If people who work on the relevant parts of rendering engines want to treat this as a supported feature, the validator should get out of the way.

Does different styling for a part of grapheme cluster make sense for any other properties than color (and opacity, which is analogous to tweaking the alpha channel)? What use cases are there for coloring different parts of the grapheme cluster differently?

(Given that, according to the reporter, IE9 doesn't support what's attempted in the test case, it's not totally crazy for the validator to whine about this.)
Comment 5 Shai Berger 2011-09-26 07:52:24 UTC
(In reply to comment #4)
> 
> The validator's behavior is based on a draft of charmod.

For those of us who are not "in" the process, can you elaborate on what that is and where we can find it?

> 
> Does different styling for a part of grapheme cluster make sense for any other
> properties than color (and opacity, which is analogous to tweaking the alpha
> channel)? What use cases are there for coloring different parts of the grapheme
> cluster differently?
> 

The general use case for what I'm asking is trying to emphasize one part of the grapheme cluster. This can make some sense even when discussing French accents in an educational setting, but it makes a lot of sense in writing systems where vowels show up as combining characters -- I'm aware of Hebrew, Arabic and Thai, but there may be more.

Given this rationale, it would probably make sense to use, besides (generalized) color, the font-weight property. I can't come up with anything else that makes sense in general.

Thanks,
Shai.
Comment 6 Henri Sivonen 2011-09-26 09:17:48 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > 
> > The validator's behavior is based on a draft of charmod.
> 
> For those of us who are not "in" the process, can you elaborate on what that is
> and where we can find it?

Oops. I meant charmod-norm:
http://www.w3.org/TR/charmod-norm/
Comment 7 Shai Berger 2011-09-26 10:28:26 UTC
(In reply to comment #6)
> 
> Oops. I meant charmod-norm:
> http://www.w3.org/TR/charmod-norm/

Ironically, that document suggests (http://www.w3.org/TR/charmod-norm/#sec-Restrictions) SVG fonts as an alternative method to achieve the effects which are the subject of this bug. There's only one major browser which doesn't support SVG...
Comment 8 Aryeh Gregor 2011-09-26 22:03:50 UTC
(In reply to comment #4)
> If people who work on the relevant parts of rendering engines want to treat
> this as a supported feature, the validator should get out of the way.

Comment #0 says it already works in Chrome, Firefox, and Opera.  (I didn't test it myself.)

> What use cases are there for coloring different parts of the grapheme
> cluster differently?

http://en.wikipedia.org/wiki/File:Example_of_biblical_Hebrew_trope.svg

The image highlights the diacritical marks to distinguish them from the main letters, and gives them different colors to distinguish vowels from cantillation marks.  I've also seen the vowel mark sheva bolded independent of the letter it's under to signify that it's a sheva na instead of a sheva nach, a distinction that matters for pronunciation but which traditional Hebrew orthography doesn't make.

More theoretically, I could definitely imagine that it would be useful occasionally to emphasize a specific vowel mark in a Hebrew word.  Vowelized Hebrew can sometimes have three or four marks per letter, especially Biblical Hebrew.  If you're contrasting two words that differ only in diacritics, you need to actively draw the reader's attention to the difference if you want them to spot it.  I haven't personally seen this done, though.

Suffice it to say, there are definitely use-cases in Hebrew to be able to color or bold diacritics separately from the letter they're on.  It's not needed for normal typography or anything, though, more of a "nice to have" thing.
Comment 9 Ian 'Hixie' Hickson 2011-09-26 22:14:18 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: see diff below
Rationale: I've explicitly made the spec disallow isolated combining characters. If the use case is just colouring accents, then IMHO CSS should support that directly. It doesn't make any sense to style different parts of a combined character differently, since per Unicode, there is only one grapheme cluster involved; indeed, there might only be one glyph from the font being rendered, e.g. if the combination corresponds to a precomposed character, or if a ligature exists for that combination.

(The question of how such things should render is a CSS one.)
Comment 10 contributor 2011-09-26 22:14:25 UTC
Checked in as WHATWG revision r6590.
Check-in comment: Explicitly disallow combining characters at the start of text nodes.
http://html5.org/tools/web-apps-tracker?from=6589&to=6590
Comment 11 Aryeh Gregor 2011-09-27 00:24:54 UTC
Test case:

data:text/html,<!doctype html>
<span style=font-size:7em>
<span style=color:blue>&%23x05de;</span>&%23x0592;
&%23x05de;&%23x0592;
</span>

In both Firefox 8.0a2 and Chrome 15 dev on Ubuntu 11.04, this displays two identical grapheme clusters.  The base glyph in the first (right-hand) blue while the associated diacritic is black, but the display is otherwise unaffected, exactly as desired.  Opera 11.50 displays the diacritic in the first cluster as a box, refusing to combine it with the different-colored character.

This demonstrates that two major browsers already behave as desired in the cases we're interested in.  It's useful functionality, and there's no reason for the spec to make it invalid.  It might be that there are some cases where styling the diacritic differently from the base character makes no sense, but in some cases it does -- don't throw out the baby with the bathwater.  If you can identify specific markup that definitely doesn't make sense, make that specific markup invalid.

What does "If the use case is just colouring accents, then IMHO CSS should support that directly" mean?  I gave two real-world use-cases in comment 8, and both of them require being able to style some diacritics on a letter differently than others.  A CSS property like diacritic-color or whatever would not serve the use-cases.  It has to be possible to identify individual diacritics to style, and the only way to do that is to put tags in the markup.
Comment 12 Shai Berger 2011-09-27 07:22:01 UTC
(In reply to comment #11)
> Test case:
> 
> data:text/html,<!doctype html>
> <span style=font-size:7em>
> <span style=color:blue>&%23x05de;</span>&%23x0592;
> &%23x05de;&%23x0592;
> </span>
> 
> [...] Opera 11.50 displays the diacritic in the
> first cluster as a box, refusing to combine it with the different-colored
> character.
> 
> This demonstrates that two major browsers already behave as desired in the
> cases we're interested in.

This is quite odd. While I see the same results for your test case (unsurprisingly, I'm also running Ubuntu 11.04 and Opera 11.51), Opera does render diacritics with different color and font-weight in my example document (http://www.w3.org/Bugs/Public/attachment.cgi?id=973 which I have already linked above). I have tried to play a little with the data: test to make it more like my code, with no luck.  However, as my test document shows, the desired behavior is actually supported in all major browsers except IE.

> It's useful functionality, and there's no reason
> for the spec to make it invalid.  It might be that there are some cases where
> styling the diacritic differently from the base character makes no sense, but
> in some cases it does -- don't throw out the baby with the bathwater.  If you
> can identify specific markup that definitely doesn't make sense, make that
> specific markup invalid.
> 

I agree, of course.

> What does "If the use case is just colouring accents, then IMHO CSS should
> support that directly" mean?  I gave two real-world use-cases in comment 8, and
> both of them require being able to style some diacritics on a letter
> differently than others. A CSS property like diacritic-color or whatever would
> not serve the use-cases.  It has to be possible to identify individual
> diacritics to style, and the only way to do that is to put tags in the markup.

I agree. As an example, in the Hebrew word &#1502;&#1489;&#1513;&#1500; ("mevashel", cooking) proper voweling puts three different "diacritics" on the third letter (&#1513;&#1473;&#1468;&#1461;  a diacritic dot on the top right marking the letter as "shin" and not "sin", a point in the middle that is like doubling a consonant in English, and the pair of dots at the bottom which are the vowel e). This is a common everyday word, not some contrived biblical example.

As a side point, the phrasing of the correction in the patch still allows for the "cheating" method I used to pacify the validator: make the text node begin with a RLM followed immediately by a combining character. All major browsers (except IE) then combine the combining character with the last character of the preceding text node, where that is possible.
Comment 13 Shai Berger 2011-09-27 07:34:39 UTC
(In reply to comment #12)
> 
> [...] &#1502;&#1489;&#1513;&#1500; [...] &#1513;&#1473;&#1468;&#1461; 

Of course, those looked like letters when I edited the comment... my apologies.

data:text/html,<!doctype html>&%231502;&%231489;&%231513;&%231500;

data:text/html,<!doctype html>&%231513;&%231473;&%231468;&%231461;
Comment 14 Leif Halvard Silli 2011-09-28 00:44:58 UTC
The Editor is correct, for the following reasons:

(1) Semantics must win over styling - as the editor has stated.
(2) When we count in the semantics, then browsers - contrary to what Aryeh is claiming - do not support combining characters that begins a text node very well at all.

Example:
  
While UAs will present/render 

    <b>accént</b>

 as a single word, most of them, including VoiceOver, Internet Explorer, Firefox, Opera (Webkit is the exception) will present

    <b>acc<span>e</span>&#x301;nt</b>

as two words.
Comment 15 Leif Halvard Silli 2011-09-28 02:01:02 UTC
Created attachment 1032 [details]
Effects when a text node begins with a combining character

For convenience, a data URI  of the attachment (doesn't work in IE):
  http://tinyurl.com/combining-char-in-text-node-st

The attachment file tests the CSS effects as well as the semantic effect of beginning a text node with a combining character.

The test shows

*  That it *is* possible - even  in Firefox and IE - to *visually* get the effect that Aryeh and Shai are after. However, in order to make it work, one must apply display:inline-block on the 'base character', which in turn causes the word to be treated as 2 or 3 words instead of as a single word. (This affectgs word break and other things.)

* That the same effect that is seen in Firefox and IE (due to the application of span{display:inline-block;}), can also be seen in Opera.

* That for Webkit, the test appears to be visually successful. However, if you test it in VoiceOver, you hear much the same thing as you can see in Firefox, IE and Opera: the word is split up.

* That very similar conceptual problems occurs if one tries to add the acute accent via CSS generated content.

PS: I should say that I have tried exactly the same thing that Aryeh and Shai describe in a Russian text where I wanted to add the accute to show word stress. As it was important to me that users could search and find words without having to type the accent, I ended up with some kind of :hover effect. (Today, Webkit excels in this regard - if you search for 'accent', then you will also find 'accént' and 'acce&#x0301;nt'.)

PPS: It can actually a bad idea to merely place a <span> in the middle of a word even without any styling: acc<b>e</b>nt. Reason:this appears to have the effect of making the word unfindable in IE (at least IE8). In other words, if you search for 'accent' with IE's Find-in-window feature, you won't find the word.
Comment 16 Aryeh Gregor 2011-09-28 18:56:48 UTC
All that sounds like browsers' word-breaking/find/etc. being buggy.  It doesn't justify making markup like this invalid.
Comment 17 Leif Halvard Silli 2011-09-29 01:50:33 UTC
(In reply to comment #16)
> All that sounds like browsers' word-breaking/find/etc. being buggy.  It doesn't
> justify making markup like this invalid.

Well, you used as argument that two browsers support the suggested behaviour. Since that is not true -  since the presense of mark-up  in every user agent tested, has at least one situation wehre an composed character that has been "split up" with mark-up, become interpreted as separated characters - one cannot use that as argument either.

But I agree that there is nothing wrong with mark-up around graphes, per se, if it would not have these side effects. (The very thing that such mark-up could affect normalization, does not sound convincing tomyself.) My main issue is that browsers have a road ahead of them when it comes to properly implement word-break/find/etc.

Btw, my objection to this being legal, is similar to my sceptisism towards the <wbr> tag: The <wbr> tag too has the effect of making words be treated as separate words, but without the authors proper awareness of the effect.
Comment 18 Shai Berger 2011-09-29 11:43:30 UTC
There is a point that was evoked for me by Leif's earlier message: a significant distinction between the sets of combining characters under discussion. I've sort of mentioned it in passing before, but I think it's a point that should be made more central.

Some of the characters I wish to emphasize separately from their base are indeed diacritics; such is the case for, e.g., 05C1 "Hebrew Point Shin Dot". Anyone who can object to "acce<b>&#x0301;</b>nt" should also object to the equivalent with Shin Dot.

However, characters in the range 05B0--05BC (inclusive) are not diacritics in any sense but visual; they are our vowels. True, we tend to avoid using them in writing, and we have partial replacements for some of them in some contexts, but still: These are the vowels. The vowel 'e', in particular, has no replacement in any context in Hebrew; the only way to write it down is a combining character.

The change introduced by the editor makes the Hebrew equivalent of "acc<b>e</b>nt" invalid. This seems to agree with Leif's PPS comment, and yet, I don't think the correct way to promote such a change (even if it is desired) is by enforcing it first on specific languages.

(as I noted before, the situation for Arabic and Thai is similar to the one in Hebrew: Vowels are combining characters; I cannot say much about the frequency of use of vowels and their possible replacements in those languages).
Comment 19 Leif Halvard Silli 2011-09-29 21:25:27 UTC
(In reply to comment #18)

> Anyone who can object to "acce<b>&#x0301;</b>nt" should also object to the
> equivalent with Shin Dot.
> 
> However, characters in the range 05B0--05BC (inclusive) are not diacritics in
> any sense but visual; they are our vowels.

How is that an argument? There is no such thing as "right to have styled vowels" ... ;-)

Beside, even if disallowed in HTML, you can get all you need via CSS. I even (re)discovered that IE and Firefox do not need that display:inline-block hack. All they need is that the "base" character and the combining character differ with regard to their respective font-weight values. Also, when using CSS, Find-in-Page tends to work a little better - in IE and Firefox, than otherwise. For Opera, I was unable to style the accent different from the base character - but at least I was able to to hold its hand: http://tinyurl.com/6yk2m9b

<rant>Each writing script has its advantages and disadvantages. For instance, Hebrew text runs are shorter than Latin runs, since there are no vowels there (and even if you have vowels, the text length doesn't increase).  As a user of of the Latin script where I must write vowels, I feel discriminated - for instance on Twitter!  It is even 
Comment 20 Shai Berger 2011-10-01 21:22:17 UTC
(In reply to comment #19)
> (In reply to comment #18)
> 
> > Anyone who can object to "acce<b>&#x0301;</b>nt" should also object to the
> > equivalent with Shin Dot.
> > 
> > However, characters in the range 05B0--05BC (inclusive) are not diacritics in
> > any sense but visual; they are our vowels.
> 
> How is that an argument? There is no such thing as "right to have styled
> vowels" ... ;-)
> 

There is in Latin scripts... 

> Beside, even if disallowed in HTML, you can get all you need via CSS. [...]
> For Opera, I was unable to style the accent different from the base character -
> but at least I was able to to hold its hand: http://tinyurl.com/6yk2m9b
> 

1) This example relies on moving the combining character to a css "content" text run (which, then, starts with a combining character). It turns semantics into presentation, and assumes that an invalid HTML text run will still be a valid CSS text run.

2) This example doesn't work in Chromium (I mean the actual code, not just the redirect). It can probably be fixed to work there too, but I fear the specter of browser-specific code.

3) Since the graphic capability is, as you say, present in all browsers (I didn't check IE myself); and since nobody is seriously contemplating to forbid the marking of single letters in a word via markup; why, then, is it so important to forbid it for symbols which are combining characters?

I actually found an answer for this question in the charmod-norm draft (http://www.w3.org/TR/charmod-norm, linked earlier by Henri). It is required there that fully-normalized text does not include text-runs which begin with a combining character, because when such text-runs are concatenated (appended) to another text-run, normalization may change the characters involved or their order. As an example, "acce"+"&#x301;nt" should normalize into "acc
Comment 21 Leif Halvard Silli 2011-10-02 04:12:38 UTC
(In reply to comment #20)
> (In reply to comment #19)
> > (In reply to comment #18)

> > > Anyone who can object to "acce<b>&#x0301;</b>nt" should also object to the
> > > equivalent with Shin Dot.
> > > 
> > > However, characters in the range 05B0--05BC (inclusive) are not diacritics in
> > > any sense but visual; they are our vowels.
> > 
> > How is that an argument? There is no such thing as "right to have styled
> > vowels" ... ;-)
> 
> There is in Latin scripts... 

Out of curiosity, would you also like be able to put emphasiz on the vovels, like this: &#x5d3;<strong style="color:red;" class='kamatz'>&#x5b8;</strong>&#x5bc;&#x5d2; ?

> > Beside, even if disallowed in HTML, you can get all you need via CSS. [...]
> > For Opera, I was unable to style the accent different from the base character -
> > but at least I was able to to hold its hand: http://tinyurl.com/6yk2m9b
> > 
> 
> 1) This example relies on moving the combining character to a css "content"
> text run (which, then, starts with a combining character). It turns semantics
> into presentation, and assumes that an invalid HTML text run will still be a
> valid CSS text run.

It is nothing new that it is entirely possible to both enhance and clutter up the user expereince of the consumption of the underlying mark-up with the help of CSS.

It is also not - in theory - *necessary* to let the CSS content begin wtih a combining character. You might instead replace the entire content of the element - base letter and diacritics. And, in fact, that is probably what you should do. Then you ought to avoid the problem.

Actually, for Webkit, you don't need CSS generated content at all - you can instead rely on :first-letter. (In reality a CSS bug, of coures.)  Well, at least I was able to do so in this demo - which also contains a colored Hebrew vowel, colored in Firefox, Webkit/Chrome and Opera: 

http://tinyurl.com/6xw4rcm 

(In IE I could not get to work properly, so instead made sure that it did not work at all.)

That said: You have a point. Because, when one adds the diacritic via CSS, then browsers must either:

 a) ignore the CSS from a 'semantic' point of view - that is: not 
     disturb the reader with the CSS content, but treat it as 
     decoration only. Since the combining letters are just "colorizing" 
     of the base letters, this works fine. (Not?)
     Often a) is perceived as the way CSS should work.
 b) combine text in mark-up and text in CSS, in to a meaningful/-less whole
 d) replace entire content with new content - which must then (of course) be read as normal text

The b) and the c) are "on your side" in the sense that - really, contrary to a common perception (see a)), there is not supposed to be any *functional* difference between adding these - or other - charactes via CSS or via mark-up. There is a principal difference, though: It is possible to disable/ignore the CSS, and then things will fall back to "normal". It is in line with the 'progressive enhancement' philosophy to enhance stuff with CSS, while keeping the unstyled mark-up functional in and by itself - without any styling.

> 2) This example doesn't work in Chromium (I mean the actual code, not just the
> redirect). It can probably be fixed to work there too, but I fear the specter
> of browser-specific code.

I have not tested in Chromium - neither in the browser nor in the OS. But I have tested in Chrome - the browser, and it did work then. If it doesn't work (perfectly), then that might be a font issue, I gues - as fonts are a thing that I think varies on different platforms.

> 3) Since the graphic capability is, as you say, present in all browsers (I
> didn't check IE myself); and since nobody is seriously contemplating to forbid
> the marking of single letters in a word via markup; why, then, is it so
> important to forbid it for symbols which are combining characters?

Because we then ensure that it is possible to fall back to something that works. If you add mark-up around combining characters, then it breaks from the start - at least that is the situation today. But if you only add mark-up around 'logical characters', then, if the styling layer creates problems, one can fall back to the unstyled layer.

> I actually found an answer for this question in the charmod-norm draft
> (http://www.w3.org/TR/charmod-norm [ snip ]
> "acceB"+"Ant" may normalize into "acceABnt".

> [snip] [But: ] ("When data
> transfer on the Web remained mostly unidirectional (from server to browser),
> and where the main purpose was to render documents, the use of Unicode without
> specifying additional details was sufficient". This still describes HTML, as
> far as I am aware).

What that document says in the next sentences is true, though: It is not as unidirectional as you say, anymore.

Frankly, "out of the box", I am not very able to evaluate what that document says - I can only use common sense. And fact is that it matters to fragment URIs whether it points to an @id value that is normalized or not: if it points to id="t
Comment 22 Ian 'Hixie' Hickson 2011-10-02 07:18:16 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Partially Accepted
Change Description: see diff given below
Rationale: I spoke with Mark Davis, who informed me that I was wrong. Unicode does intend to allow combining characters to be styled differently. So I've reverted the earlier change.
Comment 23 contributor 2011-10-02 07:20:28 UTC
Checked in as WHATWG revision r6611.
Check-in comment: Allow combining characters wherever, per Mark Davis.
http://html5.org/tools/web-apps-tracker?from=6610&to=6611
Comment 24 Leif Halvard Silli 2011-10-02 15:09:59 UTC
(In reply to comment #22)

> Rationale: I spoke with Mark Davis, who informed me that I was wrong. Unicode
> does intend to allow combining characters to be styled differently.

OK. This seems to be correct. For documentation, her are some quotes from Unicode 6.0:

]]
5.11 Editing and Selection
   [ snip ]
Nonlinear Boundaries. Use of nonlinear boundaries divides any stacked element into parts. For example, picking a point halfway across a lam + meem ligature can represent the division between the characters. One can either allow highlighting with multiple rectangles or use another method such as coloring the individual characters.
   [ snip ]
In most editing systems, the code point is the smallest addressable item, so the selection and assignment of properties (such as font, color, letterspacing, and so on) cannot be done on any finer basis than the code point. Thus the accent on an 
Comment 25 Shai Berger 2011-10-07 07:36:07 UTC
Hi,

I waited a little for possible feedback from some other interested parties; it is apparently not forthcoming.

Thank you all for the enlightening discussion, and for the resolution.