<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>12400</bug_id>
          
          <creation_ts>2011-03-30 08:00:46 +0000</creation_ts>
          <short_desc>Inconsistent treatment of combining characters beginning text run</short_desc>
          <delta_ts>2015-08-23 06:58:45 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>HTML Checker</product>
          <component>General</component>
          <version>unspecified</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Linux</op_sys>
          <bug_status>REOPENED</bug_status>
          <resolution></resolution>
          
          <see_also>https://www.w3.org/Bugs/Public/show_bug.cgi?id=13502</see_also>
          <bug_file_loc>http://webkeys.platonix.co.il/layouts/show/system/si1452/</bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Shai Berger">shai</reporter>
          <assigned_to name="Michael[tm] Smith">mike</assigned_to>
          <cc>hsivonen</cc>
          
          <qa_contact name="qa-dev tracking">www-validator-cvs</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>47048</commentid>
    <comment_count>0</comment_count>
      <attachid>973</attachid>
    <who name="Shai Berger">shai</who>
    <bug_when>2011-03-30 08:00:46 +0000</bug_when>
    <thetext>Created attachment 973
explanation and exemplification of problem

Hi,

I need to present a base character with a combining character,
while emphasizing the combining character and demoting the base
character to &quot;background&quot; status. In other words, I want to present
a combination with different styles for the combined parts.

According to the validator, a text run should not begin with a combining
character. This makes it impossible for me to get what I want, as
:first-letter applies a style to the combination as a whole (obviously).

I was able to &quot;work around&quot; the validation issue by inserting a
zero-width character as first in the run, between my base character
and the combining one. Since this is clearly cheating, I think
the validator should have caught me.

I&apos;ve tested a page with the workaround (the URL for this bug) on Chrome 10,
Firefox 3.6 and Internet Explorer 9. Chrome and Firefox render it the way
I want, IE was confused and did not render the combining character at all.

Actually, I&apos;m not sure if this is a problem in the validator or the HTML5 spec,
but I thought filing it here was a good way to push it towards the right people.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>86435</commentid>
    <comment_count>1</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2013-04-21 01:32:44 +0000</bug_when>
    <thetext>(In reply to comment #0)
&gt; Actually, I&apos;m not sure if this is a problem in the validator or the HTML5
&gt; spec,
&gt; but I thought filing it here was a good way to push it towards the right
&gt; people.

I believe that the validator is conforming here to the behavior required by the HTML5 spec -- or maybe by other specs that the HTML5 normatively references. I think the error message for this case is coming from the HTML parser code that the validator uses, not the validator code itself.

One way you can check that is, do &quot;View source&quot; on your test file in Firefox, which uses the same HTML parser code as the validator. Firefox&apos;s View-source feature will mark in red any parsing problems it finds. So if you see that it&apos;s flagging the same problem that the validator is reporting, then you know it&apos;s due to that HTML parser code.

And that HTML parser code attempts to conform to the HTML5 spec. So if you&apos;d like to see a spec change around this, please consider posting the details to either public-html@w3.org or whatwg@whatwg.org</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>86449</commentid>
    <comment_count>2</comment_count>
    <who name="Shai Berger">shai</who>
    <bug_when>2013-04-21 06:37:24 +0000</bug_when>
    <thetext>Since this bug was filed, several things changed. One of them is the page URL (updated). Another is that the issue was discussed in #13502 (added as &quot;see also&quot;), where it was concluded that both the use case (a combined character with separate styling for the separate parts) and the implementation (a text run starting with a combining character) are valid; further, they have always been valid.

According to comment 6 there (https://www.w3.org/Bugs/Public/show_bug.cgi?id=13502#c6), the validator&apos;s behavior is intentional and implements charmod-norm; according to the later discussion, charmod-norm does not apply to HTML. No change to the visible HTML spec was needed to fix this (charmod-norm was never referenced in the first place), but a comment to this effect was added to the document source.

So -- in the first case, where text runs do begin with combining characters, the validator&apos;s behavior is not conforming to the HTML5 spec or any normative reference.

In the second case, where a combining character follows a zero-width character, I&apos;d expect a warning not because it violates a spec -- but because it makes no sense.

Following the resolution of #13502, I changed the referenced web application to produce text runs that begin with combining characters, and now it gets all these warnings from the validator. Viewing the source of the page does not show the relevant characters in red -- they are all in orange (as entity references); this strengthens the claim that the warning comes from the validator and not the HTML parser.

Thanks,
Shai.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>89911</commentid>
    <comment_count>3</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2013-06-26 18:11:28 +0000</bug_when>
    <thetext>(In reply to comment #2)
&gt; discussed in bug 13502, where it was concluded that both the use case (a
&gt; combined character with separate styling for the separate parts) and the
&gt; implementation (a text run starting with a combining character) are valid;
&gt; further, they have always been valid.
&gt; 
&gt; According https://www.w3.org/Bugs/Public/show_bug.cgi?id=13502#c6), the
&gt; validator&apos;s behavior is intentional and implements charmod-norm; according
&gt; to the later discussion, charmod-norm does not apply to HTML. No change to
&gt; the visible HTML spec was needed to fix this (charmod-norm was never
&gt; referenced in the first place), but a comment to this effect was added
&gt; to the document source.
&gt; 
&gt; So -- in the first case, where text runs do begin with combining characters,
&gt; the validator&apos;s behavior is not conforming to the HTML5 spec or any
&gt; normative reference.

CCing Henri Sivonen, who&apos;s way more familiar with this than me...</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>89959</commentid>
    <comment_count>4</comment_count>
    <who name="Henri Sivonen">hsivonen</who>
    <bug_when>2013-06-27 14:28:26 +0000</bug_when>
    <thetext>Do all browsers support styling combining characters separately of the base character?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>89960</commentid>
    <comment_count>5</comment_count>
    <who name="Shai Berger">shai</who>
    <bug_when>2013-06-27 14:43:37 +0000</bug_when>
    <thetext>It is currently supported badly by Firefox and Chrome (used Chromium) on Linux; I suspect it is not supported by IE, and I don&apos;t know about others (but they all use Webkit now, don&apos;t they?...)

By &quot;supported badly&quot; I mean that applying separate styling to combining characters does render a separately styled character, but it is usually moved a little off the place it should be (less than a whole character width, but some).

I wasn&apos;t able to test this much on IE, as I&apos;m not a Windows user. In the little tests I was able to do, IE failed to display the combining characters; I can&apos;t be sure if this was a problem with the feature or a font lacking the combining characters I used (Hebrew diacritics). 

Also, I&apos;m not sure the others behave the same in this respect on all platforms.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>92146</commentid>
    <comment_count>6</comment_count>
    <who name="Henri Sivonen">hsivonen</who>
    <bug_when>2013-08-16 09:46:41 +0000</bug_when>
    <thetext>(In reply to comment #5)
&gt; It is currently supported badly by Firefox and Chrome (used Chromium) on
&gt; Linux; I suspect it is not supported by IE

Sounds like this isn&apos;t interoperably supported by major browsers then! I suggest WONTFIX in the validator.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>92147</commentid>
    <comment_count>7</comment_count>
    <who name="Shai Berger">shai</who>
    <bug_when>2013-08-16 10:00:30 +0000</bug_when>
    <thetext>(In reply to comment #6)
&gt; (In reply to comment #5)
&gt; &gt; It is currently supported badly by Firefox and Chrome (used Chromium) on
&gt; &gt; Linux; I suspect it is not supported by IE
&gt; 
&gt; Sounds like this isn&apos;t interoperably supported by major browsers then! I
&gt; suggest WONTFIX in the validator.

I was under the impression that the validator is supposed to express the W3 standards, not the current state of implementations.

Was I wrong?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>92149</commentid>
    <comment_count>8</comment_count>
    <who name="Henri Sivonen">hsivonen</who>
    <bug_when>2013-08-16 10:16:17 +0000</bug_when>
    <thetext>The standards should be realistic about interop.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>92150</commentid>
    <comment_count>9</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2013-08-16 10:21:44 +0000</bug_when>
    <thetext>resolving wontfix per comment 6</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>92154</commentid>
    <comment_count>10</comment_count>
    <who name="Shai Berger">shai</who>
    <bug_when>2013-08-16 10:53:21 +0000</bug_when>
    <thetext>(In reply to comment #8)
&gt; The standards should be realistic about interop.

Then you should change the standards, overriding the conclusion reached in ticket 13502 (and the intentions of UNICODE, as detailed there).

To clarify: The problem is that the validator rejects something that the standards allow, on the ground that the browsers don&apos;t implement it well; effectively undermining the standards committees.

I find this decision unreasonable.

(I looked for some guidelines on whether it is OK for me to reopen a bug when I disagree with the decision, and couldn&apos;t find any; please accept my apology if this is a faux pas).</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>92179</commentid>
    <comment_count>11</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2013-08-16 17:23:34 +0000</bug_when>
    <thetext>(In reply to comment #10)
&gt; (In reply to comment #8)
&gt; &gt; The standards should be realistic about interop.
&gt; 
&gt; Then you should change the standards, overriding the conclusion reached in
&gt; ticket 13502 (and the intentions of UNICODE, as detailed there).

If you want to have a change made to a particular spec, this bug is not the place to do it.

&gt; To clarify: The problem is that the validator rejects something that the
&gt; standards allow,

The validator doesn&apos;t &quot;reject&quot; it -- it emits a warning, not an error. A warning is appropriate here, given comment 5.

&gt; on the ground that the browsers don&apos;t implement it well;

The fact that browsers don&apos;t implement it well is the reason the validator is emitting a warning. The warning is there to let users know that something they might be think will work correctly actually might not work as they expect.

&gt; effectively undermining the standards committees.
&gt; 
&gt; I find this decision unreasonable.
&gt; 
&gt; (I looked for some guidelines on whether it is OK for me to reopen a bug
&gt; when I disagree with the decision, and couldn&apos;t find any; please accept my
&gt; apology if this is a faux pas).

The reason I moved the bug to &quot;resolved&quot; is that in comments here you&apos;ve already heard from the maintainers of the validator, neither of whom is planning to take any action on the bug.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>92220</commentid>
    <comment_count>12</comment_count>
    <who name="Shai Berger">shai</who>
    <bug_when>2013-08-17 06:55:35 +0000</bug_when>
    <thetext>(In reply to comment #11)
&gt; (In reply to comment #10)
&gt; &gt; (In reply to comment #8)
&gt; &gt; &gt; The standards should be realistic about interop.
&gt; &gt; 
&gt; &gt; Then you should change the standards, overriding the conclusion reached in
&gt; &gt; ticket 13502 (and the intentions of UNICODE, as detailed there).
&gt; 
&gt; If you want to have a change made to a particular spec, this bug is not the
&gt; place to do it.
&gt; 

It was Henri who suggested there was something wrong with the spec. I would like to see it implemented as it stands.

&gt; &gt; To clarify: The problem is that the validator rejects something that the
&gt; &gt; standards allow,
&gt; 
&gt; The validator doesn&apos;t &quot;reject&quot; it -- it emits a warning, not an error. A
&gt; warning is appropriate here, given comment 5.
&gt; 
&gt; &gt; on the ground that the browsers don&apos;t implement it well;
&gt; 
&gt; The fact that browsers don&apos;t implement it well is the reason the validator
&gt; is emitting a warning. The warning is there to let users know that something
&gt; they might be think will work correctly actually might not work as they
&gt; expect.
&gt; 

I see. In that case, perhaps the warning message can be made clearer.

&gt; 
&gt; The reason I moved the bug to &quot;resolved&quot; is that in comments here you&apos;ve
&gt; already heard from the maintainers of the validator, neither of whom is
&gt; planning to take any action on the bug.

Will you accept a patch to this effect (make it clearer that the warning is about current implementation in browsers)?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>92222</commentid>
    <comment_count>13</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2013-08-17 13:03:53 +0000</bug_when>
    <thetext>(In reply to comment #12)
&gt; &gt; If you want to have a change made to a particular spec, this bug is not the
&gt; &gt; place to do it.
&gt; 
&gt; It was Henri who suggested there was something wrong with the spec. I would
&gt; like to see it implemented as it stands.

So then I guess what you should probably do is file bugs in the Mozilla, Chrome, and WebKit bug trackers at least.

&gt; &gt; The fact that browsers don&apos;t implement it well is the reason the validator
&gt; &gt; is emitting a warning. The warning is there to let users know that something
&gt; &gt; they might be think will work correctly actually might not work as they
&gt; &gt; expect.
&gt; 
&gt; I see. In that case, perhaps the warning message can be made clearer.

Yeah, maybe so.

&gt; &gt; The reason I moved the bug to &quot;resolved&quot; is that in comments here you&apos;ve
&gt; &gt; already heard from the maintainers of the validator, neither of whom is
&gt; &gt; planning to take any action on the bug.
&gt; 
&gt; Will you accept a patch to this effect (make it clearer that the warning is
&gt; about current implementation in browsers)?

We&apos;ll review any patch submitted and if it&apos;s an improvement, it&apos;ll likely end up getting checked in. But if you have some suggestion for improved wording, you can  just post it here and we can see if Henri&apos;s agreeable to it and go from there.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>93352</commentid>
    <comment_count>14</comment_count>
    <who name="Shai Berger">shai</who>
    <bug_when>2013-09-13 14:15:26 +0000</bug_when>
    <thetext>Chromium bug: https://code.google.com/p/chromium/issues/detail?id=290906
Firefox bug: https://bugzilla.mozilla.org/show_bug.cgi?id=916102

Thanks for the suggestions.

For the improved text: Instead of the current &quot;Text run starts with a composing character.&quot;, how about:

&quot;Separate styling for composing characters is not supported well by current browsers. If no separate styling is intended, avoid starting text runs with composing characters.&quot;

(in case this &quot;standard but not supported well&quot; is the whole sense of warnings, then perhaps this is not the place to add it, but in a more general place in the validator output).

Thanks again,

Shai.</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="0"
              isprivate="0"
          >
            <attachid>973</attachid>
            <date>2011-03-30 08:00:46 +0000</date>
            <delta_ts>2011-03-30 08:00:46 +0000</delta_ts>
            <desc>explanation and exemplification of problem</desc>
            <filename>test-first.html</filename>
            <type>text/html</type>
            <size>1593</size>
            <attacher name="Shai Berger">shai</attacher>
            
              <data encoding="base64">PERPQ1RZUEUgaHRtbD4KPGh0bWw+CjxoZWFkPgo8c3R5bGUgdHlwZT0idGV4dC9jc3MiPgogLmRk
ZCB7IGZvbnQtd2VpZ2h0OiBib2xkOyBjb2xvcjogcmVkOyB9CiAuZGRkIHNwYW4geyBmb250LXdl
aWdodDogbm9ybWFsOyBjb2xvcjogYmxhY2s7IG9wYWNpdHk6IDAuMjU7IH0KIC5kZGQgeyBmb250
LWZhbWlseTogZXpyYSBzaWwsIHZlcmRhbmE7IH0KIC5lZWUgeyBmb250LXdlaWdodDogYm9sZDsg
Y29sb3I6IHJlZDsgfQogLmVlZTpmaXJzdC1sZXR0ZXIgeyBmb250LXdlaWdodDogbm9ybWFsOyBj
b2xvcjogYmxhY2s7IG9wYWNpdHk6IDAuMjU7IH0KIC5lZWUgeyBmb250LWZhbWlseTogZXpyYSBz
aWwsIHZlcmRhbmE7IH0KPC9zdHlsZT4KPC9oZWFkPgo8Ym9keT4KPHA+SGksCgo8cD5JIG5lZWQg
dG8gcHJlc2VudCBhIGJhc2UgY2hhcmFjdGVyIHdpdGggYSBjb21iaW5pbmcgY2hhcmFjdGVyLAp3
aGlsZSBlbXBoYXNpemluZyB0aGUgY29tYmluaW5nIGNoYXJhY3RlciBhbmQgZGVtb3RpbmcgdGhl
IGJhc2UKY2hhcmFjdGVyIHRvICJiYWNrZ3JvdW5kIiBzdGF0dXMuIEluIG90aGVyIHdvcmRzLCBJ
IHdhbnQgdG8gcHJlc2VudAphIGNvbWJpbmF0aW9uIHdpdGggZGlmZmVyZW50IHN0eWxlcyBmb3Ig
dGhlIGNvbWJpbmVkIHBhcnRzLgoKPHA+QWNjb3JkaW5nIHRvIHRoZSB2YWxpZGF0b3IsIGEgdGV4
dCBydW4gc2hvdWxkIG5vdCBiZWdpbiB3aXRoIGEgY29tYmluaW5nCmNoYXJhY3Rlci4gVGhpcyBt
YWtlcyBpdCBpbXBvc3NpYmxlIGZvciBtZSB0byBnZXQgd2hhdCBJIHdhbnQsIGFzCjpmaXJzdC1s
ZXR0ZXIgYXBwbGllcyBhIHN0eWxlIHRvIHRoZSBjb21iaW5hdGlvbiBhcyBhIHdob2xlIChvYnZp
b3VzbHkpLgoKPHA+SSB3YXMgYWJsZSB0byAid29yayBhcm91bmQiIHRoZSB2YWxpZGF0aW9uIGlz
c3VlIGJ5IGluc2VydGluZyBhCnplcm8td2lkdGggY2hhcmFjdGVyIGFzIGZpcnN0IGluIHRoZSBy
dW4sIGJldHdlZW4gbXkgYmFzZSBjaGFyYWN0ZXIKYW5kIHRoZSBjb21iaW5pbmcgb25lLiBTaW5j
ZSB0aGlzIGlzIGNsZWFybHkgY2hlYXRpbmcsIEkgdGhpbmsKdGhlIHZhbGlkYXRvciBzaG91bGQg
aGF2ZSBjYXVnaHQgbWUuCgo8cD5JJ3ZlIHRlc3RlZCBhIHBhZ2Ugd2l0aCB0aGUgd29ya2Fyb3Vu
ZCBvbiBDaHJvbWUgMTAsIEZpcmVmb3ggMy42IGFuZApJbnRlcm5ldCBFeHBsb3JlciA5LiBDaHJv
bWUgYW5kIEZpcmVmb3ggcmVuZGVyIGl0IHRoZSB3YXkgSSB3YW50LCBJRQp3YXMgY29uZnVzZWQg
YW5kIGRpZCBub3QgcmVuZGVyIHRoZSBjb21iaW5pbmcgY2hhcmFjdGVyIGF0IGFsbC4KCjxoMT4g
VGhpcyBwYXNzZXMgdmFsaWRhdGlvbjwvaDE+CjxoMiBjbGFzcz0iZGRkIj48c3Bhbj4mI3gwNWRl
Ozwvc3Bhbj4mcmxtOyYjeDA1OTI7PC9oMj4KCjxoMT4gVGhpcyBlcXVpdmFsZW50IGZhaWxzIHZh
bGlkYXRpb248L2gxPgo8aDIgY2xhc3M9ImRkZCI+PHNwYW4+JiN4MDVkZTs8L3NwYW4+JiN4MDU5
Mjs8L2gyPgoKPGgxPiA6Zmlyc3QtbGV0dGVyIGRvZXMgbm90IGRvIHdoYXQgSSB3YW50PC9oMT4K
PGgyIGNsYXNzPSJlZWUiPiYjeDA1ZGU7JiN4MDU5Mjs8L2gyPgo8L2JvZHk+CjwvaHRtbD4K
</data>

          </attachment>
      

    </bug>

</bugzilla>