This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 10838 - Make <u> conforming.
Summary: Make <u> conforming.
Status: RESOLVED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: pre-LC1 HTML5 spec (editor: Ian Hickson) (show other bugs)
Version: unspecified
Hardware: PC All
: P3 major
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL: http://www.w3.org/TR/html5/obsolete.h...
Whiteboard:
Keywords: WGDecision
: 11518 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-09-29 23:06 UTC by KangHao Lu
Modified: 2011-07-30 14:30 UTC (History)
14 users (show)

See Also:


Attachments

Description KangHao Lu 2010-09-29 23:06:24 UTC
This was discussed in a long thread[1] three years ago. There was no consensus and no response from the editor, but since some people like Maciej think we can make <u> at least conforming[2], I would like to raise this bug to catch the editor's attention.

Proper name mark, in Chinese [3] is a use case that was slightly mentioned in the thread (see also an example picture[4]). The purpose of this mark is to highlight *every* proper noun in some text to help the reader break words, cause Chinese people don't use spaces for word breaking. Although it is no longer popular recently, proper name mark is still used in some textbooks in Taiwan (esp. children's Chinese textbook) and Hong Kong. Because the purpose and frequency of using "proper name mark" in a piece of text is similar to code highlighting, I personally don't think it's appropriate to use <mark> for this purpose since the spec says "This (<mark>) is separate from syntax highlighting, for which span is more appropriate." But then <span class="pn"> is probably too long. This is not yet a very prevalent use case because proper name marks are often used in combination with vertical text and vertical text is not yet popular on the Web.  

While I do strongly agree that an underlined text which is not a hyperlink confuses the user in his/her browsing experience, there are HTML user agents for which browsing the Web isn't the main purpose and hence the user might not get confused. e-book readers are such examles.

[1] http://lists.w3.org/Archives/Public/public-html/2007Dec/thread#msg268 
[2] http://lists.w3.org/Archives/Public/public-html/2007Dec/0307
[3] http://en.wikipedia.org/wiki/Proper_name_mark
[4] http://www.go8.com.tw/files/4168%200004.jpg
Comment 1 Ian 'Hixie' Hickson 2010-09-30 02:39:51 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale:

If we introduced <u> with the semantic "proper name mark", it would just be used incorrectly by everyone. So doing that would be bad. If this is a use case that we should address, then we're better off introducing a new element for it.

I can't find another reason to have <u> (I've thought about this quite a bit). It really does seem to be purely presentational; far more than even <small>, <b>, <i>, and <s>.
Comment 2 KangHao Lu 2010-09-30 06:52:31 UTC
The following is two other use cases of <u> from the wiki entry "Underline"[1]. I am not familiar with these use cases, but I think <em>, <mark> and <b> are not suitable here.

- Underlines are sometimes used as a diacritic, to indicate that a letter has a different pronunciation to its non-underlined form.
- single underline used on manuscripts to indicate the italic typeface to be used

The use of <i> for the second use case is arguably incorrect because the typical typographic presentation in that context is not italicized.

While these are probably corner cases, it might be worth giving just a tag to these cases where underline is the typical typographic presentation.

For the record, I do agree that we shouldn't add extra semantics to <u>, but based on the current spec I don't agree that <u> is more presentational than <b> and <i>. They are all "a span of text offset from the normal prose/presentational mode, whose typical typographic presentation is xxx" to me, and explanation based on examples is confusing and somehow inconsistent. <i> for ship names won't be pronounced in an alternative voice (and ship names won't be italicized in Chinese, just as proper nouns won't be underlined in English), and I personally think these should all be made obsolete but conforming for consistency. They are all last resorts anyway.

Any pointer to your long thought about underline will be appreciated.

[1] http://en.wikipedia.org/wiki/Underline
Comment 3 Ian 'Hixie' Hickson 2010-09-30 07:13:13 UTC
The diacritic is &#x0332;.

I don't really see why we would want to use <u> to indicate that the text should be italicised... if you want to indicate that the text should be italicised, use the appropriate markup for the reason it's italicised, and then use CSS to italicise it.

No pointer to my thoughts, they're just thoughts, sorry! When I say I've been thinking about this a long time I mean it literally.

EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale:

I maintain that there is a difference between <u>'s use cases and <b>/<i>/<s>/<small>/etc. The others aren't actually defined in terms of their rendering, and yet have very solid and common use cases (<b> for keywords, <i> for alternate voices or terms, <s> for irrelevant or inaccurate text, <small> for legalese, etc) that actually match how they are commonly used in pages where people aren't just abusing HTML as a presentational language.
Comment 4 Henri Sivonen 2010-09-30 08:28:58 UTC
It seem to me the (In reply to comment #1)
> If we introduced <u> with the semantic "proper name mark", it would just be
> used incorrectly by everyone. So doing that would be bad. 

What concrete badness would ensue?

It seems to me that it's backwards to value theoretical purity (it's wrong as a matter of principle to use presentational markup without citing how people would concretely suffer) to override a pragmatic i18n/author concern (that it's easier to underline Chinese characters using <u> than using combining diacritical marks).

Reopening.
Comment 5 fantasai 2010-09-30 08:34:28 UTC
(In reply to comment #2)
> - Underlines are sometimes used as a diacritic, to indicate that a letter
>   has a different pronunciation to its non-underlined form.
> - single underline used on manuscripts to indicate the italic typeface to
>   be used

As Ian mentioned, the first case should be handled with a proper diacritic, not
with markup. In the second case, the markup used should be determined based on
the structure of the text, not its styling in the manuscript. CSS can then be
used to assign an underline in place of the default styling for the markup
(whatever that markup happens to be). HTML markup is not a presentation
language: asking it to represent manuscript styling is inappropriate.

As for the original use case, the correct markup would be <i>. From the spec:
  # The i element represents a span of text ... offset from the normal prose.
Stylistically offsetting a proper name is an appropriate use of this markup.

With regards to styling, if necessary it can be subclassed, but since italics
are not used in Chinese generally,
  i:lang(zh) { font-style: normal; text-decoration: underline; ]
should be adequate.

I recommend to close this issue.
Comment 6 Ian 'Hixie' Hickson 2010-09-30 09:05:39 UTC
The concrete badness is that if we have an element that is purely for presentational purposes, people will be locked into that rendering for all the purposes for which they have used it. This contrasts with semantic markup, where you can restyle a category of content using a style sheet. For example, you can restyle all the content that is intended to be in a different voice to be in a different font, rather than just italics. Or you can style keywords in a different colour as well as being bold.

In general there is also the value of educating authors about using the right semantic tools  as we push people away from <font> and <u>, they get closer to using the much more semantic elements like <cite> and <aside>. This further increases the authoring benefits for those authors and their readers, especially those readers using non-visual UAs, whose tools can then apply more appropriate rendering than just guessing at how to express (in this case) underlines in their medium.

When we added <b>, <i>, <small>, and, most recently, <s>, it was not that we were adding presentational elements and that we were justifying it by doublethinking a semantic meaning for them. HTML really does define these elements now in semantic terms; that they have existing presentations is a backwards-compatibility boon; that the elements are often already used for the purposes for which we defined them makes them easier to teach. But that doesn't make them any less semantic. These definitions are sometimes referred to disparagingly as "semantic fig leafs", but I think that viewing them that way misses the point of why these elements exist in the language. They each have real use cases.


EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale: see above.
Comment 7 Henri Sivonen 2010-11-17 14:10:59 UTC
Escalated as http://www.w3.org/html/wg/tracker/issues/144 as a follow-up to the discussion at TPAC.
Comment 8 Maciej Stachowiak 2010-12-09 22:05:08 UTC
*** Bug 11518 has been marked as a duplicate of this bug. ***
Comment 9 Ambrose Li 2010-12-27 23:23:46 UTC
The point is that U is not presentatioal in the Chinese language. It is not manuscript styling, but a full-fledged punctuation mark ("first-class citizen" if you prefer that wording). It is in the same class as the comma, period, colon, dashes, hyphens, and quotation marks.

The English punctuation marks are also derived from manuscript styling. Are we going to deprecate and eventually obsolete them too, and replace them with new HTML elements that describe sentence structure? I find this argument very unconvincing.
Comment 10 Ian 'Hixie' Hickson 2010-12-29 09:04:47 UTC
This bug has been escalated already, so it shouldn't be open.
Comment 11 Aryeh Gregor 2010-12-29 20:06:59 UTC
To clarify for those not familiar with the HTMLWG Decision Policy: the TrackerIssue keyword means that the bug will be escalated at some point to a request for Change Proposals.  Anyone can then submit arguments for why it should or should not be conforming.  If there's at least one argument submitted on each side, the HTMLWG co-chairs will decide the issue and may overrule the editor (at least with respect to the W3C version of the spec).

The chairs have issued a request for Change Proposals, with a due date of January 26: <http://lists.w3.org/Archives/Public/public-html/2010Dec/0126.html>  If you'd like to get it changed, you can submit a Change Proposal to public-html or bring the chairs' attention to it in some other way.  It should include all the information given in the escalation procedure: <http://dev.w3.org/html5/decision-policy/decision-policy.html#escalation>

If no Change Proposals are submitted, the issue will be closed without prejudice, so it can still be reopened later if someone wants to submit a Change Proposal at that date.
Comment 12 Laura Carlson 2010-12-29 21:04:36 UTC
Hi Aryeh,

(In reply to comment #11)
> To clarify for those not familiar with the HTMLWG Decision Policy: the
> TrackerIssue keyword means that the bug will be escalated at some point to a
> request for Change Proposals.  Anyone can then submit arguments for why it
> should or should not be conforming.  If there's at least one argument submitted
> on each side, the HTMLWG co-chairs will decide the issue and may overrule the
> editor (at least with respect to the W3C version of the spec).

I'm not sure that anyone can submit a change proposal. Unless there are extenuating circumstances, it seems that the HTML Chairs want non-HTML working group members to join the HTML working Group to be eligible to submit change proposals.

Check HTMLWG Decision Policy Bug 10524 - Please clarify procedure and recourse for non-working group members when they are unsatisfied with a bug resolution comment 21: 
http://www.w3.org/Bugs/Public/show_bug.cgi?id=10524#c21
Comment 13 Aryeh Gregor 2010-12-30 20:52:03 UTC
I stand corrected.  However, if a non-member writes a Change Proposal that stands a chance of getting accepted, it's almost certain that some member will be willing to submit it for them, so it's not a big difference in practice.
Comment 14 Laura Carlson 2010-12-30 21:37:52 UTC
(In reply to comment #13)
> I stand corrected.  However, if a non-member writes a Change Proposal that
> stands a chance of getting accepted, it's almost certain that some member will
> be willing to submit it for them, so it's not a big difference in practice.

That would seem to be a work around, alright.
Comment 15 Ambrose Li 2011-01-04 00:19:38 UTC
(In reply to comment #5)
> As for the original use case, the correct markup would be <i>. From the spec:
>   # The i element represents a span of text ... offset from the normal prose.
> Stylistically offsetting a proper name is an appropriate use of this markup.
> 
> With regards to styling, if necessary it can be subclassed, but since italics
> are not used in Chinese generally,
>   i:lang(zh) { font-style: normal; text-decoration: underline; ]
> should be adequate.

I wish to address specifically to this proposal. This is totally unworkable. If there are non-CJK text inside the proper name (English letters in foreign names, for example), the result would be a mix of roman and italic characters, some underlined and some not. This will be a complete mess and is orthographically wrong.
Comment 16 Aryeh Gregor 2011-01-05 23:18:18 UTC
(In reply to comment #15)
> >   i:lang(zh) { font-style: normal; text-decoration: underline; ]
> 
> I wish to address specifically to this proposal. This is totally unworkable. If
> there are non-CJK text inside the proper name (English letters in foreign
> names, for example), the result would be a mix of roman and italic characters,
> some underlined and some not.

That isn't what that rule does.  It will either underline the whole <i> or italicize the whole <i>, depending on whether the language of the <i> element itself is zh or not.
Comment 17 Sam Ruby 2011-04-08 18:37:06 UTC
Working Group Decision: http://lists.w3.org/Archives/Public/public-html/2011Apr/0212.html
Comment 18 contributor 2011-04-13 21:59:01 UTC
Checked in as WHATWG revision r6002.
Check-in comment: apply wg decision
http://html5.org/tools/web-apps-tracker?from=6001&to=6002
Comment 19 Sam Ruby 2011-04-13 22:10:17 UTC
(In reply to comment #18)
> Checked in as WHATWG revision r6002.
> Check-in comment: apply wg decision
> http://html5.org/tools/web-apps-tracker?from=6001&to=6002

This change seems unrelated to this bug report.
Comment 20 contributor 2011-04-13 23:50:28 UTC
Checked in as WHATWG revision r6004.
Check-in comment: Add <u> to HTML and WebVTT.
http://html5.org/tools/web-apps-tracker?from=6003&to=6004
Comment 21 Ian 'Hixie' Hickson 2011-04-13 23:52:52 UTC
Disregard comment 18, that one is for the microdata ack issue.