This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 8003 - line breaks in attributes (especially title)
Summary: line breaks in attributes (especially title)
Status: RESOLVED WONTFIX
Alias: None
Product: HTML WG
Classification: Unclassified
Component: pre-LC1 HTML5 spec (editor: Ian Hickson) (show other bugs)
Version: unspecified
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Michael[tm] Smith
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords: NE, TrackerIssue
Depends on:
Blocks:
 
Reported: 2009-10-22 11:35 UTC by David Carlisle
Modified: 2010-10-04 13:57 UTC (History)
7 users (show)

See Also:


Attachments

Description David Carlisle 2009-10-22 11:35:22 UTC
It's quite useful to be able to put line breaks in title attributes to make multi-line tooltips, although support for this is currently sporadic (not in firefox as far as I can see, and only in IE8 in full standards mode, works in Opera and Safari)

However the spec says

http://dev.w3.org/html5/spec/Overview.html#the-title-attribute





Caution is advised with respect to the use of newlines in title attributes.

For instance, the following snippet actually defines an abbreviation's expansion with a line break in it:

<p>My logs show that there was some interest in <abbr title="Hypertext
Transport Protocol">HTTP</abbr> today.</p>


This seems to be an unfortunate change from HTML4 and from XML parsing rules (thus including the XML serialisation of html5) In HTML4's SGML parsing, or XML parsing, literal newlines would be normalised to space characters, explicitly to avoid the problem about which the above quote is cautioning. In order to get a newline into the attribute value you need to use a character reference such as 
<abbr title="Hypertext &#10; Transport Protocol">HTTP</abbr> 
newlines (often added by text wrapping editors) in attribute value literals don't cause a newline in the attribute value.

It would help future parallel authoring of HTML and X((HT)ML if the attribute value white space normalisation happened in both serialisations.

David
Comment 1 Ian 'Hixie' Hickson 2009-10-23 06:15:34 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale: The current behaviour is a direct result of requests from authors to allow them to have raw line breaks in attributes, and indeed specifically in the title="" attribute. I think that's more important than consistency between the HTML and XML serialisations.
Comment 2 David Carlisle 2009-10-23 08:43:00 UTC
It may be that one or two (or one or two hundred?) authors have requested this change but the existing rules are based on decades of experience of negative effects of having a system where white space wrapping is generally harmless but line breaks are preserved in a few contexts. If this change (which is incompatble with html4, html5-xhtml and any other XML or  SGML usage) is made then bad linebreaks will inevitably end up in tooltips as files are reflowed in editors.
It is better not to add the bad behaviour than add it and then add text to the specification warning of the dangers. Especially as the existing text does not even say that it is not compatible with one of the supported serialisations or with legacy behaviour.

suggested tracker title

   line breaks in attributes

suggested text

white space normalisation should be applied to attribute values, consistent with html4 and xml. To insert a linebreak into an attribute value a character reference should be used such as &#10;
Comment 3 Anne 2009-10-23 11:39:51 UTC
All implementations already parse attributes in this way. It is just the display of tooltips that is not the same everywhere. I don't want different parsing rules for some attributes and it would probably be a compatibility issue too at this point.
Comment 4 Ian 'Hixie' Hickson 2009-10-23 12:44:04 UTC
Mike, can you file the Tracker issue for David?
Comment 5 David Carlisle 2009-10-23 13:06:34 UTC
(In reply to comment #3)
> All implementations already parse attributes in this way.

It may be the case that currently popular browsers do that, but it surely can't be the case that _no_ html implementation parses html4 attributes as specified by html4? Certainly any sgml based validators won't do that, and can you really be sure that no existing  html editors for example, assume that attributes are 
parsed by HTML4 rules which would normalise white space?

And given that major browsers currently are very inconsistent in using the result of the title attribute parse, compatibility with existing bugs doesn't seem a particularly strong argument for introducing a new attribute parsing rule at this point, incompatible with xml and html4, especially when the user experience of the new behaviour is likely to be sufficiently bad that it's thought necessary to add a warning about it to the spec.

David


Comment 6 Simon Pieters 2009-10-25 21:57:36 UTC
Web compat requires line breaks in <input type=hidden value="..."> to not be normalized; I'm sure there are other attribute values that Web compat calls for no normalization (in text/html).


> compatibility with existing bugs

We need compatibility with existing *pages*.


> doesn't
> seem a particularly strong argument for introducing a new attribute parsing
> rule at this point, incompatible with xml and html4

Browsers have parsed attributes like this all the time, AFAIK.
Comment 7 Michael[tm] Smith 2009-10-26 13:45:14 UTC
escalated to Tracker:
http://www.w3.org/html/wg/tracker/issues/87
Comment 8 David Carlisle 2009-10-26 22:36:49 UTC
(In reply to comment #6)

> Browsers have parsed attributes like this all the time, AFAIK.
> 

Depressing. It makes one wonder what's the point of having a specification if
it is just going to be ignored.




HTML since at least HTML2 has always been quite explict that attribute values should be normalized, this follows from the SGML declaration but to save people reading that, the specs have highlighted this feature:

For example

HTML 2
http://tools.ietf.org/html/rfc1866#section-3.2

 A useful technique for computing an attribute value literal for a
   given string is to replace each quote and white space character by an
   entity reference or numeric character reference as follows:



html 3.2 references
http://www.w3.org/TR/WD-html-lex/#API
which says

Section 7.9.3 of SGML says that an attribute value literal is interpreted as an attribute value by:

    * Removing the quotes
    * Replacing character and entity references
    * Deleting character 10 (ASCII LF)
    * Replacing character 9 and 13 (ASCII HT and CR) with character 32 (SPACE) 



HTML4
http://www.w3.org/TR/html4/types.html

CDATA is a sequence of characters from the document character set and may include character entities. User agents should interpret attribute values as follows:

    * Replace character entities with characters,
    * Ignore line feeds,
    * Replace each carriage return or tab with a single space.

Comment 9 Ian 'Hixie' Hickson 2009-10-27 00:41:52 UTC
> Depressing. It makes one wonder what's the point of having a specification if
> it is just going to be ignored.

There's no point having a specification if it's just going to be ignored, IMHO; that's why HTML5 only specifies things that implementors are willing to implement.
Comment 10 Maciej Stachowiak 2010-01-28 04:00:17 UTC
Removing TrackerRequest since this has already been escalated.