This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 11526 - newlines in attribute values
Summary: newlines in attribute values
Status: RESOLVED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: LC1 HTML/XHTML Compatibility Authoring Guide (ed: Eliot Graff) (show other bugs)
Version: unspecified
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: Eliot Graff
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-12-10 15:02 UTC by David Carlisle
Modified: 2011-08-04 05:07 UTC (History)
6 users (show)

See Also:


Attachments

Description David Carlisle 2010-12-10 15:02:13 UTC
The svg path d attribute in section 12, and the sample document is spread over several lines. These newlines are normalised to space in an XML DOM but not in an HTML as stated in section 7.

Rather than remove the newlines, a more pragmatic approach might be to have some words similar to those discussing the //<![CDATA usage in script elements, where there is an acknowledgement that the DOMs generated will be different but that it doesn't matter in practice as the following process (by SVG and Javascript engines respectively) will ignore the white space differences.

Also the abstract probably should not say

"parses into identical document trees"

but rather say 

"That parses into compatible DOM trees"


where you may have to just define "compatible" informally, or maybe not depending on how many places you get different parse trees while conforming to these rules.
Comment 1 Henri Sivonen 2010-12-10 16:33:37 UTC
While I agree that in this case the DOM difference does no harm, relaxing the goal from "identical" to "compatible" is a definitional slippery slope.
Comment 2 David Carlisle 2010-12-10 18:08:16 UTC
(In reply to comment #1)
> While I agree that in this case the DOM difference does no harm, relaxing the
> goal from "identical" to "compatible" is a definitional slippery slope.


Agreed, although the spec has already stood on that slope with the CDATA in script discussion. Perhaps that should be removed as well.
Comment 3 David Carlisle 2010-12-13 12:32:21 UTC
(In reply to comment #1)
> While I agree that in this case the DOM difference does no harm, relaxing the
> goal from "identical" to "compatible" is a definitional slippery slope.


one stated goal of polyglot documents though is allow the xml toolchain to be used to generate documents served as text/html. Most SVG tools are going to wrap the long svg path attributes as xml normalisation and/or svg white space rules mean that it is safe for them to do so.

thus not allowing newlines here just to obtain an identical dom seems optimising for a non-use case (the number of times when you are going to want to serve the same thing with two different mime types and have identical white space in attributes must be vanishingly small) while preventing one of the main aims of the specification (allowing you to generate xhtml+svg+mathml documents served as text/html in a safe way.

I think that I'd say don't put newlines in the title attribute, or attributes holding URI, but allow them elsewhere, noting in the introduction that white space differences in attributes are ignored except as noted in the body of the specification.
Comment 4 Eliot Graff 2011-03-01 23:29:36 UTC
The Editor's Draft of 1 March contains the following changes:

***Removed the newlines from the example document (both within the spec and the standalone version at http://dev.w3.org/html5/html-xhtml-author-guide/SamplePage.html)

***Added the following sentence to Section 7, Attributes:
]]
Polyglot markup does not use newline characters within an attribute. 
[[

***Added the following note to Section 7, Attributes:
]]
Because of attribute-value normalization in XML [XML10], polyglot markup does not use newline characters within an attribute. Practically speaking, for source code with newlines within attributes, DOMs generated via XML and HTML will be different; however, whitespace differences have no behavioral impact on the page unless explicitly examined by JavaScript, rendering the differences of small consequence. Note that newlines are overtly not allowed in the title attribute or in any attribute containing a URI. 
[[

I believe that this satisfies the requests in this bug, while maintaining the integrity of polyglot's definition, so I am resolving it as fixed.

Thank you both for working through this with me.

Eliot
Comment 5 Leif Halvard Silli 2011-05-26 00:00:36 UTC
(In reply to comment #0)
> The svg path d attribute in section 12, and the sample document is spread over
> several lines. These newlines are normalised to space in an XML DOM but not in
> an HTML as stated in section 7.

The Editor's solution is probably OK. 

However, the exceedingly long line cause the page to scroll sidaways, creating a very strange result (with lots of white, empty space on the right side of the document.)

As discussed off-list, please add a span element, and some CSS to the code inside the <pre> element, to make it behave more reasonable. Suggested markup plus CSS:


<pre>
    [ snipping lost of code ]
 &lt;!-- Note that the following attribute contains no newlines. -->
 &lt;path  transform="translate(60, -175)" <span 
 style="background:yellow;white-space:normal;">d="M153 334 C153
334 151 334 151 334 C151 339 153 344 156 344 C164 344 171 339 171 334
C171 322 164 314 156 314 C142 314 131 322 131 334 C131 350 142 364 156
364 C175 364 191 350 191 334 C191 311 175 294 156 294 C131 294 111 311
111 334 C111 361 131 384 156 384 C186 384 211 361 211 334 C211 300 186
274 156 274"</span>  style="fill:white;stroke:red;stroke-width:2"/>

[ snipping lost of code ]

</pre>

The above code will,  due to the span element with white-space:normal, cause that text o be - visually - broken over several lines. This is prettier. We don't need the current, draconian way to demonstrate that the line is long ...

I do not  re-open this bug, because it is -after all- only a display issue.
Comment 6 Michael[tm] Smith 2011-08-04 05:07:06 UTC
mass-move component to LC1
Comment 7 Michael[tm] Smith 2011-08-04 05:07:28 UTC
mass-move component to LC1