13868 – Define normalize()

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 13868 - Define normalize()

Summary: Define normalize()

Status:	RESOLVED FIXED

Alias:	None

Product:	WebAppsWG
Classification:	Unclassified
Component:	DOM (show other bugs)
Version:	unspecified
Hardware:	All All

Importance:	P2 normal
Target Milestone:	---
Assignee:	Anne
QA Contact:	public-webapps-bugzilla

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:	13843
	Show dependency tree / graph

Reported:	2011-08-22 19:47 UTC by Aryeh Gregor
Modified:	2011-09-26 14:30 UTC (History)
CC List:	3 users (show)

See Also:

Attachments

Description Aryeh Gregor 2011-08-22 19:47:57 UTC

DOM Core currently says that implementations must not support normalize().  Why?  It seems reasonably useful to me, and it's supported by all browsers.  Is there something wrong with it?  Is there evidence that sites don't depend on it?  I've used it before.

Comment 1 Anne 2011-08-31 10:06:49 UTC

What is the use case?

Comment 2 Aryeh Gregor 2011-08-31 17:04:41 UTC

I don't have a non-marginal use-case.  The case where it came up for me is when I was writing some tests, and wanted to check whether the browser and spec output matched, but didn't want to fail just because of some extra empty text nodes or such.  But this isn't common.

However, browsers already support it, and it's simple, so the path of least resistance is just to spec it.  It's not so useless or harmful as to be worth going to the effort of removing, since there are undoubtedly some sites that use it (even if pointlessly) and will break if it starts throwing an exception.  It could be re-specced as a no-op, but that seems like a lot more effort and confusion than just speccing it to do what it already does.

More generally, existing features don't need use-cases, IMO.  New features need strong use-cases, but existing features that all browsers support should be specced unless there's a strong reason we want to drop them.

Comment 3 Aryeh Gregor 2011-09-07 17:39:41 UTC

Thinking about it, the basic use-case for my tests was actually that I wanted to check whether a DOM was serializable.  In other words, I wanted to verify that

  // session 1
  sendOverNetwork(div.innerHTML);

  // session 2, maybe later
  div.innerHTML = retrieveFromNetwork();

would actually behave as intended.  So as a sanity check, I did something like

  var clone = div.cloneNode(false);
  clone.innerHTML = div.innerHTML;
  assert(clone.isEqualNode(div));

But this was foiled by things like empty text nodes, which I didn't care about at all.  So instead I changed it to something like

  var clone1 = div.cloneNode(true);
  clone1.normalize();
  var clone2 = clone1.cloneNode(false);
  clone2.innerHTML = clone1.innerHTML;
  assert(clone1.isEqualNode(clone2));

which removes the false negatives.  This would be useful for anyone who's relying on HTML serialization, so that they can track down bugs caused by failure to round-trip.  HTML serialization and parsing are so complicated that there's no realistic way to ensure that they'll work right without trying it and comparing DOMs.

The false negatives aren't a theoretical issue, either.  All kinds of common DOM operations leave empty or adjacent text nodes.  execCommand() does it all over the place (probably inconsistently between browsers at the moment).

Comment 4 Anne 2011-09-17 08:52:28 UTC

normalize():

* Go through all descendants of the context object
* For each Text node, get the data of its contiguous Text nodes (excluding itself), append that concatenated to its own data, then remove its contiguous Text nodes, and then find the next Text node.

Comment 5 Anne 2011-09-26 14:30:32 UTC

https://bitbucket.org/ms2ger/dom-core/changeset/7fe7d6fd0cf3