24208 – Please handle the case of XML attributes with namespaces but no prefix

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 24208 - Please handle the case of XML attributes with namespaces but no prefix

Summary: Please handle the case of XML attributes with namespaces but no prefix

Status:	RESOLVED FIXED

Alias:	None

Product:	WebAppsWG
Classification:	Unclassified
Component:	DOM Parsing and Serialization (show other bugs)
Version:	unspecified
Hardware:	PC All

Importance:	P2 normal
Target Milestone:	---
Assignee:	Travis Leithead [MSFT]
QA Contact:	public-webapps-bugzilla

URL:
Whiteboard:
Keywords:

Duplicates (1):	24210 (view as bug list)
Depends on:
Blocks:

Reported:	2014-01-06 00:49 UTC by Victor Costan
Modified:	2014-10-13 23:46 UTC (History)
CC List:	2 users (show)

See Also:

Attachments

Description Victor Costan 2014-01-06 00:49:24 UTC

The DOM Core API setAttributeNS allows the creation of attributes that have a non-null namespace URI, but no prefix.

A minimal example follows below.
var doc = (new DOMParser()).parseFromString("<test/>", "text/xml", null);
doc.documentElement.setAttributeNS("http://www.example.com/", "attr", "value");

The XML serialization algorithm specified below does not seem to handle this case.
https://dvcs.w3.org/hg/innerhtml/raw-file/tip/index.html#dfn-concept-serialize-xml

(searching for "XML serialization of the attributes" shows the relevant paragraph)

Firefox and Internet Explorer handle this case by generating prefixes for the attributes.

Blink and WebKit do not yet handle this case, but I'm working on it. http://crbug.com/248044

Comment 1 Travis Leithead [MSFT] 2014-02-27 19:05:10 UTC

var doc = (new DOMParser()).parseFromString("<test xmlns:a1='other'><child/></test>", "text/xml", null);
doc.documentElement.firstChild.setAttributeNS("http://www.example.com/", "attr", "value");
doc.documentElement.firstChild.setAttributeNS("http://www.another.example.com/", "attr", "value");

Firefox carefully manages the auto-generation of namespace numbering, so I'll need to spec that.

var doc = (new DOMParser()).parseFromString("<test><child/></test>", "text/xml", null);
doc.documentElement.firstChild.setAttributeNS("http://www.w3.org/2000/xmlns/", "xmlns:test", "http://www.example.com/");
doc.documentElement.firstChild.setAttributeNS("http://www.example.com/", "attr", "value");

This case falls down in IE--it doesn't seem to pickup the dynamically added "text" prefix and perform the association. This case needs to be handled as well.

Comment 2 Travis Leithead [MSFT] 2014-02-27 19:08:24 UTC

I'm on the fence about what auto-generation naming system to use.

Firefox is more concise with "a" + index forming the generated prefix.

IE is more descriptive with "NS" + index, which is somewhat self-describing.

I like IE's generated prefix, but not the uppercase characters. If I go that route than no browser will align with the spec, though it would be easy for either of them to make the namespace generation string change, I imagine.

If I go with the firefox generated string, then they form the reference implementation.

Any preferences?

Comment 3 Victor Costan 2014-02-27 19:55:29 UTC

Thank you very much for looking into this, Travis!

I have a slight preference for lowercase "ns" as well, and that's what my work-in-progress patch uses for Blink. I'm willing to implement what you end up spec'ing though.

FWIW, in my implementation, I'm not using sequential indices. Instead, the indices are based on an internal hash function. If the specification mandates sequential indices, I hope that it will do so in a way that does not require a quadratic-time algorithm. For bonus points, try not adding any memory overhead for DOMs that set their own prefixes for attributes with namespaces.

Comment 4 Travis Leithead [MSFT] 2014-02-27 23:53:12 UTC

(In reply to Victor Costan from comment #3)
> I have a slight preference for lowercase "ns" as well, and that's what my
> work-in-progress patch uses for Blink. I'm willing to implement what you end
> up spec'ing though.

Cool, I'll go with "ns" then as a prefix...

> FWIW, in my implementation, I'm not using sequential indices. Instead, the
> indices are based on an internal hash function. If the specification
> mandates sequential indices, I hope that it will do so in a way that does
> not require a quadratic-time algorithm. For bonus points, try not adding any
> memory overhead for DOMs that set their own prefixes for attributes with
> namespaces.

So, your indexes will be random? How many digits will you use to represent the result of the hash? 

Given a local incrementing counter approach, two unique serializations of different parts of a DOM are at risk of defining the same prefixes for different namespaces. I'm not sure that's a problem unless you end up wanting to combine these serialized strings together (in which case you'd need to de-dup the prefixes).

Another potential issue is that prefixes will be assignned based on the order they are encountered in an element's attributes collection (and this attribute order may be different based on different UAs).

Perhaps it would be safer (and more resilient) to have the spec define an implementation-specific algorithm that generates a unique number for a given namespace (the hashing protocol you described), and that the number<->namespace mapping must be consistent for this namespaceURI wherever it is encountered (and not already associated with a prefix). That way you solve the first-mentioned problem, and don't imply an order in the second problem. An implementation-specific algorithm could use an incrementing number, or a 10-digit hash--the spec requirement might only describe the "ns" prefix part.

On the other hand all that might be overkill for this rather edge-case scenario.

Comment 5 Victor Costan 2014-03-17 16:02:26 UTC

I'm terribly sorry for answering to this so late! My inbox ate your reply, and I just happened to check on this bug today.

The hash function that I'm relying on right now is consistent within a browser process, but may change across platforms and browser versions.

I like your approach for having a resilient spec, and have one minor mention. It might be better to allow the hash code to be up to 20 digits in length, to allow UAs to use 64-bit hashes.

I agree that this might be overkill. You can always spec that serialization should throw an exception in this case, and I'd be happy to implement just that :) I don't think this should break too much of the Web, because right now Chrome and Safari basically generate invalid XML in this case.

Comment 6 Travis Leithead [MSFT] 2014-03-23 00:11:27 UTC

OK, took a stab at this. Had to make some significant refactoring of the algorithm, so make sure you review it all in context.

Unfortunately, I couldn't avoid a multi-pass through an element's attribute list--hopefully you'll understand why when reading the steps. An implementation can probably optimize this though :-).

Finally, I went with the easiest to spec approach of a single incrementing integer (it only has scope within the execution of the algorithm, so that multiple independent serializations may generate the overlapping "generated prefixes". I think this is OK, because Firefox and IE do not enforce any kind of global consistency here and the web is just fine (so I think this is already a supreme edge-case).

Comment 7 Travis Leithead [MSFT] 2014-03-28 21:15:20 UTC

Note: the update was done across three commits:
  https://dvcs.w3.org/hg/innerhtml/rev/fb9edcfb8f5f
  https://dvcs.w3.org/hg/innerhtml/rev/f9b5a818ef99
  https://dvcs.w3.org/hg/innerhtml/rev/fa768c710fba

Comment 8 Travis Leithead [MSFT] 2014-10-13 23:46:29 UTC

*** Bug 24210 has been marked as a duplicate of this bug. ***