[whatwg] id and xml:id

Since UAs handle whitespace in the id attribute inconsistently (see  
below), old specs imply or require whitespace trimming and ids with  
whitespace are unreferencable from whitespace-separated lists of ids,  
I suggest adding the following language concerning document conformance:

The value of the id attribute must be a string that consists of one  
or more characters matching the following production: [#x21-#xD7FF]| 
[#xE000-#xFFFD]|[#x10000-#x10FFFF] (any XML 1.0 character excluding  
whitespace).

Also, I suggest requiring that elements must not have both id and  
xml:id and requiring that xml:id must not occur in the HTML  
serialization. (Again, from the document conformance point of view-- 
not disputing requirements on browsers.)

Rationale:
HTML doesn't have namespace processing of colonified names and the  
xml:id spec is not designed for HTML. Allowing xml:id in HTML feels  
intuitively wrong (perhaps even a bit evil :-).

If an element had both an id attribute and an xml:id attribute with  
different values, the document would not be HTML-serializable, which  
would be bad. (Obviously, even with only one kind of ID attribute on  
an element, in round tripping from XHTML to HTML to XHTML, the  
information about whether the original attribute was id or xml:id is  
lost just like the information about whether a table had a tbody is  
lost.)

If an element was allowed to have an id attribute and an xml:id  
attribute with the same value, the following constraint from xml:id  
spec would be violated even for conforming docs:
"An xml:id processor should assure that the following constraint holds:
     * The values of all attributes of type ?ID? (which includes all  
xml:id attributes) within a document are unique."
( http://www.w3.org/TR/xml-id/ )
Assuming, of course, that the XHTML5 id can still be considered an ID  
in the XML sense.

Finally, as the ultimate ID nitpicking, the spec should state that it  
is naughty of authors to turn attributes other than id and xml:id  
into IDs via the DTD. (Well, using a DTD at all is naughty. :-)

- -

Test case: http://hsivonen.iki.fi/test/wa10/adhoc/id.html
The script tries every id with a whitespaceless value to see if  
whitespace is trimmed before ID assignment.

Firefox:

id='a' PASS
id='2' PASS
id='<' PASS
id=',' PASS
id='ä' PASS
id=' c ' FAIL
id='\nd\n' PASS
id='\t\te\t\t' PASS
id='
f
' PASS

Opera (weekly build 3312; note that Opera recently changed its  
behavior to match the others with id=' c '):

id='a' PASS
id='2' PASS
id='<' PASS
id=',' PASS
id='ä' PASS
id=' c ' FAIL
id='\nd\n' PASS
id='\t\te\t\t' PASS
id='
f
' FAIL

Safari and IE 6:

id='a' PASS
id='2' PASS
id='<' PASS
id=',' PASS
id='ä' PASS
id=' c ' FAIL
id='\nd\n' FAIL
id='\t\te\t\t' FAIL
id='
f
' FAIL

-- 
Henri Sivonen
hsivonen at iki.fi
http://hsivonen.iki.fi/

Received on Sunday, 2 April 2006 03:58:46 UTC