Re: DOM XPath needs adjusting to work with HTML 5

On Apr 7, 2009, at 02:17, Jonas Sicking wrote:

> On Mon, Apr 6, 2009 at 4:51 AM, Henri Sivonen <hsivonen@iki.fi> wrote:
>> A successor of DOM Level 3 XPath (and, I presume, any spec  
>> specifying a JS
>> API for initiating a transform on a DOM tree that might be created by
>> parsing from text/html) need to specify that during XPath evaluation
>> implementation must consider a name expression to match is the  
>> following
>> case in addition to the cases where it is already specified to match:
>>  * The name expression has no namespace.
>> AND
>>  * The name expression has local name l.
>> AND
>>  * The expression is being tested against an element node.
>> AND
>>  * The element node has local name l.
>
> this is a case insensitive comparison, right?

The comparison is case-sensitive in Gecko and WebKit.  Gecko  
lowercases local names in the expression when compiling the  
expression. WebKit doesn't. Opera seems to compile the original case  
and the lower case into the expression and evaluates to true if either  
matches.

Thus, only expressions written in lower case interoperate.

In the SVG-in-text/html world, the Gecko approach won't work for  
camelCase names. The WebKit approach is simpler and violates the XPath  
specs less than the Opera approach, so if WebKit can get away with its  
current behavior considering existing content, I think the WebKit  
behavior is preferable over the Opera behavior, even though something  
very similar to the Opera behavior has been contemplated for Selectors  
on public-html.

>> AND
>>  * The element node has namespace http://www.w3.org/1999/xhtml
>> AND
>>  * The owner document of the element node is an "HTML document" as  
>> defined
>> in HTML 5:
>> http://www.whatwg.org/specs/web-apps/current-work/#html-documents
>
> Otherwise it sounds fine to me. We already do something very similar
> in firefox where we do case insensitive comparisons for HTML elements.

A slight modification could be better aligned with XPath 2.0: only  
letting no namespace in expressions match the http://www.w3.org/1999/xhtml 
  in the tree (i.e. not letting it match no-namespace nodes as well).  
See
http://www.w3.org/Bugs/Public/show_bug.cgi?id=6777#c23

However, Alexey Proskuryakov considered the current WebKit approach  
preferable on IRC.

Anyway, this raises the question of whether it should be possible to  
bind a prefix to no namespace. Gecko allows this but WebKit and Opera  
don't.

> Additionally, you need to define if the name() and local-name()
> functions should return upper or lower case strings. IMHO they should
> follow what .name and .localName returns.


These both return in lower case in all of Gecko, WebKit and Opera, so  
I think it's best not the change this.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Tuesday, 7 April 2009 15:50:47 UTC