Re: ID Characters (was: Re: 3.4. Global attributes)

On Jul 31, 2007, at 5:00 PM, Jim Jewett wrote:

>
> In http://lists.w3.org/Archives/Public/public-html/2007Jul/1261.html,
>
> [ valid ID characters are ]
>>>> Any character minus space characters.
>
> [so you can have an id of "1" or "$^&"]
>
> Sander Tekelenburg asked
>>> How do existing (pre-HTML5) UAs handle this?
>
> Anne van Kesteren wrote:
>> Anyway, they handle it fine. In CSS you might
>> have to escape certain characters because the IDENT
>> production does not always allow them to occur literally.
>
> That is an important enough limit that it should probably be included
> in the good authoring advice, even if not in the actual grammar.
> Perhaps something like:
>
> Authors wishing to write robust applications are advised to use a more
> restricted set of IDs.  While "1" and $^&" are technically valid
> identifiers, they will trigger bugs in some tools.  Therefore, authors
> SHOULD stick to ID characters from the ASCII digits [0-9] and one case
> of ASCII letters (either [a-z] or [A-Z]), and SHOULD ensure that the
> first character of each ID is a letter rather than a digit.
>
> This probably applies to the name attribute as well.

I am a bit concerned about XML compatibility. Allowing IDs more  
permissive than XML makes conversion to XML (or manipulation or  
embedding within XML) more difficult. I don't see how we gain that  
much by permitting authors to use these extra start characters.

However, I don't think we should be using only ASCII there either  
(perhaps you meant Unicode letters and digits, etc). Following the  
same rules as XML on name production would make a lot of sense here[1] 
[2].

Take care,
Rob

[1]: <http://www.w3.org/TR/xml/#sec-common-syn>
[2]: <http://www.w3.org/TR/xml/#sec-entexpand>

Received on Tuesday, 31 July 2007 22:47:24 UTC