19030 – [blocked on sandbox="" implementations] ID scoping for content aggregators (<iframe doc="">)

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 19030 - [blocked on sandbox="" implementations] ID scoping for content aggregators (<iframe doc="">)

Summary: [blocked on sandbox="" implementations] ID scoping for content aggregators (<...

Status:	RESOLVED LATER

Alias:	None

Product:	HTML WG
Classification:	Unclassified
Component:	HTML5 spec (show other bugs)
Version:	unspecified
Hardware:	All All

Importance:	P4 enhancement
Target Milestone:	---
Assignee:	This bug has no owner yet - up for the taking
QA Contact:	HTML WG Bugzilla archive list

URL:	http://esw.w3.org/topic/HTML/IdAndTypeID
Whiteboard:
Keywords:	NoReply

Depends on:
Blocks:

Reported:	2012-09-25 21:56 UTC by contributor
Modified:	2012-09-25 22:27 UTC (History)
CC List:	3 users (show)

See Also:

Attachments

Description contributor 2012-09-25 21:56:26 UTC

This was was cloned from bug 5772 as part of operation LATER convergence.
Originally filed: 2008-06-19 10:32:00 +0000
Original reporter: Rob Burns <rob@robburns.com>

================================================================================
 #0   Rob Burns                                       2008-06-19 10:32:23 +0000 
--------------------------------------------------------------------------------
* The goals of maintaining document-wide uniqueness for IDs and also achieving ID persistence are at odds with one another (for example consider pasting content, and other aggregation of content)
 * Authors may want to uniquely identify an element without desiring the strictness of the ID data type
 * The xml:id attribute already provides an attribute taking a value of type ID and only one such attribute is permitted in the XML serialization of HTML5 (though it could potentially be used in both serializations)
 * Authors want consistency between the text/html serialization and the XML serialization of HTML5
 * Authors and authoring tools already produce documents without carefully checking the uniqueness of ID values (even within XHTML and XML documents) within the document of the uniqueness of type ID attributes on each element, so HTML5 should clearly define interoperable norms for matching such errant IDs, including matching for:
    - the CSS id/ID selector.
    - DOM id related methods (getElementById).
 * Authors want a way to aggregate content such as articles from different sites or articles from the same site in a way that does not cause ID collisions nor id attribute collisions.
 * Authors want to mix arbitrary vocabularies in compound documents that make use of xml:id as a cross-vocabulary/cross-namespace ID-valued attribute.
 * While including ids in hand-coded HTML can be cumbersome and therefore authors are likely to include them only for some direct and immediate need, authoring tools can easily add auto-generated id values to elements or elements of a particular type. However to ensure document-wide uniqueness, such id values may end up being difficult to read and type accurately, difficult to distinguish, and long.
 * The usage of xml:id is still in its infancy and now is the time to shape its usage and treatment by common UAs.
 * Unique id attribute value violations are common place and so clear interoperable processing guidance needs to be provided when id attribute value collisions do occur

(see http://esw.w3.org/topic/HTML/IdAndTypeID for evolving solution proposals)

Also related to improved fragment identifiers (http://www.w3.org/Bugs/Public/show_bug.cgi?id=5744)

[authoring issue, editing implementation issue, added implementation]
================================================================================
 #1   Ian 'Hixie' Hickson                             2008-06-19 22:00:27 +0000 
--------------------------------------------------------------------------------
So to confirm, you are just requesting that the restriction on ID attributes that they have a document-unique value be removed? Or is there more to the request?
================================================================================
 #2   Rob Burns                                       2008-06-19 22:40:37 +0000 
--------------------------------------------------------------------------------
No, I don't think changing the meaning of the ID data type would be wise (since its used in so many other recommendations and would only create confusion for those working with various recommendations). Using xml:id for ID values in both serializations would provide authors with a familiar pattern (xml for strict fatal error handling with @id for another more permissive data type and perhaps some error recovery). Therefore this would mean introducing a new data type for the @id attribute, although it is a data type that would be generalized, yet compatible with, the ID data type. Likewise it would be compatible with existing content usage of the id attribute.

Finally, at the very least, this bug suggests the need for interoperable error handling for document conformance errors with the id attribute (and xml:id perhaps too).
================================================================================
 #3   Ian 'Hixie' Hickson                             2008-06-19 22:54:32 +0000 
--------------------------------------------------------------------------------
Ok I'm very confused. What exactly are you proposing? Or are you just highlighting a problem? I don't understand your comments so far in terms of what you want changed.
================================================================================
 #4   Lachlan Hunt                                    2008-06-20 14:28:57 +0000 
--------------------------------------------------------------------------------
Rob, we cannot relax the uniqueness requirement for id attributes because it defeats the entire purpose for which they were designed.  Although we do need to define how to handle duplicate ids, making duplicate ids conforming is not a sensible thing to do.

The getElementById() method is designed for returning only a single element.  This makes it significantly harder to work with those elements in the DOM.  Additionally, fragment identifiers in URIs can only point to one element in a document, which makes duplicate ids rather useless from an end user perspective.

(In reply to comment #0)
> * The goals of maintaining document-wide uniqueness for IDs and also achieving
> ID persistence are at odds with one another (for example consider pasting
> content, and other aggregation of content)

When it comes to content aggregation, why is id persistence considered a goal?

>  * Authors may want to uniquely identify an element without desiring the
> strictness of the ID data type

That does not make sense. How can you uniquely identify something without enforcing the uniqueness requirement?  Additionally, why is the ID data type relevant in the context of HTML 5?  HTML 5 is not defined in terms of a schema language or DTD, and in particular, the definition of the id attribute makes no reference to being of type ID.

>  * The xml:id attribute already provides an attribute taking a value of type ID
> and only one such attribute is permitted in the XML serialization of HTML5
> (though it could potentially be used in both serializations)
>  * Authors want consistency between the text/html serialization and the XML
> serialization of HTML5

Those two points seem contradictory. Since xml:id cannot be used in the HTML serialisation, but id can be used in both, it seems the best way to achieve consistency is to continue using the id attribute as a unique identifier in both.  We should not add xml:id to the HTML serialisation because it's not backwards compatible, and it doesn't even have wide support in browsers for XHTML.  xml:id is only ever useful for cases where generic XML processing is used and ID matching is required, but where there is no knowledge of HTML semantics within the tool.  Note that such cases are rare.

>  * Authors and authoring tools already produce documents without carefully
> checking the uniqueness of ID values (even within XHTML and XML documents)
> within the document of the uniqueness of type ID attributes on each element, so
> HTML5 should clearly define interoperable norms for matching such errant IDs,
> including matching for:
>     - the CSS id/ID selector.
>     - DOM id related methods (getElementById).

Authoring tools that produce such documents are non-conforming.  However, I agree that error handling needs to be defined.  As far as the ID selector is concerned, Selectors already defines this adequately (#foo matches all elements with an ID of foo, regardless of uniqueness).  It is the responsibility of the DOM specification to define how getElementById deals with duplicate IDs, though it is a well known issue that it currently doesn't.  However, we currently lack the necessary resources to rewrite the DOM spec and fix all of its issues, but this is an issue for the WebApps WG, not the HTML WG.

>  * Authors want a way to aggregate content such as articles from different
> sites or articles from the same site in a way that does not cause ID collisions
> nor id attribute collisions.

This is a valid use case, but I believe the appropriate solution lies with the authoring tools and CMSs that perform the aggregation, to check for id attributes and either strip or modify duplicates.

>  * Authors want to mix arbitrary vocabularies in compound documents that make
> use of xml:id as a cross-vocabulary/cross-namespace ID-valued attribute.

I'm not sure how that is relevant to your proposal.  I'm also not sure what the problem is solved by that solution, nor what evidence you have to support your claim that authors want it.  Since (X)HTML, SVG and MathML all have their own id attributes, why would xml:id be at all useful for such documents?

>  * While including ids in hand-coded HTML can be cumbersome and therefore
> authors are likely to include them only for some direct and immediate need,
> authoring tools can easily add auto-generated id values to elements or elements
> of a particular type. However to ensure document-wide uniqueness, such id
> values may end up being difficult to read and type accurately, difficult to
> distinguish, and long.

I'm not sure how your proposal addresses this issue, and it's not clear whether or not this is a real problem to solve.

>  * The usage of xml:id is still in its infancy and now is the time to shape its
> usage and treatment by common UAs.

Again, I'm not sure how that is relevant or what you mean by it.

>  * Unique id attribute value violations are common place and so clear
> interoperable processing guidance needs to be provided when id attribute value
> collisions do occur

Agreed, and aside from the aforementioned getElementById issue, which is outside the scope of HTML5, that is already the case.
================================================================================
 #5   Rob Burns                                       2008-06-20 20:26:18 +0000 
--------------------------------------------------------------------------------
(In reply to comment #4)
> Rob, we cannot relax the uniqueness requirement for id attributes because it
> defeats the entire purpose for which they were designed.  

Nothing I'm proposing defeats the purpose for which id attributes were designed. Quite the contrary. What I"m proposing enhances what id attributes were designed to do.

> Although we do need
> to define how to handle duplicate ids, making duplicate ids conforming is not a
> sensible thing to do.

What I'm suggesting is changing the rules for what a duplicate id means and I"m not suggesting we make duplicate ids conforming.
 
> The getElementById() method is designed for returning only a single element. 
> This makes it significantly harder to work with those elements in the DOM. 
> Additionally, fragment identifiers in URIs can only point to one element in a
> document, which makes duplicate ids rather useless from an end user
> perspective.

Not if we specify well-defined behavior.

> 
> (In reply to comment #0)
> > * The goals of maintaining document-wide uniqueness for IDs and also achieving
> > ID persistence are at odds with one another (for example consider pasting
> > content, and other aggregation of content)
> 
> When it comes to content aggregation, why is id persistence considered a goal?

Id persistence is a goal for all of the reasons you already mention and mention below as well: without persistence there's no way for getElementById or CSS ID selectors to work.

> >  * Authors may want to uniquely identify an element without desiring the
> > strictness of the ID data type
> 
> That does not make sense. How can you uniquely identify something without
> enforcing the uniqueness requirement?  Additionally, why is the ID data type
> relevant in the context of HTML 5?  HTML 5 is not defined in terms of a schema
> language or DTD, and in particular, the definition of the id attribute makes no
> reference to being of type ID.

You get too hung up on DTDs. The id attribute takes a value of a particular type. It's not a date, it's not a stylesheet, it's not a length. The fact that the draft does not yet address what data type it is doesn't somehow mean that the attribute takes a value of no particular data type.

> 
> >  * The xml:id attribute already provides an attribute taking a value of type ID
> > and only one such attribute is permitted in the XML serialization of HTML5
> > (though it could potentially be used in both serializations)
> >  * Authors want consistency between the text/html serialization and the XML
> > serialization of HTML5
> 
> Those two points seem contradictory. Since xml:id cannot be used in the HTML
> serialisation, but id can be used in both, it seems the best way to achieve
> consistency is to continue using the id attribute as a unique identifier in
> both.

It is completely false that xml:id cannot be used in the text/html serialization. We haven't specified what it means, but there is nothing about the text/html serialization that would prohibit us from doing so. Therefore it is not at all contradictory to make set the following objectives:
  * make XML serialized HTML work well with other XML vocabularies by using xml:id
  * make the two serializations of HTML consistent by adopting xml:id ID values for both
  * find something useful to do with @id that is both compatible with existing content and fulfills a need authors have for uniquely identifying fragments without requiring document wide unique values.

>  We should not add xml:id to the HTML serialisation because it's not
> backwards compatible, and it doesn't even have wide support in browsers for
> XHTML.  xml:id is only ever useful for cases where generic XML processing is
> used and ID matching is required, but where there is no knowledge of HTML
> semantics within the tool.  Note that such cases are rare.

Well for XML serialized HTML documents, and compound documents especially, it would be quite useful to make HTML fit in with other vocabularies. The backwards compatible part is a red herring here since what I'm proposing is certainly compatible with exiting content and of course it is not compatible with existing UAs because it is a proposed feature enhancement.

> >  * Authors and authoring tools already produce documents without carefully
> > checking the uniqueness of ID values (even within XHTML and XML documents)
> > within the document of the uniqueness of type ID attributes on each element, so
> > HTML5 should clearly define interoperable norms for matching such errant IDs,
> > including matching for:
> >     - the CSS id/ID selector.
> >     - DOM id related methods (getElementById).
> 
> Authoring tools that produce such documents are non-conforming.  However, I
> agree that error handling needs to be defined.  As far as the ID selector is
> concerned, Selectors already defines this adequately (#foo matches all elements
> with an ID of foo, regardless of uniqueness).  It is the responsibility of the
> DOM specification to define how getElementById deals with duplicate IDs, though
> it is a well known issue that it currently doesn't.  However, we currently lack
> the necessary resources to rewrite the DOM spec and fix all of its issues, but
> this is an issue for the WebApps WG, not the HTML WG.

HTML5 could certainly specify how getElementById handles duplicate IDs for HTML. Presumably text/html would follow a more gentle error handling than XML might follow for the same errors.

> 
> >  * Authors want a way to aggregate content such as articles from different
> > sites or articles from the same site in a way that does not cause ID collisions
> > nor id attribute collisions.
> 
> This is a valid use case, but I believe the appropriate solution lies with the
> authoring tools and CMSs that perform the aggregation, to check for id
> attributes and either strip or modify duplicates.

Stripping or modifying id attribute values is not a solution to the problem. To me it is a much more radical solution to say "who cares whether the ids stay the same and remain persistent, just change them or removing them". The authoring tool would have to check all embedded and linked scripts and stylesheets to update the id values there. And depending on the nature of the authoring tool, there may be other scripts and stylesheets that are not currently attached to the document and so stripping and modifying IDs breaks the document.
 
> >  * Authors want to mix arbitrary vocabularies in compound documents that make
> > use of xml:id as a cross-vocabulary/cross-namespace ID-valued attribute.
> 
> I'm not sure how that is relevant to your proposal.  I'm also not sure what the
> problem is solved by that solution, nor what evidence you have to support your
> claim that authors want it.  Since (X)HTML, SVG and MathML all have their own
> id attributes, why would xml:id be at all useful for such documents?

Because one can use the same attribute from the same vocabulary (and namespace in this case) to get a document-wide unique handle on elements. Perhaps SVG, MathML and others could then migrate their id attribute in the same way I propose HTML5 migrate its id attribute.
  
> >  * The usage of xml:id is still in its infancy and now is the time to shape its
> > usage and treatment by common UAs.
> 
> Again, I'm not sure how that is relevant or what you mean by it.
>
It just means we have an opportunity to help shape how xml:id is handled by implementations and authored by authors. The usage of these attributes are not so set in stone as other features of HTML or XML.
 
> >  * Unique id attribute value violations are common place and so clear
> > interoperable processing guidance needs to be provided when id attribute value
> > collisions do occur
> 
> Agreed, and aside from the aforementioned getElementById issue, which is
> outside the scope of HTML5, that is already the case.

I do not think error handling for getElementById is outside the scope of the HTML5 recommendation. Also, we have an opportunity to specify that error handling in a way that makes the id attribute more useful than it already is.
================================================================================
 #6   Ian 'Hixie' Hickson                             2008-06-20 21:39:52 +0000 
--------------------------------------------------------------------------------
I really don't understand what the request here is.
================================================================================
 #7   Rob Burns                                       2008-06-21 17:35:56 +0000 
--------------------------------------------------------------------------------
(In reply to comment #6)
Perhaps you could say more about what specifically you don't understand. One solution to this might be to change id attribute uniqueness to be only among the same siblings and define UA processing requirements for handling ID and id attribute references (much of a possible solution is suggested in the link for this bug: http://esw.w3.org/topic/HTML/IdAndTypeID)
================================================================================
 #8   Ian 'Hixie' Hickson                             2008-06-21 22:39:23 +0000 
--------------------------------------------------------------------------------
I asked if you wanted that and you said no. I think Lachlan's comments are pretty reasonable, and argue pretty convincingly against anything that would result in making duplicate IDs allowed would be bad (certainly as an author I'd hate it if the validator didn't tell me about duplicate IDs in the spec).

I don't understand what problem it is you want to solve. To be honest I can't be more specific because there are hardly two sentences you've written in this bug so far that I can understand.
================================================================================
 #9   Rob Burns                                       2008-06-22 08:37:55 +0000 
--------------------------------------------------------------------------------
(In reply to comment #8 and comment #1)
> So to confirm, you are just requesting that the restriction on ID attributes
> that they have a document-unique value be removed? Or is there more to the
> request?

> I asked if you wanted that and you said no. I think Lachlan's comments are
> pretty reasonable, and argue pretty convincingly against anything that would
> result in making duplicate IDs allowed would be bad (certainly as an author I'd
> hate it if the validator didn't tell me about duplicate IDs in the spec).

I got thrown by your use of all caps ("ID") which by convention usually refers to the data type rather than an attribute name. What I'm suggesting is we should maintain the ID data type as is, but use that with an xml:id attribute. For the id attribute we would perhaps relax the allowable characters (if we thought that necessary), but mostly restrict uniqueness to siblings of the same parent. This means that as an author you could still make use of document-wide unique IDs or even document wide unique id attributes if you chose to do so. However for authors who felt id persistence was more important than you or Lachlaun apparently feel it is, they could make use of more persistent ids unique only among siblings. As you know Henri only added duplicate id checking to his validator (the one you use) after I riased this issue. Duplicate id checking is quite rare (Henri may be the first to do so in a mainstream conformance checker)

> I don't understand what problem it is you want to solve. To be honest I can't
> be more specific because there are hardly two sentences you've written in this
> bug so far that I can understand.

Well, I don't know how to respond to that. Perhaps you could point to the parts you think you do understand. Frankly, I would expect someone who has been involved with specs as long as you have to follow this thread better. I'm prepared to take the blame for that, but you need to give me some indication of where I need to further explain things.

As for Lachlauns response, it failed to make a cogent counter-argument to any of the issues raised in this bug.
================================================================================
 #10  Ian 'Hixie' Hickson                             2008-06-22 09:06:31 +0000 
--------------------------------------------------------------------------------
HTML5 doesn't use formal data types. There's no ID data type in HTML5. xml:id is out of scope for HTML5 (except insofar as defining behaviour in case of conflicts), it's an orthogonal XML specification.

The only characters that aren't allowed now are the space characters, and we can't allow those since that would make it impossible to refer to those IDs from space-separated lists (e.g. in headers="" attributes).

Duplicate ID checker has been supported by all DTD-based validators since the dawn of SGML, as far as I'm aware. It is anything but new.

As to Lachlan's comments not being cogent, I understood his comments. I didn't understand yours. So I couldn't say if they were cogent or not. However, I do agree with his comments.


Could you describe, in two lines, what the problem is that you want to solve?
================================================================================
 #11  Lachlan Hunt                                    2008-06-22 11:48:28 +0000 
--------------------------------------------------------------------------------
(In reply to comment #5)
> (In reply to comment #4)
> > Rob, we cannot relax the uniqueness requirement for id attributes because it
> > defeats the entire purpose for which they were designed.  
> 
> Nothing I'm proposing defeats the purpose for which id attributes were
> designed. Quite the contrary. What I"m proposing enhances what id attributes
> were designed to do.

The purpose of the id attribute was to be a document-wide unique identifier.  As I understood your proposal, you want to relax the uniqueness requirement to some extent.  I fail to see how that doesn't defeat the purpose.

> What I'm suggesting is changing the rules for what a duplicate id means and I"m
> not suggesting we make duplicate ids conforming.

Now I'm really confused.  I really not sure what you're proposing now, since that seems to contradict other statements you've made.

>> The getElementById() method is designed for returning only a single element. 
>> This makes it significantly harder to work with those elements in the DOM. 
>> Additionally, fragment identifiers in URIs can only point to one element in a
>> document, which makes duplicate ids rather useless from an end user
>> perspective.
> 
> Not if we specify well-defined behavior.

Specifying well-defind behaviour for gEBId, which needs to be done anyway, doesn't address the problem I raised of it still only returning a single element.  This is a problem for your proposal because if a script depends on a particular ID, and then it is duplicated elsewhere in the document, it could potentially cause it to match the wrong element.

>> When it comes to content aggregation, why is id persistence considered a goal?
> 
> Id persistence is a goal for all of the reasons you already mention and mention
> below as well: without persistence there's no way for getElementById or CSS ID
> selectors to work.

But if we look at the common aggregation use case, which is typically done for syndicating blogs (e.g. PlanetHTML5), then the aggregated content usually doesn't come with all associated stylesheets and scripts.  The styles typically come from the host document.

> The fact that the draft does not yet address what data type it is doesn't
> somehow mean that the attribute takes a value of no particular data type.

The draft adequately defines the allowed values.  It just doesn't define it in terms of a specific data type.  The nearest concept that HTML5 has to data types is the microsyntax section, which mostly deals with values that have special parsing requirements.

> It is completely false that xml:id cannot be used in the text/html
> serialization.

Please refer to the recent aria discussion about colons in attribute names and namespaces to understand why xml:id won't work well in text/html.

>>  We should not add xml:id to the HTML serialisation...
>
> Well for XML serialized HTML documents, and compound documents especially, it
> would be quite useful to make HTML fit in with other vocabularies.

Please explain why xml:id is better than simply using id, and how it makes it "fit in with other vocabularies".

> HTML5 could certainly specify how getElementById handles duplicate IDs for
> HTML. Presumably text/html would follow a more gentle error handling than XML
> might follow for the same errors.

No, because this spec doesn't define getElementsById.  It is a separate specification that needs to define its own behaviour.

>>>  * Authors want to mix arbitrary vocabularies in compound documents that make
>>> use of xml:id as a cross-vocabulary/cross-namespace ID-valued attribute.
>> 
>> I'm not sure how that is relevant to your proposal.  I'm also not sure what
>> the problem is solved by that solution, nor what evidence you have to
>> support your claim that authors want it.  Since (X)HTML, SVG and MathML all
>> have their own id attributes, why would xml:id be at all useful for such
>> documents?
> 
> Because one can use the same attribute from the same vocabulary (and namespace
> in this case) to get a document-wide unique handle on elements. Perhaps SVG,
> MathML and others could then migrate their id attribute in the same way I
> propose HTML5 migrate its id attribute.

From an authoring perspective, why does the namespace help this in any way whatsoever?  For HTML, SVG and MathML, authors can already use the id attribute on all elements, use ID selectors and getElementById in the same way for each language, and things aren't unnecessarily complicated by namespaces.
================================================================================
 #12  Rob Burns                                       2008-06-25 12:33:37 +0000 
--------------------------------------------------------------------------------
(In reply to comment #11)
> (In reply to comment #5)
> > (In reply to comment #4)
> > > Rob, we cannot relax the uniqueness requirement for id attributes because it
> > > defeats the entire purpose for which they were designed.  
> > 
> > Nothing I'm proposing defeats the purpose for which id attributes were
> > designed. Quite the contrary. What I"m proposing enhances what id attributes
> > were designed to do.
> 
> The purpose of the id attribute was to be a document-wide unique identifier. 
> As I understood your proposal, you want to relax the uniqueness requirement to
> some extent.  I fail to see how that doesn't defeat the purpose.

No, I want to strengthen uniqueness by changing the scope of the uniqueness (and also strengthen persistence). In contrast, you're the one advocating for matching multiple duplicate IDs for CSS and DOM APIs. In my proposal the likelihood of non-unique id attributes declines substantially.

> 
> > What I'm suggesting is changing the rules for what a duplicate id means and I"m
> > not suggesting we make duplicate ids conforming.
> 
> Now I'm really confused.  I really not sure what you're proposing now, since
> that seems to contradict other statements you've made.

Let me repeat this in capsule form:
 * authors use xml:id with an ID data type (or a value consistent with an ID data type if you don't like the use of the term "data type")
 * authors use id attributes with an IDENT data type unique among all its siblings (of course in legacy content and future content, authors can continue to take steps to ensure the id attributes are unique document-wide if that's their preferred authoring practice)
 * duplicate xml:id=ID would cause fatal errors (as much as I know you have a knee jerk reaction against that it has its place and it reinforces the pattern already familiar to authors where xml implies fatal error)
 * duplicate id=IDENT would also be an error, but would me much less likely to occur

That's still not two lines as Hixie requested, so how about:

 * facilitating easier authoring of unique and persistent identifiers that can be scoped from scripts and stylesheets upon aggregation

> >> The getElementById() method is designed for returning only a single element. 
> >> This makes it significantly harder to work with those elements in the DOM. 
> >> Additionally, fragment identifiers in URIs can only point to one element in a
> >> document, which makes duplicate ids rather useless from an end user
> >> perspective.
> > 
> > Not if we specify well-defined behavior.
> 
> Specifying well-defind behaviour for gEBId, which needs to be done anyway,
> doesn't address the problem I raised of it still only returning a single
> element.  This is a problem for your proposal because if a script depends on a
> particular ID, and then it is duplicated elsewhere in the document, it could
> potentially cause it to match the wrong element.

getElementById should return a single element. That's the result of requiring uniqueness. If you say you're for uniqueness but you want a DOM method to return multiple duplicates you're undermining the uniqueness with confusing error handling.

> 
> >> When it comes to content aggregation, why is id persistence considered a goal?
> > 
> > Id persistence is a goal for all of the reasons you already mention and mention
> > below as well: without persistence there's no way for getElementById or CSS ID
> > selectors to work.
> 
> But if we look at the common aggregation use case, which is typically done for
> syndicating blogs (e.g. PlanetHTML5), then the aggregated content usually
> doesn't come with all associated stylesheets and scripts.  The styles typically
> come from the host document.

HTML5 proposes making styles scoped. How will a scoped stylesheet with id selectors work? You also are way to focussed on traditional web browsers. Every time a user pastes content into an editor, they are aggregating content. Persistence should at least be easily achievable for the authors that want to pursue that approach. It doesn't mean you can't change your id values every day if you want, every time you get a whim to do so go right ahead.
 
> > The fact that the draft does not yet address what data type it is doesn't
> > somehow mean that the attribute takes a value of no particular data type.
> 
> The draft adequately defines the allowed values.  It just doesn't define it in
> terms of a specific data type.  The nearest concept that HTML5 has to data
> types is the microsyntax section, which mostly deals with values that have
> special parsing requirements.

Well perhaps id attribute values should have a little more definition too. For example, do we want to permit <form id='#'>?

> 
> > It is completely false that xml:id cannot be used in the text/html
> > serialization.
> 
> Please refer to the recent aria discussion about colons in attribute names and
> namespaces to understand why xml:id won't work well in text/html.

I'd rather not. The participants in that discussion simply added misunderstanding upon misunderstanding. What I would suggest is that xml:id be added in the xml namespace (even in the text/html serialization). However, in any event the namespace use of a colon was designed to be completely compatible with non-namespaced processors so the tag name remains xml:id whether processed by an namespace aware implementation or not.

> 
> >>  We should not add xml:id to the HTML serialisation...
> >
> > Well for XML serialized HTML documents, and compound documents especially, it
> > would be quite useful to make HTML fit in with other vocabularies.
> 
> Please explain why xml:id is better than simply using id, and how it makes it
> "fit in with other vocabularies".

Because the other vocabularies are xml and xml:id is an attribute encouraged for use with these vocabularies. Also using id wouldn't address this proposal since one attribute cannot take a value that is both document-wide unique and not document-wide unique.
 
> > HTML5 could certainly specify how getElementById handles duplicate IDs for
> > HTML. Presumably text/html would follow a more gentle error handling than XML
> > might follow for the same errors.
> 
> No, because this spec doesn't define getElementsById.  It is a separate
> specification that needs to define its own behaviour.

However, HTML5 has to define what the ID data type is for it to work with that method. The method getElementById (not getElementsById) is not asking for the element that has an attribute called id with the value passed as an argument. The method asks for the element with ID an value (whatever attribute carries those values) whose value matches the argument. That could be <element lachy='value'> It is up to the document specification to determine where an element ID comes from.

So for an HTML5 document with two elements both sharing the same value of the id attribute it is certainly within our scope (or even required) that we specify which one of those values is an ID and which one is not (or that they both are if that's what you're saying you prefer).

> 
> >>>  * Authors want to mix arbitrary vocabularies in compound documents that make
> >>> use of xml:id as a cross-vocabulary/cross-namespace ID-valued attribute.
> >> 
> >> I'm not sure how that is relevant to your proposal.  I'm also not sure what
> >> the problem is solved by that solution, nor what evidence you have to
> >> support your claim that authors want it.  Since (X)HTML, SVG and MathML all
> >> have their own id attributes, why would xml:id be at all useful for such
> >> documents?
> > 
> > Because one can use the same attribute from the same vocabulary (and namespace
> > in this case) to get a document-wide unique handle on elements. Perhaps SVG,
> > MathML and others could then migrate their id attribute in the same way I
> > propose HTML5 migrate its id attribute.
> 
> From an authoring perspective, why does the namespace help this in any way
> whatsoever?  For HTML, SVG and MathML, authors can already use the id attribute
> on all elements, use ID selectors and getElementById in the same way for each
> language, and things aren't unnecessarily complicated by namespaces.

Again, you're missing what this bug is about. The id attribute cannot be two things at once. xml:id makes sense to provide a more strict attribute for ID values. I'd be fine if you want to call the new attribute lachy. However, as an aside  I still don't understand this irrational distaste for namespaces.
================================================================================
 #13  Ian 'Hixie' Hickson                             2008-06-25 19:48:31 +0000 
--------------------------------------------------------------------------------
Ah! ID scoping! Man I wish you'd said it in just two lines from the start.

ID scoping for aggregating content will probably be addressed using the doc="" idea on iframes.
================================================================================
 #14  Ian 'Hixie' Hickson                             2009-08-08 01:22:23 +0000 
--------------------------------------------------------------------------------
Marking LATER since we're still waiting for sandbox implementations.
================================================================================
 #15  Maciej Stachowiak                               2010-03-14 13:14:21 +0000 
--------------------------------------------------------------------------------
This bug predates the HTML Working Group Decision Policy.

If you are satisfied with the resolution of this bug, please change the state of this bug to CLOSED. If
you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

This bug is now being moved to VERIFIED. Please respond within two weeks. If this bug is not closed, reopened or escalated within two weeks, it may be marked as NoReply and will no longer be considered a pending comment.
================================================================================