W3C

Results of Questionnaire ISSUE-76 - Microdata - Straw Poll for Objections

The results of this questionnaire are available to anybody.

This questionnaire was open from 2009-12-10 to 2009-12-17.

28 answers have been received.

Jump to results for question:

  1. Objections to the Change Proposal to Split Microdata
  2. Objections to the Change Proposal to Keep Microdata

1. Objections to the Change Proposal to Split Microdata

We have a Change Proposal to Separate Microdata from the HTML5 Specification. If you have strong objections to adopting this Change Proposal, please state your objections below.

Keep in mind, you must actually state an objection, not merely cite someone else. If you feel that your objection has already been adequately addressed by someone else, then it is not necessary to repeat it.

Details

Responder Objections to the Change Proposal to Split MicrodataRationale
Jirka Kosek
Leonard Rosenthol
Roy Fielding
Jonas Sicking The microformat community has showed that there is a demand for embedding machine readable metadata into HTML documents. However it has also showed that the current extension mechanisms available in HTML 4 are lacking (clamping on the author namespace in classnames, lack of uniform parsing model).

It would be great if HTML 5 had an explicit mechanism to embed such metadata. In order to make this mechanism the best it can be it should be developed together with the rest of HTML 5. If we separate the microdata spec into a separate specification we run the risk that one spec develop faster than the other, thus possibly putting us in a situation where changes to one spec requires changes to the other, but where this is not possible because the other spec has progressed too far.

Put another way, if problems are found with the microdata mechanism during review or implementation, then that should hold up the HTML 5 specification. This in order to ensure that HTML 5 is released with a good mechanism for embedding metadata.
Ben Adida
Joe Williams
Matthew May
Mark Birbeck
Tim van Oostrom A techspec such as Microdata, which is mostly inspired by existing technologies (rdf/xmdp/rdfa/microformats) but overcomes the flaws of that same technologies and does not have obvious usability issues, should be inside the HTML5 spec.
As the web is advancing and W3c has focus on "linked data", the HTML5 spec, a language on which billions of web-application/web-documents will be built upon, shouldn't do without a flexible, simple, and unambiguous semantic annotation syntax like Microdata.
Attention on the main spec will be greater so splitting it will put less focus on machine/human readable semantic data in future web-applications, thus less chance of success of a "linked data" web.

Rob Ennals Here are some arguments in favor of keeping Microdata in HTML5.

These are all my own personal opinions, rather than an official Intel position.


== Being HTML-specific is good ==

One of the strengths of Microdata is that it has been designed specifically to allow semantic information to be added to HTML documents, rather than to arbitrary XML documents.

There is a tradeoff between making something generic enough that it can be used across a wide range of document types, and focussed enough that it can work really well for one particular document type. The number of authors writing HTML documents is significantly larger than the number of authors writing other kinds of XML-like documents, and so I believe it is worth having a metadata format that is specific to HTML.

If Microdata was spun into a separate spec then their would be pressure to make it also work for other kinds of XML document. I think that that would almost inevitably lead to a spec that worked less well for HTML - which would be a great shame.


== Being part of the standard will attract more attention ==

While it is doubtless true that most authors don't read specs directly, a reasonable number do, and a lot of people read books by people who have read the spec and want to cover "everything in the spec".

If Microdata is left out of the HTML5 spec then it will almost certainly have less attention paid to it and be less widely adopted.

Even if it is decided that Microdata must be developed in a separate spec, I belive that the informative section describing use of Microdata in HTML should remain in the HTML5 spec.


== Competition between RDFa and Microdata isn't a big deal ==

One of the most frequent complains about Microdata is that it is a rival to the existing RDFa standard. I do not believe that this is a good reason to remove Microdata from HTML5.

If RDFa was widely adopted on the web, and there was a risk of having competing standards, then that might be an argument against Microdata, however RDFa has not been widely adopted, and if Microdata is part of the HTML5 spec then it will be quite clear that there is only one preferred standard for semantic markup in HTML documents.

If the need for some people to write parsers for both RDFa and Microdata was a significant technical barrier, than that might be an argument against Microdata. However the likely state of affairs is that Microdata will only be used for HTML and RDFa will only be used for other XML formats. If you already need to support RDFa and HTML then the burden of writing parsers for RDFa and HTML dwarfs the cost of writing a Micordata parser for HTML.


== Modularity is not always good ==

Sometimes it makes sense to pull things out into separate specs, but sometimes it doesn't.

The advantage of having separate specs is that it removes the need for multiple specs X and Y that both need a feature F to waste time re-inventing the same feature, and it allows a vendor to implement feature F once and apply it to specs X and Y. If a spec is complex to describe and implement and is used in essentially the same way by multiple other specs, then it makes a lot of sense for it to be a separate spec. For example it would make no sense for JPEG to be part of HTML5 since JPEG is hard to implement, its interface to HTML is very simple, and it is used in essentially the same way in HTML5 as it is everywhere else.

The disadvantage of making feature F a separate spec is that it encourages the spec to cator less well to the specific needs of spec X, it encourages it to include confusing features required by Y that X doesn't need, it introduce a potentally complex boundary interface between the two specs, and it introduces an extra layer of indirection that can make it hard for a user of spec X to know where different things are defined. For example, it would make little sense for <form> to be pulled into a separate spec.

There is of course the middle-ground of a spec F that is split into a separate document merely for the purposes of making editing more convenient, but which is explicitly included as part of another spec X and is not expected to be generalized to provide for the needs of another spec Y. This approach can be an appropriate substitute for having F be part of spec X, provided that it is explicit in the charter for spec F that it's primary purpose is to provide a feature to be used by spec X, and that it should avoid generalizing for other domains unless there is no cost for spec X.

Graham Klyne
Larry Masinter This isn't an objection, but it is a comment and request for refinement. The change proposal says:

"The change details of this proposal would require removing all language discussing Microdata from the HTML5 specification. "

and that doing so took 8-10 hours. However, I haven't found a document which purports to be the HTML 5 specification without any mention of microdata, and I'd like the opportunity to verify the extent to which Microdata and all of the things that were added in order to support Microdata (data types and vocabularies that are not otherwise used are actually removed from the specification.

While I support removing Microdata and all of the microdata vocabularies from HTML5 (see submitted "Objections to Change Proposal to KEEP Microdata") I'd like to make sure that the job is done completely.
Jace Voracek
Leif Halvard Silli
Sam Johnston
Philip Jägenstedt Microdata, if it should exist at all, is as much a part of HTML as <time> or class="". It adds HTML attributes, adds methods/properties to HTMLDocument/HTMLElement, adds HTMLPropertiesCollection, etc... The proposal to split it into a separate spec is clearly not technically motivated, but rather intended to "level the playing field" with RDFa. If RDFa is taken out of consideration, the arguments given make very little sense.

Any new part of HTML5 may fail in the marketplace. That is not a good reason to publish every feature of HTML5 as a separate spec in order to make it easier to remove in the future. A failed feature must still be maintained by those UAs which have implemented it; which spec it is in makes no difference whatsoever. If we think that Microdata will fail we should either improve it or drop it completely if it is hopelessly broken (clearly not the case).

Independent evolution from HTML5 is no more relevant for Microdata than it is e.g. for <video>. In fact we know that <video> will be changing before, during and after HTML5 goes to REC, but still no one is arguing for splitting it into a separate specification. Microdata is arguably much closer to completion than <video>, unless major issues come up during implementation.

It is true that some of Microdata's attributes, DOM APIs and processing rules could be reused in other languages, but not without changing those languages (because, like in RDFa, the attributes aren't prefixed). The proper solution would be to split out some core Microdata concepts into another spec, referencing that from HTML5 and defining the HTML-specific parts in HTML5. That includes at least the behavior of itemValue (which depends on <a>, <img>, <time>, textContent, etc), HTMLPropertiesCollection and the processing rules for converting an HTML document to JSON, etc.

The point about maturity seems no more relevant to Microdata than any other part of HTML5. For example, <time> is (AFAIK) still not implemented anywhere and will likely change as more feedback continues to come in. Still, no one is arguing for splitting <time> into a separate spec. If one is genuinely concerned about the maturity of Microdata it would be more appropriate to review the spec and leave feedback to fix anything that is broken.

I am not in a position to make any promises of implementation, but it seems very likely that if Microdata isn't killed for political reasons, UAs will implement it, as doing so is fairly easy and provides real-world benefits such as being able to import mail contacts directly from web pages.

Some comments to other responders:

@Jirka Kosek: to produce an RDF triple with any URI as the predicate using Microdata, simply use itemprop, e.g. itemprop="http://xmlns.com/foaf/0.1/name"

@Roy Fielding: Microdata and HTML are not orthogonal, see above. What specific design problems does Microdata have? In which ways is it not extensible that any other technology could be? It seems to be exactly on par with e.g. RDFa in this regard.

@Larry Masinter: Any technology for embedding metadata in HTML has to be part of HTML, for obvious reasons. RDFa also extends HTML and should replace Microdata in HTML5 if it were actually good enough.

@Mark Birbeck: Is there any specific quality problem in the Microdata spec? I have read it and implemented it, and it was quite clear and unambiguous to me. I'm sure feedback to improve the spec is very welcome.

@Sam Johnston: HTML5 does not include any Microdata vocabularies, those are already in a separate spec.
Anne van Kesteren I do not think it makes sense to split out Microdata, but not split out the media elements, form controls, etc. Microdata just as the other new features enables a specific set of use cases and is fully integrated with the HTML language, including new attributes and a DOM API.

It addresses a specific issue with HTML4 that the microformats community came across and by tying it closely to HTML it provides an easier-to-use syntax than RDFa. Reuse in other specifications without close integration would be a mistake I think as you would lose the benefits that Microdata gives.
Kai Scheppe
Julian Reschke
Karl Dubost
Laura Carlson
Henri Sivonen The Change Proposal cites "philosophically divergent communities within a larger community" as a reason to separate Microdata from the HTML5 spec. Also, the Change Proposal refers to RDFa many times positioning RDFa and Microdata in competition with each other but on different sides of the philosophical divergence.

There is precedent that goes exactly in the other direction as far as splitting and merging goes: The HTML5 section on forms in HTML5 addresses the same sphere of use cases as XForms, and the difference represents a philosophical divide within the W3C community. In the case of the forms section, it was previously a separate spec (Web Forms 2.0) and it was merged into HTML5. It seems odd to take the opposite course of action in the case of Microdata.

In the case of XForms and HTML5 forms, it seems to have been a good idea to let the divergent communities to proceed in parallel instead of getting entangled in managing how the other community's specs are split or not split across documents.

(This comment doesn't make a technical argument, because I'm responding to an argument that itself isn't technical.)
Martin Kliehm
Michael Köller
Krzysztof Maczy&#324;ski
Shelley Powers
Ian Hickson If the final decision is that a change should be made to the spec, I'd like to request that the chairs clearly state the complete criteria by which the section was determined to belong in its own document, so that the criteria can be consistently applied across other features that are proposed in the future.
Michael[tm] Smith

2. Objections to the Change Proposal to Keep Microdata

We have a Change Proposal to Keep Microdata in the HTML5 Specification. If you have strong objections to adopting this Change Proposal, please state your objections below.

Keep in mind, you must actually state an objection, not merely cite someone else. If you feel that your objection has already been adequately addressed by someone else, then it is not necessary to repeat it.

Details

Responder Objections to the Change Proposal to Keep MicrodataRationale
Jirka Kosek Microdata are completely new proposal which is not backed by implementation in any reasonably widespread user-agent. It might be that microdata are the best solution to the problem, but this should be proven by successful implementations. Thus I think it is better to split microdata and see which technology (microformats, microdata, rdfa) will most uptake during the forthcoming years.

Moreover I think that microdata should be as much as compatible with Semantic Web and RDF -- although I have many personal reservations to this technology it has been quite successfully promoted and used in the past, so W3C should try to be consistent there. And although the current microdata proposal contains algorithm for mapping to RDF I don't think it is complete and that microdata allow expressing any arbitrary RDF (more specifically I haven't digested how to represent any chosen URI as a predicate).
Leonard Rosenthol I object to the inclusion of microdata in the HTML5 specification due to the fact that there are competing metadata standards that should also be viable in HTML5. By separating this out from the main specification it will enable authors to be able to choose the format(s) for the metadata without being locked in.

In addition, since metadata grammars update frequently, that specification can be updated without impact to the main HTML5 document.
Roy Fielding It boggles my mind why the chairs deem it appropriate to have a change proposal that effectively says "don't change anything". *shrug*

"All good specs which integrate with HTML5 should, ideally, be a part of HTML5." That is contrary to the principle of orthogonality, which states that orthogonal technologies must be defined in separate specifications in order to evolve independently.

"A spec that is designed within HTML5 and one designed outside of it are qualitatively different (see Conway's Law)." <a href="http://www.melconway.com/law/index.html">Conway's Law</a> is about designing systems, not writing specifications, and is an argument for removing Microdata because the HTML5 working group has neither the expertise nor the communication bandwidth to sufficiently evaluate and deploy a mark-up language for embedded database records.

"Microdata is not a uniquely independent technology bolted onto HTML5." Microdata has no deployed history or implementation outside the confines of the HTML5 editing group. We don't even have any concrete expressions of interest in implementing it.

"Removing sections from the HTML5 spec until they are 'mature' is not part of the development model of the HTML5 language and has never been." Which is one of the reasons that the HTML5 spec is so incredibly bloated that nobody has managed to read it in its entirety. Maybe learning from that mistake would be good.

"Microdata does not appear to be in an extreme level of flux". It's stiff, bereft of life, and if you hadn't nailed it to the perch it would be pushing up the daisies.

"The purpose of the W3C is to advance the web, not to remain neutral in technological conflicts." Removing microdata doesn't prevent the W3C from advancing the Web. It doesn't even prevent microdata from advancing on its own.

"we should as a working group give the most support to the technology we most believe should succeed in the marketplace." Some other working group can do that, if anyone actually wants Microdata to succeed. The notion that namespace prefixes are confusing to authors is a myth.

"Removing it from HTML5 would provide no benefit to authors or implementors ...". The obvious benefit would be that authors wouldn't have to read through that dreck in order to understand HTML,
and implementors wouldn't have to implement it to be compliant.

"Microdata, as written, would not be reusable by other technologies even if located in a separate spec." That's because Microdata is a poor design that presumes a world that never changes after HTML5 is published. It is not extensible.

"If Microdata were to be split from the HTML spec, it is possible that control of it would move to a separate working group, which would move part of HTML's development out of the hands of the working group chartered to develop HTML." Good! Microdata is not HTML.
Jonas Sicking
Ben Adida Microdata was inserted into the HTML5 draft without consensus, by editor decree, even though an alternative and much more widely adopted specification (RDFa), would have been the natural point of collaboration.

Once microdata was forced into the spec, there was no chance to debate its merits independently of the rest of the specification.
Thus, this is a specification forced onto HTML5 users, not a specification that has gotten fair debate with pre-existing technology.

One could add RDFa into the HTML5 spec, but then why not a third and fourth approach? Instead, the right solution, at this point in time, is to split out the microdata proposal so that true consensus around metadata specs can be achieved.

Keeping microdata in the spec, because of the way the spec is evaluated as a whole, would mean that consensus has *not* been properly applied to this feature.
Joe Williams Let's have the standard just deal with recommended HTML. Already the spec is stretched in one direction by including many obsolete or never standardized HTML elements/constructions and then Microdata, along with other brand spanking new stuff, stretches it in the opposite direction by including new technology and elements (almost)never before implemented and tested.

> All good specs which integrate with HTML5 should, ideally, be a part of HTML5

I think that today this statement cannot be taken as a serious goal. Modern web multiple media can link between specs even better than we are doing now. The fact that the spec does not now run in all major browsers should be seen as something that needs fixing by fixing the spec rather than given as an offhand GBW-like warning.
Matthew May If the HTML WG should understand anything after having this issue on the front burner as long as it has, it's that we have two constituencies with valid use cases for their chosen metadata implementation, as well as limits to their own expressivity. What this proves is that HTML5 should not force a winner upon the market.

I am not opposed to microdata per se, but as the debate has shown, TIMTOWTDI. HTML should be a rich enough language to accommodate a number of syntaxes, and the spec should focus on that kind of flexibility, rather than assuming no one will come up with anything better, simpler or more useful than microdata. If that's true, then that's what people will use. But if not, HTML will be stuck with another obscure add-on.

We reached the end of polite discussion on the core of this issue long ago. The way to move forward with everyone's wits and business plans intact is to split microdata out.
Mark Birbeck The big problem for Microdata is that the proposal was politically motivated, and this has impacted on everything about this spec:

Quality: It's difficult to write a good spec when the underlying goal is simply to see off someone else's work. Microdata is a case-study in this.

Adoption: Ultimately people get excited about technology because they can see it helping them to get things done. Saying "use this because it isn't RDFa" doesn't inspire anyone.

Even when arguing for its survival, the change proposal has no technical justifications, just more bureaucratic maneuvering.

The WHATWG presented itself as a new organisation, sticking two fingers up to the apparent corporatism of the W3C. Yet at the first opportunity its members have attempted to use the very bureaucratic techniques they decry, to push their agenda -- first pushing Microdata into the spec, without any support, and now trying to keep it there, because apparently 'it won't make any difference anyway'.

Having said this, I don't think anyone comes out of this whole thing very well; if the W3C wants to stake a claim to *leading* the web to its full potential, then it should get on and lead, and stop allowing this kind of confusion to waste everyone's time. (If they'd rather hold everyone's jackets whilst others take the web forward then that's fine...just tell us, next time.)
Tim van Oostrom
Rob Ennals
Graham Klyne As a developer of experimental browser and non-browser applications using metadata and machine-processable annotations, I am concerned that introducing an HTML-specific mechanism for encoding metadata will harm interoperability, for reasons that others have noted. Tools already exist for handling RDFa and RDF/XML within browser-based applications. I think the effort expended on yet another metadata encoding would be better spent on sorting out and deploying a robust security model to allow existing Javascript tools to be used more widely.

In some of the background material around this debate, the point is made that the decision to include metadata encoding within the HTML specification is based on technical merit, not a popularity contest. This is a fair position, but I would point out that:

(a) the open standards community have shown pretty convincingly that technical merit is most effectively (if not always perfectly) determined by open debate of the technical issues. I haven't come across any compelling argument that a metadata encoding framework hard-wired into HTML is technically superior to a separate specification that can evolve independently of core HTML. The history of Internet and Web standards indicates that those which concentrate on doing just one thing well tend to be technically superior.

(b) standards do not exist in glorious isolation - they are part of a social fabric that includes standards developers, programmers, users, corporate interests and more. As such, the breadth of consensus is an important technical factor in the choice of a technology. (E.g. as an advocate of RDF and related technologies, I see many problems with it, many things that could be improved, but they are insignificant compared with the advantages conferred by its breadth of consensus and usage, and its evolvability to accommodate new ideas.)

Technical merit, existing practice and future possibilities therefore indicate that encoding for machine-consumable data within web pages should be specified and allowed to evolve separately from encoding for human presentation and interaction (which I take to be the main purpose of HTML, including semantic markup). By separating these aspects of web-page markup specification, we lose nothing and have a vast amount to gain.

Further, I have read the various arguments listed at http://esw.w3.org/topic/HTML/ChangeProposals/KeepMicrodata, and I find that in each case they are either technically irrelevant (e.g. "All good specs which integrate with HTML5 should, ideally, be a part of HTML5" is non sequitur), something with which I actively disagree (e.g. "we should as a working group give the most support to the technology we most believe _should_ succeed in the marketplace"), or compelling reason to drop the Microdata specification altogether (e.g. "One designed originally as part of the larger spec tends has a larger "surface area" alongside the rest of the spec, rather than limiting its interaction to a small number of channels").

I also find that many of the arguments advanced at http://esw.w3.org/topic/HTML/ChangeProposals/KeepMicrodata take a rather broader view of the working group charter than I feel is justified (e.g. "If one technology under the W3C's purview is better than a competing technology, it is our responsibility to actively decide in favor of it"). To my mind, this suggests an attitude that strikes at the heart of web architecture, and at the characteristics that have made it so successful to date.

I also strongly disagree with the conclusion of the "impact" section of http://esw.w3.org/topic/HTML/ChangeProposals/KeepMicrodata. It may be that "spin-off" specs get less attention, but so also do large, bloated specifications. At least, when the parts are separated, the core parts can stand independently of the extras, and receive better review by virtue of being less to review.
Larry Masinter It is a circular reasoning to argue that Microdata should be part of HTML because Microdata is part of HTML. If Microdata winds up *not* to be part of HTML, then moving its development out isn't moving "part of HTML's development" out.

I don't think microdata should be part of HTML; even if I liked it (which I don't), it shouldn't be part of HTML any more than HTTP or URIs or JPEG or GIF or PNG or JavaScript or SVG or MathML or CSS should be "part of" HTML.

HTML is "part of" the Web (or, if you like, the Web Hypertext Application platform), as are many or most of those other things. Not the other way around.

The W3C HTML Working Group charter does not call for the W3C HTML working group to develop HTTP or URIs or JPEG or GIF or PNG or JavaScript or SVG or MathML or RDFa. It *does* call for extensibility mechanisms, and explicitly mentions RDFa but merely as an *example* of what an extensibility mechanism should allow.

The W3C Process http://www.w3.org/2005/10/Process/ section 6.2.2, 6.2.3 : The W3C Advisory Committee reviews working group charters, the Director announces the charter (with a Call For Participation), and the Advisory Committee reviews the charter. The W3C Advisory Committee has the final authority on a charter, because Advisory Committee representatives MAY appeal creation or substantive modification a Working Group charter.

A Working Group itself has no authority to modify its charter.

The HTML Working Group charter:

http://www.w3.org/2007/03/HTML-WG-charter.html does not mention Microdata. It does mention RDFa, but not for inclusion in HTML 5 at all.

It says:

The HTML WG is encouraged to provide a mechanism to permit independently developed vocabularies such as Internationalization Tag Set (ITS), Ruby, and RDFa to be mixed into HTML documents. Whether this occurs through the extensibility mechanism of XML, whether it is also allowed in the classic HTML serialization, and whether it uses the DTD and Schema modularization techniques, is for the HTML WG to determine.

It does not say the working group is allowed to actually add additional vocabularies, only to develop an extensibility mechanism.

The charter also notes that there is a SINGLE specification deliverable (although splitting into multiple documents is presumably consistent with a single deliverable), which is "A language evolved from HTML4".

The word "evolve" cannot possibly mean "add completely new facilities", especially because adding RDFa itself was clearly made out of scope: Reviewing the (member only) W3C Advisory Committee discussion at the time http://lists.w3.org/Archives/Member/w3c-ac-forum/2007JanMar/, there was explicit feedback that RDFa was *not* part of the charter, so there should not be any confusion about that, especially since Maciej was quite active in the charter discussions.

Minor note: One of the arguments against "splitting" Microdata given was "... no one is arguing for splitting it [<video>] into a separate specification." But in fact, there were such arguments made in the past (e.g., http://lists.w3.org/Archives/Public/public-html/2009Jul/0139.html). The fact that during the one month call for proposals, no one was willing to step up to writing a "change proposal" following the current HTML WG "Decision Process" is not evidence that there would not be sentiment in favor of a change proposal to split <video> into a separate spec as well, for many of the same reasons.
Jace Voracek My objection to this Change Proposal on Issue 76 has been stated and expressed accurately, therefore it is not necessary for me to readdress it as mentioned in Table One.
Leif Halvard Silli One of the goals of HTML 5 is to integrate exactly SVG and MathML into text/HTML. And yet, the change proposal to keep microdata states as reason for keeping it inside the HTML 5 specification the following:

]]
Reusing it in, for example, SVG would not be possible, as SVG lacks a <time> element, [ ... ] Generalizing Microdata so that it could be used in technologies such as SVG would make it much less useful for HTML
[[

The current HTML 5 draft has some examples of microdata applied to the <img> element. For example this code fragment:

<img itemprop="img" src="hedral.jpeg" alt="" title="Hedral, age 18 months">

Now, if that image was an SVG image, then it would be impossible to use microdata for tagging it.

OBJECTIONS 1: Microdata embraces HTML, but not HTML 5 documents

1. To favour as meta data tagging technology something that can only be used inside the XHTML namespace sections of HTML 5 documents, means that HTML 5 doesn't fully embrace whether SVG or MathML.
2. It also means means that authors, if they are interested in tagging SVG or MathML sections, still will have to learn an additional meta data tagging system. This in itself would be a complications for authors - hurting the argument that Microdata is simpler for authors to use than for example RDFa.
3. Meta data tagging, as important as it is, is probably not felt so important by authors that they will actually learn - and practise - both systems. Authors would instead either err and place microdata tags inside SVG and MathML. Or they would not tag their SVG sections at all. Or the two systems will make them feel that the whole thing is so overwhelming that they don't tag at all.

Hence, the fact that

]] Microdata, as written, would not be reusable by other technologies even if located in a separate spec. [[

is not a reason to market microdata with the HTML 5 stamp, when HTML 5 itself is about embrasing SVG and MathML in text/html. Before eventually promoting any meta data tagging system as especially fitted for use with HTML 5 documents (and keeping microdata as part of the HTML 5 spec is meant to be such a compatibility stamp) , we should ensure that it can be used in all parts of an HTML 5 document. In a split out specification, microdata may have the chance of reaching that kind of maturity.

OBJECTIONS 2: Microdata may itself be affected by Conway's law, despite that it is part of the spec.

The fact that microdata has developed to not interface 100% with _HTML 5 documents_ could be an effect of Conway's law, as it seems to reflect the strong HTML focus within the communication channel where its developement primarily took place. This shows that keeping microdata inside the HTML 5 specification does not guarantee any automatic compatibility with HTML 5 documents. RDFa+HTML, which primarily have been developed in other channels - with much input from many critical voices with the HTMLwg, have the prospect of being - or becoming - 100% compatible with all the namespaces that HTML 5 documents permits, with RDFa 1.1. promising to become even simpler to use. Which seems to hint that a separate document and active communication with the critical voices, is a better guarantee for that microdata will interface well with HTML 5 documents, than the focus on keeping microdata inside the HTML 5 specification.

OBJECTIONS 3: It is not true that <time> is useless unless microdata is part of the spec.

This is an appropos to Jon Sicking's remark in his objection to splitting out microdata, where he states:

]]if problems are found with the microdata mechanism during review or implementation, then that should hold up the HTML 5 specification[[

Microdata's use of HTML 5 specific features, such as <time>, is not an argument for keeping microdata inside the HTML 5 specification because it is the meta data tagging system's task to be compatible with the mark-up language and not the other way around. That such an adaption takes place, can be better secured if a the meta data tagging system is developed in a separate spec.

Take for example the <time> element:

1. The <time> element was planned at least as early as 2006 (http://www.whatwg.org/specs/web-apps/2006-01-01/), and by 2007 it was part of the draft.
2. <time> looks partly developed as a response to MicroFORMAT's trouble with finding a HTML 4 features that was usable for marking up time without disturbing screen readers and the like. (<abbr title="date-time"></abbr> did not work well and seemed like a hack)
3. Microdata was designed only in 2009, and it was desided to take advantage of the <time> element in the design.

Thus both previous experience (e.g microFORMATs), and experience with the specification of HTML 5, shows that mark-up comes first, and that meta data tagging comes second. HTML elements may be designed with meta data tagging needs in mind. But it is the meta data systems that eventually adapts to and makes use of the mark-up language and not the other way around. That SVG and MathML is not included in microdata is an example of this lack of adapation.

Publication of the HTML 5 spec - and thereby the use of <time> - should not suffer the danger of being delayed, just because microdata isn't ready yet, as that would be a very bad signal to send about the benefits of the rest of HTML 5.
Sam Johnston Risk mitigation is my primary motivation, in that Microdata may well fail and/or prove to be the inferior technology. In that case the spec is laden with unused and confusing baggage which will inevitably lead to interoperability problems. It also adds non-trivial complexity which will increase the cost of implementation and may delay/deter implementors.

Maintaining separate specifications allows healthy competition between alternatives which will initially manifest itself in the form of clients being able to speak multiple standards. If/when there is a clear winner, support for the losing alternatives will likely taper off with no harm done.

I am specifically concerned about the inclusion of certain vocabularies over others given the potential scope is unlimited. For example, I am working on the Open Cloud Computing Interface (OCCI) which initially describes compute, storage and network resources - why should we cover people and their relationships but not the machines themselves (or any other models for that matter)? Who decides what's in and what's out and what happens when we need to make changes to these vocabularies? How do we deal with extensibility? These are questions we need not be answering, particularly for a specification that is primarily concerned with formatting rather than semantics.
Philip Jägenstedt
Anne van Kesteren
Kai Scheppe The HTML 5 specification is already large enough. Anything that could be split out, should be split out.
Microdata can be pursued on its own and does not have to be folded into this specification.
Taking it out of the spec won't hurt it, but keeping it in will make HTML5 that much more indigestible.
Julian Reschke 1) Thanks to all who already participated in the poll; it appears that every point has already been made somewhere.

2) I'd like to point out that this is essentially a poll about a process/editorial question, so, by definition, those arguments that really apply to the poll (*) fall into that category as well. ((*) as opposed what the "better" technology is)

That being said here are two comments:

i) "All good specs which integrate with HTML5 should, ideally, be a part of HTML5..."

This contradicts somehow what we (the WG) have communicated before, in that extensions to HTML5 are possible and allowed, and thus the extensibility model is sufficient. Now stating "...but 'good' ones should be part of HTML5..." implies that those which are maintained separately are by definition "not good", until they become subject of the HTML5 author's change control.

It appears we have disagreement on this, so the Working Group really should try to make a decision on this specific question. Separately from this discussion.

ii) Specification Size and Review

We are trying to get the spec to LC. To do that, it needs review. The bigger the document, the less likely it is that the *essential* parts get the required attention. Taking out both non-essential and controversial parts absolutely is the right approach to do this.
Karl Dubost Putting aside the lack of implementations, there are now many mechanisms for declaring *explicit * an html document about the content of this document :

* usual semantic html (ex: blockquote, cite, etc.)
* meta element in the head
* class attributes (used by microformats community)
* rdfa
* microdata
* data- prefix (which is guaranteed to not be private to the page.)

It's becoming far too many. It shows a design issue where we put tape on top of tape on top of tape.

Laura Carlson I'm supportive of some type of metadata facility *for* HTML as it can be used for accessibility purposes unrelated to ARIA and can make it possible for applications to carry out automated functions. And microdata's usability/simplicity would be a plus for authors.

But with that said, I don't know if microdata needs to be integrated *into* the HTML5 specification for reasons (out of scope, RDFa conflicts, modularity) already stated by others. Beyond the rationale others previously stated is the following:

On June 9, 2008, in his email "How to add features to HTML5" [1] in response to my inquires on process [2] [3] [4] [5], Ian stated *his* nine step procedure for adding new features to the spec. Ian concluded by saying that the default state for a feature request is for it to be rejected and the default state for a section of the spec was for it to be eventually dropped unless the feature is widely implemented and so important that browser vendors "are actually ready to commit money and risk interop issues over it".

From all I have gathered, it has not been verified that the microdata feature:

1. Compiled with all the steps in Ian's own process to add a feature.
2. Is indeed widely implemented.
3. Is so important that user agent vendors "are actually ready to commit money and risk interop issues over it".

[1] http://lists.w3.org/Archives/Public/public-html/2008Jun/0140.html
[2] http://lists.w3.org/Archives/Public/www-archive/2008Jun/0036.html
[3] http://lists.w3.org/Archives/Public/www-archive/2008Jun/0038.html
[4] http://lists.w3.org/Archives/Public/www-archive/2008Jun/0040.html
[5] http://lists.w3.org/Archives/Public/www-archive/2008Jun/0042.html
Henri Sivonen
Martin Kliehm "we should as a working group give the most support to the technology we most believe should succeed in the marketplace"

I believe this technology is RDFa. The proposal to keep Microdata within the HTML5 spec cites quality issues, but that does not convince me that another competitive technology to the well established RDFa should be created at all. Offering one proven technology will confuse authors less. And even as a separate specification I believe Microdata could be integrated into SVG - if a time element is necessary, include that spec in your namespaces and all will be well.
Michael Köller HTML 4.01 dates from 1999. It is a valid assumption that the currently discussed draft of HTML5 will easily stay an equally long period of time until some successive specification version may obsolete it.

So I strongly mandate, that any feature within the new HTML5 spec be thorougly discussed, proved solid and interoperable by a wide range of implementations. Any features where these properties are still highly controversial should better given a distinct separate place to reach maturity.

Mircodata currently does not meet these requirements. There are still too few experiences how it will succeed in practice. Microdata will definitely have more flexibility to evolve if it's split into a separate spec of it's own.
Krzysztof Maczy&#324;ski Objection 1
Rec-track work on Microdata is not an option for this WG allowed by charter.

Objection 2
The authors of Microdata claim that their design goal was to satisfy the use cases given by the community interested in this kind of embedding additional semantics. However, the community isn't satisfied with the result and generally prefers the RDFa approach (including a clear path for evolution, addressing issues by sincerely interested RDFa WG members, integration with languages other than (X)HTML).

Objection 3
(This is also the problem underlying Objection 2.) Leaders of support for Microdata have for a few years repeatedly stated their belief that the Web should not accommodate a technology for solving the Semantic Web (or, as some like it, semanic web) community's use cases. It woulde therefore be naive to assume that they will continue to nourish that community and evolve the spec to their liking (or indeed, that they already have so far), much less that they'd welcome on-par involvement of experts from that community or any meaningful form of dialogue.

Objection 4
Microdata is a political attempt at preventing RDFa from achieving more success. This can be seen from Objection 3, unilateral faits accomplis, as well as the Editor's employer's most likely purposeful misimplementation of RDFa (see e.g. http://iandavis.com/blog/2009/05/googles-rdfa-a-damp-squib, although it's easy to be picky at the points where the author's formulation lacks some clarity), tormenting the market (since it's not just a partial implementation - there are actually wrong triples extracted) and dumbing RDFa down to the level of Microdata, making the advantages of the latter (which it does have and they're informing RDFa 1.1) shine in comparison.

Answers to some objections to the other CP
(Those are based by some form of analogy on assumptions about things which don't belong to the issue at hand, but these assumptions aren't universally accepted, quite the opposite. This weakens their position by rendering those arguments ungrounded.)

@Anne van Kesteren: Indeed, video, audio, canvas and img are inappropriate in a markup language for hyperTEXT. object with specific bindings for top-level media types associated in a UA's stylesheet would be the correct generic approach, technically superior for all. See http://lists.w3.org/Archives/Public/public-html/2009Sep/0739.html. Also see http://www.w3.org/TR/xhtml-forms-req#reqintro and please stop trying to take us several years back.

@Henri Sivonen: Ian Hickson admits trying unsuccessfully to forestall XForms. Work on Web Forms 2.0 was done outside W3C and later brought into it by the political power of just a few companies against otherwise accepted migration to a superior technology (XForms). XForms supporters know they currently have to tolerate the development of the old forms within HTML5. As one of them, I believe (and it seems to me that many would agree) that in clean projects XForms will be used anyway with potential dynamic translation to HTML5 forms where a user agent doesn't support it. As you can see, the new features of the old forms aren't becoming popular in actual documents. We already knew better in 1999, see http://www.w3.org/TR/1999/WD-xhtml-forms-req-19990830#req.
Shelley Powers I fully support what Manu stated in his Change Proposal to split Microdata, though I don't believe that having both Microdata and RDFa as W3C specifications is actually in the best interests of the web community at large.

In addition to the sources using RDFa that Manu stated, now Best Buy has incorporated it into its store front, and has achieved positive search results because of such effort[1]. Obviously, RDFa has reached both maturity and use that Microdata can never hope to reach.

In Tab's proposal to keep Microdata in the spec, he states:

"The purpose of the W3C is to advance the web, not to remain neutral in technological conflicts. If one technology under the W3C's purview is better than a competing technology, it is our responsibility to actively decide in favor of it. To do elsewise would be dereliction of our core duty to the web. Microdata and RDFa are directly competing, as they accomplish virtually precisely the same thing; there is no good reason to use both on a page except for gratuitous proliferation of metadata embedding syntaxes."

If those who want to keep Microdata in HTML5 do so because they believe Microdata and RDFa are in competition, then the point of doing so is now moot: RDFa has achieved maturity, reach, and momentum that make any competition between it and Microdata extremely uneven, at best.

However, the existence of a supposedly competitive Microdata format in HTML5 could cause confusion, and could ultimately result in people deciding not to implement any metadata format, regardless of whether it is Microformts, RDFa, or Microdata. This would be, in my opinion, a net loss for the web.

If the Microdata supporters had kept open the possibility of both RDFa and Microdata co-existing, then it would make sense to at least provide a minimum of support for Microdata--if not directly in HTML5, in a FPWD. However, Microdata supporters have stated, vehemently stressed in fact, that there can be support for RDFa, or support for Microdata, but not both.

Since RDFa has maturity and reach, as well as a long history of support in the W3C, it is not going to be supplanted by Microdata. By the Microdata supporters own words, then, Microdata should be dropped from the HTML5 specification, because, according to them, there can't be both support for RDFa and support for Microdata, in the W3C.

In my opinion, based on the Microdata supporters own views of the competition of the two formats, Microdata should be dropped altogether, but if the Microdata supporters wish to pursue it as a separate specification or group, that's their choice. Note, though, that this violates their own beliefs about letting the marketplace decide. According to Tab in his change proposal:

"While it is true that either RDFa or Microdata (or both) may fail in the marketplace, we should as a working group give the most support to the technology we most believe should succeed in the marketplace."

The marketplace has already decided.

[1] http://priyankmohan.blogspot.com/2009/12/online-retail-how-best-buy-is-using.html
Ian Hickson
Michael[tm] Smith The Team objects to the HTML WG proposal to include Microdata as a section in HTML 5. The two main reasons it should be published as an independent specification are:

* Microdata is not specific to HTML 5. As an independent specification it can evolve independently and be easily referenced from other markup languages.

* Microdata is not the only mechanism available for annotating HTML content with specific machine-readable labels. Rather than strongly promote a preferred mechanism by direct inclusion in the specification, the Team suggests that independence and user choice should be preferred at this time.

More details on responses

  • Jirka Kosek: last responded on 10, December 2009 at 20:59 (UTC)
  • Leonard Rosenthol: last responded on 11, December 2009 at 01:56 (UTC)
  • Roy Fielding: last responded on 11, December 2009 at 02:23 (UTC)
  • Jonas Sicking: last responded on 11, December 2009 at 07:32 (UTC)
  • Ben Adida: last responded on 11, December 2009 at 16:14 (UTC)
  • Joe Williams: last responded on 11, December 2009 at 19:20 (UTC)
  • Matthew May: last responded on 12, December 2009 at 03:30 (UTC)
  • Mark Birbeck: last responded on 14, December 2009 at 11:20 (UTC)
  • Tim van Oostrom: last responded on 14, December 2009 at 17:37 (UTC)
  • Rob Ennals: last responded on 14, December 2009 at 20:18 (UTC)
  • Graham Klyne: last responded on 15, December 2009 at 09:33 (UTC)
  • Larry Masinter: last responded on 15, December 2009 at 22:14 (UTC)
  • Jace Voracek: last responded on 16, December 2009 at 00:35 (UTC)
  • Leif Halvard Silli: last responded on 16, December 2009 at 00:36 (UTC)
  • Sam Johnston: last responded on 16, December 2009 at 00:49 (UTC)
  • Philip Jägenstedt: last responded on 16, December 2009 at 04:39 (UTC)
  • Anne van Kesteren: last responded on 16, December 2009 at 12:41 (UTC)
  • Kai Scheppe: last responded on 16, December 2009 at 13:01 (UTC)
  • Julian Reschke: last responded on 16, December 2009 at 13:18 (UTC)
  • Karl Dubost: last responded on 16, December 2009 at 15:37 (UTC)
  • Laura Carlson: last responded on 16, December 2009 at 15:49 (UTC)
  • Henri Sivonen: last responded on 17, December 2009 at 08:58 (UTC)
  • Martin Kliehm: last responded on 17, December 2009 at 09:37 (UTC)
  • Michael Köller: last responded on 17, December 2009 at 11:23 (UTC)
  • Krzysztof Maczy&#324;ski: last responded on 17, December 2009 at 16:04 (UTC)
  • Shelley Powers: last responded on 17, December 2009 at 20:13 (UTC)
  • Ian Hickson: last responded on 18, December 2009 at 00:50 (UTC)
  • Michael[tm] Smith: last responded on 18, December 2009 at 04:43 (UTC)

Everybody has responded to this questionnaire.


Compact view of the results / list of email addresses of the responders

WBS home / Questionnaires / WG questionnaires / Answer this questionnaire