User:Eoconnor/ISSUE-27

From HTML WG Wiki
< User:Eoconnor
Revision as of 00:48, 9 December 2010 by Eoconnor (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Defer to the Microformats community for cataloging HTML rel values

Summary

The Microformats community actively maintains the de facto listing of HTML rel values on its existing-rel-values wiki page. The HTML specification should bless this as the official location for listing HTML rel values.

This change proposal addresses ISSUE-27 rel-ownership.

Rationale

Reflect reality

Whatever solution we adopt should, above all else, reflect the deployed reality of HTML rel values. (Support Existing Content) In 2009, DeWitt Clinton conducted a survey of deployed HTML rel values (A Survey of Rel Values on the Web), and while he found "a staggering 1.8M unique rel value strings in use[…] the top 11 alone were responsible for 90% of all usage." Comparing the Microformats listing and the IANA registry of link relation types, we find that the Microformats list has been much more successful at capturing the top 25 rel values in DeWitt's survey:

HTML rel value Present in µf list Registered with IANA
nofollow y y
stylesheet y y
tag y n
alternate y y
icon y y
chapter y y
forum n n
shortcut n n
bookmark y y
archives y y
category n n
external y n
search y y
edituri y n
apple-touch-icon n n
help y y
prev y y
next y y
pingback y n
wlwmanifest n n
contents y y
contact y n
service.post n n
top y n
me y n
Total 76% 48%

As stated in the W3C-hosted registry Change Proposal, "the Microformat community has the best track record for running a registry of rel values."

Diversity and openness of maintainership

HTML rel values are minted by a diverse set of markup authors: standards-aware web designers and developers, search engine engineers, browser vendors, Semantic Web tool developers and users, and many others. An effective registry of HTML rel values should engender the support and participation of all of these groups.

Several active participants of the Microformats community have been championing the principled minting of new HTML rel values for many years, as documented in this email of Tantek's to public-html. More specifically, the community of contributors to the existing-rel-values page is as diverse and informed a group as we could possibly hope to attract to the task of registering and curating a list of HTML rel values. The page has been edited by, among others:

  • DiSo project co-founder Steve Ivy;
  • Apple accessibility engineer James Craig;
  • Toby Inkster, a Semantic Web developer and advocate;
  • Niall Kennedy, a prominent Web developer and Open Web evangelist;
  • accessibility advocate and HTML WG contributor Leif Halvard Sillil.

Ease of registration of pre-existing values

All things being equal, it should be as easy as possible for interested parties to register already-deployed HTML rel values. Such interested parties might not be the originators or publishers of such HTML rel values.

The Microformats listing is maintained on a wiki page, and as such is straightforward for anyone to update at any time. MediaWiki, the wiki software used by the Microformats community, is very widespread and so its wikitext syntax is more widely understood than competing wiki languages.

Several widespread HTML rel values (tag, external, pingback, and various XFN values come to mind) are absent from the IANA registry, and due to its onerous registration requirements are likely to never be successfully registered. At least one such widespread HTML rel value's registration attempt failed (pingback). (ref) As Anne van Kesteren said: "Writing a specification as a barrier to enter the registry is too much work. Many link relations have seen widespread adoption before a formal specification was written." (src) "For instance, for 'nofollow' it was not really clear whether a specification would at some point arrive, but everyone was using it so it should really be on the list[…]" (src) One of our chairs put it this way:

I don't think it's a good idea to leave values in common use completely undocumented in the registry. It means that a good samaritan who finds someone else's unregistered header cannot ensure that it is documented without writing a full specification that will survive formal review.

See also the Editor's experience with the IANA registry.

Must be able to register HTML-specific details

Three HTML elements (<a>, <area>, and <link>) take rel attributes, but for various reasons different HTML rel values may be used in different circumstances. Because of this, any registry of HTML rel values must be able to hold such element-specific data on a per-value basis. When the Editor tried to use the IANA registry, he encountered pushback when attempting to register HTML-specific metadata about link relations:

One thing that came out during this discussion was the possibility that link relations would be rejected if they were HTML-specific. 17 For example, something specific to HTML cache manifests would likely not be able to be used as a <link rel> value since it wouldn't work in Atom. This is apparently a risk even if the value is already well-established de facto, meaning the registry could in fact by design fail to match deployed content.

The conversation shifted to suggesting that the HTML specification should not use the registry to maintain information about link relations as they apply to HTML. Since this would apparently mean that people who wanted to use link relations with HTML would have to register their relations twice — once with IANA, and once with whatever other registry mechanism HTML has, this seemed highly suboptimal. 18

While the existing-rel-values page currently lacks these HTML-specific fields, it will be easy enough to add them should this WG choose to use it as its repository of HTML rel value information. We have explicit buy-in from the Microformats folks on this.

License

The registry we adopt should be available under a license compatible with both Free and proprietary software. The existing-rel-values page of the Microformats wiki is in the public domain, and as such its contents are able to be incorporated into Free software as well as proprietary software.

Provisional registration

It is prudent for any registry of HTML rel values to allow for various states of registration. Among other reasons, and as currently suggested in the HTML specification, conformance checkers could flag provisional values differently from unregistered values.

The existing-rel-values page of the Microformats registry currently lists HTML rel values in one of several categories, based on the quality and permanence of their specification. At the highest level there are those rel values defined by format specifications. Rel values on the next level down are defined in proposals which have yet to complete the Microformats process. There are several further levels for more experimental values.

There is no allowance for provisional registration in the IANA registry scheme.

Discoverability

It must be easy for web authors to find the list of HTML rel values for reference purposes. The Microformats wiki's various pages on rel="" perform quite well when googling:

Google search Result
rel values existing-rel-values is 1

rel-faq is 2

acceptable rel values existing-rel-values is 1

rel-faq is 2

what can i put in the rel attribute rel-faq is 5
html rel rel-nofollow is 8
rel attribute rel-faq is 6

The first page of results typically contained previous HTML specifications, several w3schools markup tutorials of dubious quality, and the WHATWG Wiki RelExtensions page. (When both the Microformats and WHATWG wiki pages appeared on the same page of results, the Microformats wiki page was always higher.) The IANA registry did not appear in the first page for any of these searches.

If searching for the spec writer term-of-art "link relation," the WHATWG wiki page and the IANA link relation registry appear, but the Microformats wiki does not.

Responses to anticipated objections

Susceptibility to spam

As you can see in its revision history, the existing-rel-values page of the Microformats wiki has never been spammed in over three years of existence. The wiki is configured to only allow logged-in users to edit pages; this has been sufficient to avoid most spam bots. That said, when spam has appeared elsewhere on the wiki, it has been removed promptly by one of the many Microformats admins.

Too easy to add entries

Some may argue that an open wiki is too open, that it will end up being filled up with junk from any random person who happens upon the page. But if we look at the page's revision history, we find that edits have been fairly conservative over the lifetime of the page.

Lack of formal process

Unlike the IANA registry, there aren't official Designated Experts for this page, nor is there a formal process for editing it. This hasn't proved to be a problem in the past but, should it become one in the future, a process may be worked out over time between the various stakeholders.

Other considerations

Unification of HTML and Atom rel values

While atom:link superficially resembles HTML's link element, Atom rel values are (currently) distinct from HTML rel values. This WG has made no decision on whether or not unification of Atom and HTML rel values is a goal we should pursue.

The IANA registry set up by RFC 5988 explicitly unifies HTML and Atom rel values. While this CP takes no stance on whether or not HTML and Atom rel values should be unified, this may be harmful to the future minting of HTML rel values by this WG or by other parties. As described by one of our chairs in an email to public-html:

At least some of the designated experts for the IANA link relation type registry seemed to indicate that all future entries in the registry should be appropriate for all contexts where they might be used (including the HTTP Link header, and Atom link relations), and so future (or current de facto standard) link relations that are HTML-specific by nature may well be rejected. Such relations might need to go in a separate HTML-specific registry.

This is problematic for at least three reasons:

  • By choosing the IANA registry, this WG and other HTML rel value minters may be constrained to minting only those rel values which make sense in an HTTP or Atom context, thus impeding their ability to register useful-yet-HTML-specific rel values.
  • §4 of RFC 5988 restricts future link relations from having a different semantics when used in conjunction with another link relation. (HTML's stylesheet alternate is grandfathered in.)

    If we adopt the IANA registry, we prevent future HTML rel value minters from minting "modifier" rel values which augment other values. While this CP takes no position on whether or not such "modifier" rel values are sound design, they do have precedent in the Web platform (stylesheet alternate, shortcut icon) so it's reasonable to suppose such things may be minted in the future.
  • The need to maintain a separate, parallel registry to contain real-world HTML rel values rejected for not being generic enoough defeats the purpose of choosing the registry in the first place.

See also the 2010 F2F discussion on link relations for more.

HTML rel values are tokens

All HTML rel values, whether defined in the HTML specification, registered in whatever scheme we end up choosing, or minted by web authors and unregisterd, are case-insensitive tokens.

RFC 5988 requires unregistered rel values (what it calls Extension Relation Types) to be URIs. Formats which allow Extension Relation Types to be expressed as simple tokens are required to provide a mechanism for them to be converted to URIs for comparison.

There are a couple of issues with this:

  • Because legacy HTML processors compare HTML rel values as simple tokens only, such processors would not consider simple tokens to be equivalent to their URI form.
  • It is unreasonable to expect web authors to use URIs as HTML rel values. (ref)

These issues came up during the 2008 F2F:

anne: I don't want to be able to write everything as a URI. e.g. rel="stylesheet" shouldn't have an equivalent using a full URI

[…]

gsnedders: The only problem I can see with absolute URIs is using current relations like "stylesheet" is that you can't express them as absolute URIs without breaking backwards compatibility

Details

In § 4.12.4.19 Other link types, replace references to the WHATWG Wiki RelExtensions page to the Microformats wiki existing-rel-values page.

In References, add a new reference to the Microformats wiki.

In § 4.12.4.19 Other link types, change the [WHATWGWIKI] references to refer to the Microformats wiki reference added above.

A unified diff implementing these changes may easily be supplied by this Change Proposal's authors upon a request from the Editor.

Impact

Positive Effects

The HTML5 specification would reference the list of HTML rel values already used by web developers in practice, and would benefit from the pre-existing community of contributors and administrators who keep it up-to-date and spam-free.

Negative Effects

This change proposal, if adopted, would result in HTML link relations remaining separate from Atom and HTTP link relations. This is negative to those who have a goal of converging Atom, HTML, and HTTP link relations into a unified system.

Conformance Classes Changes

Conformance checkers which currently source valid HTML rel values from the WHATWG RelExtensions wiki page would have to be modified to instead source such values from the Microformats wiki.

Risks

  • Rohit Khare could be hit by a bus.
    This isn't that big of a risk. As the zero-change proposal says, "we might have to update the spec at some point in the future if for whatever reason the registry moves to another URL. However, that's a risk regardless of what solution we use, since if someone sets up a competing registry that wins in the market, we'd have to update the spec to point to that registry even if the previosuly 'official' one still existed."

References

References are linked inline.

See also

Contributors

Other collaborators welcome!