ISSUE-114: link tag: rel: associate pages about the same organization across many sites for searches without a canonical page and despite a confusingly indistinct name

rel-canonical-organization

link tag: rel: associate pages about the same organization across many sites for searches without a canonical page and despite a confusingly indistinct name

State:
CLOSED
Product:
Raised by:
Maciej Stachowiak
Opened on:
2010-04-28
Description:
Escalated from: http://www.w3.org/Bugs/Public/show_bug.cgi?id=7682
Requested by: Nick Levinson

Different websites may have pages about the same organization. Several
organizations (businesses, government agencies, institutions, ad hoc criminal
conspiracies, etc.) may have the same name and all may be written about on
multiple sites.

A link element naming the organization and providing data that is standardized
could help search engines organize their listings to reduce accidental
intermixing. It wouldn't be perfect; e.g., an organization may have listed its
name differently in different places; a website owner may erroneously enter the
wrong data; nationality may vary with a citizenship change; or its functional
headquarters and its legal headquarters may be far apart. But, in general,
listings with this element could be more successfully separated.

Writing and parsing the link element would be a bit more complex than with
other link elements, but I think this is manageable and the method I propose
has been applied elsewhere.

I propose that the rel value be "canonical-organization" and that its title
attribute be reserved for a special meaning and syntax. The title attribute's
syntax would be in the form of title="name: XYZ Greasy Spoon, Inc.;
headquarters: South Beach, Staten Island, New York, NY, US; ident-scheme: ;
ident: ;".

Each subattribute (e.g., "name") would be optional.

For the subattribute headquarters, if a subvalue is supplied, a nation would be
required. The nation would be represented by a standard code. One difficulty is
that an organizational headquarters is often inconsistently identified between
the functionally dominant one (e.g., where the CEO sits) and the legal one
(e.g., according to incorporation law), but that's probably solvable with
headquarters-main and headquarters-legal.

For the subattribute ident-scheme, a list of schema could be developed later,
perhaps each to be prefixed by a code for the scheme's nation and a hyphen.
Schemes could include privately-owned but widely available databases of
moderately-well-known organizations. Subvalues for ident-scheme and ident must
not be entered until a list of schema and the style of ident values for a
scheme is centralized and then the scheme must be in that list and ident's
subvalue must conform to the specified style.

If only whitespace or a null is between the colon and the semicolon, that is
equivalent to the subattribute not appearing.

A final semicolon before the closing quote mark is optional and may be imputed.

More subattributes might be added in the future, so page authors must not
invent new ones in the meantime.

No subvalue could contain a colon or a semicolon. If a one is needed or wanted,
a character entity must represent the colon or the semicolon.

Nations would be identified by standard two-letter codes. For nations that no
longer exist and do not have two-letter codes, e.g., Roman Empire and Van Lang,
longer codes must be used, since about 200 2-letter codes are already in use
and only 676 exist, and longer codes would prevent future conflict or
exhaustion. A list of deceased nations and their longer codes would have to be
established, possibly based on a standard gazetteer.

Multiple link elements with this rel value would be permitted, and UAs should
apply all of them. That permits multiple names (e.g., corporate and d/b/a),
ident-schemes, and idents to identify the one organization more certainly.

No rev value would be meaningful.

hCard and FOAF are too limiting, as discussed in Bug 7681 Comment 5, and don't
do what canonical-organization would. For example, FOAF and hCard each lack 3
of the 4 fields I proposed for the link element rel. Arguably, effort can go
into amending hCard and/or FOAF, but that seems more complicated, partly
because there appears to be a commitment to maintaining a nearly 1:1
relationship between hCard and vCard, which self-describes as stable, while
FOAF is oriented to online communities, requiring overcoming that to generalize
its use. Amending the HTML5 provision with equal effect is easier in simply
adding a new rel value and adding to where Web designers are likelier to see it
(HTML5).

Closely related is the corresponding value for humans, in Bug 7681.
Related Actions Items:
No related actions
Related emails:
  1. Re: ISSUE-114 (rel-canonical-organization): Chairs Solicit Proposals (from rubys@intertwingly.net on 2010-06-03)
  2. [minutes] HTML WG 20100603 (from plh@w3.org on 2010-06-03)
  3. {agenda} HTML WG telecon 2010-06-03 (from rubys@intertwingly.net on 2010-06-02)
  4. RE: {agenda} HTML WG telecon 2010-05-27 (from adrianba@microsoft.com on 2010-05-27)
  5. Re: {agenda} HTML WG telecon 2010-05-27 (from faulkner.steve@gmail.com on 2010-05-27)
  6. Re: {agenda} HTML WG telecon 2010-05-27 (from laura.lee.carlson@gmail.com on 2010-05-27)
  7. {agenda} HTML WG telecon 2010-05-27 (from rubys@intertwingly.net on 2010-05-26)
  8. {agenda} HTML WG telecon 2010-05-20: Surveys close, Publishing new Working Drafts (from mjs@apple.com on 2010-05-19)
  9. [Bug 7682] link tag: rel: associate pages about the same organization across many sites (from bugzilla@jessica.w3.org on 2010-05-12)
  10. RE: {agenda} HTML WG telcon 2010-04-29: Action items, new issues, Task Force reports - minutes of the meeting (from adrianba@microsoft.com on 2010-04-29)
  11. Re: ISSUE-114 (rel-canonical-organization): Chairs Solicit Proposals (from mjs@apple.com on 2010-04-29)
  12. Re: ISSUE-114 (rel-canonical-organization): Chairs Solicit Proposals (from tai@g5n.co.uk on 2010-04-29)
  13. ISSUE-114 (rel-canonical-organization): Chairs Solicit Proposals (from mjs@apple.com on 2010-04-28)
  14. {agenda} HTML WG telcon 2010-04-29: Action items, new issues, Task Force reports (from mjs@apple.com on 2010-04-28)
  15. ISSUE-114 (rel-canonical-organization): link tag: rel: associate pages about the same organization across many sites for searches without a canonical page and despite a confusingly indistinct name (from sysbot+tracker@w3.org on 2010-04-28)

Related notes:

Closed without prejudice: http://lists.w3.org/Archives/Public/public-html/2010Jun/0057.html

Sam Ruby, 3 Jun 2010, 17:17:01

Changelog:

Created issue 'link tag: rel: associate pages about the same organization across many sites for searches without a canonical page and despite a confusingly indistinct name' nickname rel-canonical-organization owned by Maciej Stachowiak on product , description 'Escalated from: http://www.w3.org/Bugs/Public/show_bug.cgi?id=7682
Requested by: Nick Levinson

Different websites may have pages about the same organization. Several
organizations (businesses, government agencies, institutions, ad hoc criminal
conspiracies, etc.) may have the same name and all may be written about on
multiple sites.

A link element naming the organization and providing data that is standardized
could help search engines organize their listings to reduce accidental
intermixing. It wouldn't be perfect; e.g., an organization may have listed its
name differently in different places; a website owner may erroneously enter the
wrong data; nationality may vary with a citizenship change; or its functional
headquarters and its legal headquarters may be far apart. But, in general,
listings with this element could be more successfully separated.

Writing and parsing the link element would be a bit more complex than with
other link elements, but I think this is manageable and the method I propose
has been applied elsewhere.

I propose that the rel value be "canonical-organization" and that its title
attribute be reserved for a special meaning and syntax. The title attribute's
syntax would be in the form of title="name: XYZ Greasy Spoon, Inc.;
headquarters: South Beach, Staten Island, New York, NY, US; ident-scheme: ;
ident: ;".

Each subattribute (e.g., "name") would be optional.

For the subattribute headquarters, if a subvalue is supplied, a nation would be
required. The nation would be represented by a standard code. One difficulty is
that an organizational headquarters is often inconsistently identified between
the functionally dominant one (e.g., where the CEO sits) and the legal one
(e.g., according to incorporation law), but that's probably solvable with
headquarters-main and headquarters-legal.

For the subattribute ident-scheme, a list of schema could be developed later,
perhaps each to be prefixed by a code for the scheme's nation and a hyphen.
Schemes could include privately-owned but widely available databases of
moderately-well-known organizations. Subvalues for ident-scheme and ident must
not be entered until a list of schema and the style of ident values for a
scheme is centralized and then the scheme must be in that list and ident's
subvalue must conform to the specified style.

If only whitespace or a null is between the colon and the semicolon, that is
equivalent to the subattribute not appearing.

A final semicolon before the closing quote mark is optional and may be imputed.

More subattributes might be added in the future, so page authors must not
invent new ones in the meantime.

No subvalue could contain a colon or a semicolon. If a one is needed or wanted,
a character entity must represent the colon or the semicolon.

Nations would be identified by standard two-letter codes. For nations that no
longer exist and do not have two-letter codes, e.g., Roman Empire and Van Lang,
longer codes must be used, since about 200 2-letter codes are already in use
and only 676 exist, and longer codes would prevent future conflict or
exhaustion. A list of deceased nations and their longer codes would have to be
established, possibly based on a standard gazetteer.

Multiple link elements with this rel value would be permitted, and UAs should
apply all of them. That permits multiple names (e.g., corporate and d/b/a),
ident-schemes, and idents to identify the one organization more certainly.

No rev value would be meaningful.

hCard and FOAF are too limiting, as discussed in Bug 7681 Comment 5, and don't
do what canonical-organization would. For example, FOAF and hCard each lack 3
of the 4 fields I proposed for the link element rel. Arguably, effort can go
into amending hCard and/or FOAF, but that seems more complicated, partly
because there appears to be a commitment to maintaining a nearly 1:1
relationship between hCard and vCard, which self-describes as stable, while
FOAF is oriented to online communities, requiring overcoming that to generalize
its use. Amending the HTML5 provision with equal effect is easier in simply
adding a new rel value and adding to where Web designers are likelier to see it
(HTML5).

Closely related is the corresponding value for humans, in Bug 7681.
' non-public

Maciej Stachowiak, 28 Apr 2010, 21:41:33

Issue dissociated from any product

Sam Ruby, 3 Jun 2010, 17:17:01

Status changed to 'closed'

Sam Ruby, 3 Jun 2010, 17:17:01


Paul Cotton <Paul.Cotton@microsoft.com>, Maciej Stachowiak <mjs@apple.com>, Sam Ruby <rubys@intertwingly.net>, Chairs, Michael[tm] Smith <mike@w3.org>, Staff Contact
Tracker: documentation, (configuration for this group), originally developed by Dean Jackson, is developed and maintained by the Systems Team <w3t-sys@w3.org>.
$Id: index.php,v 1.325 2014-09-10 21:42:02 ted Exp $