This document is also available in these non-normative formats: XML.
Copyright © W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
This finding addresses the questions "When should URNs or URIs
with novel URI schemes be used to
name information resources for the Web?" and "Should registries be provided for
such identifiers?". The answers given are "Rarely if ever" and "Probably not". Common arguments in favor
of such novel naming schemas are examined, and their properties compared with
those of the existing http:
URI scheme.
Three case studies are then presented, illustrating how the
http:
URI scheme can be used to achieve many of the stated
requirements for new URI schemes.
Editorial note: HST | 2006-03-14 |
Further to a request from Roy Fielding, I had a brief look at XCAP, seems to be using http: URIs now, although it introduces a new Application UID registry, and uses ietf: URNs for its namespaces. . . If anyone (including Roy) remembers what Roy was particularly concerned at here, please let me know. |
This document has been produced by the W3C Technical Architecture Group (TAG). This finding addresses TAG issue URNsAndRegistries-50.
This is the third draft of this finding, with the first section complete and adding three case studies. This finding is an editorial draft, not yet accepted by the TAG.
Additional TAG findings, both accepted and in draft state, may also be available. The TAG expects to incorporate this and other findings into [what?] that will be published according to the process of the W3C Recommendation Track.
Editorial note: HST | 2005-03-29 |
Are we ready to tell the world what will follow AWWW? |
Please send comments on this finding to the publicly archived TAG mailing list www-tag@w3.org (archive).
1 Introduction
2 Examining the need for new approaches to naming information resources
2.1 Persistence
2.2 Standardized
2.3 Protocol Independence
2.4 Location Independence
2.5 Structured names
2.6 Uniform access to metadata
2.7 Flexible Authority
3 The value of http: URIs
4 Case study: Naming namespaces
4.1 Context
4.2 Identification
4.3 Persistence of Identifiers
4.4 Dereferencability
4.5 Erroneous appearance of dereferencability of identifiers
4.6 Summary
5 Case Study: XRI
5.1 Simple Document Retrieval Technical Analysis
5.1.1 Run-time Resolution
5.2 Persistent Dereferencability (location independence)
5.2.1 Run-time resolution
5.2.2 Configuration
5.3 Persistence Identifiers
5.3.1 Operational policy
5.3.2 Intent
5.4 Protocol Independence
5.5 XRI alternative design
5.6 Summary
6 Case study: new URI scheme with no protocol
In [AWWW] we find the following recommendations:
"A URI owner SHOULD NOT associate arbitrarily different URIs with the same resource."
"A specification SHOULD reuse an existing URI scheme (rather than create a new one) when it provides the desired properties of identifiers and their relation to resources."
"Agents making use of URIs SHOULD NOT attempt to infer properties of the referenced resource."
"A URI owner SHOULD provide representations of the resource it identifies."
Recently, however, a number of proposals have emerged to create new
identification mechanisms for the Web. They propose new URN (sub-)namespaces or
URI schemes and provide
registries for instances thereof, in order to allow them to be used to identify
and retrieve information resources. This
would appear to be incompatible with [AWWW]'s simple positive
recommendations. In this finding we enumerate the arguments given in favor of
these new proposals, which often turn out to be arguments against
using http:
URIs, and explain why they are mistaken and how the above
principles can be understood to point the way constructively to alternative
designs which do in fact make use of http:
URIs.
This section is structured in terms of goals or requirements for resource
identification mechanisms which have been offered as justifications for
adopting a new approach. They are drawn from a number of recent proposals
([NZL], [RFC 3688], [UBL], [XRI], [RFC 4452])
abstracting, merging and summarizing them. Although we will examine some of
these proposals in specific detail in the three cases studies below, in this
section [Definition: we will use the name myRI as a
cover term for this general class of proposed alternatives to
http:
, both those proposing new URI schemes and those proposing
new URN sub-schemes.] In each case we state a requirement and examine the extent to which the existing http:
-based identifier mechanism addresses it.
NRI Goal
The relation between myRIs and the information resource they identify should persist indefinitely.
Or, more realistically, that individual myRIs should manifest syntactically whether or not they are intended to persist indefinitely.
This goal is difficult to get to grips with, as it appears to mean different things in different contexts:
At its simplest, this is just a wish for an end to 404 Not
Found
, i.e. that you should always be able to resolve a myRI.
In the Information Science community, 'persistence' is a stronger requirement, namely, that what you get when you resolve a myRI should never change.
http: fact
http:
URIs support persistence as well as
it is in-practice possible to do so.
As has been frequently observed, achieving either of the numbered types of
persistence above is not a
technology issue, it's a management issue. It's up to the owners and operators
of the mechanisms which implement myRI resolution to enforce whatever degree
of persistence they choose. It follows that there is no difference here
between myRI and http:
.
What of the more sophisticated reading, that a myRI should manifest its
minter's intentions with respect to persistence? That's just a
matter of naming conventions, and perfectly possible using http:
. We could, for example, say that all
versionable/time-varying resources on our site are named with all lower-case
letters, and all persistent/stable/non-varying resources are named with all
upper-case letters.
NRI Goal
myRIs should be susceptible to standardization within administrative units
This goal appears to be directed at guaranteeing certain invariants, for example with respect to the structure of identifiers and the availability of the resources they identify. This means they should not be creatable in a distributed or unsupervised fashion.
http: fact
Again, this is largely a management issue, not a
technical one. Whatever invariants are in view can as well be enforced on
(sub-parts of) http:
-served resource collections as on those
identified via myRIs.
Nothing in a specification can stop people from uttering URIs of any kind. Domain names are as good, or as bad, at conveying ownership of a particular form of URI as URN namespaces or URI schemes.
Centralized authorities can be established for parts of domain space as easily as for areas "off the web", and enforcement mechanisms can be as effective. For example, my employers constrain the mechanisms by which web pages are accepted for serving from certain parts of their domain so as to enforce invariants both of path structure and content markup.
NRI Goal
Access to resources identified by myRIs should not be dependent on any particular protocol.
Exactly what this means is not clear -- although it is listed as a requirement in several cases, there is little or no discussion, so exactly why it should be a requirement for myRIs is not clear.
http: fact
http:
URIs are no more protocol-dependent
than any other identification mechanism.
For pure naming, that is, if retrieval is never intended,
http:
is as good as any myRI approach, because no protocol at
all is involved. If retrieval is anticipated, then any myRI
approach must specify a mapping to one or more
protocols. All existing myRI approaches in practice specify only one such
mapping, to the HTTP
protocol. So they are in exactly the
same position as http:
-- if for some reason in the
future the HTTP
protocol becomes unavailable or inappropriate,
both myRIs and http:
will have to specify a new mapping.
True protocol independence is difficult to imagine in practice, as many
protocols depend on a tight coupling between message formats and client/server
application models. Protocols which don't allow servers any escape mechanism
are thereby pretty much ruled out as transports for retrieval from myRIs (or
http:
URIs).
It's appropriate to note here that in cases where the necessary form of client/server interaction for a particular kind of information resource, for example streaming video, cannot be provided by the protocols normally associated with existing URI schemes, new schemes may be appropriate. Detailed discussion of this point can be found in [Schemes and Protocols]. But none of the myRI proposals are for resources of this kind.
NRI Goal
myRIs should not be locations.
Practical realities and administrative changes will always defeat any attempt to guarantee that the representation of a particular resource will always be stored in exactly the same host/server/filestore/directory/file. Any naming mechanism which equates locations in that sense with names is by construction inadequate. It follows that this goal is a sensible one.
http: fact
http:
URIs are not locations.
Misunderstanding of http:
URIs as locations has a long and,
in part, justifiable history (they were, after all, originally called Uniform
Resource Locators). But it's not longer justifiable either in principle (the
RFC for URIs [RFC 3986] is quite clear on
the subject) or in practice (there's lots of software support for server-side
management of the
relationship between http:
URIs and their representations). See
for example the classic [Cool URIs] for a more detailed discussion
of these points.
NRI Goal
myRIs should provide for structuring resource identifiers with shareable tags
This requirement has only been suggested by the authors of [XRI]. It amounts to a wish to structure resource names using name/value pairs, with the names having some standardized, widely understood meaning. This requirement is related to requirements appealed to in the design of End Point References [EPRs], [TAG on EPRs].
http: fact
The query component of http:
URIs supports
non-hierarchical structured naming.
It is open to any naming authority to establish
conventions for the use of the query component of http:
URIs under
its control. Since the query component is already structured in terms of
simple name/value pairs, it is a good fit for the requirement.
NRI Goal
myRIs should provide as well for access to metadata about as to representations of a resource.
Several myRI proposals establish a constructive relation between the myRI for a resource and the myRI for metadata about that resource.
http: fact
Naming conventions or response headers can provide this already
Naming authorities can impose such constraints on the http:
URIs
under their control. Alternatively, and particularly where it is appropriate
to allow for meta-metadata, etc., the Link:
response header
[RFC 2068] may
provide equivalent functionality in a more extensible way.
NRI Goal
myRIs require different approaches to identifying namespace authorities, in some cases simpler and in others richer than that provided by hierarchical domain names administered by IANA and resolved via DNS.
http: fact
http:
URIs can encode arbitrarily complex
(or simple)
namespace authority expressions.
Complex encodings of dependent and delegated naming authority can be implemented using proxies and
redirection. In the other direction, proper management of domain names for
http:
URIs can produce names which are very little different from
the equivalent myRI (compare e.g. http://lccn.info/2002022641 to
info:lccn/2002022641
), while gaining all http:
's benefits of
scalability and installed base.
Editorial note: HST | 2006-06-06 |
HST now owns lccn.info and oclcnum.info , will sell to Stuart Weibel
for a modest consideration :-) |
http:
URIsThe http:
URI scheme implements a two-part approach to
identifying resources. It combines a universal distributed naming scheme for
owners of resources with a hierarchical syntax for distinguishing
resources which share the same owner. Widely available mechanisms (DNS and web
servers, respectively) exist to support the use of http:
URIs to
not only identify but actually retrieve representations of information resources.
Any requirement for naming resources, particularly if not only naming but
also retrieval of representations is in prospect, which admits to a similar
decomposition, that is, into a universal owner name and a hierarchical
owner-relative name, can almost certainly be satisfied by the
http:
URI scheme. http:
provides substantial benefits, in terms of installed
software base, user comprehension, scalability and, if required, security, at
very low cost.
Anyone developing an alternative approach, that is, some form of myRI,
should consider carefully whether that approach is either isomorphic to
http:
, or makes covert appeal to http:
for its
implementation. In either case, this strongly suggests that the fundamental
requirements of the new approach do in fact admit to the two-part description
given above, and therefore that http:
itself would be a viable,
and therefore a preferred, way forward.
The example in section 2.7 Flexible Authority above is illustrative of
the benefits that the ubiquitity of the installed base of support for
http:
provide -- within 15 minutes of registering the lccn.info
domain, the http://lccn.info/ homepage had
been put in place and was available to anyone with a web browser and
access to the Web.
In this section we look in detail into some of the background
assumptions for the utility of myRIs for one particular purpose, namely for
naming namespaces. We will compare the use of http:
and of myRIs
for this purpose.
The XML Namespaces specification [XML Namespaces] is the context-defining specification for namespace names. It specifies that namespace names are for use in expanded names consisting of a namespace name (or no value) plus a local name. An expanded name may be compared against other expanded names. Very common scenarios are for performing well-formedness checking and for content model validation. The namespace specification, roughly speaking, says that a namespace name cannot be assumed to be dereferenceable. Any software component that is written assuming that any namespace name must be dereferencable is violating the namespace specification. It may be that the namespace owner has guaranteed that they will provide a document at the namespace name, but namespace owners are not required to do so and not all do. As a result of this, generic XML software should not be written to assume dereferencability of namespace names.
Any use of identifiers, from namespace names to isbn numbers to invoices, requires a context. The context will define the use of the identifier and includes social and technical context. A URI on the side of a bus will probably convey the social meaning that it can be typed into a browser. Other contexts for the use of URIs include namespace names, references to documents, and identifiers for things. It is never the case that a URI is simply "found" without a context.
First we examine the use of an http:
URI for a namespace name. We will choose the OASIS WS-RM TC's HTTP Namespace name as an example.
<myns:foo xmlns:myns="http://docs.oasis-open.org/ws-rx/wsrm/200602"/>
Compare this with the use of a urn:
URI for the namespace name. We will use the OASIS UBL rules for namespace names. The UBL rules are roughly that the namespace names for UBL Schemas holding OASIS Standard status must be of the form: urn:oasis:names:specification:ubl:schema:<subtype>:<document-id>
. For example, the first namespace name for the first major release of the Invoice document has the form urn:oasis:names:tc:ubl:schema:xsd:Invoice-1.0
, such as:
<myns:foo xmlns:myns="urn:oasis:names:tc:ubl:schema:xsd:Invoice-1.0"/>
In all XML namespace software, both approaches work correctly. The software-only interaction pattern is clearly erroneous if it assumes that a namespace name is dereferenceable, and it is unlikely that XML software written today requires this assumption be valid.
Let us know examine the persistence of the identifiers. The
oasis
URN namespace, as used in urn:oasis:names:tc:ubl:schema:xsd:Invoice-1.0
, is assigned by the OASIS organization and registered with IANA. OASIS has the authority to change its identifying scheme, subject to IANA review. Additionally, the actual names are decided by OASIS. As with all URNs, the persistence of any particular identifier and scheme are up to the registering organization.
An http:
URI for OASIS namespace names, such as
http://docs.oasis-open.org/ws-rx/wsrm/200602
, is assigned by the
OASIS organization because it owns the oasis-open.org
domain. They do not have to register the complete URI anywhere. OASIS has the authority to change the template on it's own, without any review. Additionally, the actual URIs are decided by OASIS. As with all URIs, the persistence of any particular identifier and schema are up to the owner of the domain. It is possible for the domain to cease being owned by OASIS, through lack of maintenance or even error.
We might imagine a scenario many years down the road where OASIS no longer
exists. It would not maintain the oasis-open.org
domain name and
http:
identifiers using that domain would no longer be assigned. Alternatively, OASIS does not produce or mint any new URNs. In either case, the identifiers are not dereferenced so all the existing software works.
In URN and http:
scheme cases, the persistence of the identifier is accomplished by the organization. The ongoing existence of the organization does not affect the persistence of the identifiers.
But, if one of these identifiers appears in a document, how will a human
find out the meaning? One approach is examine the context
surrounding the identifier, in this case the XML document and the Namespaces specification. They will look in the XML Namespace specification and see what it says about namespaces. There is no benefit to the xri:
versus http:
as the work in examining the XML document and XML
namespace specifications are the same. Alternatively, they may try to dereference the
namespace name, but it's not deferenceable so they get no information.
It is natural for a human reading an XML document with an unknown namespace name to want to understand more about the namespace.
This is why [AWWW] recommends providing a document at a namespace name
that provides both human and machine readable information. The use of
http:
namespace names enables 3 separate scenarios:
an identifier can be created in a decentralized manner;
an identifier may be dereferenced by a person via a browser to aid understanding;
an identifier may be dereferenced by a computer and exploited for automatic processing by reason of its identifying schemas, WSDLs, policies, etc.
These are two distinct interaction patterns, without and with human involvement.
In all dereferencable identifier scenarios, an identifier must be usable to generate an authority. There may be interactions with multiple authorities to determine the "final" authority for the identifier. The final authority uses the identifier to produce a document.
In the http:
identifiers, the authority is specified immediately after the scheme. The authority system in http:
URIs is the internet's DNS and IP systems. One or more DNS authorities produces an IP destination as the final authority. That authority is then sent the remaining part of the URI for dereferencing. In the case of http://docs.oasis-open.org/ws-rx/wsrm/200602
, the HTTP interaction is
A common reason given for needing myRIs for namespace
names is that an http:
identifier appears to humans as a location and
hence dereferencable. The argument that http:
URIs are "locations" is based upon
incomplete understanding of the use of URIs. A classic scenario is that a
human looks at an XML document using a word-processing application, and the
application formats the value of an xmlns
attribute as a
hyperlink, because it is an http:
URI, say http://example.org/ns/foo. But as there is no document dereferencable from that URI, when the user clicks on the link an HTTP 404 will be returned. The obvious downside is that the user has wasted some time, typically around 5-10 seconds. There is no additional harm than that in clicking and getting a 404.
Under what circumstances are identifiers viewed as "clickable", that is what are the contexts? In this document, neither of the xmlns
links have shown up as clickable. When these documents were pasted into an email, they were not converted to clickable. The http:
link was converted to clickable only when the myns
attribute was typed by hand and auto-complete was on. It was the e-mail program's "auto-complete" that saw an http:
within a pair of quotes and made it clickable. It also required that rich text or HTML formatting is selected in creator and receiver/viewer. When viewed in plain text, the link is not clickable. The clickable link arises when a document is typed by hand, with auto-complete turned on, and then viewed by with HTML formatting. Neither of these applications is treating the document as XML, rather they are treating it as HTML. In particular, none of the applications know anything about XML or the xmlns
attribute. Thus the context of usage has incorrectly view the xmlns
attribute as HTML. When this happens, that is people are reading and writing sample XML documents using HTML formatting, the worst downside is that a person may waste 5-10 seconds.
Contrasting with this is the approach of using a myRI. A myRI provides an
identifier. A human looking at an xml document with a myRI namespace name will not be confused
about whether it is dereferencable or not. No software will "auto-complete"
e.g. an xri:...
identifier into a clickable link. The 5-10 seconds of potentially wasted time
are avoided.
Namespace names are just one example of a context of use. Any use of an identifier, or any datatype for that matter, in an XML document has the same issues. A provider of a
identifier must specify how the identifier will be used in each specific sub-context of their XML language, whether it
is intended as an identifier, a location, or both. Using a myRI instead of an http:
URI does not make the software or human's job any easier.
This section has shown that http: uris have a large benefit over urns: when used as namespace names because the namespace name could optionally be dereferencable, and the only downside is a fairly minimal amount of wasted time when a namespace name appears dereferencable but isn't.
In this section we look in some detail into some of the background
assumptions for the utility of myRIs for
persistent identifiers and location independence, and we will compare http:
with the proposed
xri:
scheme in this regard.
This sections provides an overview of document retrieval of http:
versus XRI:
For comparison purposes, it shows document retrieval for a given
URI and for a given XRI. Consider the worked example from [XRI], in which a
department of a government agency published a document named
govdoc.pdf
. It assigns a URI
http://department.agency.example.org/docs/govdoc.pdf
. [XRI] observes that changing the organizational structure represented in the URI, for example to http://newdept.agency.example.org/docs/govdoc.pdf
, or the path structure, for example to http://newdept.agency.example.org/documents/govdoc.pdf
, breaks access.
[XRI] suggests that an xri:
URI for the same
resource can be designed to be
location independent, for example xri://@example.org*agency*department/docs/govdoc.pdf
[XRI] deals with delegation by using stars ("*"). Another
solution advised by [XRI] is to use identifiers that have bang ("!") symbols to indicate persistence. An example is xri://@!9990!AF8F!1C3D/!2495
. We examine this scenarios in order.
A client makes an HTTP request for the document at http://department.agency.example.org/docs/govdoc.pdf
GET /docs/govdoc.pdf HTTP/1.1 Host: department.agency.example.org response: 200 OK PDF Document
[XRI] specifies that "XRI resolution is a two phase process. The first phase, authority resolution, resolves to the XRI authority responsible for the resource. The second phase, local access, uses URIs and metadata from the authority to interact with the identified resource." [XRIResolution] specifies that a xri://@example.org*agency*department/docs/govdoc.pdf
is parsed to an XRI Authority of @
, which is queried for example.org
, which is queried for *agency
, which is queried for *department
. [XRISyntax] specifies that @
represents an authority of type organization and it establishes a global context for identifiers for whom the authority is controlled by an organization or a resource in an organizational context, resulting in http://example.org
ss the base authority for @example.org
. It is possible to do look-ahead as well, so a query of @example.org*agency*department
might return resolution for @example.org
, @example.org*agency
, or @example.org*agency*department
. Resolution proceeds until a "/" is reached in the XRI. The "=" represents an authority of type Person and it establishes a global context for identifiers for whom the authority is controlled by an individual person - resulting in equals.example.org
is the authority to send the resolution request. The XRI Authority endpoints are described using XRI Descriptors. There are other special characters, such as "!", "+", "$".
Note:DaveO can't find where @ is resolved by local authority, how @example.org maps to http://example.org. Would @foo.ca map to http://foo.ca? There is some wording in XRI Resolution that says = examples resolves to http://equals.example.org/xri-resolve as found in the xrid:XRIDescriptor/xrid:Authority/xrid:uRI for this community, but I'm not sure what that means. Somehow the @ and = authorities have to be built in, but I don't know if it's an HTTP GET or a default XRIDescriptor or .. So, I don't know how this bootstrap problem is resolved.
GET *example.org*agency*department HTTP/1.1 Host: example.org Accept: application/xrid+xml response: 200 OK <xrid:XRIDescriptor> <xrid:Service> <xrid:Type/> <xrid:URI>http://department.agency.example.org<xrid:URI> </xrid:Service> </xrid:XRI Descriptor>
The XRIDescriptor's element specifies that the authority for *agency*department is department.agency.example.org. An HTTP GET request is issued
GET /docs/govdoc.pdf HTTP/1.1 Host: department.agency.example.org response: 200 OK PDF Document
There is the obvious bootstrap issue in the XRI system. Any XRI client must understand the XRI descriptor format. This is effectively a replacement for DNS, that is mapping names to addresses. Note that it recurses and uses the DNS/HTTP infrastructure in this example. There are at least 2 separate HTTP GET requests to resolve the xri:
identifier into a document.
Another common reason for a new identifier scheme is to come up with an identifier that is location-independent or "movable" from one location to another. The idea is that the document changes location but the identifier should still resolve to the same document. In all cases, there must be some kind of mapping of the identifier to the "new" location if a location is changed. There is a publishing step, where the "new" location is added into the registry for the identifier.
HTTP supports movement through various 3xx status codes. Virtually all Web browsers and servers will correctly utilize the 3xx HTTP Status codes.
GET /docs/govdoc.pdf HTTP/1.1 Host: department.agency.example.org response: 301 Moved Permanently Location: newdept.agency.example.org/documents/govdoc.pdf GET /documents/govdoc.pdf HTTP/1.1 Host: newdept.agency.example.org response: 200 OK PDF Document
XRI supports movement through modifying the XRIDescriptor's URI element
GET *example.org*agency*department HTTP/1.1 Host: example.org Accept: application/xrid+xml response: 200 OK <xrid:XRIDescriptor> <xrid:Service> <xrid:Type/> <xrid:URI>http://newdept.agency.example.org<xrid:URI> </xrid:Service> </xrid:XRI Descriptor>
The XRIDescriptor's URI element specifies that the authority for *agency*department is newdept.agency.example.org. An HTTP GET request is issued
GET /docs/govdoc.pdf HTTP/1.1 Host: newdept.agency.example.org response: 200 OK PDF Document
NOTE: DaveO: I can't see in the XRI Resolver specs how the docs path is changed to documents.
With http:
URIs, there is a dependency that the original URI cannot be re-used for some other purpose and that it must remain "viable", that is it can't be terminated. If department.agency.example.org
ever disappeared, all the clients would break on that document. With XRI identifiers, there is a dependency upon the @
authority and related resolvers. The "long-term" viability question then is whether XRI resolvers will "last" longer than HTTP Servers on given domain names.
There are two steps to making the document available at the new URI. Firstly, the Web server must be configured to do the 301 and new Location (the redirect). In Apache 2.2, this is a configuration line such as Redirect /service http://foo2.example.com/service
. Secondly, the document must be made available at http://newdept.agency.example.org/documents/govdoc.pdf.
XRI allows a simpler retrieval once the authority is known by removing the redirect step. The authority maps the identifier to the new document and retrieval of ns/foo
is avoided. The change process is simpler as well. The new address is an XML entry instead of a registry must be updated and the new ns/latest/foo
must be added to the system.
Now the question is about the relative difficulties in updating an HTTP Server or to update an XRI resolver. In either case there will be some kind of submission and approval process. There is widespread deployment of HTTP and HTTP Administrators, it seems likely that the configuration change (one line redirect in Apache) is probably roughly equivalent to updating XRI descriptor element at a URI.
In both solutions, the mappings from old identifiers to new identifiers are stored. XRI calls this out as "However such an approach would eventually lead to a spaghetti code of new-to-old XRI mappings. It also has the drawback of preventing reassignment of the identifier "department" for another purpose."
The persistence of an identifier has operational and expressive characteristics.
Let us return to the persistence of the identifiers without regards to derefencability. The xri:
scheme is managed by the XRI committee within OASIS.
OASIS has the authority to change the XRI scheme, probably at the request of the XRI committee. It is possible for the XRI scheme to cease being maintained by the XRI committee, and even some other committee or organization take it over. Should another organization choose to create an alternative XRI scheme, then there would probably be a dispute perhaps with a dispute resolution mechanism. As the XRI specification says "As with URNs, the issue of whether a persistent sub-segment is in fact permanent (never reassigned)
is a matter of operational policy for the assigning authority. XRIs can't help with the operational issue ..."
An http:
-based naming scheme for documents is managed by the organization that owns the domain in the URI.
The organization does not have to register new schemes anywhere. In the case of http://www.oasis-open.org
URIs, OASIS has the authority to assign and change the URIs or URI schemes on its own, without any review. It is possible for the domain to cease being owned by OASIS, through lack of maintenance or even error. Alternatively, the XRI community could use an http:
based URI, such as http://xri.net
. Then the owner of the xri.net domain, presumably the XRI TC, would be responsible for managing the domain. As with all names, the domain could lapse. And as with XRI
identifiers, there is the possibility of conflicts and disputes.
With all identifiers, the persistence of any particular identifier and scheme are up
to the registering organization and the registration authority. http:
and XRI:
are equivalent in the operational policies determining persistence.
Both schemes can express the intent of persistence. As the XRI specification says "XRIs can't help with the operational issue, but XRI syntax allows the
authority to express its intent". XRI uses the bang ("!") symbol to indicate that the identifier that follows is persistent. XRI uses the star ("*") symbol to indicate that the identifier that follows is re-assignable. The XRI specification suggest that "a much better solution would be to assign the resource "govdoc.pdf" an identifier that never
needs to change or be reassigned ... such as xri://@!9990!AF8F!1C3D/!2495
".
It is possible for a URI authority to express its intent in the URIs it mints, so the same identification of persistence versus transience can be done using http:
URIs. The http:
based XRI design will be shown shortly.
XRI and Web architecture effectively give the same guidance, that it is a much better solution to assign an identifier that never changes, as in [Cool URIs].
Protocol independence is a goal of XRI. The previous HTTP redirects have shown that HTTP can do redirects, and it can do redirects to non http resources. Starting from HTTP, a redirect to an ftp:
resource is possible. An important but subtle aspect of the web architecture is that the http:
scheme for identifiers does not require the HTTP protocol to be used. This is explored in [Schemes and Protocols].
We suggest XRI should create URIs using the http:
scheme, rather than inventing a non-URI based scheme. The thread of documented good practices is that leads to this conclusion starts from [AWWW]:
Good Practice
To benefit from and increase the value of the World Wide Web, agents should provide URIs as identifiers for resources.
Good Practice
A specification SHOULD reuse an existing URI scheme (rather than create a new one) when it provides the desired properties of identifiers and their relation to resources.
Intent is part of the potential metadata in URIs discussed in [Metadata in URI].
Good Practice
URI assignment authorities and the Web servers deployed for them may benefit from an orderly mapping from resource metadata into URIs
The XRI community could define constraints on http:
uris containing a particular domain, such as xri.net.
The rules for persistence and location independence can be defined. They could start with http://xri.net
followed by roughly the current XRI rules and taking into account URI character constraints.
For example, the previous XRI persistent identifier could be similar to: http://xri.net/@;9990;AF8F;1C3D/;2495
. An alternative common practice for persistent identifiers is using UUIDs in the URI, ie:http://example.org/6B29FC40-CA47-1067-B31D-00DD010662DA
. Location independent identifiers can be achieved using HTTP redirects.
There are many varieties of constraints upon any URIs and use of HTTP redirects. One generalized framework for mapping XRIs or URNs to http:
URIs
is at [urn2http].
We have shown that http:
identifiers for XRIs can achieve the goals of XRI with substation benefits. The XRI goals of persistent identifiers and location independence are already available with http:
identifiers. There are two concrete benefits to using XRIs identified in the previous analysis: that users cannot waste time
by erroneously dereferencing namespace names that do not have namespace documents, and that an extra HTTP GET request is
avoided when documents move. The XRI identifier solution's downsides are adding a new identifier scheme with the software and
human costs and seemingly mandatory increased network costs ( our example shows 2 HTTP GETs instead of 1).
Given these costs and benefits, deploying a new registrar, resolution mechanism and related software to layer on top of existing web
functionality is not justified.
Our analysis has shown that if the scheme definition for xri:
says that it
is dereferencable, and specifies a mechanism, then either that
mechanism is HTTP, or it will have to provide all the
functionality, and thus be heir to all the weaknesses, of HTTP.
In either case little benefit has been gained over just using the http:
scheme itself. Note we have not yet compared the authority resolution mechanisms and the dependence upon centralized authority.
We have also not compared the distributed authoring of identifiers either.
In the myRI identifier scenario, the "location" to be
used for knowledge is somewhere in the application or in some
property of the myRI such as a URI scheme or URN (sub)scheme. The myRI proposals includes means to
transform a myRI into
a dereferencable address via lookup using a registry server. This in turn requires the use of a
dereferencable address for the server, or else all software intended
for use with myRIs must have the registry server locations
"hard-coded". As far as we can tell, all the myRI
proposals expect the results of server lookup to be an http:
URI,
and also appear to use an http:
URL to identify the location of the
registry server.
A main advantage of http:
URIs is the use of DNS to allow decentralized
creation of vocabularies. This does bear the cost that humans can be
confused by the mixing of location and identifiers. Another possibility
is to create and register a scheme that does not have any protocol associated with
it but follows all the rest of the http:
syntax. This is something like a cross between URNs and http:
URIs.
The intent is clear from the instance of the URI, that HTTP is not to be used for dereferencing. Various programs do not "auto-complete" into clickable links. However, similar to URNs, this does not allow the possibility for the dereferencing the link to retrieve a document. If a human wants to find out about the specific namespace, how do they find out? Returning to context, a human must understand that the myns
is an XML namespace and how XML namespaces are used. The use of id:
or http:
or urn:
or xri:
has done nothing to shield them from this requirement.
There are very few advantages and significant downside to this approach.
For these reasons, David Orchard never proceeded down the registration path for id:
.