Understanding URI Hosting Practice as Support for Documentation Discovery

1 Introduction

This document gives a set of conditions under which a particular document ("representation" in the sense of [rfc3986]) might be considered valid, current, and/or canonical documentation for the meaning of a particular URI. Such a representation will be called a "nominal URI documentation carrier" for the URI.

The purpose of defining which representations are to be considered nominal URI documentation carriers is to coordinate uses of the URI. If all parties in a communication scenario agree on which representations are nominal URI documentation carriers and which ones are not, that will help to promote agreement on meaning and therefore correct interoperation.

This specification can be seen as inducing a protocol, namely the set of implied methods from existing protocols (HTTP, FTP, etc.) that allow a client to obtain a nominal URI documentation carrier for a given probe URI.

The definition of "nominal URI documentation carrier for a URI" records a best effort interpretation of [rfc3986] and the so-called "httpRange-14 resolution" [issue-14-resolved], with [httpbis-2] and [webarch] as background. The "Cool URIs for the Semantic Web" note [cooluris] is another description of the same architecture.

The uses targeted here are those involving notations such as RDF [rdf-concepts], and languages layered on RDF, in which declarative URI meaning figures centrally, but other languages, notations, and modes of "meaning" are not excluded.

After a review of the history of the principal controversy around URI documentation discovery, there is a discussion of the central concepts of URI documentation and "representation". The following two sections give discovery methods for URIs with and without a hash sign, respectively. The document concludes with discussion of inconsistency risks resulting from content negotiation, change over time, and other sources, and a comparison of the present interpretation with the literal text of [issue-14-resolved].

1.1 Historical note

This document is part of a conversation first started circa 2002 around the declarative meaning of "hashless" URIs. At the time two different conventions were proposed for the declarative use of URIs. One convention, inherited from the hypertext Web, was for a hashless URI to refer to the document-like entity ("information resource") served at that URI. This convention collided with a separate desire to use a hashless URI to refer to an entity described by that information resource. Which use would, or should, have priority was not clear at the time. After deliberation, the TAG adopted its so-called httpRange-14 resolution [issue-14-resolved], asking "the community" to use hashless URIs to refer to their information resources, not to what those information resources describe (except when the resource is self-describing). An exception allowed a hashless URI to refer according to a description in the case where no information resource was served at the URI, as signalled by a 303 HTTP response to a GET request.

A parallel question for URIs with fragment identifier arose, but was easier to settle, since in any given case there was no ambiguity: either the URI was tied to a description, or it was tied to a document fragment, the choice being dictated by the media type of the response to a retrieval request on the "stem" URI (without the fragment identifier). In particular, if a media type specifies an RDF equivalence, then the equivalent RDF graph's use of the fragment identifier bears on its meaning.

With the growth of linked data [linked-data], some resistance to the architecture has been expressed. Reports of hash URIs being unacceptable in some situations, coupled with performance difficulties arising from the 303 redirection and the impossibility of deploying 303 redirects at all on many Web hosting services, have led to the current reexamination of the architecture. Some of the criticisms of the two approaches, and possible alternatives to them, are captured in [issue-57-report].

2 Preliminaries

The punchline — specifically the circumscription of a small number of general URI documentation discovery methods — can be stated concisely once we have established a framework.

2.1 URI documentation

URI documentation is information that documents the intended meaning of a particular probe URI. URI documentation may be transmitted along with other information, such as documentation for other URIs, without any particular demarcation between the documentation for that URI and the other information. A typical example might be an ontology document in which one finds integral documentation for a set of URIs. The ontology document serves as URI documentation for a number of URIs at the same time.

URI documentation typically takes the form of a set of statements in which the probe URI occurs. The statements, by saying what is supposed to be true of the entity to which the probe URI refers, are meant to communicate the probe URI's intended meaning - what that entity is. There is always a risk that as a result the URI means nothing at all, or that it could refer to more than one thing. Treating such situations is outside the scope of this specification, which only addresses the discovery of URI documentation, not its interpretation.

2.2 Representations and nominal representations

This specification rests on Web retrieval, as defined in [rfc3986], so we will need precise terminology for talking about Web retrieval.

The word "representation" is used in two ways here, as in [rfc3986] and elsewhere, as a type and as a relationship.

As a type, "representation" is a term of art meaning an octet sequence (the "content") together with metadata, such as media type, that directs the interpretation of the content. In [rfc2616] the word "entity" is used for this. In discussion that follows "representation" on its own should always be understood this way (that is, as a type), following the usage in [webarch], and [httpbis-2].

As a relationship, "representation of" is not clearly defined in [rfc3986] and we take it to be undefined, perhaps possessing an ordinary language meaning. In a successful retrieval using a URI the server is saying that the retrieved representation is a representation of the resource identified by that URI. So we will call a representation that results (or could result, given a suitable request) from a successful, authoritative^[1] retrieval request using a URI a "nominal representation of" the resource identified by the URI, or, to reduce the amount of verbiage, a nominal representation "from" the URI.^[2]

2.3 Representations that carry URI documentation

A representation is hereby defined to carry URI documentation for a given URI if it contains the URI documentation (with or without other information), the syntax and semantics of the documentation is as determined by the media type of the representation, and the documentation occurs unqualified. It is difficult to define "unqualified" precisely for all media types, but we generally mean by this that the documentation is given "sincerely", not quoted, conditionally, or modally. That is, if D is URI documentation and a carrier says in effect that D is not or might not be true, then D, although it occurs in the carrier, is not considered to be carried by it. (E.g. documentation that occurs in an XML literal inside of an application/rdf+xml representation is not unqualified, and therefore not "carried" by it under the present definition.)

A "URI documentation carrier" for a URI is a representation that carries URI documentation that bears on the meaning of that URI. Applying the adjective "nominal" is a technicality that signifies that being a URI documentation carrier for the URI is expected according to this specification, but that it might not actually be one (for example, if someone has made a mistake, documentation might not be found where it is expected).

Specifying an answer to the question "When is a given representation a nominal URI documentation carrier for a given URI?" is the purpose of this document.

The determination of URI semantics according to the content of a representation may be made either directly in media type documentation or by a chain of normative references starting from it. For example, for media type application/rdf+xml, this is accomplished by language in the media type registration [rfc3870] and normative references therein. For media type application/xhtml+xml, delegation is accomplished, among other ways, via XHTML's XML namespace document [xhtml-ns], which leads one (via the RDFa specification) to the algorithm for extracting an RDF graph from the XHTML markup. The RDF graph then can provide URI documentation.

It is not intended that a nominal URI documentation carrier is either objectively "authoritative" or exclusive of other sources of URI documentation.

3 Probe URI with local identifier

The syntax stem#id has come to be used not just for document fragment references as originally specified, but for any reference determined relative to content found at stem. Therefore the present document refers to id in stem#id as a 'local identifier' rather than a 'fragment identifier'. The two expressions may be considered synonymous but with different connotations.

When a URI is of the form stem#id (a 'hash' URI), a nominal representation from stem is a nominal URI documentation carrier for the probe URI.

Normal user-agent behavior implements this part of the specification, as ordinary retrieval behavior for stem#id involves a retrieval using stem.

3.1 Example: URI documentation via RDF graph

URI documentation can be provided in the form of an RDF graph. Sometimes an RDF graph can be specified by the media type, such as application/rdf+xml, application/xhtml+xml (for RDFa), or text/turtle, of a nominal URI documentation carrier. This is true for all URIs, but it is called out specifically for 'hash' URIs in the media type registration for application/rdf+xml: [rfc3870]

In RDF, the thing identified by a URI with fragment identifier does not necessarily bear any particular relationship to the thing identified by the URI alone. This differs from some readings of the URI specification, so attention is recommended when creating new RDF terms which use fragment identifiers. More details on RDF's treatment of fragment identifiers can be found in the section "Fragment Identifiers" of the RDF Concepts document. ^[3]

3.2 Example: URI documentation via markup

A local identifier can also get its meaning via format specifications that specify that certain local identifiers refer to document parts (fragments). For example, the @name attribute in HTML, or the @xml:id attribute in XML, are defined per their respective media types to provide 'anchors', and thereby to document that the local identifier refers to the enclosing element. The relevant markup then acts as URI documentation for the corresponding URI.

4 Probe URI lacking local identifier

If the URI scheme of the probe URI is 'http' or 'https',^[4] the URI has a nominal URI documentation carrier in the following ways. The cases are not exclusive (e.g. both GET/200 and Link: may yield nominal URI documentation carriers for the URI).

4.1 General case

If a nominal representation from the probe URI includes a URI documentation link (see following) with target V in its response to the retrieval request, then nominal representations from V are nominal URI documentation carriers for the probe URI.

There are two ways to locate a URI documentation link in an HTTP response:

using the Location: response header of a 303 See Other response [httpbis-2], e.g.
```
303 See Other
Location: http://example.com/uri-documentation>
```
using a Link: response header with link relation 'describedby' ([rfc5988], [powder]), e.g.
```
200 OK
Link: <http://example.com/uri-documentation>; rel="describedby"
```

Normal user-agent behavior partially implements this part of the specification, as retrieval yielding a 303 See Other response is ordinarily followed by a retrieval using the URI in the documentation link.

In the 303 case, the term "landing page" is sometimes applied to the redirect target document - it is "where you land" when you attempt a retrieval.

There is no type restriction on what the probe URI refers to or "identifies" in this case. It can refer to whatever the URI documentation specifies, which could be (and often is) an "information resource" (see below); a URI documentation link in itself does not say that the referent is not an "information resource". (But see below for the case when retrieval is successful.)

4.2 Information resource reference (probe URI is retrieval-enabled)

Editorial note

This section is the controversial one: the (a) clause of [issue-14-resolved]. Controversy surrounds the following:

the definitions of "identifies", "representation", and "information resource"
whether the (a) clause follows from the HTTP specification [rfc2616] or has the status of a separate good practice or recommendation
for any particular interpretation of the crucial terms, whether the (a) clause is actually a good idea or not from an engineering point of view
what should replace the (a) clause, if it's not a good idea

The editor is not aware of anyone who is happy with the status quo, which is what is presented here. Those desiring a change (that would be everyone) should submit a change proposal to modify or replace this section. Change proposals will be considered on an equal footing with this baseline.

The editor's best attempts so far to untangle the controversies may be found in [issue-57-report] and [generic] . [issue-57-report] is intended to be useful as an overview of the design space and a source of ideas for change proposals, and it provides a basis for evaluating potential change proposals.

If there is a nominal representation Z from the probe URI, then the client may consider this state of affairs as equivalent to the existence of a nominal URI documentation carrier for the probe URI that says that Z is a current representation of the resource identified by the probe URI, and, moreover, that the identified resource is an "information resource" (see below).

There can be many nominal representations over time or under different circumstances, but that makes this no less true. It is being said in this case that all of the nominal representations are current representations of the identified resource.

The following passage in [webarch] introduces the term "information resource":

It is conventional on the hypertext Web to describe Web pages, images, product catalogs, etc. as “resources”. The distinguishing characteristic of these resources is that all of their essential characteristics can be conveyed in a message. We identify this set as “information resources.”

The determination of which characteristics of any given resource are to be considered "essential," and what it means for any given essential characteristic to be "conveyable in a message," is left up to the reader, but some idea of what [webarch] intends is provided by the surrounding explanation and examples.

Being a nominal representation from a URI does not in itself qualify a representation as being a nominal URI documentation carrier for that URI.

4.3 Discovery via redirection

For purposes of discovery, redirect chains (HTTP status 301, 302, and 307) are often followed. That is, if retrieval is requested using a URI U1, and a retrieval using U1 yields a non-303 redirect to U2, and a retrieval request using U2 yields a result R, then R is considered a nominal representation of U2, and is consequently considered a nominal URI documentation carrier for U1. Note that this practice incurs vulnerability to mischief on the part of the U2's URI owner as well as to U1's - the eventual URI documentation may not be correct and may not even be "authoritative" in any agreed sense.

5 Inconsistency risks

What happens if there are multiple URI documentation carriers (nominal or otherwise), and they provide inconsistent information, is outside the scope of this specification. This must be recognized as a risk by all users of this specification.

Potential sources of conflict arise in the following situations:

The set of nominal representations varies over time
Multiple methods apply, e.g. both 303 and Link:
The same method applies multiple times, e.g. several Link: headers
There are several nominal representations from the same documentation URI (either stem in stem#id or a documentation link URI), i.e. content negotation ^[5]
URI documentation not carried by a nominal URI documentation carrier is used, such as in SPARQL query results
URI documentation in a nominal URI documentation carrier does not respect the URI scheme

Servers should endeavor to reduce the variability in URI documentation among these multiple sources, in order to maximize the utility of URI documentation discovery to receivers. Receivers should approach nominal URI documentation carriers with skepticism and seek independent assurance of their consistency with what their interlocutors have consulted.

5.1 Transactional inconsistency

Consider the situation where a sender S composes a message (or document, or "representation") M containing a URI U, and sends it to a receiver R (or leaves it somewhere for R to find). S may choose to rely on a nominal URI documentation carrier for U to decide how to use U in composing M, and R may choose to rely on a nominal URI documentation carrier for U as a way of understanding the use of U in M after receiving M.

It is possible that this specification will yield different representations as nominal URI documentation carriers in the two instances. Because of this, S and R should rely on this specification for URI meaning coordination only when there is a reasonable expectation that the meaning of U (as reflected in the retrieved URI documentation, and to the extent it is needed in context) is equivalent, or is only inconsequentially different, between the two nominal URI documentation carriers involved in the transaction. This is under the control of the the agent(s) controlling retrieval at the probe URI (or its stem), but only partially up to the sender's control (they can at least make sure they consulted a fresh version), and hardly at all under the receiver's control, so use of this specification entails trust between the sender and the publisher, and between the receiver and both other agents. If there is a discovery link then yet another agent might be involved.

A typical transaction might be:

P writes documentation D to URI U
S reads D
S composes message M using U based on D
S sends M to R
R reads D to understand U in M

If P writes D in between the two reads, or the second read otherwise yields a different nominal representation, then S and R's coordination attempt may fail. M acts in some ways as a partial cache of D. Note that R might read M before S does, and in effect cache it; the same problem arises in this case.

5.2 Clients and servers that use incompatible practices

Many document and message formats include a specific indicator of the protocol being used. For example, every HTTP/1.1 request or response contains the string "HTTP/1.1" in a fixed location, and each XML document starts with an XML processing directive giving the XML version number. Such an indicator is meant to convey that the originating agent is respecting some particular specification, and urges receiving agents to either understand according to that specification, or reject as not understood. This specification combines elements of existing protocols and formats in a manner that is largely compatible with current practice, so the risk of inconsistency is low. However, there is a failure case here when the following conditions hold:

the probe URI is a hashless http: URI
a retrieval is successful
the client assumes that this specification is being respected, i.e. the fact of the retrieval means the identified resource is an "information resource"
the server does not respect this specification through either ignorance or choice, i.e. the server uses the URI to refer to something that is not an "information resource"
the fact that the server uses the URI in this way could end up mattering somehow to the client.

Editorial note
In case this combination of circumstances is considered important, a possible change proposal might therefore be to revisit the assertion "risk of inconsistency is low" and introduce changes to the definition of "nominal URI documentation carrier" to avoid error in cases in which these conditions would otherwise hold.

5.3 Inconsistency with the URI scheme

URI meaning is subject to normative specifications such as RFC 3986 [rfc3986], applicable URI scheme registrations, and media type registrations. The purpose of URI documentation is to provide URI-specific information that goes beyond what the normative specifications say, while retaining compatibility with them. URI documentation should not be written that is inconsistent with constraints imposed by these specifications. The http: scheme imposes no such constraints^[6], but other schemes such as mailto: do.

6 Comparison with the TAG resolution

The above gives an interpretation of the TAG resolution [issue-14-resolved]. This section lists some important points of comparison between the preceding and [issue-14-resolved].

For reference, the critical part of [issue-14-resolved] is reproduced below:

a) If an "http" resource responds to a GET request with a 2xx response, then the resource identified by that URI is an information resource;

b) If an "http" resource responds to a GET request with a 303 (See Other) response, then the resource identified by that URI could be any resource;

c) If an "http" resource responds to a GET request with a 4xx (error) response, then the nature of the resource is unknown.

'"http" resource" is used in [issue-14-resolved] but not defined there, but it seems to mean a resource that someone uses an http: (or possibly https:) URI to "identify". The distinction in kind between what is identified and what could be appears to be immaterial, especially in light of (b).

The purpose of a 2xx HTTP status code is to signal successful retrieval (per [rfc3986]), but the HTTP protocol is only one way to perform a retrieval. In order to harmonize this specification with the architecture articulated in [rfc3986], the editor has therefore made the obvious generalization from the resolution's narrow scope of the HTTP protocol to retrieval in general.

The (b) clause does not say anything about which resource is "identified", but an informal practice has emerged whereby the See Other link is to documentation meant to establish what the probe URI means - that is, the URI is understood to "identify" according to that URI documentation. This interpretation is corroborated by [httpbis-2], section 7.3.4, which says

The Location URI indicates a resource that is descriptive of the target resource, such that the follow-on representation might be useful to recipients without implying that it adequately represents the target resource.

As an obscure technicality, because nobody is authoritative for what is or isn't an "information resource", the (a) clause can only be interpreted to mean that the resource is nominally (i.e. said to be) an information resource, not that it is one.

7 Disclaimer regarding the meaning of "meaning"

Editorial note

Larry Masinter is concerned that too cavalier a treatment of meaning may lead to mistakes. Ashok Malhotra is concerned that going into as much detail as is given here is a distraction. Some decision will have to be reached on whether and how to include the material below.

Henry Thompson gives the following advice:

"The terminology used here and elsewhere in discussing what URIs are about/for, that is words such as 'meaning', 'refer', 'identify' etc., are notoriously hard to pin down. This specification follows the lead of its predecessors in attempting to provide no explicit definitions for these terms. This may leave open the possibility of misinterpretation: change proposals are invited to attempt to explicitly rule out any misinterpretations they feel are particularly pernicious and/or likely, but are advised not to attempt to [rush in where others have feared to tread and] provide rigid definitions of such terms themselves.

This document does not define "meaning", "reference", or "identification" in any absolute sense. It only specifies a particular manner of coordination that may be used by agents that choose to use it. The word "meaning" is meant to be broad enough to encompass a wide variety of uses.

Some meaning comes from the URI scheme, as when a piece of software responsible for retrieval takes http://example.com/zebra to "mean" that it should check its HTTP cache, do a domain lookup, talk to a server, and present some results. It is not expected that this specification will bear on that kind of "meaning", although in principle it could.
Some kinds of "meaning" have to do with habit or convention, as when the author of some HTML uses a URI in an @href attribute to "mean" or refer to a particular document to which the user is to be directed (which might or might not be the one that is actually received, depending on the current state of the Web). They do not actually mean to refer to whatever happens to be received, as they probably have no control over that, but rather what can be reasonably expected using that URI. A diligent reader may be able to track down the intended document in spite of protocol behavior.
In RDF, "meaning" sometimes has to do with documents with which the URI is associated on the Web, similar to the HTTP or HTML cases, or (in addition or instead) what the URI "refers to" in RDF statements, which could be anything at all, depending on how the URI is documented and used.
In Web architecture ([rfc3986], [webarch]), the meaning of a URI is what it "identifies".
To a human being, a URI typically means nothing at all, although to the technically literate it could mean a variety of things relating to the way the URI is spelled or administered, to Web behavior, or to content displayed on either end of a hyperlink.

"Meaning" in general encompasses all such manners of use, with different facets surfacing in different contexts.^[7]

8 Acknowledgments

Larry Masinter, Henry S. Thompson, Ashok Malhotra, and other TAG members gave valuable advice on drafts of this document. Many of the ideas grew out of work done by the TAG's AWWSW Task Group.

9 References

cooluris: Leo Sauermann and Richard Cyganiak. Cool URIs for the Semantic Web. W3C Interest Group Note, 03 December 2008. (See http://www.w3.org/TR/2008/NOTE-cooluris-20081203/.)
generic: Jonathan A. Rees, editor. Generic resources and Web metadata. Editor's draft, W3C, 2012. (See http://www.w3.org/2001/tag/awwsw/ir/20120127/.)
httpbis-1: R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, T. Berners-Lee, Y. Lafon (editor), and J. Reschke (editor). HTTP/1.1, part 1: URIs, Connections, and Message Parsing. Revision of [rfc2616]. Work in progress, version 18, IETF, 2012. (See http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-18.)
httpbis-2: R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, T. Berners-Lee, Y. Lafon (editor), and J. Reschke (editor). HTTP/1.1, part 2: Message Semantics. Revision of [rfc2616]. Work in progress, version 18, IETF, 2012. (See http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-18.)
issue-14-resolved: Roy Fielding. [httpRange-14] Resolved. Email to www-tag list, 2005. (See http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html.)
issue-57-report: Jonathan A. Rees, editor. Providing and discovering definitions of URIs. W3C editor's draft, 25 June 2011. (See http://www.w3.org/2001/tag/awwsw/issue57/20120202/.)
linked-data: Tim Berners-Lee. Linked Data. Design note, June 2009. (See http://www.w3.org/DesignIssues/LinkedData.html.)
powder: Phil Archer, Kevin Smith, and Andrea Perego, editors. Protocol for Web Description Resources (POWDER): Description Resources. W3C Recommendation, 1 September 2009. (See http://www.w3.org/TR/powder-dr/#appD.)
rdf-concepts: Graham Klyne and Jeremy J. Carroll, editors. Resource Description Framework (RDF): Concepts and Abstract Syntax. W3C Recommendation, 10 February 2004. (See http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/.)
rfc2616: R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee. Hypertext Transfer Protocol -- HTTP/1.1. RFC 2616, IETF, 1999. (See http://www.ietf.org/rfc/rfc2616.txt.)
rfc3870: A. Swartz. application/rdf+xml Media Type Registration. RFC 3870, IETF, 2004. (See http://www.ietf.org/rfc/rfc3870.txt.)
rfc3986: T. Berners-Lee, R. Fielding, L. Masinter. Uniform Resource Identifier (URI): Generic Syntax. RFC 3986, IETF, 2005. (See http://www.ietf.org/rfc/rfc3986.txt.)
rfc5988: M. Nottingham. Web Linking. RFC 5988, IETF, 2010. (See http://www.ietf.org/rfc/rfc5988.txt.)
webarch: Ian Jacobs and Norman Walsh, editors. Architecture of the World Wide Web, Volume One. W3C Recommendation, December 2004. (See http://www.w3.org/TR/webarch/.)
xhtml-ns: XHTML namespace document. Namespace document, occasionally revised, retrieved 31 January 2012. This document is revised from time to time as new specifications bearing on the XHTML namespace are published. (See http://www.w3.org/1999/xhtml.)

10 Change log

2012-01-31 Backed off from the word "instance" as the relationship between IR and its representations, and from the informal description of the generic resource idea. (Rely on reference to [generic] instead.)
2012-01-31 Promoted Link:/describedby from information resource section to general case section.
2012-02-08 Recast the whole thing as a definition (of nominal URI doc. carrier) rather than as a protocol. New adjective "nominal" used liberally. New section on "meaning" attempting to answer some of Larry's concerns. Expanded "stability considerations" section to broader scope of "inconsistency risks".

End Notes

[1]

See HTTPbis part 1 section 2.7.1 [httpbis-1] for a definition of "authoritative" in the context of HTTP. To simplify a bit, it means the response comes from the origin server.

[2]

We need to distinguish being a (plain old) "representation" of something from being a "nominal representation" from a URI because "is a current representation of" is not actionable — it is too easy to argue about. What you get from a successful retrieval is a representation of something (at least according to [httpbis-2]), but all we know about what the representation is of is that the URI owner says that it is a current representation of the target resource. E.g. if a URI identifies (according to the URI owner) the Magna Carta, and a retrieval yields a representation carrying Jabberwocky, then the retreived representation is a nominal representation from the URI, but not necessarily a representation of the Magna Carta. To infer that it is a representation we need either trust in the URI owner, or more information.

[3]

When a URI with local identifier occurs in an RDF graph (not just the graph found via in a nominal representation), the following passage from RDF Concepts [rdf-concepts] applies to its meaning:

"a URI reference in an RDF graph is treated with respect to the MIME type application/rdf+xml. Given an RDF URI reference consisting of an hashless URI and a fragment identifier, the fragment identifer identifies the same thing that it does in an application/rdf+xml representation of the resource identified by the hashless URI component."

This simply reinforces the representation consistency directive quoted previously. If there is no application/rdf+xml nominal representation this makes any URI meaning coming from, say, RDFa or some XML-based MIME type registration, out of reach of RDF. To reconcile [rdf-concepts] with [rfc3986] we must assume that when a URI with local identifier is used in an RDF graph specified according to the media type, there is a potential equivalent application/rdf+xml representation defining all of the local identifiers, even if such a representation is never delivered in a retrieval response.

Editorial note
Is there talk in the RDF WG of amending this passage when RDF Concepts gets revised?

[4]

In the editor's opinion the restriction to these two URI schemes seems superfluous - there is no reason this can't be also true of, say, ftp:, data:, or tag: URIs. These are rarely used with the HTTP protocol, of course, but other retrieval protocols may apply.

[5]

The following language from [rfc3986] bears on the semantics of local identifiers.

The semantics of a fragment identifier are defined by the set of representations that might result from a retrieval action on the primary resource. The fragment's format and resolution is therefore dependent on the media type of a potentially retrieved representation, even though such a retrieval is only performed if the URI is dereferenced.

This text is somewhat confusing concerning the distinction between what is retrieved and what is identified, so we propose the following interpretation:

The semantics of a local identifier are defined by the set of nominal representations that might result from a retrieval action using the primary resource's URI. The retrievals' formats and therefore the identity of the secondary resource are therefore dependent on the media types of potentially retrieved representations, even though such retrievals are only performed if the URI is dereferenced.

A consequence of this is that if there are multiple simultaneous representations then they need to be consistent in what they convey about a local identifier, if it is to be meaningful beyond a single representation. That is, if two nominal representations assign meanings to a given local identifier, the meanings should be consistent:

If the primary resource has multiple representations, as is often the case for resources whose representation is selected based on attributes of the retrieval request (a.k.a., content negotiation), then whatever is identified by the fragment should be consistent across all of those representations. Each representation should either define the fragment so that it corresponds to the same secondary resource, regardless of how it is represented, or should leave the fragment undefined (i.e., not found). [rfc3986]

For URI definition discovery in the presence of content negotiation to behave correctly for a given local identifier, all retrievable representations should define the local identifier, consistently across these representations.

The topic of representation consistency is also covered in [webarch] section 3.2. All of the considerations for fragment identifiers also apply in the hashless URI case (303, Link:), when there are RDF graphs or other mechanisms for documenting hashless URIs involved.

[6]

[rfc2616] is difficult to interpret on this question. [issue-14-resolved] asserts that there are no constraints in the 303 case. In the 2xx case, the server's assertions that nominal representations that are in fact representations impose a constraint on what is "identified", to the extent that "representation" has any agreed meaning.

[7]

In the philosophy of language, meaning (i.e. semantics) and reference are distinct properties of linguistic tokens. For example, when the word "now" is used at two different times, it refers to different times in the two instances, without any change in meaning. Meaning in context determines reference. Meaning and reference coincide in the case of proper names.

According to [rfc3986], the semantics of a given URI is supposed to be uniform across contexts of use.

When a URI appears to refer to or "identify" something, especially in a declaration or statement that says that what it refers to has some type or has properties with particular values, this is a referential use of the URI. Uses of URIs in RDF are referential. This document does not take a stand as to whether uses of a URI as a hypertext link target, XML namespace indicator, HTTP request URI, or HTTP header value (as in Location:) are referential.

It is customary to speak of a URI as "identifying" a "resource". Although "identification" is related to meaning, this document makes no particular assumption regarding the relation between what a URI "identifies" and what the URI refers to. (One might hope, however, that except in rare cases a URI would refer to what it identifies.)

Depending on what is meant by "resource" it may or may not be possible to refer to and/or identify something that isn't a resource, but this question is outside the scope of this document.