INTERNET-DRAFT U-REST H. Frystyk Nielsen, W3C/MIT draft-ietf-frystyk-http-urest John Mallery, MIT/AI Lewis Girod. MIT/LCS Benjie Chen, MIT/LCS Expires: XXXX Monday, July 26, 1999 URI Resolver Transport Protocol (U-REST) Status of this Document This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at "http://www.ietf.org/ietf/1id-abstracts.txt" The list of Internet-Draft Shadow Directories can be accessed at "http://www.ietf.org/shadow.html". Please send comments to the mailing list. This list is archived at "http://lists.w3.org/Archives/Public/ietf-http- ext/". Abstract In order for the Web to continue to grow and prosper, information publishers must have readily available and stable URIs [13] to identify Web resources. "Stable" can in this context be interpreted along many different axis including stability over time, access mechanism, locality, etc. Although the stability of URIs depends highly on social engineering including not thinking about Web resources as local files but as resources in a global information space, it is important that the Web Frystyk, et al [Page 1] INTERNET-DRAFT U-REST Monday, July 26, 1999 infrastructure supports evolution of protocols, transports and access models without requiring changes to already deployed URIs. Currently this form for extensibility is primarily supported through invention of new URI schemes or inline negotiation in the protocol used to resolve the URI, for example as is known from the HTTP Upgrade header field. The former has serious implications for the long-term integrity of the Web and the latter does not allow for general resolver data to be exchanged as first class objects. This document defines a lightweight extension to HTTP [14] called "URI Resolver Transport Protocol" (U-REST). U-REST is based on the HTTP Extension Framework [15] and extends HTTP to support the transfer of metadata about URIs and how to resolve them. The purpose of U-REST is to allow for stable URIs to be deployed while supporting distributed protocol evolution without having to change already deployed URIs or URI schemes. The extension does not define or require a particular resolution mechanism. Rather, it defines a simple mechanism for carrying URI resolver information using HTTP as a transport with support for both implicit and explicit resolution of URIs. Table of Contents 1. Introduction ...............................................3 1.1 Terminology .............................................4 1.2 Purpose of U-REST .......................................4 1.3 Requirements ............................................5 1.4 Design Rationale for Using HTTP .........................6 2. Operational Overview .......................................6 3. Notational Conventions .....................................7 4. U-REST Extension Identifier ................................7 5. Protocol Specification .....................................8 5.1 HTTP Status code: 350 Resolution Delegated ..............8 5.2 Resolver Location .......................................8 5.3 Resolver Control Directives .............................9 5.3.1 Fragment .............................................10 5.3.2 Schema ...............................................10 6. Convergence Errors ........................................10 7. Security Considerations ...................................11 8. References ................................................11 9. Acknowledgements ..........................................12 10. Examples ..................................................12 10.1 Resolving a URN ........................................12 Frystyk, et al [Page 2] INTERNET-DRAFT U-REST Monday, July 26, 1999 1. Introduction URIs is the fundamental mechanism for identifying resources in the Web. Many existing URI schemes contain implicit information about how to access a particular resource, typically as a function of the URI access scheme like "http:", "news:", etc. Some of these schemes are location dependent, some are not; some schemes may be intended to be persistent, others may not, and so on. The problem is that these characteristics often change over time without any general mechanism for expressing these changes in the current Web model. The following examples are well-known examples of such characteristics that tend to change independently of each other: Locality (where can I get it?) Global availability of resources may be obtained by local mirroring of parts of the URI space. Mirroring can either be directly supported in the URI scheme, which for example is the case of "news:" or implemented in an ad hoc manner, which often is the case in HTTP based mirroring. Two typical mechanisms of mirroring "http:" URIs is either to make it explicit in the document: "This document can also be obtained from_", or by doing multihome host DNS selection based on the AS number of the requestor; both of which have considerable drawbacks. Access Mechanism (how can I get it?) A single resource may be available through different access protocols supported by the party serving the resource. These access protocols may or may not be compatible: HTTP/0.9, HTTP/1.0, and HTTP/1.1 are backwards compatible protocols but HTTP running on top of SSL is not although it is in fact using HTTP as one of the access protocols. Lacking the capability of expressing and negotiating protocol stacks forces new URI schemes like "shttp:", causing serious deployment and evolvability problems. Persistence (how long can I get it?) The persistence of a URI, that is, the time period by which the URI can be resolved, often depends on the contents or may be obtained through contractual agreements based on delegation of the URI space. Note that this is different from the max-age cache control directive in HTTP/1.1 that indicates how long an HTTP response is fresh, not the persistence of the URI itself. For example, a URI pointing to today's news paper is not expected to change, but the content is. Being able to express additional information about the persistence of a URI would be a great potential benefit to search services, indexes etc. Frystyk, et al [Page 3] INTERNET-DRAFT U-REST Monday, July 26, 1999 These characteristics can along with features like content negotiation etc. be summarized as follows: Under which circumstances can a URI be compared to itself as well as other URIs; what are the semantics of the comparison; and how can this relationship be expressed without changing the URI itself? 1.1 Terminology In this document, we use the following terminology which is not new to the Web community but as we define explicitly here for clarity. fragment or view A string at the end of a URI which identifies, within a Web document, a part or view to which one refers. The view, which is a function of the media type, is separated from the URI by a crosshatch ("#") character renderer An application that takes an input stream and produces a representation as output. The representation is typically a function of the media type of the input data, the requested view, and maybe stylistic information provided by style sheets etc. resolver An application that translates a URI into another URI, or in case it is the authoritative resolver, directly to the requested resource. resolution The sequenced set of operations performed by a set of one or more resolvers is a nested set of operations that may result in an entity being generated and returned to the requestor. 1.2 Purpose of U-REST The purpose of U-REST is to allow relationship information such as the examples above to be returned by a URI resolver mechanism instead of having to change the information in the URI itself, often using ad hoc mechanisms. The advantage being that more stable names can be introduced in the Web. Other information that may be passed around using U-REST is information about services supported by the resource, privacy policies, pricing policies, content ratings, etc. Frystyk, et al [Page 4] INTERNET-DRAFT U-REST Monday, July 26, 1999 A fundamental design principle behind URNs [18] has been to separate the URI assignment from the resolution of URIs. Regardless of the potential benefits of this approach, it tends to fall short on solving the bootstrapping problem of how to resolve a name if it doesn't contain any hints about how to resolve it. Instead of attempting to design yet another URI resolver, this proposal breaks down the problem by separating the resolver transport protocol from the resolver mechanism and only specifies the former. U- REST is a simple extension to HTTP that allows HTTP to be used as a URI resolver transport protocol while imposing as few restrictions on the URI resolution mechanism as possible. U-REST does not define the resolution process of finding which HTTP server to contact - this would have to be implemented as a separate service, which of course can use U-REST as transport protocol. 1.3 Requirements In order for one or more resolvers to be able to use U-REST for URI resolution, it is required that: o the protocol is independent of specific URI schemes; o the protocol can transfer authoritative and non-authoritative information provided by resolvers; o the protocol is independent of the format of the information passed around by the resolvers; o resolver information can be identified as a first class object using URIs; o it be possible for resolvers to detect infinite resolution loops and be able to decide whether a resolution is likely to converge or not; o the protocol can work without breaking existing Web applications. Note, that it is not a requirement for the resolver protocol to provide a mechanism of finding a resolver providing URI resolution services. Within HTTP, this is equivalent to the problem of a client finding a proxy for accessing the Web. Although this is a very important problem, it should be solved for all uses of HTTP proxies and not only for special cases. Frystyk, et al [Page 5] INTERNET-DRAFT U-REST Monday, July 26, 1999 1.4 Design Rationale for Using HTTP As a potential resolver protocol [16][17], HTTP already fulfills the majority of the requirements described above: o HTTP is independent of specific URI schemes o HTTP is independent of the format of the information passed around; o Web resources are identified as first class objects using URIs; and o HTTP does not break existing Web applications The remaining requirements, that o the protocol can transfer authoritative and non-authoritative information provided by resolvers; o it be possible for resolvers to detect infinite resolution loops and be able to decide whether a resolution is likely to converge or not; are the additional features defined in this document. 2. Operational Overview A resource which is discovered through the resolution process will based on the user preferences, application capabilities, and view indicated in the request respond with a representation of itself. This representation is returned to the client as a response entity and forms the input stream to the renderer. The resolution process can be described recursively as renderer ( resolution ( Request-URI, view ) ) where the Request-URI, the view and the resolver may change while iterating through the resolution process. By applying this model to HTTP itself, the origin server for a resource referenced by a "http:" URI is the authoritative resolver for that URI and a proxy server is a non-authoritative resolver for that URI. Furthermore, as HTTP can carry arbitrary URIs in a request, an HTTP server can be a resolver for arbitrary URIs: proxy servers as non-authoritative resolvers and gateways as authoritative resolvers. U-REST is intended to be used as follows: Frystyk, et al [Page 6] INTERNET-DRAFT U-REST Monday, July 26, 1999 o A U-REST compliant client issues an HTTP request to its preferred resolver server with the URI to be resolved as the Request-URI. The client indicates that it understands U-REST using the HTTP Extension Framework [15]; o If the resolver server is non-authoritative, it can either a) reply with information about where to find an alternative resolver which may or may not be authoritative; or b) proxy or gateway the request itself; o If the resolver is authoritative, it replies, as it would normally have done. No promise is made by this extension that the resolution converges towards a resource nor that a URI is guaranteed to be understood by the resolver (see section 6 for more details). Currently, HTTP requests do not include the fragment or view identifier of the URI. However, as the view is a function of the media type of the response entity, it is important that the view is passed along in the resolution process, as it may not be defined across the set of representations that the resource can return to the client. 3. Notational Conventions This specification uses the same notational conventions and basic parsing constructs as RFC 2616 [14]. In particular the BNF constructs "token", "quoted-string", and "field-name" in this document are to be interpreted as described in RFC 2616 [14]. For definitive information on URI syntax and semantics, see RFC 2396 [13] . This specification adopts the definitions of "URI-reference" and "uric" from that specification.. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119[9]. 4. U-REST Extension Identifier The URI used to identify this extension within the HTTP Extension Framework [15] is defined as U-REST-specification-URN = "urn:specs:U-REST" Frystyk, et al [Page 7] INTERNET-DRAFT U-REST Monday, July 26, 1999 This identifier uniquely identifies the U-REST extension and MUST NOT be used for any other purpose. 5. Protocol Specification This section defines the protocol components of the U-REST HTTP extension. It contains the definition of a new HTTP status code called 350 (Resolution Delegated) (see section 5.1) and two new header fields, which are both controlled by the rules defined by the HTTP Extension Framework [15] (see section 5.2 and 5.3). 5.1 HTTP Status code: 350 Resolution Delegated This status code indicates that the resolution process has been delegated to one or more alternative resolvers indicated in the response using the Resolver header field (see section 5). A 350 response can be issued by any resolver receiving a request for resolving a URI. Note that this differs from most other HTTP status codes as it is issued by the resolver that serves it, and may not be within the same trust domain as the origin server serving the Request- URI. We expect that document signing will help establish the required trust between two parties to allow them to communicate the appropriate resolver information independent of any a priori trust relationship (section 7). The response MAY include an entity containing additional metadata about the resource. If that entity is accessible from a separate location then this SHOULD be indicated using the Content-Location header field (see [14]). 350 responses are cachable unless indicated otherwise. 5.2 Resolver Location The res-loc response-header field MUST be included in 350 (Resolution Delegated) response messages (see section 5.1). The field value consists of at least one resolver that indicates the location of the next resolver in the resolution chain. Frystyk, et al [Page 8] INTERNET-DRAFT U-REST Monday, July 26, 1999 resolver = "res-loc" ":" #resolver-address resolver-address = <"> URI-reference <"> If the resolver-address is a relative URI, the relative URI is interpreted relative to the Request-URI. If more than one resolver-address is provided, the client SHOULD try to determine which of the resolvers is the optimal one, for example based on connectivity, trust, etc. If the res-loc field-value is empty then no resolver could be found that could resolve the URI and the resolution process stops. 5.3 Resolver Control Directives The res-ctrl general-header field is used to specify directives that MUST be obeyed by the recipient. The directives typically override default resolution behavior or contain additional intended to guide the resolution in a certain direction. Resolver control directives are unidirectional in that the presence of a directive in a request does not imply that the same directive should be given in the response. resolver-control = "res-ctrl" ":" 1#resolver-directive resolver-directive = resolver-request-directive | resolver-response-directive resolver-request-directive = "schema" "=" <"> URI-reference <"> | "fragment" "=" uric | resolver-extension resolver-response-directive = resolver-extension resolver-extension = token [ "=" ( token | quoted-string ) ] Relative URIs in resolver-control directives are interpreted relative to the base URI of that message. As the parameters used in the resolution process may change (see section 1.4), the resolver control directives may also change. Frystyk, et al [Page 9] INTERNET-DRAFT U-REST Monday, July 26, 1999 5.3.1 Fragment The fragment resolver directive can be used to pass the fragment identifier of the Request-URI to the resolver. As mentioned in section 1.4, the fragment identifier may change as a result of the resolution and hence SHOULD be presented to the resolver. 5.3.2 Schema In a request, the schema resolver directive can be used to indicate which additional metadata the client requests in a 350 (Resolution Delegated) response. The value of the schema directive is a URI defining the semantics of the metadata. It is not within the scope of this specification to define a language for describing or defining a schema. If the response varies as a function of the schema resolver directive then this SHOULD be indicated using the HTTP Vary header field (see [15] for a description of how to use the Vary header field in HTTP Extensions based on the HTTP Extension Framework) 6. Convergence Errors When a resolver receives a resolution request for a URI, it SHOULD attempt to resolve the URI, making use of any resolution control information provided by the client (section 5.3). If the resolver is authoritative, it replies, as it would normally have done. If the resolver can not itself complete the resolution, it can do one of three things: 1. Respond with a 350 (Resolution Delegated) status code indicating that the resolution has been delegated to alternate resolvers (section 5.1). Redirection and delegation information MUST be conveyed via the res-loc header field (section 5.2); an empty res-loc field-value indicates that no resolver could be found. 2. Proxy or gateway the request to an upstream resolver and return the proxied response to the client. This mechanism can also be used to interoperate with non-U-REST compliant applications that may still be using a resolver mechanism. If a server detects an unrecoverable error in the resolution process, it SHOULD return a 350 (Resolution Delegated) response to the client with an empty res-loc field-value. This can for example be the case if Frystyk, et al [Page 10] INTERNET-DRAFT U-REST Monday, July 26, 1999 it detects a resolution loop or does not know how to resolve the Request-URI. If a client detects a resolution loop by inspecting the res-loc header field, then it SHOULD immediately stop the resolution process and report an error. 7. Security Considerations Possible conflicts between the HTTP trust model and the 350 response raise security concerns. In short, 350 responses without security extensions are responses from untrusted resolvers. Measures such as loop-avoidance should be applied to detect and prevent denial-of- service attacks. Implementations of U-REST should follow the security restrictions of the environment the resolver operates in. For example, Resolvers on firewalls operating under both single-step and delegation proxy behaviors may be required to filter out resolution requests from outside the firewall that intend to use an internal resource. Such requests, in most cases, are not allowed. However it is quite essential that such proxying resolvers forward resolution requests from internal clients to the outside world, unless an organization intend to mirror resolution services over all URI namespaces internally. Many security concerns of URI resolution, such as authenticity of resolution information, are problems that require further study. These considerations are beyond the scope of this document. 8. References [9] S. Bradner, "Key words for use in RFCs to Indicate Requirement Levels", RFC 2119, Harvard University, March 1997 [10] R. Daniel, M. Mealling, "Resolution of Uniform Resource Identifiers using the Domain Name System", RFC 2168, June 1997 [11] R. Daniel, "A Trivial Convention for using HTTP in URN Resolution", RFC 2169 , June 1997 [12] K. Sollins, "Architectural Principles of Uniform Resource Name Resolution", RFC 2276, MIT, September, 1997 [13] Berners-Lee, T., Fielding, R., Masinter, L., "Uniform Resource Identifiers (URI): Generic Syntax and Semantics", RFC 2396, August, 1998. Frystyk, et al [Page 11] INTERNET-DRAFT U-REST Monday, July 26, 1999 [14] R. Fielding, J. Gettys, J. C. Mogul, H. Frystyk, T. Berners-Lee, "Hypertext Transfer Protocol HTTP/1.1", RFC 2616, U.C. Irvine, DEC W3C/MIT, DEC, W3C/MIT, W3C/MIT, June 1999 [15] H. F. Nielsen, P. Leach, S. Lawrence, "HTTP Extension Framework", draft-http-ext-mandatory, March 15, 1999. This is work in progress. [16] Purls @@@what link@@@ [17] T-http @@@what link@@@ [18] URN framework _ need reference to a draft saying this _ know I have seen it _ couldn't find the draft 9. Acknowledgements The contribution of World Wide Web Consortium (W3C) staff is part of the W3C HTTP Activity (see "http://www.w3.org/Protocols/Activity"). 10. Examples 10.1 Resolving a URN A client wants to resolve "urn:cid:9802032044@thebe.lcs.mit.edu". It sends a resolution request to "http://urn.org" GET urn:cid:9802032044@thebe.lcs.mit.edu HTTP/1.1 Host: urn.org Opt: "urn:specs:U-REST" _ The resolver at "http://urn.org" determines that the URI can be resolved using another resolver, and sends back a 350 (Resolution Delegated) response: HTTP/1.1 350 Resolution Delegated res-loc: "http://thebe.lcs.mit.edu/;scope=urn%3Acid%3A" Content-Type: application/rdf _ To continue the resolution process, the client makes another resolution request, this time to "http://thebe.lcs.mit.edu": GET urn:cid:9802032044@thebe.lcs.mit.edu HTTP/1.1 Host: thebe.lcs.mit.edu Opt: "urn:specs:U-REST" res-ctrl: hint="http://thebe.lcs.mit.edu/;scope=urn%3Acid%3A" The resolver at thebe.lcs.mit.edu is the authoritative resolver for the URI. It returns the requested entity: Frystyk, et al [Page 12] INTERNET-DRAFT U-REST Monday, July 26, 1999 HTTP/1.1 200 OK _ Frystyk, et al [Page 13]