RNodes

This document describes RNodes, the web equivilent of inodes.

Table of Contents

Abstract

An RNode is a collection of content-related properties of a web resource. Formalizing the process for storing and retrieving, as well as providing initial schemas for parameterizing these properties, will promote more rapid interoperability between web agents. Rnodes are expressed in RDF, a language designed for describing web resources.

Language

Organizations and application domains will have interest in different RNode properties. The data needs to be recorded in a dynamic and extensible language. Because RNodes express relationships between different resources, the language must also represent graphs with links between arbitrary nodes. RDF was designed with these goals. RDF tools reduce the amount of application-specific code needed to process RNodes.

Common Components

Because RNodes are expressed in RDF, there is no limit to wht properties of a resource may be described. What follows is a list of the likely common properties:

Access Control Lists
describe what parties have what access to a resource. (ACLs schema)
redirects
associate resources with temporary or permanent redirects. Dynamic redirects should minimize 404s. nobody needs DELETE, right?
cache control
tell intervening proxies when and how to cache the resource
ETag
identifies specific version of a resource.
MIME info
collects information from the MIME headers used in HTTP.
Pipeline Processing Directives
describes the order of XML applications processing for "correct" understanding of the document. See the XML Processing Pipeline Model strawman.

Omitted Topics

RDF provides a general mechanism for describing resources. Some schemas may be usefull and ubiquitous but still not be considered RNodes:

Annotations
Records modifications, comments, and descriptive bookmarks for resources. (Annotations schema)
Topic and Site Maps
Collections of statements about the content of resources and the links between them.

These applications describe the same realm of objects as Rnodes. From the data representation sense, the graphs touch. Rnode metadata is not all the information the web agent may want to know about resources, it is a subset that is common to the logic and decision path of most servers.

Use Cases

Server Configuration
The most basic use of Rnodes is to store server configuration. This provides a consistent, interface-free serialization that may be used by any number of servers. This is the main goal of Rnodes, though the information can be and extended and communicated to other agents to enable many other useful applications.
Site Mirror
An extension to the Server Configuration example is a situation where one site provides a mirror of some part of another site. The mirrored site may share Rnodes with the mirrors describing the portion of the site being mirrored.
Authentication and Realm Clarification
Browsers sampling from a site gather information from auth required and forbidden responses, apply heuristics, and help the user by reapplying authentication credentials. Some servers save a round trip by assuming that authentication information will be needed for subresources (subdirectories) of a resource already known to require authentication. Others only supply the information when asked. Communicationg ACLs to browsers can help them present this information to the user (links to documents to which the user has been denied access appear in another color) and prompt the user when entering a new realm.
Local Browsing With Negotiation
Many web servers use generic names for resources and resolve them to specific entities based on the negotiation constraints. Document authors are encouraged to use the generic names in links in order to enable this mechanism. Local browsing (via file: rather than http:) breaks when the browser is left with insufficient information to conclude that file:/2001/01/31-rnodes refers to file:/2001/01/31-rnodes.html or file:/2001/01/31-rnodes.xml. Communicating MIME headers like Content-type along with a direct (resource1--hasQuality->"0.8") or indirect (rule1--globExpression->"*.xml" rule1--hasQuality->"0.8") way to express the author's preferences will allow browsers to adopt server policies and behave in an offline mode as they would in an online mode.
Site Navigation
While site maps have been ommited, the information in an Rnode provides information useful in navigating a site. Additional content and link information complements Rnodes to provide the functionality of site maps.
Document Subscription
Passing entity (document) expiration information to proxies and clients enables them to tag those entities as expired and do a refresh the next time the resource is accessed. This is an economical and effective model for publish/subscribe.

Design Constraints

In order to minimize the query infrastructure requirements, the common uses of the RNode schema are designed such that the queries all have a known subject and predicate, for instance, looking up the "text/html; charset=iso-8859-1" "English" variant of a resource is an algae query of the form

(ask '((neg::Set http://www.w3.org/ ?set)
      (neg::Mime-text-html ?set ?html)
      (neg::Language-English ?html ?html-eng))
      collect '(?html-eng))

Conversely, finding all the generic resources for the ?html-eng reqires a more difficult query of the form

(ask '((neg::Set ?r ?set)
      (?mime ?set ?html)
      (?language ?html ?html-eng))
      collect '(?r ?mime ?language))

which may not be supported by access limited query engines like algernon.

Implementations

GetMeta HTTP Extension
Written for Apache 2 and started for Mozilla.
RDF-based Access Control Lists
Access to the W3C web site is managed by access control lists written in RDF.

Related Links

earlier Rnodes document
rough sketches, team only

Eric Prud'hommeaux

Last modified: Wed Nov 14 16:42:14 EST 2001

Valid XHTML 1.0!