W3C NOTE-link-@@Date

Describing and Linking Web Resources

W3C Note @@Date

This version:
(unpublished)
$Id: NOTE-link.html,v 1.6 1996/11/13 14:39:22 connolly Exp $
Latest version:
http://www.w3.org/pub/WWW/Architecture/NOTE-link
Editor:
Dan Connolly
Authors:

Status of this document

This is an unpublished NOTE. It provides background for a number of W3C activities. See section "Related Work" for details.

Abstract

The World Wide Web is the universe of network-accessible information, an embodiment of human knowledge. This document presents the basic hypertext structures used to represent knowledge in the web.


Contents

  1. Introduction
    1. Related Work
  2. Architecture
    1. Notation
    2. Link Semantics
    3. @@Belief, Authority and Authenticity
  3. Some Useful Link Relationships
    1. Reflection and Bootstrapping
    2. Indirection
    3. Hierarchical Relationships
    4. Sequence Links
    5. Collection Relationships
  4. Acknowledgements
  5. References

Introduction

The World Wide Web is the universe of network-accessible information, an embodiment of human knowledge. It is a distributed system where servers provide resources and clients, or user agents, provide access to them. Resources can have links to other resources, and the links can be typed to express relationships.

While resource can be encoded in any internet media type and served via any information retrieval protocol, the Hypertext Markup Language (HTML) and the Hypertext Transfer Protocol (HTTP) were specifically designed for this purpose, and have become the dominant mechanisms.

The first part presents the architectural basis for describing and linking web resouces common to all data formats and protocols used in the web. Part two discusses a number of link relationships that have proved useful in a wide variety of contexts.

Related Work

CSCW, Hypertext , AI knowledge rep research, semantic nets , Xanadu (Dexter, Minsky, Nelson), (LINCKs, algernon)

RFC822, USENET news: Summary, Keywords

TimBL's design issues (Hypertext Design Issues , Link Types) circa 1990

HTTP 1.0 Link:, Location: and URI: headers

used by SCO in its online documentation and context-sensitive help system.

Murray's REL/REV draft: HTMLWG

WD-style for STYLESHEET link type.

IAFA, URI WG on URCs.

Web-crawing search engines. /robots.txt. Tim Bray raises the issue of aliasses in search results on HTML-WG

HTML 2.0: Nov 95

Dublin core

PICS specs

indexing workshop paper on meta schemas. May 95?. Keywords:, Description, no-index, no-traverse (X-No-archive?)

Generic resouces, Resource spec TimBL Nov 95

Distributed Authoring: typed links to access control, version history

Sitemap proposal

Applying the Model to Other Formats

There are many established formats for representing meta-content such as sitemaps. This is an open-ended section that provides some examples of how the Shoe Company example presented in section 2 of this document may be represented using some of these existing formats. Given below are a couple of examples.

Outcomes

Forum
Drafts
HTML ERB

Dist Auth WG

Search forum
Search/WD-search-meta.html (no-index, no-traverse, aliases, hierarchy)

Architecture

A number of data formats besides HTML are used to describe and link web resouces: MARC, SOIF, MCF, and PICS to name but a few.

Though these different formats use a wide range of syntaxes, semantically, they attempt to convey similar content. For example, the following two are semantically highly overlapping, though syntactically different.

SOIF
@FILE {"http://www.shoes.com" 
	Author{4}: Fred
	Supercedes{30}: http://www.provider.com/shoes
	  }
PICS @@@
HTML
<about href="http://www.shoes.com">
	<meta name=author content="Fred">
	<link rel=Supercedes href="http://www.provider.com/shoes">
</about>

The architectural model that is common to them is the basic structure of the web: a directed graph with labelled arcs. The nodes (aka points or vertices) of the graph are URLs--anchor or resource addresses. The arcs are links. The labels are link relationships.

Associated with each node is a set of attributes, or slots, or fields. Each attribute has a name and a value. Values are defined in a media-type specific manner.

Link relationships may be names in a namespace defined by the media type. Each use of a relationship name should be associated with a URL; this gives the relationship a medai-type-independent representation, and allows unambiguous translation.

Notation

We could use HTML for all examples. But in order to emphasize the independence of the model from any particular data format, we can borrow notation from propositional logic[cite???] for the purpose of discussion. We write R(S, T) for a link from S to T with relationship R. The same notation suffices for attributes: we write N(S, V) for an attribute named N on an anchor at S with value V. In the example above, we have:

@@figure: semantic net, translation into HTML

Link Semantics

Anything can be considered a point in the web--including people, organizations, dates and subject categories--by giving it a URL. A link or attribute in the web can be interpreted as an assertion, given an understanding of the semantics of the link relationship or attribute name. For example, given the following definitions:

for Author(S, V) read:
The Author of S is V.
for Supercedes(S, T) read:
S supercedes T

we can interpret the HTML or SOIF data as the following set of assertions:

@@Belief, Authority and Authenticity

@@Who said so? When?

@@version skew, forgery, mistakes, inconsistency

ABC believes that document MA is a mirror of document A".

Some Useful Link Relationships

While any URL can be used as the type of any link, this allows for the unfortunate case of one different parties using different URLs to mean the same thing. This section provides a reference description of some common and useful link relationships.

Reflection and Bootstrapping

A link relationship can be considered a point in the web by giving it a URL. Defererencing that URL should yield a definition of the link relationship, whether in human-readable or machine-readable form.

For this purpose, we define the following relationships:

global(S, T)
The anchor S, which represents a link relationship locally to a resource, is defined globally at T.
implies(S, T)
S implies T; that is, from any link/assertion S(X, Y), deduce T(X, Y)
equivalent(S, T);
S is equivalent to T; that is, implies(S, T) and implies(T, S)
converse(S, T)
S is the converse of T; that is, for any link/assertion S(X, Y), deduce T(Y, X)

For example, using , any HTML document can use the definition of parent given in the next section ala:

	<head>
	<link name=parent rel=global.
		href="http://www.w3.org/TR/WD-resource#parent">
	</head>

The package mechanism defined in [HTMLLINK] uses these relationships to bind HTML link relatioship names to URLs. For example:

	<head>
	<link rel=schema name=w3c href="http://www.w3.org/TR/NOTE-link">
	<link rel=succ.w3c href="ch3.html">
	<link rev=succ.w3c href="ch1.html">
	</head>

The relationship name succ.w3c is associated with the URL http://www.w3.org/pub/WWW/TR/WD-resource#succ.

Indirection

Indirection is useful for fowarding etc.@@connection to HTTP Redirect

for POINTER(S, T) read:
S is a pointer though T. From POINTER(S, T) and R(T, X), deduce R(S, X).

For example:

	<a REL=POINTER HREF="launchbase.html#x">

and in launchbase.html:
	<a name=x href="http://www.x.org/">X Consortium</a>

@@put about here?

Hierarchical Relationships

It is quite common for documents to be developed or defined using a hierarchical model, or tree-like structure.

The entire set of relationships may be used by a user agent to build a map of the hierarchical structure(s) of a set of resouces. User agents should provide an "up" button in a toolbar for PARENT, FIRST, LAST, NEXT, and PREVIOUS links.

for PARENT(C, P) read:
C's parent is P.
for CHILD(P, C) read:
C's parent is P. Note that we have CONVERSE(CHILD, PARENT)
for SIBLING(S1, S2) read:
S1 and S2 are siblings. SIBLING is reflexive; i.e. from SIBLING(S1, S2), deduce SIBLING(S2, S1). SIBLING is many-to-many. Siblings share parents; i.e. from SIBLING(S1, S2) and PARENT(S1, P) deduce PARENT(S2, P).
for ROOT(E, R) read:
E is contained in a hierarchy rooted at R. Children inherit ROOT from parents; i.e. from ROOT(P, R) and PARENT(C, P), deduce ROOT(C, R). ROOT weakly implies FIRST; that is, from ROOT(E, I) deduce FIRST(E, I) unless a different FIRST(E, R) is evident.

Sequence Links

Given a set of documents, it is possible and often desirable to specify linear sequences to navigate through the set. A book, for example, is often organized as a linear sequence. With sequence links in each document, a user agent can step through or gather an entire book programmatically.

for FIRST(E, I) read:
E is in a sequence beginning with I. BEGIN is a synonym. FIRST weakly implies ROOT; that is, from FIRST(E, I) deduce ROOT(E, I) unless a different ROOT(E, R) is evident.
for LAST(E, I) read:
E is in a sequence ending with I. END is a synonym.
for NEXT(S, T) read:
T is next after S in sequence.
for PREVIOUS(S, T) read:
NEXT(T, S). PREV is a synonym.

Collection Relationships

To represent sets or classes, we have the containment and subset relationships:

for in.(E, C) read:
E is an element of the collection or set C, or E is in the class C.
for part.(C1, C2) read:
C1 is part of or a kind of C2, or C1 is a subset or subclass of C2. part is transitive; i.e. from part(C1, C2) and part(C2, C3) deduce part(C1, C3)

Acknowledgements

Hypertext link relationships, specified by using the REL and REV attributes of the LINK and A elements, were conceived of as an early feature of the HTML language. Amidst all of the various and sundry efforts that have been undertaken to advance HTML and the WorldWide Web, the definition of a small set of widely accepted hypertext relationships has yet to be agreed upon and deployed in user agents.

Hypertext link relationships, and the attendant REL and REV attributes of the LINK and A elements, are discussed in Dave Raggett's Internet Draft on HTML 3.0. In addition, The Santa Cruz Operation, Inc (SCO) has developed an HTML user agent, based on Mosaic, which incorporates the use of the REL attribute of the LINK element.

Tim Berners-Lee

References

[DUBREF]
Dublin Core Metadata Element Set: Reference Description August 29, 1996
[HTMLLINK]
Hypertext Links in HTML