Version 2.0, 3rd April 1996 © Martin Bryan, The SGML Centre
This paper suggests how the HTML could be extended to allow better management of the relationships between a source document and its translations. It also suggests how topic maps could be defined in HTML headers to provide indexes of topics covered in a document or document set.
This proposal takes into account the extensions to RFC 1866 currently being proposed by the HTML Working Group at the IETF. In particular it will refer to the proposal to allow an ID attribute to be associated with any element. Text showing the current definitions of the link and anchor elements, and proposed extensions to their use, are shown in Annex A.
The existing <LINK>
and <A>
elements
allow a start to be made to controlling translation referencing. Firstly the
problem of identifying translations of a retrieved document. If the document
starts with link statements of the following form it will be possible to
identify the translations that are available for the currently displayed
document:
<BASE "http://www.myco.org/pub/subject/en/myfile.htm">
<LINK name=author href="mailto:author@myco.com" title="Author" rev=made>
<LINK name=spanish href="../sp/myfile.htm" title="Español" rel=translation>
<LINK name=french href="../fr/myfile.htm" title="Français" rel=translation>
<LINK name=german href="../de/myfile.htm" title="Deutsch" rel=translation>
Unfortunately current browsers do not display link elements, so these elements will not be selectable by users.
Until browsers offer mechanisms for listing links in menus that user can use to interconnect files the only alternative is to turn the links into anchors within the body of the document, using defintions such as:
<A name=spanish href="../sp/myfile.htm" title="Espanol" rel=translation>
[Español]</a>
<A name=french href="../fr/myfile.htm" title="Francais" rel=translation>
[Français]</a>
<A name=german href="../de/myfile.htm" title="Deutsch" rel=translation>
[Deutsch]</a>
Note that there is a different model for links (which are declared using the EMPTY keyword and therefore have no content or end-tag) and anchors (which must have content or, at very least, an end-tag).
When the new set of common attributes becomes available it will be possible to extend the above descriptions as follows:
<A name=spanish href="../sp/myfile.htm" title="Espanol" rel=translation lang=sp class=automatic>
[Español]</a>
<A name=french href="../fr/myfile.htm" title="Francais" rel=translation lang=fr class=manual>
[Français]</a>
<A name=german href="../de/myfile.htm" title="Deutsch" rel=translation lang=de class=semi-automatic>
[Deutsch]</a>
In this case I have used class to indicate whether conversion was done automatically by a program, manually by a skilled translator or semi-automatically, using human correction to an automatic translation.
Now let us look at how we can name points within a document. At present the
only valid mechanism that will work across all browsers is through use of the
<A name=xyz>
mechanism as the HTML 2.0 spec does not allow
IDs to be associated with elements. If you want to attach a name to a paragraph
you have to do something along the lines of:
<p><a name=SGML></a>The uses of SGML are legion ...
Note that I have named the paragraph here, not identified a piece of text as an anchor. The anchor in this case has no content. Whilst it would be possible to place the end-tag for the anchor at the end of the paragraph this would serve no real purpose and introduces the risk that the limited model of HTML could be broken by elements within the paragraph.
If I translate this file into French in the translated file I would have something like:
<p><a name=SGML></a>Les usages de SGML sont legion ...
[Excuse my lousy French!]
I don't need to rename the object as anchor names are local to the file they are in.
Note: HTML "fragment identifiers" can only identify the text within the anchor element they point to: the cannot not be used to identify element sets. HTML anchors can at most identify a set of characters within a single element. Alternatively, as shown here, they can identify a spot in the document.
Links such as <a href="#SGML">
will take you
to the point in the current document that has the name SGML assigned to an
anchor. All you need to do to move from one translation to another is to switch
from the BASE definition of the document you are working in to that of the
translation listed in the LINK statements that is appropriate to your request. A
mechanism for doing this would be very easy to define in an extension to the
HTML 2.0 specification.
Where document structure changes during translation the translator must determine which is the most appropriate place in the translation to place the anchor. (There is a distinct advantage in the use of empty anchors in this context as they are much easier to reposition than anchors that are placed around text, as the latter may need to be mingled with other text when translated, as anyone who has struggled with maintaining links in English/German translations will tell you! In fact this makes a very good argument for using anchors rather than IDs to identify points in the document that are linked within translations.)
Within an SGML document the basic method of addressing an element is by
assigning it a unique identifier (ID) to an element and then making a reference
to this identifier using an attribute whose declared value is either an id
reference value (IDREF
) or an id reference list (IDREFS
).
Each unique identifier must be a valid SGML name, beginning with a letter, or
one of the alternative name start characters defined in the SGML declaration,
optionally followed by one or more letters, digits, or characters declared in
the SGML declaration to be valid name characters.
Note: URLs are not valid SGML names, and neither are fragment identifiers by default as # is not a valid name character in the HTML. The name used in an SGML IDREF is identical to that used in the ID being pointed to. All IDREFs are presumed to be local so there is no need to add a # to distinguish local name references as HTML currently requires.
SGML's basic object addressing method has a number of limitations. The principal limitation is that the identified object must form part of the document, or subdocument, from which it is referenced. In addition, unique identifiers must be assigned to a single element's start-tag. This prevents an identifier from selecting more than one point in a document, though references can be made to more than one unique identifier from a reference point using an id reference list. A further restriction is that, because unique identifiers are defined as attributes of elements, they cannot be assigned to entity references, or to significant data strings.
HyTime's location address module allows SGML unique identifiers to be assigned to parts of a document which do not otherwise have identifiers. HyTime allows unique identifiers to be assigned to:
There are three main types of HyTime location address:
Name space locations are used to point to objects that have been assigned a name. Such objects include:
Each named location address associates a unique identifier (id
)
with one or more objects that are identified by name in a constructed name
list. This name list is constructed by concatenating the name lists
supplied by the name list specifications (nmlist
) and name list
queries (nmquery
) that form the contents of the named location
address.
The names in a name list specification are not checked for validity as a unique identifier or entity name until the named location address is referenced in a manner that requires access to the addressed object.
To understand the role of named locations, consider the following examples, which are constructed using the following elements:
<!ELEMENT topic - O (anchors+) >
<!ATTLIST topic HyTime NAME #FIXED nameloc
id ID #REQUIRED
set (set|notset) set >
<!ELEMENT anchors - O (#PCDATA) --lextype(NAMES)-- >
<!ATTLIST anchors HyTime NAME #FIXED nmlist
nmspace NAME element
obnames (obnames|nobnames) nobnames >
Using the default values for the non-compulsory attributes the following named location addresses could be specified:
<topic id=fleas><anchors>p-12 p-34 p-35</topic>
<topic id=dogs><anchors>p-3 p-24 p-35</topic>
Each of the names in the anchors
name list specifications
points to an anchor element defined in the currently active document. Each of
the
topic
named location address elements identifies three paragraphs.
Note that the paragraph called p-35
is included under both the
dogs and fleas topics.
The following named location specification can be used to concatenate the two lists:
<topics id=allergies><anchors obnames>fleas dogs</topic>
In this example the name list specification is identified as pointing at the
unique identifiers of one or more object names. The elements pointed to are the
two named topic elements used in the preceding example. Because the default
value assigned to the set
attribute is set
duplicated entries will be removed, so the constructed name list known as
allergies
will contain the following entries:
p-3 p-12 p-24 p-34 p-35
.
HyTime coordinate locations can be used to identify the following types of objects that can be addressed as HyTime "quanta":
treeloc)
or a depth first path location address (pathloc
), or through their
relationship with another node (relloc
).Each coordinate location addresses a location with respect to some other location, known as a location source. The combination of a location source and a coordinate location is known as a location view. Different location views can use the same location source. The location source for a coordinate location can be another location address element. In such cases the combination of locations is said to form a location ladder.
Note: Location sources allow relative addressing.
Structured documents such as SGML documents can be viewed as a hierarchically structured tree consisting of a number of nodes. For SGML document the following node types can be identified:
#PCDATA
token within an element's content model Trees can be viewed in either a width-first or a depth-first manner. The width-first approach looks at each level in the tree as a separate list of nodes from which members can be selected. The depth-first approach considers each path down the tree as a separate list of nodes.
Both of these approaches to the tree structure can be used to define a set of measurement domains whose addressable range is determined by the number of nodes in a given node list. Nodes in a node list do not need to be unique. For example, the same data entity could occur at two points at a given level. Hence node lists do not form 'sets' in the mathematical sense. Each node list is 'ordered'; the order in which the nodes are listed is always preserved during processing.
Each node in a node list can be considered as a tree of one or more nodes. This allows location ladders to be built that find one node in a tree and then locate other nodes with respect to the located node.
NOTE: This paper will not concern itself with the use of tree or path locators.
HyTime provides two types of links::
clink
) are embedded in the source document
at the point from which the reference is being made
ilink
) are stored independently of the
data they reference, either in one of the documents being pointed to or in a
completely separate document. The HTML <A>
element can be seen as a less constrained
version of the HyTime clink
model, with the name
attribute taking the part of HyTime;sid
attribute and
href
taking the place of HyTime's linkend
attribute.
As both theHTML attributes are defined using the CDATA
keyword
they are less constrained than their HyTime equivalents, which must follow
SGML's naming rules, which would not accommodate URLs without modification of
the default name set (which is not a problem as HTML already redefines most of
the other default SGML limitations).
HyTime independent links can point to two or more named location
specifications via a linkends
attribute. Each of these locations
can be assigned a role by the anchor role (anchrole
) attribute.
Each role can have a method associated with it through the endterms
attribute. Rules for traversing to each anchor can be controlled independently
using
intra
and extra
attributes, while an aggregate
traversal (aggtrav
) attribute determines whether the links are
traversed in parallel, separately or under user control.
To see how HyTime independent links work consider the following example of what might be possible if HyTime locators and independent links were permitted in the header of an HTML document. For this example I will use a special element defined as follows:
<!ELEMENT extlink -- External link --
- O (#PCDATA) >
<!ATTLIST extlink HyTime NAME ilink
HyNames CDATA "anchrole language
linkends locators
endterms show-as"
id ID #IMPLIED -- Default: none --
languages CDATA #REQUIRED --one per language--
locators IDREFS #REQUIRED --one per language--
show-as IDREFS #REQUIRED --one per language--
extra NAMES "A"
intra NAMES "A"
aggtrav NAMES agg >
In this example three of the default names assigned to HyTime attributes
have been renamed using the HyNames
attribute. This allows us to
use languages
as the name of the
anchrole
attribute, locators
as the replacement for
linkends
and show-as
in place of
endterms
.
Each use of the element must have three attributes defining which languages are to available, which locators are used to identify the relevant point in each language, and what form should be used to identify the translations.
The following example will also make use of the topic and anchor elements that were defined earlier.
The external link (extlink
), topic
and anchors
elements could be used as follows within an HTML header:
<html><header><title>Linked Translation Set</title>
<BASE "http://www.myco.org/pub/subject/en/myfile.htm">
<LINK name=author href="mailto:author@myco.com" title="Author" rev=made>
<LINK name=spanish href="../sp/myfile.htm" title="Español" rel=translation>
<LINK name=french href="../fr/myfile.htm" title="Français" rel=translation>
<LINK name=german href="../de/myfile.htm" title="Deutsch" rel=translation>
<topic id=SGML-en>
<anchors>SGML HTML</topic>
<topic id=SGML-sp>
<anchors URL="../sp/myfile.htm">SGML HTML</topic>
<topic id=SGML-fr>
<anchors URL="../fr/myfile.htm">SGML HTML</topic>
<topic id=SGML-de>
<anchors URL="../de/myfile.htm">SGML HTML</topic>
<extlink id=connect-sgml languages="EN SP FR DE" locators="SGML-en SGML-sp SGML-fr SGML-de" show-as="hot-spot SP-flag FR-flag DE-flag">
...
</head>
<body>
<h1>Linking together the World Wide Web</h1>
<p><a name=HTML></a>
HTML is ....
<p><a name=SGML></a>
SGML is ...
</body></html>
Note that the anchors
definition for the topic related to the
local file (SGML-en
) has no URL
attribute. This is
because the default value for this attribute is the local document or the
document identified by the BASE
element.
Using this form of coding it will possible to provide a facility whereby, when you select a point in the document, the system will look for the nearest named element and then look in the header to identify which topics refer to that locator. Users can then either chose to go to related points in the same document, or to the similarly named points in any of the translations that have been identified as being associated with the document.
Note: This document is incomplete. If the ideas are accepted more information on the role of topics and anchors will be provided by its authors.
The elements that can be used to link documents together in HTML are the <LINK>
element used in the header to identify meta-data links and the <A>
(anchor) element used in the text to identify actual links. The current
definition for <A>
given in RFC1866 is:
5.7.3. Anchor: A The <A> element indicates a hyperlink anchor (see 7, "Hyperlinks"). At least one of the NAME and HREF attributes should be present. Attributes of the <A> element: HREF gives the URI of the head anchor of a hyperlink. NAME gives the name of the anchor, and makes it available as a head of a hyperlink. TITLE suggests a title for the destination resource -- advisory only. The TITLE attribute may be used: * for display prior to accessing the destination resource, for example, as a margin note or on a small box while the mouse is over the anchor, or while the document is being loaded; * for resources that do not include a title, such as graphics, plain text and Gopher menus, for use as a window title. REL The REL attribute gives the relationship(s) described by the hyperlink. The value is a whitespace separated list of relationship names. The semantics of link relationships are not specified in this document. REV same as the REL attribute, but the semantics of the relationship are in the reverse direction. A link from A to B with REL="X" expresses the same relationship as a link from B to A with REV="X". An anchor may have both REL and REV attributes. URN specifies a preferred, more persistent identifier for the head anchor of the hyperlink. The syntax and semantics of the URN attribute are not yet specified. METHODS specifies methods to be used in accessing the destination, as a whitespace-separated list of names. The set of applicable names is a function of the scheme of the URI in the HREF attribute. For similar reasons as for the TITLE attribute, it may be useful to include the information in advance in the link. For example, the HTML user agent may chose a different rendering as a function of the methods allowed; for example, something that is searchable may get a different icon.
The definition for <LINK>
is:
5.2.4. Link: LINK The <LINK> element represents a hyperlink (see 7, "Hyperlinks"). Any number of LINK elements may occur in the <HEAD> element of an HTML document. It has the same attributes as the <A> element (see 5.7.3, "Anchor: A"). The <LINK> element is typically used to indicate authorship, related indexes and glossaries, older or more recent versions, document hierarchy, associated resources such as style sheets, etc.
These definitions are further qualified by the following descriptions of HTML hyperlinks:
7. Hyperlinks In addition to general purpose elements such as paragraphs and lists, HTML documents can express hyperlinks. An HTML user agent allows the user to navigate these hyperlinks. A hyperlink is a relationship between two anchors, called the head and the tail of the hyperlink[DEXTER]. Anchors are identified by an anchor address: an absolute Uniform Resource Identifier (URI), optionally followed by a '#' and a sequence of characters called a fragment identifier. For example: http://www.w3.org/hypertext/WWW/TheProject.html http://www.w3.org/hypertext/WWW/TheProject.html#z31 In an anchor address, the URI refers to a resource; it may be used in a variety of information retrieval protocols to obtain an entity that represents the resource, such as an HTML document. The fragment identifier, if present, refers to some view on, or portion of the resource. Each of the following markup constructs indicates the tail anchor of a hyperlink or set of hyperlinks: * <A> elements with HREF present. * <LINK> elements. * <IMG> elements. * <INPUT> elements with the SRC attribute present. * <ISINDEX> elements. * <FORM> elements with `METHOD=GET'. These markup constructs refer to head anchors by a URI, either absolute or relative, or a fragment identifier, or both. In the case of a relative URI, the absolute URI in the address of the head anchor is the result of combining the relative URI with a base absolute URI as in [RELURL]. The base document is taken from the document's <BASE> element, if present; else, it is determined as in [RELURL]. 7.1. Accessing Resources Once the address of the head anchor is determined, the user agent may obtain a representation of the resource. For example, if the base URI is `http://host/x/y.html' and the document contains: <img src="../icons/abc.gif"> then the user agent uses the URI `http://host/icons/abc.gif' to access the resource, as in [URL].. 7.2. Activation of Hyperlinks An HTML user agent allows the user to navigate the content of the document and request activation of hyperlinks denoted by <A> elements. HTML user agents should also allow activation of <LINK> element hyperlinks. To activate a link, the user agent obtains a representation of the resource identified in the address of the head anchor. If the representation is another HTML document, navigation may begin again with this new document. 7.4. Fragment Identifiers Any characters following a `#' character in a hypertext address constitute a fragment identifier. In particular, an address of the form `#fragment' refers to an anchor in the same document. The meaning of fragment identifiers depends on the media type of the representation of the anchor's resource. For `text/html' representations, it refers to the <A> element with a NAME attribute whose value is the same as the fragment identifier. The matching is case sensitive. The document should have exactly one such element. The user agent should indicate the anchor element, for example by scrolling to and/or highlighting the phrase. For example, if the base URI is `http://host/x/y.html' and the user activated the link denoted by the following markup: <p>See: <a href="app1.html#bananas">appendix 1</a> for more detail on bananas. Then the user agent accesses the resource identified by `http://host/x/app1.html'. Assuming the resource is represented using the `text/html' media type, the user agent must locate the <A> element whose NAME attribute is `bananas' and begin navigation there.
In the 1st Feb 96 draft of HTML Tables proposes the following attributes be added toall elements, including links and anchors:
Common Attributes The following attributes occur in several of the elements and are defined here for brevity. In general, all attribute names and values in this specification are case insensitive, except where noted otherwise. The ID, CLASS and attributes are required for use with style sheets, while LANG and DIR are needed for internationalization. <!ENTITY % attrs "id ID #IMPLIED -- element identifier -- class NAMES #IMPLIED -- for subclassing elements -- lang NAME #IMPLIED -- as per RFC 1766 -- dir (ltr|rtl) #IMPLIED -- I18N text direction --"> ID Used to define a document-wide identifier. This can be used for naming positions within documents as the destination of a hypertext link. It may also be used by style sheets for rendering an element in a unique style. An ID attribute value is an SGML NAME token. NAME tokens are formed by an initial letter followed by letters, digits, "-" and "." characters. The letters are restricted to A-Z and a-z. CLASS A space separated list of SGML NAME tokens. CLASS names specify that the element belongs to the corresponding named classes. It allows authors to distinguish different roles played by the same tag. The classes may be used by style sheets to provide different renderings as appropriate to these roles. LANG A LANG attribute identifies the natural language used by the content of the associated element.The syntax and registry of language values are defined by RFC 1766. In summary the language is given as a primary tag followed by zero or more subtags, separated by "-". White space is not allowed and all tags are case insensitive. The name space of tags is administered by IANA. The two letter primary tag is an ISO 639 language abbreviation, while the initial subtag is a two letter ISO 3166 country code. Example values for LANG include: en, en-US, en-uk, i-cherokee, x-pig-latin. DIR Human writing systems are grouped into scripts, which determine amongst other things, the direction the characters are written. Elements of the Latin script are nominally left to right, while those of the Arabic script are nominally right to left. These characters have what is called strong directionality. Other characters can be directionally neutral (spaces) or weak (punctuation). The DIR attribute specifies an encapsulation boundary which governs the interpretation of neutral and weakly directional characters. It does not override the directionality of strongly directional characters. The DIR attribute value is one of LTR for left to right, or RTL for right to left, e.g. DIR=RTL. When applied to TABLE, it indicates the geometric layout of rows (i.e. row 1 is on right if DIR=RTL, but on the left if DIR=LTR) and it indicates a default base directionality for any text in the table's content if no other DIR attribute applies to that text.
It should be noted, however, that this definition say nothing about how IDs could be used to identify fragments within URLs.
In the December 1995 draft of Hypertext Links in HTML the
following definitions were suggested for the use of the REL
and
REV
attributes:
4a. Legacy The following are REL values which were known to be used as values of the REL and REV attributes on the World Wide Web in December 1995. MADE The REV=MADE relationship has been used to identify the author or "maker" of an HTML document. Typical HREF values include a `mailto:' URI or the URL of the author's home page. Example: <A REV=MADE HREF="mailto:murray@sq.com">Author</A> NEXT/PREVIOUS/TOC/INDEX/NAVIGATOR These values are described below, are used by SCO in its online documentation and context- sensitive help system. 4b. Browser-defined Links Some keywords are reserved and should not be used as REL/REV values. HTML user agents typically provide a mechanism for navigating through the recent history of a user's access to documents; traditionally these operations are referred to as "back" and "forward". These mechanisms allow a user to step back through the documents which led to the current location and then forward again to retrace the path. Additionally, most user agents provide a mechanism to immediately return to a user-defined location, traditionally referred to as the home page, or "home". Since these browser actions are internally implemented by the browser, REL/REV keywords associated with these relationships are disallowed. HOME RESERVED. Defined by the user (for example, using an environment variable or preference, e.g. WWW_HOME). This relationship may not be overridden; HTML user agents should ignore any author-supplied REL=HOME setting. BACK RESERVED. Defined by the browser. This relationship may not be overridden; HTML user agents should ignore any author-supplied REL=BACK setting. FORWARD RESERVED. Defined by the browser. This relationship may not be overridden; HTML user agents should ignore any author-supplied REL=FORWARD setting. 4c. Navigational Node Links Navigational nodes are commonly used document objects which are designed by authors to assist the user in navigating through a closed or extended document set. The most familiar and common form of navigational node is a table of contents, which is a well known publishing device used for enumerating and ordering the contents of a closed document set. CONTENTS or TOC The TOC relationship identifies a Table of Contents. When REL=TOC, the target document is the Table of Contents for the current document, or for the collection of documents of which the current document is a member. When REV=TOC, the current document is a Table of Contents and the target document is a related document. When REL=TOC and REV=TOC it indicates that the current document is a Table of Contents and the target document is also a Table of Contents. Additional REL/REV values may be used to specify the relationship between the two, such as PARENT/CHILD. If the hypertext link is specified with REL in a LINK element, an HTML user agent may present an icon in a tool bar. Or, if capable, an HTML user agent may present the Table of Contents in a concurrent window or pane, highlighting the current document. INDEX The INDEX relationship identifies an index. When REL=INDEX, the target document is an index for the current document, or for the collection of documents of which the current document is a member. When REV=INDEX, the current document is an index. Additional REL/REV values may be used to further specify the relationship between the two ends of the link. If the hypertext link is specified with REL in a LINK element, an HTML user agent may present an icon in a tool bar. An index may be presented as an HTML document which is organized and presented in a style reminiscent of a paper-based index. An index may also be presented as a form-based query into a full- text search database. NAVIGATOR The NAVIGATOR relationship identifies a navigational aid. When REL=NAVIGATOR, the target document is a navigational aid. A navigational aid may consist of a whole or partial Table of Contents, a list of related documents, an indication of the current document's location within a document hierarchy, or any other information which may be useful to the user. When REV=NAVIGATOR, the current document is a navigational aid. If the hypertext link is specified with REL in a LINK element, an HTML user agent may present an icon in a tool bar. 4d. Hierarchy Links It is quite common for documents to be developed or defined using a hierarchical model, or tree-like structure. The keywords listed below may be used within HTML documents to identify the hierarchical relationship of closely related nodes, such as the immediate parent, siblings and children. In addition, the TOP keyword may be used to identify the logical top (or root, depending on your perspective) of a hierarchical or tree-like structure. The entire set of relationships may be used by a user agent to build a map of the hierarchical structure(s) of which the current document is a node. Hypertext links to documents identified with PARENT and TOP values are more likely to be accessible through an icon or other mechanism than documents identified with CHILD or SIBLING. CHILD The CHILD relationship identifies a subordinate or subdocument. Any document may have multiple CHILD documents within the same hierarchy. When REL=CHILD, the target document is a hierarchical child, or subdocument, of the current document. When REV=CHILD, the current document is the hierarchical child, or subdocument, of the target. PARENT The PARENT relationship identifies the superior or container node. When REL=PARENT, the target document is the hierarchical parent, or container, of the current document. When REV=PARENT, the current document is the hierarchical parent, or container, of the target. If the hypertext link is specified with REL in a LINK element, an HTML user agent may present an icon in a tool bar. SIBLING The SIBLING relationship identifies a sibling in the current hierarchy. Any document may have multiple SIBLING documents within the same hierarchy. When REL=SIBLING, the target document is a child of a common parent, or a hierarchical peer of the current document. REL and REV have equivalent meanings for the SIBLING relationship. TOP or ORIGIN The TOP relationship identifies the logical top of a hierarchical tree of which the current document is a branch. BEGIN is a functional equivalent to TOP, if only one of these values is specified. When REL=TOP, the target document is the logical top node of the tree. When REV=TOP, the current document is the logical top of the tree. If the hypertext link is specified with REL in a LINK element, an HTML user agent may present an icon in a tool bar. NOTE: ORIGIN has been suggested as an alternative to TOP to provide metaphorical consistency with PARENT/CHILD/SIBLING. Comments are encouraged. 4e. Sequence Links Given a set of documents, it is possible and often desirable to specify linear sequences to navigate through the set. A book, for example, is often organized as a linear sequence. With sequence links in each document, a user agent can step through or gather an entire book programmatically. BEGIN or FIRST The BEGIN relationship identifies the author- defined start of a sequence of documents of which the current document is a node. TOP is a functional equivalent to BEGIN when only one of these values is specified. When REL=BEGIN, the target document is the beginning of the sequence. When REV=BEGIN, the current document is the beginning of the sequence. If the hypertext link is specified with REL in a LINK element, an HTML user agent may present an icon in a tool bar. END or LAST The END relationship identifies the author defined end of a sequence of documents of which the current document is a node. TOP is a functional equivalent to END when only one is specified. When REL=END, the target document is the end of the sequence. When REV=END, the current document is the end of the sequence. If the hypertext link is specified with REL in a LINK element, an HTML user agent may present an icon in a tool bar. NEXT The NEXT relationship identifies the next document in an author-defined sequence of documents, such as a linear book. When REL=NEXT, the target document is next after the current document. When REV=NEXT, the current document is next after the target. If the hypertext link is specified with REL in a LINK element, an HTML user agent may present an icon in a tool bar. PREVIOUS or PREV The PREVIOUS relationship identifies the previous document in an author-defined sequence of documents, such as a linear book. When REL=PREVIOUS, the target document is previous to the current document. When REV=PREVIOUS, the current document is previous to the target. If the hypertext link is specified with REL in a LINK element, an HTML user agent may present an icon in a tool bar. 4f. Related Documents BIBLIOENTRY The BIBLIOENTRY relationship identifies a bibliographic entry. BIBLIOENTRY would most typically be specified on an A element, as it would specify a hypertext link between a citation and a bibliographic entry describing the citation. Example: <A REL=BIBLIOENTRY HREF="biblio.html#V.Bush"><CITE>As We May Think</CITE></A> The resource identified by this link may take any form desired by the author/publisher. A bibliographic entry may be presented in the style of a paper-based bibliographic entry, or it may be presented as the result of a database query. BIBLIOGRAPHY The BIBLIOGRAPHY relationship identifies a bibliography. The resource identified by this link may take any form desired by the author/publisher. A bibliography may be presented as an HTML document which is organized and presented in a style reminiscent of a paper-based bibliography. A bibliography may also be presented as a form-based query into a bibliographic database. If the hypertext link is specified with REL in a LINK element, an HTML user agent may present a labeled icon in a tool bar. CITATION The CITATION relationship identifies a bibliographic citation. When REL=CITATION, the target is a bibliographic citation. The anchor, in this case, may be a bibliographic entry. The anchor may also be a reference, thus allowing the reader a way to locate the citation: ... as described by Tim Berners-Lee <A REL=CITATION HREF=#TBL>[1]</A> ... When REV=CITATION, the anchor is a citation. Typically, the anchor would also be enclosed within a CITE element as shown in the example below. The example shown here also corresponds to the previous example, serving as its target by use of the NAME attribute. ... is described in Tim Berners-Lee's <CITE><A NAME=TBL REV=CITATION HREF=./biblio/TBL > The HyperText Markup Language </A> </CITE> ... NOTE: an alternative (and preferred) approach would be to add a URI-valued attribute (HREF?) to the HTML CITE element. DEFINITION The DEFINITION relationship identifies a definition of a term. Definitions may be, but are not necessarily, contained within a glossary. DEFINITION would most typically be specified on an A element, as it would specify a hypertext link from a term to its definition. <A REL=DEFINITION HREF="glossary.html#HTTP">HTTP</A> FOOTNOTE The FOOTNOTE relationship identifies a footnote. When REL=FOOTNOTE is specified on an A element, the anchor is a footnote marker and the target is a footnote. This can be used to link from the footnote marker (or a highlighted word, phrase, etc.) to an HTML document which contains the footnote text, or to a portion of the same document (see REV=FOOTNOTE). When REL=FOOTNOTE is specified on a LINK element, it can specify a hypertext link to a set of footnotes which are related to the current document, or to a set of end-notes. When REV=FOOTNOTE is specified on an A element, the anchor is a footnote; that is, the actual content of the footnote, as opposed to a footnote marker. In this case, the target specified by the HREF value, if any, is the footnote marker. It has been suggested that the combination of REV=FOOTNOTE and NAME=... on an A element may be used to imply that the enclosed content not be rendered until a link to it is explicitly traversed, at which time it can be presented in a popup window. This would allow for the inclusion of footnote text within a document that would not be visible until the reader wanted it to be presented. Developers of user agents are free to experiment with this proposed feature, but there is no requirement that it be implemented. GLOSSARY The GLOSSARY relationship identifies a glossary. When REL=GLOSSARY, the target document is a glossary. When REV=GLOSSARY, the current document is a glossary. If the hypertext link is specified with REL in a LINK element, an HTML user agent may present an icon in a tool bar. A glossary may be directly presented as an HTML document which is organized and presented in a style reminiscent of a paper-based glossary. A glossary may also be accessed through an intermediary query mechanism. For example, the user highlights a word or phrase and presses the glossary button, thereby accessing the linked object and passing the highlighted text as an argument. The server returns the glossary entry relevant to the highlighted word. 4g. Meta Documents There are classes of information which are not intrinsic to a document, but for which a clear and unambiguous association is often useful or even necessary. This section defines a small set of keywords which are related to ownership and legal notices. Any attempt to rigorously define a closed set of meta- data classes, types, and formats is doomed to failure, partly due to the need for ongoing experimentation. Hence, the META keyword may be used to identify meta documents which do not necesarily have a clear or unambiguous definition. The content of the target node may be as specific format as a MARC record or an FGDC record, or it may be an author-defined format. For each of the relationship keywords listed in this section, if the relationship is specified with REL in a LINK element, an HTML user agent may present a labeled icon in a tool bar. AUTHOR The AUTHOR relationship identifies a hypertext link to an author. The hypertext link may be to the author's home page, a biography, an audio or video clip, or an agent which sends mail to the author (e.g., using the `mailto:' scheme). COPYRIGHT The COPYRIGHT relationship identifies a hypertext link to a copyright notice. While it is arguable whether a copyright notice is required in every HTML file to assert copyright protection on it, there is clearly a desire to express copyright notice among a sufficient portion of the user community to justify support. A basic copyright notice for this document may simply state: "Copyright 1995 by Murray C. Maloney". It may be desirable, in place of or in addition to such a notice, to have a hypertext link between each HTML document in a set and a single copyright notice, as in the following examples: <LINK REL=COPYRIGHT HREF="copyright.html"> <A REL=COPYRIGHT HREF="copyright.html"> Copyright 1995 by Murray C. Maloney</A> DISCLAIMER The DISCLAIMER relationship identifies a hypertext link to a legal disclaimer. Usage is expected to be similar to that of the COPYRIGHT hypertext link. As with the copyright notice, there is no intention or expectation that such a link would be the only way to express a disclaimer. EDITOR The EDITOR relationship identifies a hypertext link to an editor. Usage is expected to be similar to that of the AUTHOR hypertext link. META The META relationship identifies a hypertext link to a node which contains meta-information related to the current document. This is intended to be a generalized meta-data relationship descriptor. PUBLISHER The PUBLISHER relationship identifies a hypertext link to a publisher. Usage is expected to be similar to that of the AUTHOR hypertext link. TRADEMARK The TRADEMARK relationship identifies a hypertext link to a trademark notice. Usage is expected to be similar to that of the COPYRIGHT hypertext link. 4f. Other REL and REV Values Under Discussion The POINTER keyword is an invention of the author. The BANNER, BOOKMARK, HOTLIST and STYLESHEET keywords are described in Dave Raggett's Internet Draft on HTML 3.0. Recent discussions tend to indicate that these keywords may not be appropriate for use as REL/REV values. Dave Raggett's further explanation and justification is needed before any further discussion or decision can be made as to the future status of these keywords. The LANG attribute is described in Dave Raggett's Internet Draft on HTML 3.0. It has been applied to various HTML elements, not including the LINK and A elements. The author suggests that LANG is a useful attribute to apply to the LINK and A elements. See also the discussion of REL=TRANSLATION. BANNER The BANNER relationship identifies a document banner. When REL=BANNER, the target document is to be included within the current document as a banner. A banner is typically used for corporate logos, custom toolbars, and other information which would not typically be scrolled with the body of a document. When REV=BANNER, the current document is a banner. This may be used, in future, to provide error-checking or to prevent the use of a document as a banner unless it has been explicitly identified as a valid source. (Or not! Sorry, I was reaching for a useful meaning.) Compelling arguments have been made against the need for a REL=BANNER value, which is simply a special case of the INCLUDE mechanism. BOOKMARK The BOOKMARK relationship identifies a bookmark. Bookmarks are used to provide direct links to key entry points into an extended document. The TITLE attribute may be used to label the bookmark. Several bookmarks may be defined in each document, and provide a means for orienting users in extended documents. HOTLIST RESERVED: This keyword has been proposed by Dave Raggett. Its meaning and purpose require further explanation. A placeholder is being maintained until such time as Dave has had an opportunity to provide further explanation, examples, discussion and justification. If the hypertext link is specified with REL in a LINK element, an HTML user agent may present an icon in a tool bar. LANG The LANG attribute indicates the language of the target document. The LANG attribute is optional and has no default value. It may be used for purely informational purposes by an HTML user agent, or by a robot for language classification. Used in combinatiuon with a proposed REL=TRANSLATION and a user's language preference setting, an HTML user agent may intelligently select from a collection of otherwise equivalent hypertext links expressed with the LINK element. If the user's language preference is not available, the user agent may present a virtual menu of language options. See the Internet Draft on the Internatiolisation of HTML for a definition of the values of this attribute. POINTER The pointer relationship identifies a hypertext pointer. That is, this is a way to do indirection in HTML. When REV=POINTER, the anchor is a pointer to the target document. When a hypertext link is traversed to a LINK or A element with REV=POINTER, the target specified by the HREF value should be traversed, and so on, until a target without REV=POINTER is retrieved. <LINK NAME=PSEUDO REV=POINTER HREF="real.html"> When REL=POINTER, the target is a pointer to the real target. This value can be used by a user agent to perform a pre-fetch of the specified target for evaluation until the real target is reached. NOTE: The authors propose that the NAME attribute be removed from the LINK element, or that a practical use for it should be defined. For example, hypertext indirection can be specified by providing both a NAME and an HREF value on the LINK element, in combination with a specific REL or REV value, such as POINTER. Some support exists among members of the HTML Working Group to provide for hypertext indirection with the LINK element. There is no other reason for an author to define a target by using the NAME attribute on a LINK element, since the resulting target address is functionally equivalent to the address of the document in which such a target is defined. STYLESHEET The STYLESHEET relationship identifies a stylesheet. When REL=STYLESHEET, the target document is a stylesheet. When associated with a LINK element, the author/publisher is expressing an expectation that the target stylesheet will be applied by the HTML user agent. When associated with an A element, an HTML user agent may simply retrieve the target stylesheet for display, or it may launch a stylesheet editor with the target stylesheet. When REV=STYLESHEET, the current document is a stylesheet and the target document may be a demonstration of its use. In general, it is not anticipated that stylesheets will contain LINK or A elements, as they are not projected to be HTML documents. TRANSLATION The TRANSLATION relationship specifies a translation to another language. When REL=TRANSLATION, the target is a translation to another language. This value will most typically be used with the LINK element, in combination with specification of the target document's language as a LANG attribute value. Presumably, REL=TRANSLATION can be used with the A element to specify a translation of a document fragment, such as a phrase in a foreign language. When REV=TRANSLATION, the current document, or document fragment, is a translation of the target. URC The URC relationship identifies a Uniform Resource Catalogue for the current document. This keyword has been proposed by Dave Raggett. Its meaning and purpose have not been explained to the author, but a placeholder is being maintained until such time as Dave has had an opportunity provide explanation, examples, discussion and justification.