Re: Hyperlinks depend on GET (was: Re: REST and the Web) from Paul Prescod on 2002-03-30 (www-tag@w3.org from March 2002)

From: Paul Prescod <paul@prescod.net>
Date: Sat, 30 Mar 2002 00:04:55 -0800
To: www-tag@w3.org
Message-ID: <3CA571A7.5F3E1819@prescod.net>
noah_mendelsohn@us.ibm.com wrote:
> 
>....
> Paul: I agree with much of what you have written over the past few days,
> but not with this, and I think it's an important point.  Maybe it would be
> closer to the mark to say: "Without GET you don't have hyperlinking for
> browsing", but even that is arguably a bit of an oversimplification.

Actually, browsing has nothing to do with it. Consider:

<xsd:import ... schemaLocation="foo:/bar/baz/jaz.xsd"/>

<xsl:include href="foo:/bar/baz.xslt"/>

<img src="foo:/bar/baz.gif"/>

<xsl:value-of select="document('foo:/bar/baz.xml')"/>

In only one of these contexts is a browser likely involved. In all of
them, it is the semantics of the embedding document type that drives the
inclusion. Now I'm not saying that where-ever you see a URI in any
context it is appropriate to GET it, but whenever I see *any* URI, with
*any* scheme prefix in an <img src=".."/> context, or an xsd:import
context or an xsl:include context, etc., then the appropriate thing to
do is to try to resolve the URI to a representation that can be treated
as a fragment of XSD or XSL or as an image. Therefore I strongly
disagree with your assertion that I should choose what to do with the
URI based on whether it is an nsf: URI or a foobar: URI. The scheme
should never play a factor in your decision of what to do when you see a
URI.

The mailto: URI as it currently exists breaks my model. That's too bad.
Considering the rate of new URI adoptions on the Web there will never be
another silly URI scheme like that again. Here's how I would have
implemented mailto:

Strategy 1:

<mailto address="...">Please Mail Me!</mailto>

Strategy 2:

<a href="mymailbox.wmbx">Please Mail Me!</a>

mymailbox.wmbx:

<web_mailbox>
   <smtp_address>paul@prescod.net</smtp_address>
   <uucp_address> ... </uucp_address>
   <jabber_address>...</jabber_address>
   ....
   <alternate_addresses>...</alternate_addresses>
</web_mailbox>

The important thing I'm trying to get across is that you decide what to
do based on the documents you are looking at, not the URI mechanism that
happens to be used to glue them together. That's part of the "URIs are
opaque" idea. What does it mean for XSL if the behaviour of the
stylesheet is now dependent in part on what kind of URIs you use? Also,
do we need to start being specific in our schemas to restrict explicitly
to URIs that we know support GET?

>...
> In general, I see hyperlinking as the generalized ability to refer in a
> uniform manner, in a broad range of contexts, to a large and perhaps
> diverse set of resources.  The existence and widespread deployment of URI
> architecture gives us this.  Indeed, I claim that one wants to hyperlink
> resources for which GET is not the obvious default operation.

There is never a harm (other than a tiny performance hit) in doing a GET
to ask the object what else can be done with it. I would be glad to
design an XML encoding for NSF mount points. Then de-referencing an
HTTP, FTP, file:/// or Gopher URI would give you the information you
need to do the mount.

> ...  Further, I
> claim that Metcalf's law suggests that mixing all of these sorts of links
> in one system is much more powerful than a system in which hyperlinking is
> dependent in all cases on GET.

I hope I've proved that the GET-centric system is equally powerful. But
in addition to being more powerful it is more extensible. Now that we've
got the mailto: URIs we're stuck with them.

There is no forward path to mailto: URIs that also support jabber or AOL
messenger or whatever. But XML representations are extensible and using
content negotation even "stupid" representations can be "upgraded" to
extensible ones. So as long as you do things my way you can always shove
a little bit more metadata into the document referenced by the URI and
make the system a little more intelligent and sophisticated. Plus, the
cost for deploying new XML vocabularies is a tiny fraction of the cost
for deploying new URI syntaxes and there are great fallback schemes we
can use which are not generally available for URIs.

>...
> I think we are underrating the role of the browser in that scenario if we
> assert that hyperlinking is intimately dependent on GET.  It's the browser
> that decides to do a GET.  

The browser decides to do the GET, but based on the knowledge of the
HTML or XSLT or XSD or ... vocabularies. Those vocabularies view the
URI-space as a sort of file system where things can be glued together
through GET. Every hypertext vocabulary I've ever heard of does. That's
why I say GET is a necessity for hypertext.

When you start having URIs that don't work with GET you break those
vocabularies for no benefit. Now you have to start thinking about which
URIs you put in which documents.

Consider the Java APIs for URLs:

http://java.sun.com/products/jdk/1.2/docs/api/java/net/URL.html

The only thing you are guaranteed about URLs is that you can do a
getContent() on them. I'm just pointing out that it is widely believed
that this is the "contract" supported by all URIs. The mailto: URI is
really a level-breaking hack. 

> .... Another application might know to DELETE every
> resource that's selected.   

That's fine for the application to do. But it says nothing about whether
the resource should support GET. If it supports DELETE then I guess
there is some data there to delete. Why shouldn't the resource support
GET so I can ask it about that data before I delete it? Maybe the
application wants to check what it is deleting before it deletes it!

> ... Another example:  let's say we have an NFS:
> URI scheme, a means of linking to NFS mountable volumes.  Cool idea.  When
> I click on such a link, do I want to GET the entire volume? 

You don't get the entire volume. You get something like:

<volume>
  <file>...</file>
  <file>...</file>
  <directory>...</directory>
  <directory>...</directory>
</volume>

If you want to establish some kind of connection ("a mount point") with
the server then you do a POST. But I'll point out again that if I see
the nfs: URI in (for example) an XSLT context then I want to do a GET
and do a transform or embed of whatever I get back.

> .... Probably not,
> but mounting it would be great.  And the fact that I could imagine putting
> such an NFS link on a web page is the essence of hyperlinking IMO.  Same
> with mailto: links;  they generally send rather than GET mail when
> selected.  Metcalf's law:  the power of my system grows when I can mix all
> of these as needed.  That's why I don't want to limit the Web to
> retrievable documents.

As I've shown, retrievable documents are not limiting. They are
incredibly powerful, flexible and extensible. It is URIs that are
limiting -- when you try to use them for expressing behaviour, rather
than addressing.

They are, after all, just addresses. It's a little bit like looking at a
phone number and figuring that it starts with 976 so the "appropriate
thing to do" is ask the person on the other end what they are wearing.
Let me suggest that it would be better to find out whether the number is
ACTUALLY associated with a sex hotline based either on the context you
found it in, or the information you get when you call it. The form of
the phone number is irrelevant.

I can only think of two widely deployed, important cases where I've seen
many URIs that did not support GET. The mailto link is one. The other is
magical urn: URIs used primarily for XML namespaces. If something like
RDDL ever gets deployed and becomes useful, people who use those
non-resolvable URIs will wish they had just used http: URIs. We aren't
at that point yet, so I guess we'll have to wait and see. (by the way, I
was historically a big booster of urn: style URIs for namespaces)

 Paul Prescod
Received on Saturday, 30 March 2002 03:08:50 UTC