usage of 'resource' vs 'representation' in HTML 5, CSS, HTML 4, SVG, ... from Dan Connolly on 2009-12-10 (www-tag@w3.org from December 2009)

From: Dan Connolly <connolly@w3.org>
Date: Wed, 09 Dec 2009 23:55:19 -0500
To: "www-tag@w3.org" <www-tag@w3.org>
Message-ID: <4B207F37.5030103@w3.org>
The clock has started on this issue in the HTML WG; proposals are due 
January 16, 2010
http://lists.w3.org/Archives/Public/public-html/2009Dec/0256.html

I'm thinking out loud a bit here...

I'm sympathetic to this viewpoint:

"the confusion is caused by trying to reference something that doesn't
exist. There is no such thing as what you call a "resource" -- it's an
abstract concept that has no correspondance to the real world. It is
unnecessary and makes talking about our infrastructure more complicated."
 -- http://lists.w3.org/Archives/Public/public-html/2009Sep/1133.html

Meanwhile, the translation impact suggests otherwise:
http://lists.w3.org/Archives/Public/public-html/2009Sep/1136.html

Ian points out usage that suggests "a resource is a bag of bits"
in HTML 4, CSS, SVG etc.
http://lists.w3.org/Archives/Public/public-html/2009Sep/1132.html

Roy Fielding dismisses those as "just examples," but I think it's
a bit more subtle than that... I think the webarch view of those usages 
is that typically, a URI identifies
pretty much a file... the kind whose contents change over time, not the
contents of the file at any one time. So to say '<xyz.html> identifies
an HTML file' is not to say that it identifies a bag/sequence of bits,
but rather that it identifies a resource whose representations have mime 
type text/html .
But as I say, I'm sympathetic to the position that (outside of the 
Semantic Web) this
abstraction just makes talking about all this stuff more complicated.

Meanwhile, Ian also says:

 This is actually intended to refer to "bag of bits". It identifies a
 bag of bits in the same way that a telephone number identifies a
 person. Sure, if you call a number at different times you might end up
 with different people, but you're still using a phone number to
 identify a person, you just don't know which one until you try to use
 the phone.

I find that usage of "identify" very unappealing. I think normal usage 
of "identify"
is unambiguous. If I say "In this game, teams are identified by color" and
then told you that blue identifies team X and a different team Y, you'd 
consider that nonsense.

I wonder about some terminology that just relates URIs with byte sequences,
without going thru the intermediate concept of resources, and yet doesn't
use "identify" in this confusing sense.

Something like:

  A URL is a key typically used to retrieve a page from the Web; more 
generally,
  it is used as an address in the Web, whether to find documents, mailboxes,
  services, applications, etc.

"navigation marker" also appeals to me, though I'm not sure there's any 
specific place
in the HTML 5 spec to talk about it that way.

So "find" in place of "identifiy". Somewhat ironic... "find" is a 
synonym for "locate"...
so maybe...

  A URL is a key, typically used to locate a Web page; more generally, it is
  used to locate mailboxes, services, applications, etc.

(footnote: I try to tow the party line where the standard term is 'URI' 
rather than 'URL',
but only out of duty/burden/obligation; somewhere between RFC2396 in '98
and 3986 in '05, I tried to convince TimBL and the TAG that it's pushing 
water uphill to try
to get the community to learn 'URI' rather than just going with the flow 
and using 'URL',
but I couldn't make the sale. I'm reasonably happy to see arguments on 
both sides examined
in some detail in the context of working out IRI interop stuff.)

But maybe not... I think the analogy with files suggests that 'locate' 
raises
the same issues as 'identify'; that is: filenames name files... or 
identify files... or
locate files; in any case, when you open a file, edit it, and save it 
back, it's
still the same file, and the filename identifies/names/locates/refers to 
the file,
not its contents at a given time. This analogy works with variables in a 
program, too:

 x = 1
 y = 2
 x = y + 2

There's just one variable called/named x; the name 'x' doesn't refer to 
1 nor to 3, but rather to
the place in memory that holds 1 at first and then 3.

I guess it's only in very informal glosses that you can skip from the 
URL to the sequence-of-bytes
without referring to the notion in between... though 'retrieve' does 
seem to get around it.
Filenames can be used to retrieve sequences of bytes... variable names 
can be used
to retrieve values. 'retrieve' doesn't generalize to mailto: and POST so 
well, but as Ian
pointed out somewhere in the thread, the HTML 5 spec doesn't need that 
generalization.

One specific case that the terminology showed up in the HTML 5 spec was 
around
caches, I think; in that case, it's clear to me that the simplest way to 
talk about
it is to talk about caching responses... or the content of response 
messages.
Something like that.

I hope to look at a few specific cases of HTML 5 spec text, but it's 
late here and
I already spent a lot more time on this message than I intended to...

-- 
Dan
Received on Thursday, 10 December 2009 04:55:32 UTC