Re: Proposed issue: What does using an URI require of me and my software? from Tim Berners-Lee on 2003-09-26 (public-sw-meaning@w3.org from September 2003)

From: Tim Berners-Lee <timbl@w3.org>
Date: Fri, 26 Sep 2003 15:42:17 +0100
To: public-sw-meaning@w3.org
Message-Id: <A01DD31E-F02F-11D7-BE7B-000A9580D8C0@w3.org>
In-Reply-to Bijan's original   
<http://lists.w3.org/Archives/Public/public-sw-meaning/2003Sep/ 
0054.html> , clarifications.

1.First of all, it does seem a funny question. What can a symbol  
require of a person? Nothing. What can receipt of a message which uses  
a symbol require of a person? Nothing. You can't require people to do  
thing by sending them things.

We can define a *protocol* in which people do do things, and we can  
demonstrate
that if people adhere to the protocol interesting results can be  
assured.  Then conformance with the protocol would require that one do  
something.

2. Bijan, you mention as an aside

> (Thus far we have no account of whether it's only using a URI
> "assertionally" or even "without scare quotes" that so commits. Given
> that RDF has neither non-assertional modes nor scare quotes, it may be
> moot. I am curious, however, about the status of URIs in literals!!!)


N3 extends RDF with quotation using {} as you know.  There is no direct  
commitment to
the ontology of  URIs used in quoted RDF.  The role of the nested RDF
  statements (formula) depend completely on the predicate they are used  
with.

{ ... }  a  log:Truth.
{ ... }  a :LoadOfOldCodswallop

A string which happens to be a valid URI occurring in an RDF statement  
is
not an occurrence of the URI as a symbol in the RDF statement.  This of
course allows one to reify quoted formulae.

3.  Ontology access not required

> TECHNICAL POINT
>
> This resolves to a simple technical point: Should an RDF
> processor/reasoner/agent import, to the best of its ability, pace
> network outages, cacheing, etc. "the" ontology of every URI it sees in
> a document? There *is software that made this assumption*. In our
> group, a student wrote an editor, RIC, that did exactly this. DanC and
> Tim, at the WWW BOF, I'm pretty sure, said that this was what you *had*
> to do.

Correction: I certainly don't remember saying that you *had* to, and I  
believe
I would have argued strongly against me, had I!

> DanC said, I believe, that all his software already did that.
>
> I'm not against software doing that. I'm against the spec requiring it.

Good.

cwm will operate in a number of modes with respect to looking
stuff up as a function of the URIs loaded into the knowledge base.

     cwm --closure=flags

where flags are a combination of
   p - look up predicates
   s - look up subjects
   o - look up objects
   t  - look up the object only where the predicate is rdf:type

An interesting mode is cwm --closure=pt.
Lets call this "ontological closure".
It is a reasonable thing to do, as it adds to the  KB the  
machine-readable
information which is assumed shared by the writer and reader of
a document.   If people do this a lot, then it useful to write  
documents whose
ontological closure is of a manageable size.  This is the case with any
real RDF files I've tried. It is what you might expect - people define
ontologies using ontologies but only to a limited level.

Contrast with   --closure=spo  which pulls in the whole contiguous
semantic web starting at the given document.  This is not practical.
This is interesting, as it highlights a difference between p and s  and  
o
not only in the spec but in the topology of the web.

But all this about things you *can* do and how we can make that useful.
It isn't what you *must* do.

(I think actually a lot of interesting work will be in finding  
combinations
of breadcrumbs to leave in the ontologies and algorithms for following
them.   Cwm -- closure=ptr  also follows links to rule files and loads  
the
rules. How do we set expectations about the sorts of rules one will
set and and what will happen if we run them? Can we leave pointers to
translations to other ontologies?  Can I write a tax form ontology
so that it points to enough stuff directly or indirectly for an  
inference
engine to fill it out? and so on.)

3. PROCEDURAL POINT

> I don't see how this is a matter for Web Architecture rather than the
> working groups. Not all uses of URIs entail inclusion of the document
> the URI points to, even in HTML. For example, <a
> href="...someImage.jpg">...</a> vs. <img src="...soemImage.jpg"/>.

It is an architectural issue in that the meaning/denotation of a URI
is defined in the architecture to be the basically the same for  
everyone,
so we have there may be implications on other parts of the web.

That said, the global hypertext system built out of HTML and
the global KR system build out of RDF are different in many ways.
I think we should get the semantic web act together before
we look at way sin which we try to merge them too closely.
It might be reasonable to convert/coerce an RDF URI into a hypertext
anchor of a document describing what the thing is, but it is not
the primary goal.


4. Retreivability

> Retrievability is a total red herring. Much of Tim's language above
> seems devoted to weakening the requirement that you look up the
> ontology each time. That weakening just doesn't do anything for me.

There is no such requirement, so no weakening of it.

> The
> requirement is commitment to the URI owner's ontology, and, apparently,
> to the current URI owners *current* ontology (which I must try to
> ascertain to the best of my ability, and we're tolerant of web
> failures, etc.)

You seem to think that you are being constrained
as to what you, personally, or you as your software must do.
You don't have to do anything from the point of view of us defining
what the RDF means.  Whether your machine spends a certain
amount of time or effort to access things under various circumstances
depends on what you are trying to do.  If have no goals, turn the
machine off.

It is much more a question of defining *if* you do these things, then
what will you be able to conclude?  From that may follow your decisions
as to what to do to achieve a goal.

Yes, to name the class of reasoners which does inference and looks
up ontological closures as it goes might be useful, just as we have
various levels of OWL reasoner.  But we don't expect it of everything  
with
an RDF parser.

5. Natural language definitions

> Natural language defintions [are a red herring].
> These are related but distinct. But let me
> tell you, if you think I'm committed to not only the *formal, machine
> readable* ontologies (in an importy sense) but the *natural language*
> ontologies...uh...well, let's just say I don't know how to import the
> latter. (Actually, this would suggest that I, a software writer, would
> have to check EVERY URI for the natural language spec and rewrite my
> software to conform. Yick.)

Only if you want to write software which does more useful things.
Yes, people who receive documents written in the OWL ontology
to derive more useful information have to read the OWL specs
and stuff. Yick.  Hard work.   But you only have o do that
if you need to.  Maybe you do it because you are claiming to
provide a service which includes inference as defined in an OWL
conformance level.

6. Limiting the damage

There is a corner case in which somebody writes
something rather irrelevant and potentially inconsistent
in an ontology.  Sally defines Girl as intersection of
YoungPerson and Female, and also mentions
that the weather is sunny.

One solution is to say that that is not a friendly way to do things,
she should not do that.

Another solution is to try to "limit the damage" as you say
so that somehow the information in the ontology to which
one is committed is only that which "defines" the predicate
in question for some value of "defines".  I can't see any
way to do that formally - only with hand waving.
I can see the distinction being used in court but not in a machine.
So I haven't explored that route.

7 Naive protocols and safe operating procedures

Actually, the whole question of damage raises another distinction.
Most of the "intuition pump" example is about things going wrong.

I think we may have to consciously distinguish in the design
of the semantic web in general between the normal expected ways
of going about things, which we can show will wok, and the
operating procedures which will allow one to operate safely in
a potentially hostile environment.

Example: a purchase choice system choses the cheapest product
which is offered as being compatible with an hp:p314159.
The naive protocol is for the seller to offer the compatibility and  
price
system in the catalogue which you get by dereferencing the
part URI, and the buyer loading the catalogs into its kb.
A more secure system filters the catalogs for lies about
which product is best.  You can define conformance with some
market protocol in that the catalog only has data of a given form,
but you still want to be careful about things which break it.

Focussing, then, back on what an RDF document means, which
was the original narrower scope than all of this, I would say we have
to first define the naive protocol,

- Use of an HTTP URI as a symbol in an RDF statement
  refers to one thing which the URI owner intended.

- The URI owner puts true, consistent, hum &/or machine
readable information in the
   document that you get should you chose to dereference the URI.

- Nobody hijacks the domain name system, the LAN or the server,
  or an intervening proxy, or the user's computer, etc


If we could get that nailed down first, then afterwards we could launch  
into
the questions of what happens when people lie, make mistakes,
fix mistakes, the net goes down, and so on, as to whether we should
make the best of it, model everything in an extra level of detail,
take someone to court or call the to discuss it over lunch, et cetera.

We can also define useful rules of friendly behavior which a community
could adopt to make a working system within that community.

Tim
Received on Friday, 26 September 2003 10:42:25 UTC