Re: Descriptors linked from RS or DR? (again) from Charles McCathieNevile on 2007-09-09 (public-powderwg@w3.org from September 2007)

From: Charles McCathieNevile <chaals@opera.com>
Date: Mon, 10 Sep 2007 07:17:58 +0900
To: "Smith, Kevin, VF-Group" <Kevin.Smith@vodafone.com>, public-powderwg@w3.org
Message-ID: <op.tyau9trqwxe0ny@pc129.n201.mita.keio.ac.jp>
Su Majestad, Vos eminences,

my comments, claims, and possible errors of commission are inserted below  
at the relevant point. The short answer is I am still holding out for this  
issue, because I think the current data structure means that an RDF  
processor would make incorrect assumptions about our data and therefore  
authorise treatment of it which follows the RDF spec but breaks our  
semantics.

On Wed, 05 Sep 2007 21:49:12 +0900, Smith, Kevin, VF-Group  
<Kevin.Smith@vodafone.com> wrote:

> It seems that if the RS contains the DR, it will need the
> embargo/expiration and the foaf:maker too so that trustworthiness can be
> established.

No. If you want to keep that information around, you have the original  
document, or you make a data structure that shifts it to suit your  
internal purposes. You do it where you want an optimisation. If you want  
the DR, use the full DR.

> -----Original Message-----
> From: public-powderwg-request@w3.org
> [mailto:public-powderwg-request@w3.org] On Behalf Of Phil Archer
> Sent: 04 September 2007 16:55
> To: public-powderwg@w3.org
> Subject: Re: Descriptors linked from RS or DR? (again)
>
>
> No, we didn't think of that, but as I understand Chaals, you'd look at
> the whole DR, including the attribution and expiry data and only then
> decide to process the RS and D. They wouldn't exist outwith the DR, but
> might be processed in an optimised system deep in a Norwegian Fjord once
> their credentials had been established.
>
> Putting the expiry dates etc. in the RS is getting close to getting rid
> of the DR altogether and just having an RS with all the stuff in it -
> which is really back to option A.

Right, and not terribly useful.

>> -----Original Message-----
>> From: public-powderwg-request@w3.org
>> [mailto:public-powderwg-request@w3.org] On Behalf Of Phil Archer
>> Sent: 04 September 2007 16:09
>> To: Public POWDER
>> Subject: Re: Descriptors linked from RS or DR? (again)

>> At present, the structure of a Description Resource is:
...
>> Call that option A.
>>
>> The question is, should we change this for option B:
>>
>> <wdr:DR>
>>    <attribution>
>>    <ResourceSet>
>>      <Descriptors>
>>    </ResourceSet>
>> </wdr:DR>
>>
>> Oh it looks so minor...
>>
>> Points in favour of option B (correct me if I'm wrong Charles)
>>
>>   1. Once you have parsed a DR and decided it's trustworthy, for
>> internal processing you can throw away the attribution and the rest of
>> the DR and just take the Resource Set and the Descriptors. This is a
>> pointer to efficiency.

Although this is an optimisation that explicitly removes the attribution,  
so you don't use it without realising that.

>>    2. It is much more sound semantically to have Resource Set ->
>> Descriptors. It's a lot closer to the established RDF model (although
>> I'm not convinced that you could process this entirely without having
>> to use POWDER-specific routines).

You cannot process this entirely without some POWDER-specific routines.  
However, you can take a resource, such as a DR, and do intelligent things  
with a pure RDF engine such as adding further information about where it  
came from, or noting that there are properties and classifying them even  
if you don't understand them, or translating the information to some other  
formalism (database tables, looking at them in tabulator, etc).

The reason for my proposal is that as I understand it, option A will lead  
any non=POWDER-aware system to process the data in a way which is actually  
wrong. In other words, unless you isolate this data from all non-POWDER  
aware systems, you risk polluting the semantic web with a whole lot of  
statements that look like RDF but in fact lead to incorrect data  
structures.

>> The points in favour of option A are (correct me if I'm wrong Andrea)

>>    2. The real problem comes when you have a DR with a complex scope
>> such as:
>>
>> <wdr:DR>
>>    <attribution>
>>    <ResourceSet>
>>      <owl:unionOf>
>>        <ResourceSet1>
>>        <ResourceSet2>
>>      </owl:unionOf>
>>    </ResourceSet>
>> </wdr:DR>
>>
>> If we're going to link the Descriptors from the RS, there are three
>> possible places it could go. You could have:
>>
>> <wdr:DR>
>>    <attribution>
>>    <ResourceSet>
>>
>>      <owl:unionOf>
>>
>>        <ResourceSet1>
>>          <Descriptors_A>
>>        </ResourceSet1>
>>
>>        <ResourceSet2>
>>          <Descriptors_B>
>>       </ResourceSet2>
>>
>>      </owl:unionOf>
>>      <Descriptors_C>
>>    </ResourceSet>
>>
>> </wdr:DR>
>>
>> Call this example C
>>
>> That is, three separate Descriptor Classes, one for each RS. To which
>> Charles says something like 'so what?' Why is that bad?
>>
>> Well, IMHO it's only bad if it's an error. That is, if you didn't mean
>> to associate a Descriptors Class with a particular RS. Our thinking
>> tends to be: One DR has one Resource Set and one set of Descriptors
>> (although a given Resource Set or Descriptors set may comprise several
>> sets in union or intersection). Example C is only complicated because,
>> well, it's complicated. The simple version - option B - is no more
>> complex than option A. And it's the fact that the creator wants to
>> describe the three different Resource Sets differently - which is a
>> complex thing to want to do - that makes the DR complex.

But possible. I don't think you can do this at all in our current model -  
although yu can merge two things with the same set of descriptors, or you  
can merge two sets of resources and their sets of descriptors just blindly  
asserting that all the descriptions from each side are accurate for all  
the assertions.

Nothing would stop you doing that in the proposed model, except that you  
merge at the level of RS, not DR. Since whoever did the merge should take  
responsibility for the claim, this makes more sense than some magic merge  
where you somehow decide which set of attribution information to remove,  
and makes more sense than the further restriction that you can only merge  
where the attribution is the same. In the proposed model, if you decide to  
merge another resource set into your DR you jast grab it. If you decide to  
make claims about the joint set beyond those which are already in the RS  
that you accepted, you can do so.

(As I understand the owl,

<Beer>
  <owl:unionOf>
    <Foo>
     <hasProp r:resource="ABar">
     <hasProp r:resource="ABetterBar">
    </Foo>
    <Foo>
     <hasProp r:resource="ABar">
     <hasProp r:resource="AnotherBar">
    </Foo>
  </owl:unionOf>
</Beer>

implies

<Beer>
  <owl:unionOf>
    <Foo>
     <hasProp r:resource="ABetterBar">
    </Foo>
    <Foo>
     <hasProp r:resource="AnotherBar">
    </Foo>
  </owl:unionOf>
  <hasProp r:resource="ABar">
</Beer>

but not vice versa, so if you wanted to further mess around with the bits  
inside you should keep their descriptors explicit inside them as well).

>> How would you process Example C?
>>
>> You'd do some SPARQL to find out that the RS is a union of two others
>> and that that the Descriptors linked from that union applied to both
>> its constituent sub sets.
>>
>> Then you'd parse each sub set and find a bunch of descriptors that
>> applied to each of those (and that, for example, Descriptors_B did not
>> describe RS 1. (More concrete example: RS 1 is the mobileOK stuff, RS
>> 2 is the accessible stuff, the overall RS is all child-friendly).
>>
>> That doesn't seem bad.
>>
>> Can we constrain some of this to help mitigate errors? My suggestion
>> is that (if we go with option B)
>>
>> 1. We define the wdr:hasDescriptors property has having Resource Set
>> as its domain and that an RS MUST have this property (i.e.
>> owl:cardinality = 1)
>>
>> 2. We define a sub class of Resource Set that doesn't support the
>> hasDescriptors property. A processor would 'know' that such a class
>> was only described when considered part of the overall RS.
>>
>> It's probably a bad idea to call this a sub set but it does provide a
>> partial definition of a Resource Set so how about wdr:PartSetDef.
>>
>> So we might have something like
>>
>> <wdr:DR>
>>    <attribution />
>>    <ResourceSet1>
>>      <owl:intersectionOf>
>>
>>        <PartSetDef />
>>
>>        <ResourceSet2>
>>          <Descriptors_A>
>>       </ResourceSet2>
>>
>>      </owl:intersectionOf>
>>      <Descriptors_B>
>>    </ResourceSet1>
>>
>> </wdr:DR>
>>
>> This says that RS 2 is described by Descriptors A; the partial set
>> definition has no description but resources that meet that definition
>> AND are in RS 2 are described by Descriptors B.
>>
>> This is a little complex! Actually, I'd probably want to use a package
>> to do this a little differently, but the approach seems logically
>> sound to me. You too??

Yes, except that if RS supports the hasDescriptors then so does any  
subclass. Instead, we should define a "DescribedRS" that has to have a  
descriptor (and should normally be used) which, along with PartSetDef, is  
a subclass of RS, and not require an RS to have a descriptor.

>> The point about introducing the requirement that an RS MUST be linked
>> to a block of Descriptors is that it provides another validation step.

Indeed. How important is this?

>> One more thing. What Charles actually proposed in Washington was that
>> an RS MUST have a property of 'hasTag' - which can link to anything,
>> including free text, and then we define a sub property of that which
>> is the hasDescriptors property.

Given the validation thing above, I would propose that we make it an RS  
SHOULD have a hasTag (or more), and that a DescribedResourceSet MUST have  
one (minCardinality 1), or even must have a hasDescriptor (the version  
where we know there is a URI, not just a random tag).

>> Any feedback? We need to nail this as it's holding us us...

Actually I would be happy to note the issue in the draft and request  
explicit feedback, and then publish.

>> Phil Archer wrote:
...
>>> At its most basic, a DR states that:
>>>
>>> {Organisation} asserts that {Resource Set} is described by
>>> {Descriptors}
...
>>> Our current structure has the Descriptors linked from the DR thus:
>>>
>>> <wdr:DR>
>>>   <foaf:maker rdf:resource="http://www.example.org/foaf.rdf#me" />
>>>   <dcterms:issued>2007-07-20</dcterms:issued>
>>>   <wdr:validUntil>2008-07-19</wdr:validUntil>
>>>   <wdr:hasScope ... />
>>>   <wdr:hasDescriptors ... />
>>> </wdr:DR>
>>>
>>> Putting that in prose we get
>>>
>>> A {Description Resource} was created by {organisation} on {issue
>>> date} that will expire on {valid until date} that describes
>>> {Resource Set} as {Descriptors}.
>>>
>>> Which seems to me to have the right semantics.

Except in terms of RDF, which has no way, in the current structure, to  
understand that the hasDescriptors is meant to apply properties to the  
Resource Set, and instead specifies for any conformant RDF engine that the  
Descriptions in hasDescriptors actually apply to the DR itself.

This seems to me a bad use of RDF. It is perfectly sound to require  
processors to understand particular terms in order to process some  
specific RDF vocabulary in a way that is designed to do something. Some  
FOAF and RDFIcal systems do this, interpreting the "nearestAirport"  
property to provide a map view or something. But it is not a good idea to  
say that the nearestAirport applies, in RDF terms, to the RDF document,  
and then expect people to make special processors, or just never try to  
interpret what is in fact valid RDF (although the information it tries to  
model is expressed incorrectly in terms of RDF's basic semantics).

An alternative would be to ask some hard-core RDF geeks what the model  
says - it may be that I am wrong in my interpretation, but I don't think  
so.

Cheers

Chaals

-- 
   Charles McCathieNevile, Opera Software: Standards Group
   hablo español  -  je parle français  -  jeg lærer norsk
chaals@opera.com   http://snapshot.opera.com - Kestrel (9.5α1)
Received on Sunday, 9 September 2007 22:18:12 UTC