Re: New Proposal (6.1) for GRAPHS from Sandro Hawke on 2012-04-03 (public-rdf-wg@w3.org from April 2012)

From: Sandro Hawke <sandro@w3.org>
Date: Mon, 02 Apr 2012 20:57:13 -0400
To: Charles Greer <cgreer@marklogic.com>
Cc: Charles Greer <Charles.Greer@marklogic.com>, public-rdf-wg <public-rdf-wg@w3.org>
Message-ID: <1333414633.28199.137.camel@waldron>
On Mon, 2012-04-02 at 14:00 -0700, Charles Greer wrote:
> Thanks for responding Sandro.  I think that what I'm finding difficult, 
> or at least a significant departure from RDF as I have understood it in 
> the past, is that this TRIG document
> 
> <u1> { <a> <b> <c> . <d> <e> <f> }
> 
> is not equivalent to these n-quads:
> 
> <a> <b> <c> <u1>.
> <d> <e> <f> <u1>.
> 
> Or rather, you now need a document structure around n-quads as well in 
> order to provide the context in which rdf knows that these triples, and 
> only these triples, constitute the graph <u1>.
> 
> I had previously thought that RDF was a data model that didn't need any 
> notion of 'document' to work.  I'm not sure how another assertion that
> 
> { <u1> a rdf:Graph }
> 
> can assert the boundaries of <u1> unless either the { } syntax does more 
> than it appears to, or the document is a harder scope boundary than I 
> would have expected.  If the document has some relationship to scope, I 
> think that should be made explicit.

Two main points:

1.  That rdf:Graph declaration is different thing.  It changes how <u1>
relates to the graph, but in a semantic (not syntactic) way.  It can be
in a different document, or deduced by the use of some predicates, or
known a priori by a data consumer.  Knowing it entitles the consumer to
see that <u1> actually identifies the graph directly, rather than just
being a label for the graph.     This might matter if we also know <u1>
dc:licence ...SomeLicensingTerms....   Is it the graph that's licensed,
or something else?     There are some use cases that suggests this
distinction is important, but if it turns out not to be, it's not bad,
people will just not use rdf:Graph declarations much.

2.  Whether or not your trig example and your n-quads example are
equivalent depends on your reading of n-quads.   This extends to your
reading of SPARQL as well.     My understanding is people are somewhat
informal about this, but they generally do expect that once they've seen
the whole trig file, or the whole n-quads file, or searched the whole
SPARQL end point, that they've seen all the triples in the graph with
that name/label.

As a social test case, we could tell people this SPARQL query is run:

    SELECT ?s ?p ?o 
    WHERE GRAPH <http://g1.example.org> { ?s ?p ?o }.

and that we got three result bindings back: 

    ?s  ?p  ?o
    === === ===
    <a> <b> 1.
    <a> <b> 2.
    <a> <b> 3.

Then we ask them: "According to this query, how many triples are in the
graph known to that endpoint as 'http://g1.example.org' ?"

What do you think they'll say?

I think most folks will say, "Three", even if you ask them to think
again and be pedantically precise.

I think that means they're using the complete-graph semantics I'm
suggesting.  If they were using partial-graph semantics, they'd have to
say, "Three or more".

You see what I'm saying?   When we have a complete protocol interaction,
via SPARQL, or transmitting a trig or n-quad files, I think the usual
assumption is that *all* the triples in the named graph are being sent,
not just some of them. 

I understand sometimes it would be nice to store/transmit just part of
some named graph.   But, as I discussed in a message a couple of minutes
ago, I think we have to pick one or the other, and I think the
complete-graph approach is better.  It's pretty easy to convey partial
graphs if we define the complete approach.

(I suppose if we defined the partial-graph approach we could transmit
complete graphs by transmitting partial graphs and including a
triple-count as metadata, so you know it's complete.   I guess that
would work, but it seems to me to be optimizing for the much-less-common
case.)

Coming back to:

> I had previously thought that RDF was a data model that didn't need
any 
> notion of 'document' to work. 

Yeah, it depends what you're doing with it.   There's a lot you can do
with RDF without paying any attention to what documents particular bits
of RDF were found in, but I think most of the Graphs use cases involve
situations where you do need to pay attention to these document
boundaries.    

> Thanks for your willingness to understand my points --- I'm sure that my 
> formal language will improve over time.

It's a long process.   :-)    Interesting, it seems to be helped by
arguing.

    -- Sandro

> 
> Charles
> 
> 
> 
> On 04/02/2012 08:36 AM, Sandro Hawke wrote:
> > On Thu, 2012-03-29 at 09:25 -0700, Charles Greer wrote:
> >> I really like this solution and it seems to satisfy the use cases
> >> familiar to me from when I actually worked a lot with RDF in the wild.
> >>
> >> One thing I'm tripping over though --  The scope of a TRIG document or
> >> RDF dataset in effect 'closes the world.'  Is the idea of "merge" only
> >> within a TRIG document/dataset?
> >>
> >> I can only see two ways to really assert a graph literal -- either by
> >> sanctifying the boundaries of  a dataset, thereby making merges with
> >> external data problematic, or by signing bytes.  Am I missing something,
> >> as usual?
> > There's some misunderstanding here, yes.   Maybe you can talk through
> > some particular thing you imagine doing, involving merging and TriG, and
> > I'll be able to pick it up.   From what you've written, I'm confused.
> >
> > Maybe I can clarifying by translating this TriG document:
> >
> >          <u1>   {<a>   <b>   <c>  }
> >
> > into this English declaration:
> >
> >          The URI 'u1' denotes something, and that thing has exactly one
> >          associated RDF Graph.   That associated RDF graph consists of
> >          one RDF triple, which we can write in turtle as "<a>  <b>  <c>".
> >
> > So, perhaps it's more clear, now.  If you merged that with another TriG
> > document:
> >
> >          <u1>   {<a>   <b>   <d>  }
> >
> > Then, trying to accept both documents at onces, you'd be saying:
> >
> >          The URI 'u1' denotes something, and that thing has exactly one
> >          associated RDF graph.  In one document that associated graph is
> >          claimed to be the RDF triple "<a>  <b>  <c>", but in another
> >          document that graph is claimed to be the RDF triple "<a>  <b>
> >          <d>".
> >
> > So, in this case, you can try to merge the documents, but when you do,
> > you find there is a contradiction, since there is only allowed to be one
> > associated graph, but in this case there are two different ones.
> >
> >         -- Sandro
> >
> >> Charles
> >>
> >>
> >> On 03/27/2012 07:23 PM, Sandro Hawke wrote:
> >>> I've written up design 6 (originally suggested by Andy) in more
> >>> detail.  I've called in 6.1 since I've change/added a few details that
> >>> Andy might not agree with.  Eric has started writing up how the use
> >>> cases are addressed by this proposal.
> >>>
> >>> This proposal addresses all 15 of our old open issues concerning graphs.
> >>> (I'm sure it will have its own issues, though.)
> >>>
> >>> The basic idea is to use trig syntax, and to support the different
> >>> desired relationships between labels and their graphs via class
> >>> information on the labels.  In particular, according to this proposal,
> >>> in this trig document:
> >>>
> >>>      <u1>   {<a>   <b>   <c>   }
> >>>
> >>> ... we only know that<u1>   is some kind of label for the RDF Graph<a>
> >>> <b>   <c>, like today.  However, in his trig document:
> >>>
> >>>      {<u2>   a rdf:Graph }
> >>>      <u2>   {<a>   <b>   <c>   }
> >>>
> >>> we know that<u2>   is an rdf:Graph and, what's more, we know that<u2>
> >>> actually is the RDF Graph {<a>   <b>   <c>   }.  That is, in this case, we
> >>> know that URL "u2" is a name we can use in RDF to refer to that g-snap.
> >>>
> >>> Details are here: http://www.w3.org/2011/rdf-wg/wiki/Graphs_Design_6.1
> >>>
> >>> That page includes answers to all the current GRAPHS issues, including
> >>> ISSUE-5, ISSUE-14, etc.
> >>>
> >>> Eric has started going through Why Graphs and adding the examples as
> >>> addressed by Proposal 6.1:
> >>> http://www.w3.org/2011/rdf-wg/wiki/Why_Graphs_6.1
> >>>
> >>>        -- Sandro (with Eric nearby)
> >>>
> >>>
> >>
> >
> 
>
Received on Tuesday, 3 April 2012 00:57:28 UTC