Re: RDFa API - graph?

Hi Nathan,

On Sun, Oct 31, 2010 at 3:32 PM, Nathan <nathan@webr3.org> wrote:
> Mark Birbeck wrote:
>>
>> So, if you are saying that a store is something that holds many graphs:
>>
>>> ...it clearly separates the concepts of "Store" (somewhere to
>>> store graphs and triples) and "Graph" (a set of triples, an RDF Graph)...
>>
>> then that's great, and we can get on with making this happen (more
>> below). If on the other hand you are simply suggesting we rename
>> stores to graphs:
>
> That's correct, specifically I'm suggesting that the DataStore Interface
> outlined in ISSUE-52:
>
>  http://www.w3.org/2010/02/rdfa/track/issues/52
>
> Be renamed to Graph or RDFGraph, and is a lightweight set of triples, an
> RDFGraph, distinct from the notion of Store.

Ah...ok. In which case I don't agree. We need both Graphs /and/ Stores.


> This would also allow us, or libraries, to extend Graph with a NamedGraph
> interface, by simply adding in the name/uri of the graph.

Well...an array of Graphs with a URI as a key is all you need to
implement named graphs. For example (from the backplanejs API):

  RDFStore.prototype.getGraph = function( graphURI ) {
    // A null or empty graph URI means use the default graph.
    //
    graphURI = graphURI || "default";

    // If the graph we want doesn't exist, then create it.
    //
    if ( !this.graphs[ graphURI ] ) {
      this.graphs[ graphURI ] = new RDFGraph( );
    }
    return this.graphs[ graphURI ];
  };

It's pretty easy, and doesn't require the Graph interface to be extended at all.


> It would also open the door for the definition of a Store in the way you
> describe, one which handles multiple named graphs and which can be SPARQL'd
> over.

I think I must be misunderstanding you...aren't you saying that Store
is replaced with Graph?


> Certainly we need to support multiple graphs (like the default and processor
> graphs), and ensure the API is compatible with the notion of NamedGraphs (as
> in easily extended to provide named graph functionality, as outlined above).
>
> However, I'm entirely unsure whether we're the ones who should be defining
> what are essentially NamedGraphs, a QuadStore and SPARQL Query interfaces
> ...

No-one mentioned SPARQL at this point. :)

In fact, no-one mentioned quad-stores either!

I'm saying that a Store comprises one or more Graphs, because whilst
RDFa is defined to create triples in a default graph, it allows for
the fact that parsers may generate other triples in other graphs.

This means that if we don't define how that is done, developers might
go off in different directions.


> Personally my gut feeling is that this is out of scope, but that the API
> needs defined in such a way that it's compatible with the future definition
> of such interfaces.

I disagree, and think it's very much in scope.


> Perhaps it's enough for use to define an RDF Graph interface, allowing
> iteration and quantification, one which can be augmented with more triples,
> or filtered to produce a new (sub) graph.

Yes, that's pretty much all we need.

In the main, the Graph interface needs to have methods that act
directly on the graph (such as adding triples), whilst the Store
interface needs to have an additional parameter so that the graph to
operate on can be specified (such as which graph to add the triple
to).


> Whilst we're on this subject I'm also unsure about the merge method, perhaps
> it should be more of a concat which treats the Graph as immutable (like
> filter does), returning a new Graph which is the combination of the two -
> that is perhaps a different conversation though.

Yes, that's probably best discussed separately.


>> In the backplanejs RDFa API a lot of work went into making things work
>> at the graph level. Triples can be stored in specific graphs within a
>> store, and then queried either by explicitly stating a graph or by
>> querying across 'all graphs' in that store.
>>
>> This requires therefore that there's a graph interface that supports a
>> simple add() method (like we have at the moment for store), but that
>> the add() method on a store is changed to support an extra parameter
>> to indicate which graph to add to. (In the backplanejs API this is set
>> to either the string "default" or a full URI depending on where you
>> want the triples stored.)
>
> Kudos for getting named graph support and a multi graph store in there!

:)


>> Queries are much the same except that we need an additional concept of
>> 'all graphs'.
>
> Unsure on that one, certainly when you have multiple (named) graphs you need
> support for querying either single or multiple graphs, but surely there's a
> notion of just querying the single graph / set of triples you have which we
> can work with? something like a SPARQL select without the 'FROM', like the
> where() introduced in Jeni T's library, or like my own prototype of RDF
> Selectors, essentially something along the lines of:
>
>  querylang.query("some query", graph);

No...it's not like that at all. The parallel to what I'm talking about
would be writing a SPARQL query without expressing any 'FROM' clauses;
in that case the query is run against 'all graphs' (what SPARQL calls
the 'RDF dataset').

It's very common now to place data from multiple sources into a
triple-store and use named graphs to track the provenance, but then
query the store as if it were one graph. (See the example that I gave
in response to Ivan's email for more on this.)


> Best and thanks for replying, hopefully we can get the details of this
> agreed :)

Not at all...in fact I'm jumping on this. ;) I raised the distinction
between graphs and stores a while back but got some pushback; this is
a great way to revisit some of the issues.

Regards,

Mark

Received on Sunday, 31 October 2010 16:49:59 UTC