Re: Review of "SPARQL 1.1 Update"

On 07/01/2010 6:35 PM, Paul Gearon wrote:
> After being snowed in, Christmas, and then being sick, I'm finally
> getting to my backlog...
>
> On Tue, Dec 22, 2009 at 6:57 AM, Andy Seaborne<andy.seaborne@talis.com>  wrote:
>> == Review of SPARQL 1.1 Update
>> Version of 18 December 2009
>>
>> = Overall
>>
>> 1/ We have
>> "SPARQL 1.1 Update"
>> "SPARQL 1.1 Uniform HTTP Protocol for Managing RDF Graphs"
>> "SPARQL 1.1 Protocol for RDF"
>>
>> Somewhere there needs to be an explaination of the three specs that relate
>> to this area.  I suggest we have another doc which is a guide to SPARQL 1.1
>> documents.
>
> I agree that we need something like this, but it will be pretty short, right?

Yes - I'd expect so.

Or we could include common boilerplate when XSLT is run but the HTTP doc 
is HTML, not xmlspec.

I like the idea of a short doc that then becomes the main link 
destination for SPARQL 1.1 documentation.

>> For this publication:
>> SPARQL 1.1 Update and SPARQL-HTTP need to have some text relating them.
>
> To get the ball rolling, I came up with:
>
> "A related document is SPARQL 1.1 Uniform HTTP Protocol for Managing
> RDF Graphs. This other specification employs the HTTP protocol to
> perform update operations using standard HTTP methods, such as PUT and
> DELETE. While providing a simple and well known API, it is necessarily
> restricted in its operations due to the limited methods in the HTTP
> protocol. In contrast, SPARQL 1.1 Update permits multiple
> modifications in a single operation, and can use complex SPARQL
> queries for constructing data to be inserted, or choosing data to be
> deleted. Also, the use of an update language facilitates operations
> over other APIs, along with allowing other protocols that may have
> different properties for maintaining connections."
>
> Suggestions for improvement please?

Looks good to me.

>
>> 2/ The term "SPARQL-Update" is used frequently but that isn't the name any
>> more.
>> e.g. Abstract
>>
>> "This document describes SPARQL-Update"
>> ==>
>> "This document describes SPARQL 1.1 Update"
>
> Hmmm, I didn't realize this was the name. I thought it was
> SPARQL-Update 1.1. The name "SPARQL 1.1 Update" makes me think that
> it's an "Update" on "SPARQL 1.1", not that it's an "Update" language.
>
> Mind you, if anyone were ever confused by this then I'm sure that
> they'd get over it pretty quickly. I've updated all occurrences.

"SPARQL 1.1 Update" is the title of the document currently and at last 
publication.  It's in big letters at the top of the document :-)

>> 3/ The language is ambiguous and really could do with fixing (or, at the
>> very least, noting the fact)
>
> Can you provide pointers to specific sections please?

There is a long description of the syntactic ambiguity in the main 
section of my review "2/ Ambiguity." below.

>> 4/ Fix INSERT DATA and DELETE DATA to take GRAPH and not use FROM/INTO.
>>   Fixing now will reduce confusion later.
>
> OK. I hadn't thought about the new syntax being applied to DATA, but I
> guess this is more consistent. (naysayers?)
>
> Should the WITH modifier be applicable here too? Since "INTO" is no
> longer being used for INSERT, then should it still be used with LOAD?
> (I'm guessing so).

I'm neutral as to the keywords for LOAD but there has to be some way to 
identify the destination graph.  More below.

>> = Main review
...

>> == 1.2 Document Conventions
>>
>> Need to discussion PREFIX and BASE somewhere.
>>
>> Maybe a "structure of an SPARQL Update Request section" that also uses the
>> material about multiple operations in one request, defines some terminology:
>> Graph store (already described) , operation, request, modify_template and
>> other terms used.
>>
>> Maybe the final section "2 Terminology" depending on how section 6
>> (defintions) works out.
>
> Should the descriptions of structure and terminology should be in the
> same section? I'd rather keep them separate, but it's hard to describe
> the terminology without it relating to the structure of a request, and
> vice versa.

Personally, I think structure can be done by discussing the overall 
syntax structure.  Terminology is defining words and phrases in a 
formal(ish) manner so they are different.

> I can put this in straight after "Document Conventions". Did you have
> anything to say on this section (it's woefully short).

This is a working draft publication - what's there is fine.


>> == 4.1 Note
>> No.
>> Protocol/Syntax does not allow it.
>
> OK. What about ISSUE-20? I'm working on the assumption that
> non-existent and empty graphs are different things, and I note that
> you (Andy) expressed the same point of view in the most recent email
> discussing it. Are we close to resolving this issue?

I don't understand the relationship to issue 20.  The note in sec 4.1 
says "Can a SELECT and an INSERT be done in the same query?"

As currently in the docs, update has no reply content so there is no way 
to get a result back from a SELECT.

>
>> == 4.2
>> "In the case of two different update services,"
>> and also connection to SPARQL-HTTP.
>
> You mean that I should add a link to SPARQL-HTTP? Do you mean Protocol
> 1.1, or the HTTP Protocol for Managing RDF Graphs? It looks like you'd
> like a link to the latter, but I'd have thought a link to the former
> to be more appropriate (since SPARQL 1.1 Update cannot be used on the
> latter). I've included a link to Protocol 1.1.
>
>> A system shoudl be able to provide an update service at:
>>   http://host/update
>>
>> and PUT/POST with
>>   http://host/update/localgraph
>>   http://host/update/?graph=http://host2/remoteGraph
>
> My reading of this section is that it refers to two (or more) separate
> services, such as:
>
> http://host/update-1/localgraph
> http://host/update-2/localgraph
>
> Where "localgraph" refers to the same graph accessed through these two
> separate services. This seems (to me) to be orthogonal to what you're
> discussing here.

It does not matter if the updates to the other update service are  "HTTP 
Protocol" or "SPARQL Update" - the point is that two graph stores can 
have graphs of the same name but they are unrelated (or may be exactly 
the same).

Maybe "HTTP protcol" and "SPARQL Update" could share some common 
terminology and common conceptual model.

The protocol form has the update language forms but does not talk about 
graph stores.  The language form should, I think, recognize that updates 
may be coming in HTTP protcol style as well and elsewhere.

>> == 5.1 Graph Update
>>
>> 1/
>> WITH/FROM/INTO/GRAPH
>> 5.1.1 (INTO), 5.1.2 (FROM), and WITH in 5.1.3, 5.1.4, 5.1.5
>> We have three mechanisms for the same thing which is confusing.
>>
>> It's weird it's:
>>   DELETE DATA FROM<uri>  { :x :p 123 . }
>> but equivalent in effect to:
>>   WITH<uri>  DELETE { :x :p 123 . }
>> or
>>   DELETE { GRAPH<uri>  { :x :p 123 . } }
>>
>> Shouldn't DELETE DATA use GRAPH?
>>   DELETE DATA { GRAPH<uri>  { :x :p 123 . } }
>> and eliminate FROM and INTO.
>
> Much as I don't like the syntax, I agree that consistency is more
> important. To that end, should WITH be supported on INSERT DATA and
> DELETE DATA? It's really only convenience, and for these operations it
> seems superfluous. But to be really consistent then maybe it should be
> added.

Maybe WITH should be a scoped form

WITH <uri>
{
    LOAD ...
    INSERT DATA ...

    INSERT...DELETE...WHERE
}

rather than a modifier of an operation.  Yet another alternative would 
be to apply from that point in the request (c.f. PREFIX / BASE).

>> 2/ Ambiguity.

Let's take this, and other syntax matters, to a separate thread.

>> [[
>> 5.1.1 INSERT DATA
>>
>> Insert data into a graph:
>>
>> INSERT DATA [ INTO<uri>  ]*
>> { triples }
>> ]]
>>
>> That's insert into one or more graphs.  But replace with restricted
>> modify_template* anyway.
>>
>> Same for DELETE
>
> So you're pointing out that the modify_template syntax doesn't allow
> the same set of triples to be inserted into (or deleted from) multiple
> graphs without restating those triples for each graph? Are you saying
> that you're OK with it? It's unfortunate, but I don't see it as a huge
> issue.

I am suggesting that "{ triples }" need to quads via GRAPH so one INSERT 
DATA can include triples for different graphs.

We have two ways of doing similar (but different) things.  Insert one 
data stream into a number of graphs, and insert the same data into a 
number of graphs.

INSERT DATA {
   :s :p 1 .
   GRAPH :g1 { :s :p 2 . }
   GRAPH :g2 { :s :p 3 . }
}

seems quite reasonably to me - it's a single operation to stream-load a 
dataset (pure quads would be nice especially for the HTTP form).

But keep [ INTO<uri>  ]* which affects several graphs with the same data.


Does anyway want the insert data into more than one graph at a time 
operation? (the "*" part of "[ INTO<uri>  ]*") -- which can be done 
verbosely with the INSERT/WHERE form.
...

If we have WITH being a scoped modifier, then a consistent design 
without the "*" is

WITH <uri> { INSERT DATA ... }


>> == B SPARQL-Update Grammar
>> Out of date.
>> Better to remove it and leave a placeholder note. than to publish
>> conflicting information.
>
> Removed for the moment.
>
> I don't trust myself on grammars. I'll have a go at it, but will need
> someone to triple check me.

I'm happy to do the mechanics of producing the grammar HTML including 
writing the generator input when we have an agreed design.  Yacker can 
be used to test designs if anyone is comfortable with writing EBNF.

>
> Regards,
> Paul Gearon

	Andy

Received on Friday, 8 January 2010 12:05:57 UTC