major technical: semantics are poorly specified

The semantics are poorly specified.  In general terms, the specification
can be characterized as consisting of three disjoint (though interleaved)
styles of presentation: the EBNF in Appendix A, the formal definitions
enclosed in boxes, and the discursive text outside the boxes.  The
problem is that there is almost no attempt to relate these three things
to one another.  Instead, it is left to the reader to deduce that
a particular term in the informal English refers to the same thing
as a particular syntactic element, and that these things correlate with
particular symbols in the formal semantics.

For example, section 7, "RDF dataset". 
It seems that the second definition, RDF dataset graph pattern, is
intended to be the formal semantic definition of the GRAPH clause
presented in section 8, "Querying the dataset".  However, there is
no text connecting this definition with that syntax (nor, indeed, is there
any explicit connection between that section and rule [23] 
GraphGraphPattern,
aside from the keyword GRAPH in the examples) .  What is needed
is some definition that starts with syntax (the rules of Appendix A)
and maps the syntactic
elements to the symbols you use in the definition.  For example, there 
is no
text connecting the formal symbol { G, (<u1>, G1), ... } with the
FROM clause.  Presumably what happens is that the RDF merge of all
graphs specified by rule [10] DefaultGraphClause is G,
and for each rule [11] NamedGraphClause, the IRI becomes a <ui> and
the graph that it identifies is Gi.

As a somewhat related problem, the discursive sections are not always
carefully written.  For example, section 2.4 "Pattern solutions".
It says "How a particular graph pattern matches a dataset is described
for each kind of graph pattern below".  However, the following sections
are not careful to rigorously define the word "match".  In particular,
section 2.5 "Basic graph pattern" seems to take "match" as a primitive
concept.  Thus we read in that section "For a basic graph pattern
to match some dataset, there must be a solution where each of the
triple patterns matches the dataset with that solution".  In this sentence,
the first occurrence of "match" is defined in terms of the second
occurrence, but the second occurrence is not defined.
It seems that the most primitive concept is a match of a triple pattern.
I think the definition is "If S is a pattern solution, then S is a
match for a triple pattern TP if S(TP) is a member of the dataset."
This refers the definition to the mathematically understood primitive
notion of membership.  After defining a match of a triple pattern,
the definition of a match of a basic graph pattern (ie, a set of
triple patterns) is "If S is a pattern solution, then S is a match
for a basic graph pattern BGP if, for every triple pattern TP in BGP,
S is a match for TP."  And so forth, all varieties of match can be
formally defined.

I will also note here that Appendix D, "Collected formal definitions",
is missing, which makes it hard for me to assure myself that the
formal definitions are complete and correct.  It certainly seems that the
formal definitions for "match" are incomplete.

Fred Zemke

Received on Thursday, 12 January 2006 21:34:53 UTC