Comments on TF-Graphs/Minimal-dataset-semantics

Ref. http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics

I understand this proposal is due to be discussed soon by the RDF working group, 
and would like to offer some comments based on my work with the W3C provenance 
WG.  (Although derived from my contact with the provenance WG work, these are my 
personal comments, and have not been discussed with or endorsed by the 
provenance WG.)

I am particularly keen that RDF Datasets can represent the kind of situation 
that is intended to be addressed by the provenance "mention" construct; cf. 
http://www.w3.org/TR/2012/WD-prov-dm-20120724/#term-mention, 
http://lists.w3.org/Archives/Public/public-prov-comments/2012Aug/0001.html.


First, my thanks to the authors of this proposal; generally, it seems to me to 
be a nicely crafted and useful proposal for underpinning semantically 
justifiable uses of RDF Datasets.

I'll respond to the discussion points raised in the proposal.  Some of my 
responses are marked "(preference)", indicating that I don't currently think the 
choice made is critical - that I see possible workarounds if the opposite choice 
is adopted.  I regard the important responses concern DD0 and DD5.


DD0: 
http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics#DD0:_Do_we_define_a_semantics_for_RDF_datasets.3F

Yes, please define semantics for datasets.  I feel that to fail to provide some 
level of framework for associating semantics with datasets would be a failure of 
the working group's charter.  Even if relatively few people actually read or 
understand the formal semantics, I feel they provide a "centre of gravity" that 
helps to promote consistent treatment of RDF constructs, particularly where 
subtle alternative uses are possible.


DD1: 
http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics#DD1:_Different_regime_for_default_graphs_and_named_graphs.3F

(preference)

I don't feel strongly about this, but on balance I feel that applying a single 
entailment regime across all graphs is simpler, easier to understand hence less 
likely to lead to divergent understanding or expectations.  I'm not aware of any 
compelling case for supporting multiple entailment regimes in a dataset.


DD2: 
http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics#DD2:_No-Semantics

(preference)

It's not clear to me what purpose is served by a weakened entailment regime. 
Depending on how and when it might be applied, It could even be considered 
contrary to the current RDF semantics which requires all semantic extensions to 
to consistent with base RDF semantics.


DD3: 
http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics#DD3:_Let_the_dataset_announce_its_assumed_entailment_regime.3F

(preference)

I am inclined to respond "no" to this, for 2 reasons:
(1) it is a new feature that might introduce further complications and 
difficulties for implementations and modellers.  As far as I can tell, not 
defining it now does not preclude defining such a feature as a future semantic 
extension when its implications are better understood.
(2) many applications will not support entailment regimes, or may have their own 
local and defining a dataset to depend on them could limit its utility.  Failure 
to implement an entailment regime should not, as I understand, lead to incorrect 
results, just incomplete ones.


DD4, DD5: 
http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics#DD4:_Does_the_graph_extension_assign_graphs_to_resources_or_to_IRIs.3F

I'm treating these together, because I think my response to DD5 renders DD4 
somewhat moot.

I think it would be very useful if a graph name n *does* denote the IGEXT(n) 
graph, as this would provide a hook for future semantic extensions.  In the 
context of provenance, we want to be able to express contexts/situations that 
are specializations of other (e.g. when talking about a web document on a 
particular date as a particular instance of that document during a particular 
year).  While I would not (necessarily) expect the specifics of such a mechanism 
to be part of the RDF Dataset semantics, having a name for talking about the 
graphs leaves open the possibility of introducing new properties with their own 
extension semantics.  The inconsistencies that would arise if the URI is used as 
some other kind of resource seem to me to be quite benign (i.e. "don't do that").

BUT: this begs a further question: is there any way to refer to the default 
graph (or some graph that entails the default graph) in an RDF Dataset?


DD6: 
http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics#DD6:_Open-graph_or_closed-graph_semantics

(preference)

I favour open graph semantics.  I think that is more consistent with current 
RDF, and less likely to lead to surprises.  (Based on my understanding of RDF, I 
find some of the illustrated results of closed graph semantics to be surprising.)


DD7: 
http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics#DD7:_Is_the_default_graph_universally_true.3F

(preference)

Nit:  I don't know what is meant by a graph satisfying another graph.  I assume 
that "Should the truth of a named graph require that the named graph satisfies 
the default graph?" is asking "Should the truth of a named graph require that 
the any interpretation satisfying the named graph also satisfies the default graph?"

My response to this would be "no".  I think this kind of additional semantic 
constraint should be for an extension to introduce (see my response above to DD5).

I feel that requiring this constraint universally could make it harder to say 
things about hypothetical or fictitious contexts.

...

#g

Received on Tuesday, 18 September 2012 09:25:02 UTC