RE: bNodes as graph identifiers (ISSUE-131) from Markus Lanthaler on 2013-06-03 (public-rdf-wg@w3.org from June 2013)

From: Markus Lanthaler <markus.lanthaler@gmx.net>
Date: Mon, 3 Jun 2013 13:05:25 +0200
To: <public-rdf-wg@w3.org>
Message-ID: <00f901ce604a$40a0ced0$c1e26c70$@lanthaler@gmx.net>
On Sunday, June 02, 2013 8:09 AM, Pat Hayes wrote:
> >>> The main problem with doing so is that it would mean that if
> >>> someone Skolemized the graph names in a dataset, they'd be 
> >>> losing these conditions, which might significantly change the
> >>> meaning of the dataset.
> >> Right. Skolemization does lose meaning, in fact.

That's something that I think I still don't fully understand. Isn't the
whole point of standardizing Skolem IRIs (and reserving some IRI space) to
be able to recognize them as Skolem IRIs? If so, isn't a skolem IRI ===
bnode ID which just happens to use a different scheme; most likely "http"
instead of "_"?

If skolem IRIs != bnode IDs what's the point of standardizing them? Any
arbitrary IRI would yield the same results then.


> > As long as that's true, I don't see how we can be telling Steve, et
> al, to just Skolemize.

Well as I see it, skolem IRIs are just a trick to work around limitations in
(legacy) systems. You mint a skolem IRI before ingesting the data in such a
system. When the data is retrieved, the skolem IRI is replaced with a bnode
ID.


> > That's easy to know when dealing with parsing/generating g-texts,
> > but not when there might by references via API calls.

Could you explain this?


> >>> Also, I think our minimalist approach to dataset semantics is
> >>> going to be confusing to people and I think having this additional
> >>> twist, as simple as it is when one understands it, would be quite
> >>> confusing to people who haven't quite grasped it.
> >> The part that is confusing is the minimalist view of IRIs as graph
> labels.

Huge +1. What's the point of having a graph label if I can't use it to talk
about the graph in RDF? Quoting RDF Concepts:

"It is common to have the default graph contain triples that involve the
graph names of the other graphs in the dataset."

I would bet that 99.999% of the people would interpret this as "I can use
the graph name to make statements about the graph" - which is wrong. To be
fair, RDF Concepts also includes the following *non-normative* note:

"Despite the use of the word "name" in "named graph", the graph name does
not formally denote the graph. It is merely syntactically paired with the
graph. RDF does not place any formal restrictions on what resource the graph
name may denote, nor on the relationship between that resource and the
graph."


> At least we have here an opportunity to have *some* graph
> labels that actually make semantic sense. How can it be confusing to be
> told that labelling a graph with a label means that the label refers to
> the graph??  What else could it possibly mean?

I've asked that question several times but till now I haven't got any answer
:-(


> > It could mean that it refers to the graph indirectly, in the same way
> > IRI graph names do.
> 
> No, they don't refer to the graph ***at all***. There is no notion of
> "indirect reference" in RDF, in the current documents. Look, Sandro,
> PLEASE get this straight. The WG has taken a decision which implies
> that an IRI used as a graph name can refer to something other than the
> graph. That means that it DOES NOT REFER TO THE GRAPH. That is an
> absolute end of story regarding graph names and graphs. There is NO WAY
> in the current semantics to get around this.
>
> > Having both direct reference and indirect reference in one system is
> pretty complicated.
> 
> It would be if that is what we would have. But in fact, what we would
> have is one (and only one) way to have graph names referring to their
> graphs, which is to use bnode graph labels. WIthout this we have no way
> to refer to graphs. Using IRIs as graph labels, the WG has decided,
> does not provide any referntial connection between the label and the
> graph.

I think this is a fundamental problem in the current model. Unfortunately, I
don't believe we can solve it in a timely manner - even though I haven't
heard many arguments for this decision apart that systems (which used
datasets before they have been standardized) have been built under this
assumption.


> > If we only have indirect reference, then people can (nearly all the
> > time) just think of it as reference and have things work.

I'm sure people will use datasets under the assumption that the graph name
denotes the graph. IMHO, this will lead to a situation in the future where
we probably have to decide to change the semantics in a
backwards-incompatible way.

I would prefer to do that now, knowing that all previous systems relied on
extensions that haven't been standardized. The consequence of postponing
that decision is that we have to introduce a BC-break in a model that has
been standardized.



--
Markus Lanthaler
@markuslanthaler
Received on Monday, 3 June 2013 11:05:58 UTC