Abstract data model update - discussion

well I started thinking about my action to update the abstract data model.
When I get round to it, there are some choices I need to make that well 
inevitably be controversial.

This message is to raise those issues for discussion now.

Issue 1
======
Can XML Literals be datatyped?
In RDF/XML is this legal:

<rdf:Description>
  <eg:prop rdf:datatype="&eg;foo" rdf:parseType="Literal">
  the value <em>ha</em>
 </eg:prop>
</rdf:Description>

I am intending to disallow this.

Issue 2
======
Does the literal label of a datatyped literal include the lang tag?
I am intended to allow this - bowing to pressure from Patrick and Pat (against 
my better? judgment), but note that then without the xsd engine, RDF alone 
cannot conclude that american Jenny and Italian Ginevra have the same age.

<rdf:RDF xml:lang="it">
<rdf:Description  rdf:ID="Ginevra">
  <eg:name>Ginevra</eg:name>
  <eg:age rdf:datatype="&xsd;int">10</eg:age>
</rdf:Description>
</rdf:RDF>

<rdf:RDF xml:lang="en-us">
<rdf:Description  rdf:ID="Jenny">
  <eg:name>Jenny</eg:name>
  <eg:age rdf:datatype="&xsd;int">10</eg:age>
</rdf:Description>
</rdf:RDF>

Issue 3
======
How untidy is the graph?
Options range from saying nothing (so that there maybe multiple occurrences of 
conceptually tidy nodes [e.g. URI labelled ones] with the same label - this 
then leaves Pat to do the necessary tidying); to an extreme syntactic version 
of WG decisions in which URI and datatyped literal nodes are tidied and 
untyped literal nodes are untidy.
I think I prefer the latter for the following reasons:
- the WG (and the community) has a general tendency to prefer syntactic 
expression of semantic truths where possible (hence the damp squib of my 
attempt to separate syntactic and semantic tidiness)

Issue 4:
=======
Can an untyped literal be the object of two triples?
I intend to answer "NO". (Strict untidiness of untyped literals).
Such *strictly* untidy literals do not need to be named in N-triples and 
leaves implementors with less to do. Also permitting untyped literals to 
occur as the object of multiple statments reintroduces the serilization 
problems that we have seen with bNodes (fixed for bNodes with rdf:nodeID - 
which doesn't immediatly generalize because of the empty property element 
production).

A test case is:

<rdf:RDF xml:base="http://example/">
  <rdf:Description rdf:about="#subj">
    <eg:prop rdf:ID="reify">literal</eg:prop>
  </rdf:Description>
</rdf:RDF>

does entail

<rdf:RDFxml:base="http://example/">
  <rdf:Statement rdf:about="#reify">
    <rdf:object>literal</rdf:object>
  </rdf:Statement>
  <rdf:Description rdf:about="#subj">
    <eg:prop>literal</eg:prop>
  </rdf:Description>
</rdf:RDF>

but neither entails

<rdf:RDFxml:base="http://example/">
  <rdf:Statement rdf:about="#reify">
    <rdf:object rdf:nodeID="blank"/>
  </rdf:Statement>
  <rdf:Description rdf:about="#subj">
    <eg:prop rdf:nodeID="blank"/>
  </rdf:Description>
</rdf:RDF>


that is the untyped literal node created for the reification is a different 
literal node than that created for the triple itself.
If the object is a typed literal or a uriref node then the usual tidiness 
rules would have resulted in the entailment above.

Issue 5
======
reacting to xml:lang=""
I intend to make the lang component of a literal compulsory, defaulting to "".
(I suggest ntriple does not need to include an empty lang tag)

Issue 6
======
Are RDF XML Literals tidy or untidy.
They are untyped inline literals, so I will make them untidy.
However we haven't formally decided that.
Test case 


<rdf:RDFxml:base="http://example/">
  <rdf:Description rdf:about="#s1">
    <eg:prop1>literal</eg:prop1>
  </rdf:Description>
  <rdf:Description rdf:about="#s2">
    <eg:prop2>literal</eg:prop2>
  </rdf:Description>
</rdf:RDF>

does not entail

<rdf:RDFxml:base="http://example/">
  <rdf:Description rdf:about="#s1">
    <eg:prop1 rdf:nodeID="b" />
  </rdf:Description>
  <rdf:Description rdf:about="#s2">
    <eg:prop2 rdf:nodeID="b" />
  </rdf:Description>
</rdf:RDF>



SUMMARY
=========

Thus I am imaging that a literal in ntriple will need to show:

A lang tag (if not "")
A string
Either "xml" or a datatype URI or nothing.

It will not need to show 
- both xml and a datatype at the same time
- both a literal and a node identifier at the same time
- any sort of unknown datatype (datatypes are always URIrefs).

Jeremy

Received on Saturday, 21 September 2002 06:52:51 UTC