User:Azimmerm/RDF-semantics
This is an attempt to redefine the RDF semantics in such a way that interpretations do not depend on a particular vocabulary. In this proposal, all IRIs and all literals are interpreted as resources. One consequence is that the domain of an interpretation (its universe) is always infinite because it contains at least all the unicode character strings. I also introduce the notion of LV-entailment, which is a weak semantics that simply makes graphs equivalent up to semantic equivalence of datatype values (related to ISSUE-90). This modification has very little consequences on existing implementation, except that it solves ISSUE-84, which is regarded as a bug of RDF 1.0 Semantics.
Solves:
- ISSUE-84: "Bug" in D-entailment with literals in non-canonical form.
- ISSUE-90: Define a simple form of “literal value entailment”.
Does not solve:
- ISSUE-85: Update RDF Semantics to distinguish between the identity of values and the (numeric) equality of values to be in line with XSD 1.1.
Contents
Interpretation
A simple-interpretation I is a tuple (IR, IP, IEXT, IS, IL, LV) such that:
- IR is a non-empty set of resources, called the domain or universe of I.
- IP is a set, called the set of properties of I.
- IEXT is a mapping from IP into the powerset of IR x IR, i.e. the set of sets of pairs <x,y> with x and y in IR.
- IS is a mapping from U into (IR union IP)
- IL is a mapping from typed literals into IR.
- LV is a distinguished subset of IR, called the set of literal values, which contains all character strings and pairs <str,lang> (???).
Denotation of ground graphs
Semantic conditions for ground graphs:
- if E is a xsd:string-typed literal "aaa"^^xsd:string then I(E) = aaa
- if E is a language-tagged literal "aaa"@ttt then I(E) = <aaa, ttt>
- if E is a typed literal then I(E) = IL(E)
- if E is an IRI then I(E) = IS(E)
- if E is a ground triple (s, p, o) then I(E) = true if
- I(p) is in IP and <I(s),I(o)> is in IEXT(I(p))
- otherwise I(E) = false.
- if E is a ground RDF graph then I(E) = false if I(E) = false for some triple E' in E, otherwise I(E) = true.
Blank Nodes as Existential Variables
[No change to the RDF 2004 spec]
LV Interpretations (at risk)
Let D be a datatype map. We define an LV-interpretation(D) (or, LV-interpretation with respect to D) as a simple interpretation I which satisfies the following conditions:
LV semantic conditions.
- if <aaa,x> is in D then for any typed literal "sss"^^ddd with I(ddd) = x,
- if sss is in the lexical space of x then IL("sss"^^ddd) = L2V(x)(sss),
- otherwise IL("sss"^^ddd) is not in LV
RDF Interpretations
An rdf-interpretation is a simple interpretation I which satisfies the extra conditions described in the following list and all the RDF axiomatic triples.
RDF semantic conditions.
- x is in IP if and only if <x, I(rdf:Property)> is in IEXT(I(rdf:type))
- if xxx is a well-typed XML literal string, then:
- IL("xxx"^^rdf:XMLLiteral) is the XML value of xxx;
- IL("xxx"^^rdf:XMLLiteral) is in LV;
- IEXT(I(rdf:type)) contains <IL("xxx"^^rdf:XMLLiteral), I(rdf:XMLLiteral)>
- if xxx is an ill-typed XML literal string, then
- IL("xxx"^^rdf:XMLLiteral) is not in LV;
- IEXT(I(rdf:type)) does not contain <IL("xxx"^^rdf:XMLLiteral), I(rdf:XMLLiteral)>;
- if xxx is a UNICODE string then:
- IEXT(I(rdf:type)) contains <xxx,I(xsd:string)>;
- IEXT(I(rdf:type)) contains <IL("xxx"^^rdf:HTML),I(rdf:HTML)>;
- if xxx is a UNICODE string and t is a canonical language tag then IEXT(I(rdf:type)) contains <(xxx,t),I(rdf:langString)>;
RDFS Interpretations
An rdfs-interpretation is an rdf-interpretation I which satisfies the RDFS semantic conditions in RDF 2004 and all the RDFS axiomatic triples.
[No need to change the semantic conditions here.]
The following additional axiomatic triples should be added:
rdf:HTML rdf:type rdfs:Datatype .
rdf:HTML rdfs:subClassOf rdfs:Literal .
xsd:string rdf:type rdfs:Datatype .
xsd:string rdfs:subClassOf rdfs:Literal .
rdf:langString rdfs:subClassOf rdfs:Literal .
Datatyped Interpretations
If D is a datatype map, a D-interpretation is any rdfs-interpretation I which satisfies the following extra conditions for every pair < aaa, x > in D:
General semantic conditions for datatypes.
- if <aaa,x> is in D then I(aaa) = x
- if <aaa,x> is in D then ICEXT(x) is the value space of x and is a subset of LV
- if <aaa,x> is in D then for any typed literal "sss"^^ddd with I(ddd) = x,
- if sss is in the lexical space of x then IL("sss"^^ddd) = L2V(x)(sss),
- otherwise IL("sss"^^ddd) is not in LV
- if <aaa,x> is in D then I(aaa) is in ICEXT(I(rdfs:Datatype))
Note that a D-interpretation is also an LV-interpretation with respect to D.
Some consequences on entailment:
- with RDF entailment, the following triples are tautologies (valid in all RDF graphs, even the empty graphs):
_:bnode1 rdf:type rdf:XMLLiteral .
_:bnode2 rdf:type xsd:string .
_:bnode3 rdf:type rdf:langString .
_:bnode4 rdf:type rdf:HTML .
- with RDFS entailment, for all IRI uuu, the triple:
uuu rdf:type rdfs:Resource .
is RDFS-entailed by all RDF graphs, even the empty graph.
A system that perform inference materialization should avoid materializing those triples, as well as the triples of the form:
rdf:_i rdf:type rdf:Property .
and similar.