Warning:
This wiki has been archived and is now read-only.

StringLiterals/LangTagDatatypes

From RDF Working Group Wiki
Jump to: navigation, search

Adding datatypes for each language tags may work as follows:

For a language tag {langTag}, "xxx"@{langTag} would be interpreted as a typed literal of type rdf:lang-{langTag}. For any language tag {langTag}, there is a datatype rdf:lang-{langTag} such that:

  • the lexical space is all unicode strings.
  • the value space is all pairs <string,{langTag}>
  • the lexical to value space is L2V(rdf:lang-{langTag})(xxx)=<xxx,{langTag}>

There is an infinite number of lang datatypes and {langTag} SHOULD be restricted to what RFC 5646 defines, but implementation MAY accept any string for lang tags (e.g., "foo"@mylangtag-bar42 MAY be considered as a valid literal by parsers, leaving conformance to RFC 5646 to an external processor when needed).

Additionally, there exists a class, for the time being named rdf:LangTaggedString, that corresponds to the set of all pairs <string,tag>, so that it is a superclass of all the lang tag datatypes.

It follows that the following triples are valid under the appropriate entailment regime:

rdf:lang-{langTag} rdf:type rdfs:Datatype;
                   rdfs:subClassOf rdf:LangTaggedString .
rdf:LangTaggedString rdf:type rdfs:Class;
                   rdfs:subClassOf rdf:PlainLiteral .

In OWL, we have for all pairs of distinct {langTag1} and {langTag2}:

rdf:lang-{langTag1} owl:disjointWith rdf:lang-{langTag2}.
rdf:LangTaggedString owl:equivalentClass [
    rdf:type rdfs:Datatype;
    owl:onDatatype rdf:PlainLiteral;
    owl:withRestrictions( [rdf:langRange "*"] )
].
rdf:lang-{langTag} owl:equivalentClass [
    rdf:type rdfs:Datatype;
    owl:onDatatype rdf:PlainLiteral;
    owl:withRestrictions( [rdf:langRange "{langTag}"] )
].

Drawbacks

This proposal has the following drawbacks:

  • there is an infinite number of datatypes (but smart implementations would not suffer from this);
  • OWL 2 does not talk about these new types, so the OWL 2 RDF-based semantics would be incomplete wrt RDF 1.1 semantics;
  • others?

Advantages

  • compared to rdf:PlainLiteral the lexical form is more natural;
  • there is a vocabulary to talk about strings with a language tag;
  • one can define language-specific range restrictions (e.g., ex:englishLabel rdfs:range rdf:langen.) in RDF without the need for OWL 2 datatype machinery;
  • compared to RDF alone, we have everything typed, which can be seen as a simplification;
  • others?