This is a proposal for addressing the following time-permitting item from the charter:
Reconcile various forms of string literals: at the moment we have plain literals, rdf:plainLiteral, and xsd:string literals. They are very very close to one another but they are officially different. In practice this means that, eg, SPARQL queries have to have a three branch UNION to handle all of these. Worth looking at some sort of a reconciliation of these.
This is ISSUE-12.
1. Abolish plain literals without language tag from the abstract syntax
2. "foo" and corresponding forms in other concrete syntaxes are syntactic sugar for "foo"^^xsd:string. In general, both forms MAY be used and represent identical literals in the abstract syntax.
3. N-Triples has use cases that are hampered by the variability introduced by syntactic sugar. One of the two forms is forbidden when serializing. Only when serializing, so that legacy documents can still be parsed. The N-Triples editors will take care of this.
4. This proposal does not consider what ought to be done about language-tagged strings or rdf:PlainLiteral; these questions are to be addressed in a separate proposal and decision
- This implies some changes to SPARQL. How to handle this, given the different WG timeframes? It's probably too late for SPARQL 1.1, or is it?
- In queries, "foo" now matches "foo"^^xsd:string and vice versa
- The notion of “simple literal” in the SPARQL spec does no longer reflect RDF Concepts
- datatype("foo") is now xsd:string without the need for an exception in the spec
- SPARQL Results XML/JSON are hampered by the variability introduced by syntactic sugar. One of the two forms should be forbidden when answering queries over RDF 1.1.
Comparison of current RDF and proposal
Blue italics indicate changes between current RDF and new proposal.
|Literals in the new proposal|
|Kind of literal||Concrete syntaxes||Abstract syntax||Value|
|Concrete syntax form||Allowed?||Ttl||NT||Spq||SRX||RDFa||R/X||Abstract syntax form||Allowed?|
| Strings without
|"foo@"^^rdf:PlainLiteral||MUST NOT||✓||✓||✓||✓||✓||✓||"foo@"^^rdf:PlainLiteral||MUST NOT|
| Strings with
|"foo"@en||✓||✓||✓||✓||✓||✓||<"foo",@en>|| <Unicode string,|
|"foo@en"^^rdf:PlainLiteral||MUST NOT||✓||✓||✓||✓||✓||✓||"foo@en"^^rdf:PlainLiteral||MUST NOT|
|Other literals||"lexical"^^datatype||✓||✓||✓||✓||✓||✓||"lexical"^^datatype|| Depends on L2V|
mapping of datatype
- One output syntax form should be stated as preferred. Suggestion: "Serializing an RDF graph SHOULD use the plain literal syntax "foo" in preference to the "foo"^^xsd:string form."
- This can then be suggested to other RDF-related formats, including SPARQL results formats.