Proposal for ISSUE-12, string literals

I took an action today to draft text for RDF Concepts that resolves ISSUE-12. I put it on the wiki here:
http://www.w3.org/2011/rdf-wg/wiki/StringLiterals/EntailmentProposal
A plain text copy is attached below.

Best,
Richard



SHORT SUMMARY

1. RDF Concepts puts more emphasis on the distinction between (syntactic) “literal equality” and (semantic, important for applications) “value equality”
2. RDF Concepts explicitly points out the specific string value equalities that already arise from RDF Semantics
3. RDF Concepts declares one of the string literal forms as canonical
4. Implementations MAY canonicalize, but don't have to
5. The canonical form is plain literals.


WHY?

1. No changes to the abstract syntax required
2. No changes to any concrete syntax or parser required
3. No changes to any implementations of any of the existing entailment regimes required
4. Those who are ok with canonicalization can do that, and don't need to deal with entailment
5. Those who don't want to canonicalize, have the option of supporting only string value equality at query time, without RDFS- and D-Entailment
6. “MAY canonicalize” softly discourages the use of xsd:string typed literals, without abolishing them outright or declaring them archaic
7. Standardizing on xsd:string was never an option because of language tags
8. Standardizing on rdf:PlainLiteral was never an option because it MUST NOT be used in serializations that support plain literals


CHANGES TO 6.5.2 The Value Corresponding to a Typed Literal
http://www.w3.org/TR/rdf-concepts/#section-Literal-Value


§1 Rename it to “6.5.1 The Value Corresponding to a Literal” and move it ahead of 6.5.1

§2 Add to the beginning:
“The value of a plain literal without language tag is the same Unicode string as its lexical form.

The value of a plain literal with language tag is a pair consisting of 1. the same Unicode string as its lexical form, and 2. its language tag.

For typed literals, …” (continue with rest of section as is)

§3 Remove the Note at the end of the section


CHANGES TO 6.5.1 Literal Equality
http://www.w3.org/TR/rdf-concepts/#section-Literal-Equality


§4 Rename section to “6.5.2 Literal Equality and Canonical Forms”

§5 Add to the beginning:
“Equality of literals can be evaluated based on their syntax, or based on their value.”

§6 Change “Two literals are equal …” to: “Two literals are syntactically equal …” in the current first paragraph.

§7 Add to the end:
“In application contexts, comparing the values of literals (see section 6.5.1) is usually more helpful than comparing their syntactic forms. Literals with different lexical forms and with different datatypes can have the same value. In particular:

- A plain literal with lexical form aaa and no language tag has the same value as a typed literal with lexical form aaa and datatype IRI xsd:string
- A plain literal with lexical form aaa and no language tag has the same value as a typed literal with lexical form aaa@ and datatype IRI rdf:PlainLiteral
- A plain literal with lexical form aaa and language tag xx has the same value as a typed literal with lexical form aaa@xx and datatype IRI rdf:PlainLiteral”

§8 “Some literals are canonical forms. Implementations MAY replace any literal with a canonical form if both are syntactically different, but have the same value. All plain literals, with or without language tag, are canonical forms.”


CHANGES TO 6.3 Graph Equivalence
http://www.w3.org/TR/rdf-concepts/#section-graph-equality


§9 Append this leftover sentence, which was removed from 6.5.1:
“Note: For comparing RDF Graphs, semantic notions of entailment (see [RDF-SEMANTICS]) are usually more helpful than the syntactic equivalence defined here.”


EXTENDING THIS TO NUMERIC LITERALS???

(While we're at it, we might also cover equalities between the built-in numeric XSD types, and between different lexical forms of the same built-in XSD datatype.)

Received on Wednesday, 11 May 2011 21:53:04 UTC