Re: Proposal for ISSUE-12, string literals from Steve Harris on 2011-05-13 (public-rdf-wg@w3.org from May 2011)

From: Steve Harris <steve.harris@garlik.com>
Date: Fri, 13 May 2011 16:31:18 +0100
To: Richard Cyganiak <richard@cyganiak.de>
Cc: Alex Hall <alexhall@revelytix.com>, Pat Hayes <phayes@ihmc.us>, RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <D2F3E9FF-B2DC-4DC0-A074-C545B4C0DF50@garlik.com>

On 2011-05-13, at 16:00, Richard Cyganiak wrote:

> On 13 May 2011, at 15:33, Alex Hall wrote:
>> It's for this reason that I'd prefer to keep rdf:PlainLiteral out of the core RDF specs and reserve it for exchanging language-tagged literals with systems that don't support that notion.  Having to deal with the extraneous '@' for literals without language tags seems like needless complexity for what should be a simple string manipulation.
> 
> Strong +1. Earlier I tried to work out the changes to the spec that would be required to make rdf:PlainLiteral the unified representation of strings, and it's a bloody mess and I really don't want to go there. I kept my notes on the wiki anyways:
> http://www.w3.org/2011/rdf-wg/wiki/StringLiterals/SyntacticSugarProposal
> 
>> If we're going to say that everything has a datatype, I'd prefer to see "foo" get normalized to "foo"^^xsd:string.  But my reasons there are more aesthetic; it just seems wrong to single out that one particular primitive datatype and say that it should not be used.
>> 
> 
>> FWIW, my preferred approach would be to:
>> 1. Say that every literal has *either* a datatype *or* a language tag.
>> 2. Say that the datatype of the surface form "foo" is xsd:string.
> 
> This feels weird. Ok, "foo" is of type string, even though the type is implicit, I can understand that. But why is it no longer a string if I tag it as English? Shouldn't it still have an implicit type of string? So you have replaced one weird thing (multiple ways of representing a string) with another weird thing (a notion of string datatypes that doesn't make sense).
> 
> I think the sensible way would be:
> 1) every literal has *both* a datatype and a (possibly empty) language tag;

I suspect that a lot of existing RDF systems are built on the assumption that you can only have one or the other, not both.

That might be OK, if it's explicitly disallowed to use both in serialisations, but it adds more weirdness.

> 2) of the built-in datatypes, only xsd:string can have non-empty language tags;
> 3) plain literals and rdf:PlainLiterals don't exist;
> 4) "foo" in concrete syntaxes is syntactic sugar for "foo"^^xsd:string.
> 5) "foo"@en in concrete syntaxes is syntactic sugar for "foo"^^xsd:string@en.

It seems a little weird "foo" and "foo"@en have different types, but it's not the end of the world I guess, and it's the case in SPARQL anyway.

- Steve


> This *might* work better than the rdf:PlainLiteral mess when translated into spec changes, but raises BC issues, and requires changes to syntax specs to add the syntactic sugar, so I prefer the proposal that says implementations MAY unify to plain literals, as it doesn't require changes to the abstract syntax.
> 
>> As long as the surface forms "foo" and "foo"^^xsd:string get normalized to the same thing (or systems have permission to do such normalization) then I'm happy.
> 
> Good to hear that.
> 
> Best,
> Richard

-- 
Steve Harris, CTO, Garlik Limited
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD

Received on Friday, 13 May 2011 15:31:49 UTC