Re: ISSUE-12: xs:string VS plain literals: proposed resolution

On Wed, May 4, 2011 at 2:29 PM, Eric Prud'hommeaux <eric@w3.org> wrote:

> * Alex Hall <alexhall@revelytix.com> [2011-05-04 14:08-0400]
> > On Wed, May 4, 2011 at 1:36 PM, Lee Feigenbaum <lee@thefigtrees.net>
> wrote:
> >
> > > On 5/4/2011 1:17 PM, Pat Hayes wrote:
> > >
> > >>
> > >> On May 4, 2011, at 9:08 AM, Lee Feigenbaum wrote:
> > >>
> > >>  I'd like to understand if the proposed resolution of this issue is
> > >>> ("merely") a recommendation, or is a change to RDF syntactic
> equality. In
> > >>> particular, will we be changing
> > >>> http://www.w3.org/TR/rdf-concepts/#section-Literal-Equality such
> that
> > >>> "foo" and "foo"^^xsd:string are equal literals?
> > >>>
> > >>> Looking at this through SPARQL's eyes (as I am wont to do), one of
> the
> > >>> goals of this change is so that I can write:
> > >>>
> > >>> SELECT ... { ?s :p "foo" }
> > >>>
> > >>> and have that match whether the data that was loaded into the store
> was
> > >>> "foo" or "foo"^^xsd:string.
> > >>>
> > >>> Recommending that stores canonicalize to "foo" would be one way to
> > >>> accomplish this, but only for new data. (And even then, is only a
> > >>> recommendation.) If we changed (or made a SHOULD-style change)
> literal
> > >>> equality, then the above query would match against :s :p
> "foo"^^xsd:string
> > >>> as well as :s :p "foo", which -- for me -- is the goal of this issue.
> > >>>
> > >>
> > >> Well, have SPARQL decide that the appropriate entailment is
> > >> {xsd:string}-entailment (that is, D-entailment where D={xsd:string}),
> and
> > >> that fixes the necessary matching. Seems to me that this is not RDF
> > >> business, in fact. RDF already provides the machinery for doing this,
> all
> > >> SPARQL has to do is use the existing RDF specs appropriately.
> > >>
> > >
> > > Then maybe I don't understand the original motivation behind ISSUE-12
> in
> > > this working group at all.
> > >
> > > *shrug*
> > >
> > >
> > >From what I can tell based on looking at the charter, the original
> > motivation was exactly what you stated: to make querying for string data
> > simpler in SPARQL.
> >
> > Unfortunately, the only ways I can see of making that work transparently
> in
> > SPARQL are:
> > 1. Follow Pat's suggestion and define SPARQL BGP matching in terms of
> > {xsd:string}-entailment.
> > 2. Modify the abstract syntax specified in RDF Concepts so that there's
> only
> > one way of expressing string data in an RDF literal, which seems to be
> what
> > you're asking for.
>
> 3. Add a little text saying that plain literals are preferred to
> literals of type xsd:string.
>
> The RDB2RDF WG faced this in defining the Direct Mapping of relational
> databases to RDF. The ISO SQL committee provides a mapping of SQL
> types to XSD types, and naturally SQL's string types (STRING, CHAR(n),
> VARCHAR(n)) map to xsd:string. Because we didn't want to needlessly
> encumber users with a typed literal when a plain literal would do, we
> overrode the mapping for strings (ints, etc. still map per ISO). A
> little guidance text could encourage others to do the same and
> unification will get that much easier.
>
>
I understand, and I'm not arguing against doing this.  What I'm saying is,
if the goal is to make queries of string data work *transparently* from
SPARQL, then encouraging one form over another doesn't go far enough.  Users
will still have to account for the presence of legacy data or the
possibility of producers ignoring the guidance.

If transparency from a query perspective is not our goal, or is not a
feasible goal, then I'm +0 on the proposed resolution as stated by Antoine.

-Alex



>
> > I'm not fundamentally opposed to either of those approaches, but they
> both
> > would require significant changes to deployed code.  Given a choice, I
> would
> > go with the second one because I don't think the problem is confined to
> > SPARQL.  I personally think that making a breaking change to the abstract
> > syntax would be worthwhile in this case because string data is so
> pervasive,
> > but I wouldn't be surprised if there's backlash from the community over
> > that.
> >
> > The proposed resolution for ISSUE-12 appears to me to be avoiding making
> any
> > breaking changes by recommending that data producers prefer one form
> > syntactic form over another.  I share your skepticism over how well that
> > will work in the long run.
> >
> > -Alex
> >
> >
> >
> > > Lee
> > >
> > >
> > >
> > >> Pat
> > >>
> > >>
> > >>> (SPARQL defines matching based on subgraphs, which in terms is based
> on
> > >>> RDF graph equivalence.)
> > >>>
> > >>> I'm not an expert on the RDF standards documents, admittedly, so I
> might
> > >>> be missing something.
> > >>>
> > >>> thanks,
> > >>> Lee
> > >>>
> > >>> On 5/4/2011 6:04 AM, Antoine Zimmermann wrote:
> > >>>
> > >>>> Hi,
> > >>>>
> > >>>>
> > >>>> With respect to ISSUE-12, I propose that we reformulate the
> resolution
> > >>>> as follows:
> > >>>>
> > >>>> "PROPOSED: Recommend that data publishers use plain literals instead
> of
> > >>>> xs:string typed literals and tell systems to silently convert
> xs:string
> > >>>> literals to plain literals without language tag."
> > >>>>
> > >>>> In the text of the spec, we may want to add some more details,
> saying:
> > >>>>
> > >>>> "In XSD-interpretations, any xs:string-typed literal
> "aaa"^^xs:string is
> > >>>> interpreted as the character string "aaa", that is, it is the same
> as
> > >>>> the plain literal "aaa". Thus, to ensure a canonical form of
> character
> > >>>> strings and better interoperability, we recommend that data
> publishers
> > >>>> always use plain literals instead of xs:string typed literals and
> tell
> > >>>> systems to silently convert xs:string literals to plain literals
> without
> > >>>> language tag whenever they occur in an RDF graph."
> > >>>>
> > >>>>
> > >>>>
> > >>>> Regards,
> > >>>>
> > >>>
> > >>>
> > >>>
> > >> ------------------------------------------------------------
> > >> IHMC                                     (850)434 8903 or (650)494
> 3973
> > >> 40 South Alcaniz St.           (850)202 4416   office
> > >> Pensacola                            (850)202 4440   fax
> > >> FL 32502                              (850)291 0667   mobile
> > >> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >
>
> --
> -ericP
>

Received on Wednesday, 4 May 2011 18:55:08 UTC