Re: ISSUE-147: PROPOSAL for rdfa:defaultDatatype

Sebastian,

- I understand your use case, and I do not deny it is a valid one. The problem we have is that there are many use cases which are just about the opposite, namely when the markup is really for presentation or for HTML purposes and does not have any meaning for the RDF part. I have myself run into this problem many many times, doing a simple thing like writing a foaf profile. A typical example is from my own foaf profile where I have something like:

<a rel="foaf:workplaceHomepage" href="http://www.cwi.nl"><span property="dc:title">Centre Mathematics and Computer Sciences (<abbr title="Centrum voor Wiskunde en Informatica">CWI</abbr>)</span></a>

The current RDFa 1.1 generates

 <> foaf:workplaceHomepage <http://www.cwi.nl> .
 <http://www.cwi.nl> dc:title "Centre Mathematics and Computer Sciences (CWI)" .

whereas, with the default XMLLiteral version it would say

<> foaf:workplaceHomepage <http://www.cwi.nl> .
 <http://www.cwi.nl> dc:title """Centre Mathematics and Computer Sciences (<abbr title="Centrum voor Wiskunde en Informatica"> CWI</abbr>)""" .

Obviously, the second set of triples is not what one wants, the <abbr> is used here for user interface/accessibility purposes. 

Running up to RDFa 1.1 we got many feedbacks of such structures and the overall feedback was that generating XML Literals by default creates lots of problems for users who are forced to use the @datatype="" hack.

So, after long discussions, we made a decision to change this. That boat has sailed.

- Your solution means, if my understanding is correct, that we would have some sort of a parametrizing of the RDFa processor behaviour.  Ie, by adding some parameters into the file, the behaviour of the processor changes insofar as it would generate an XML (or HTML) literal. This creates a major change in the way the processing works and also opens up the floodgates: I could imagine a number of other such cases that could depend on parameters (a good example: should we keep white spaces in the text as they are in the source, or should we merge them? The current decision is the former although, I must admit, I would have preferred the latter.)

I would be extremely uneasy adding this to HTML5+RDFa; HTML5+RDFa is just defining a language/host profile for RDFa Core, it is not supposed to introduce such a radical departure from the way an RDFa processor works (and this sort of approach would fairly radically change existing implementations). It would also means a major incompatibility between HTML5+RDFa and XHTML+RDFa. 

Finally... you reject the current solution whereby the author would have to put an explicit @datatype into the code because the author may get it wrong. Well, isn't this the same problem? We have had the experience with authors forgetting to put namespace/prefix declarations into the RDFa source all the time, hence our approach of defining a default context for RDFa 1.1; what makes you think that this extra parameter will not be forgotten all the time?

I can imagine particular RDFa implementations introducing non-standard parameters that would govern whether HTML/XML literals would be generated by default or not. If there is a real need for that, implementation would do that. I could even go as far as defining that parameter's name in the HTML5+RDFa document, or even in a revised version of the RDFa 1.1 Core, as long as implementation of that parameter is not required (but, if an implementation wants to do it, it would use a predefined parameter name), but this is as far as I can see going.

Sorry...

Ivan


On Jan 8, 2013, at 14:35 , Sebastian Heath <sebastian.heath@gmail.com> wrote:

> First I'd like to thank Ivan [1], Gregg [2], and Manu [3] for their
> thoughtful replies on 29/12/2012. Other commitments kept me from
> responding right away but I am hoping to provide more context before
> the teleconference on the 10th.
> 
> Preliminarily and germane to the PROPOSAL I'll make in this e-mail,
> I'd like to consider a few of the points made in those message.
> 
> Gregg wrote that it seemed I was very concerned with the archival
> document that I am creating. This is true but not my only concern. My
> most immediate concern is the workflows I am developing for present
> processing and analysis of XHTML+RDFa 1.1 documents.
> 
> The generic scenario is as follows:
> 
> 1) Use a command line processor to extract triples from XHTML+RDFa1.1 document
> 2) Identify the parts of the document that are of interest to users
> on the basis of a SPARQL query
> 3) Display those parts of the document to users.
> 
> I hope it is clear that the current XHTML5 spec will by default
> discard information in this workflow. A more specific is example is
> that I have texts which make reference to geographic entities. We mark
> these up in the following way:
> 
> <span rel="dcterms:references" typeof=''dcterms:Location">
>  <a rel="rdfs:isDefinedBy" property="rdfs:label"
> href="http://pleiades.stoa.org/places/668331">Palmyra (<span
> xml:lang="grc">Πάλμυρα</span>)
> </span>
> 
> 
> I want rapper to identify all the dcterms:Location entities in my
> documents, discover their lat/long via reference to the Pleiades URI,
> and then create a map labelled with the rdfs:lable value as it appears
> in the text. Sure I can do various code-driven things to go back into
> the text and grab the original but that is a practical burden. Yes, I
> could have my editors put on a @datatype. But these are Manu's
> "beginners". They will make mistakes. It is much simpler for RDFa to
> respect the fact that these strings are XML and to retain the markup
> by default.
> 
> This overlaps with Manu's suggestion that I am asking for purity. I
> am not. I am in the real world here and do not want RDFa to co-erce
> (which term I use in its CS meaning) to the simple - dare I say "pure"
> - form of a plain literal. Instead I would like it to preserve the
> messy reality of my actual data. I don't just mean to say "Hey wait,
> you're being pure. I'm the realist" but to point out that there are
> many,  many real and practical uses for RDFa that are negatively
> impacted by the not pursuing ISSUE-147 to a more robust solution than
> just closing it. The real world is messy and currently the RDFa 1.1 in
> HTML5 spec makes it hard to deal with that messiness. Hard for
> developers and hard for beginners. Especially in that it silently
> discards [4] intentional markup.
> 
> So....
> 
> PROPOSAL: Define an rdfa:defaultDatatype attribute that can be used
> in XHTML5 texts. This would take the form of:
> 
> 
>  rdfa:defaultDatatype="rdf:HTML" (slightly more technically, it would
> have an rdf:range of rdf:resource).
> 
> When this attribute is in scope, @property processing will produce
> that datatype for elements that have children. If the value is
> "rdf:HTML" or "rdf:XMLLiteral" processing will be according to the W3
> rules defined for those types.
> 
> I am not a spec writer but I hope the intent is clear. I believe this
> is a flexible mechanism that provides for robust preservation of the
> original intent of markup undertaken by both beginners and experts. I
> believe it accommodates the other use cases and evidence I have raised
> previously on this issue. I believe it is timely in that we are
> considering more substantial incompatibilities such as @itemref.
> 
> Thanks,
> 
> Sebastian.
> 
> 
> 
> [1] http://lists.w3.org/Archives/Public/public-rdfa-wg/2012Dec/0083.html
> [2] http://lists.w3.org/Archives/Public/public-rdfa-wg/2012Dec/0084.html
> [3] http://lists.w3.org/Archives/Public/public-rdfa-wg/2012Dec/0086.html
> [4] While I have used "destroy", I have done so in recent e-mails.
> "discard" is more accurate and maybe we should keep to that.
> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
FOAF: http://www.ivan-herman.net/foaf.rdf

Received on Tuesday, 8 January 2013 16:05:06 UTC