Re: ISSUE-147 (preserve markup by default): RDFa Processors should preserve markup by default [RDFa 1.1 in HTML5]

Sebastian, one of your concerns seems to be that archived documents retain their meaning over time. The change from RDFa 1.0 to RDFa 1.1, and the different behavior with XHTML1 vs. what you propose suggest that depending on default behavior would be incompatible with your objectives. This would go for preservation of markup as well as the use of pre-defined prefixes.

In any case, for your purposes, adding datatype="rdf:HTML" would seem like a safe choice to make sure that the intent is clear. Wanting to change default behavior seems to be related to uses other than for your preservation, and you're suggesting that this is the right thing to do for everyone else, even if the behavior is different from that of other host languages.

Of your arguments, the most convincing to me preservation of accessibility information, particularly when sub-elements with different LR encodings might be used together.

It is something that we could do for HTML5+RDFa, but it has significant consequences; however, my inclination is to leave the rules as is, but add a note that in order to fully preserve literal state, and ensure that the document will be processed as intended through possible future revisions to RDFa, authors should not rely on default behavior, such as the definition of prefixes or the default representation of literal values. After all, this is what the rdf:HTML datatype is designed for, and recommending that authors use it to preserve the intent of the markup would be a best practice.

Perhaps you can join us on a future telecon to make your case further.

Gregg Kellogg
gregg@greggkellogg.net

On Dec 29, 2012, at 11:47 AM, Sebastian Heath <sebastian.heath@gmail.com> wrote:

> On Sat, Dec 29, 2012 at 8:25 AM, Ivan Herman <ivan@w3.org> wrote:
>> I agree with you and Gregg. The issue on XML Literal has been discussed a
>> lot. It wasn't an obvious issue, but the decision has been made.
>> 
>> Procedurally, it is correct to say that this WG has the right to define the
>> behaviour of HTML5+RDFa differently for XML Literals and/or for HTML
>> literals. However, the discussions for RDFa Core, as well as for
>> XHTML1+RDFa, obviously took into account the most important prospective
>> deployment of RDFa, i.e., HTML5, too. I also do not see any new evidence in
>> this thread that would justify essentially reopening this issue, and
>> introducing a major incompatibility between XHTML1, SVG, etc., and HTML5. I
>> am sorry, Andreas, but I am definitely not in favour of this change.
>> 
>> Ivan
>> 
>> ---
>> Ivan Herman
>> Tel:+31 641044153
>> http://www.ivan-herman.net
>> 
>> (Written on mobile, sorry for brevity and misspellings...)
> 
> 
> Ivan,
> 
> Thank you for your response. And thank you for clarifying that the WG
> can adapt the RDFa 1.1 Core document to the specific needs of an
> (X)HTML5 context, though I'm sure we all agree that that is inherent
> in the process of describing how RDFa works in a host language.
> 
> And again thank you for indicating that there is a requirement for
> new evidence. I am not sure why that is the case since this is a new
> issue opened on "RDFa 1.1 in HTML", but if that requirement does
> exist, I believe this situation meets it. Of course, that keeps us on
> procedural issues, which are less interesting than the basic point of
> making sure "RDFa 1.1 in HTML" is a well-constructed and robust spec
> that can meet the needs of many users. But since the objections to
> considering ISSUE 147 seem to be procedural, I'll address those.
> 
> New Evidence:
> 
> * The existence of RDFa Lite
>    When the decision to discard child elements when distilling RDF
> triples from RDFa+XML was made (ISSUE 19) [1], RDFa Lite did not exist
> [2]. This is relevant because it seems that a reason for discarding
> child elements is how "semantic data is consumed in the marketplace"
> [3].
> 
> I believe I'm on fairly firm ground in saying that one inspiration
> for RDFa Lite was to ease the process by which semantic data is
> firstly created and secondly consumed in an SEO oriented
> "marketplace". Although I do not use RDFa Lite, I fully recognize its
> need and utility. But to the extent that one downstream use of
> semantic data is determining default behaviors, I suggest we can now
> encapsulate those behaviors in RDFa Lite. Meaning, that RDFa Lite can
> retain the default production of Plain Literals when processing HTML5.
> In fact, I think that is a perfect use of RDFa Lite, one that was not
> possible when ISSUE 19 was decided. This should not lead to ISSUE 19
> being re-opened, but rather the current ISSUE 147 being considered on
> its merits.
> 
> 
> * Use cases in which child elements in (X)HTML5 are not a mistake
> 
> I think that such uses can be considered new evidence has already been
> recognized by Manu's inclusion of point 2 in his list of items that
> the WG should re-examine [4]. So perhaps the following is also my
> contribution to that re-examination.
> 
> Here's my use case and some of its history by way of explaining why I
> raised this issue now and in the context of "RDFa 1.1 in HTML5":
> 
> As I said, I edit an online scholarly journal "ISAW Papers" [5]. The
> end result of the publication process will be the deposition of XHTML
> files in the New York University Faculty Digital Archive for permanent
> preservation. It is my hope to also use RDFa to encode the semantic
> data in these articles. I have begun to do so and an example of the
> current state of work can be seen in the publicly accessible version
> of ISAW Papers 2 as delivered by the NYU library [6]. If you look in
> the source of that page, you will see many @property values which have
> child elements, in particular, dcterms:bibliographicCitation. It is
> not a mistake that those are there, they are important, and it is
> important to me that they be preserved in workflows that process this
> data. That is the source of my request that ("non-lite") RDFa
> distillers preserve that markup.
> 
> But in terms of procedure and the requirement for new evidence, here
> is where I am. For some time now I have been developing ISAW Papers
> with XHTML+RDFa 1.0, which did preserve child elements.  The RDFa Core
> Rec came out in June. THe community's attention turned more firmly to
> RDFa in HTML5, as did mine. In the fall I began converting my content
> to HTML5 and RDFa 1.1. (If you poke around ISAW Papers you'll see
> evidence of an ongoing process...) In making that conversion I saw
> that child elements where discarded.  This is a concern to me because
> I am hoping to encode my semantic data using RDFa 1.1, then use
> standardized tools that extract that semantic data, and then process
> that semantic data for both automatic-agent and human consumption. In
> all these cases, preservation of markup is extremely important.
> 
> It did take me some time to confirm that the discarding of markup was
> not caused by error on my part. Knowing now that it is a feature of
> the REC, I have now reported my concern in the context of the "RDFa
> 1.1 in HTML5", and I have made this report while the spec is a Working
> Draft. I don't think it is unusual that I have found this issue only
> after taking considerable time to work with both specs and hope
> realities of my timing aren't determinative in addressing ISSUE 147.
> To put that another way, I think I have done the "right thing" in
> pursuing the conversion now and in offering my feedback to the WG.
> 
> I do recognize that the development process for RDFa 1.1 Core looked
> forward to its deployment in HTML5. Ivan, you noted this above. But I
> do think its important that RDFa Core describes RDF in the context of
> generic XML documents and so did not bear the burden of fully
> addressing the role of RDFa in HTML5. This seems clear from the
> well-understood need for the "RDFa 1.1 in HTML5" product that we are
> working on now.
> 
> * Accessibility
> I am not an expert in this topic so I raise it with some hesitancy.
> But I would like my XHTML to be accessible as defined by the W3's Web
> Accessibility Initiative [7]. I see there that it suggests the use of
> HTML markup to achieve its goals, see the suggested use of the dfn
> element [8]. It is likely that such elements will end up in RDFa
> marked content such as dcterms:abstract. The current "RDFa 1.1 HTML5"
> spec discards that accessibility markup. But let me clear, I defer to
> Shane's greater expertise or any other official feedback from the WAI
> on this issue.
> 
> 
> Ivan, on the basis of the above, I wonder if you would be willing to
> reconsider your determination that insufficient new evidence has been
> introduced in order to consider ISSUE 147 within its stated "RDFa 1.1
> in HTML5" context.
> 
> Broadening this discussion slightly, in earlier messages I have made
> what I think are substantive points as to why either rdf:XMLLiteral or
> rdf:HTML should be the default production when parsing elements that
> have child elements. I won't repeat those here, other than to note
> that I highlighted issues of language, which I think are especially
> relevant as RDFa is deployed in HTML5. It looks like there was some
> consideration of language in the teleconference that resolved ISSUE
> 19, with the suggestion that this is where people might want markup
> preserved [9]. I agree and think "RDFa 1.1 in HTML5" is the right
> place to pursue that topic.
> 
> 
> To sum up with reference to previous messages:
> 
> 1) The consideration of ISSUE 147 with in the context of the Working
> Draft of "RDFa 1.1 in HTML5" is timely.
> 
> 2) There is new evidence in the form of the existence of RDFa Lite,
> the introduction of a use case in which child elements are not a
> mistake, full consideration of multi-lingual issues as they appear in
> HTML5 as used in the real world, and the possibility of WAI impact.
> 
> 3) There is a substantive case for the default production of
> rdf:XMLLiteral and rdf:HTML in the context of HTML5 and its variants.
> See "2)" immediately above.
> 
> 4) Some solutions have been offered in the form of: restricting the
> default discarding of markup to RDFa Lite in HTML5, writing separate
> specs for  RDFa inXHTML5 and HTML5, re-enforcing that elements
> without child markup should produce a Plain Literal.
> 
> I do hope the above discussion allows us to move beyond procedural
> issues to full consideration of the merits of ISSUE 147.
> 
> My final point is that I hope it's clear that this issue is of great
> importance to me. I want to use XHMTL+RDFa but this default
> behavior is a real impediment. One that I have only recently
> discovered. So I've tried to be clear in my
> language (though it's hard to be concise!). Please don't take
> that as abruptness or rudeness.
> 
> Thank you,
> 
> Sebastian.
> 
> [1] http://www.w3.org/2010/02/rdfa/track/issues/19
> [2] http://www.w3.org/standards/history/rdfa-lite
> [3] http://lists.w3.org/Archives/Public/public-rdfa-wg/2012Dec/0077.html
> [4] http://www.w3.org/2010/02/rdfa/track/issues/147
> [5] http://isaw.nyu.edu/publications/isaw-papers
> [6] http://dlib.nyu.edu/awdl/isaw/isaw-papers/2/
> [7] http://www.w3.org/WAI/
> [8] http://www.w3.org/TR/2012/NOTE-WCAG20-TECHS-20120103/H54
> [9] http://www.w3.org/2010/02/rdfa/meetings/2010-05-13#ISSUE__2d_19__3a__Default_generation_of_XMLLiterals
> 

Received on Saturday, 29 December 2012 21:05:45 UTC