Warning:
This wiki has been archived and is now read-only.

GRDDL Issue

From OWL
Jump to: navigation, search

Small intro on GRDDL

The goal of GRDDL is to make it easy to transform XML content into RDF by associating a suitable transformation to the XML data. This can be done by either associating transformations to the XML instance data or by associating transformations to the Namespace document. In the latter case the idea is that a GRDDL processor would find the reference to the Namespace in the XML instance data, would find references to the transformations in the Namespace document, and would use those to transform the XML instance data.

The widespread usage of GRDDL is via XSLT. Ie, the references to the transformations are URI-s of XSLT transformations and the GRDDL processor would apply those transformations on the original XML data to yield RDF (usually RDF/XML). Smarter GRDDL processors may cache the transformation(s), to avoid retrieving them every time they see the XML data, or even use other, equivalent transformation. This means that the URI-s of the transformations act, in fact, as some sort of unique identifiers. The details of this caching is not described in the GRDDL recommendation.

It must be noted that the GRDDL recommendation does not specify that the transformation must be XSLT. It can be any other transformation or Web Service, as long as it has a URI (though it is not really clear how a GRDDL processor would find out the type of transformation). However, the practical fact is that most (if not all) GRDDL implementation today understand XSLT only.

To make things more complicated, one way of reading the spec is that it does not specify that this must be the URI for anything executable; in other words, the “unique identifier” aspect, as described above, could be the only necessary feature. The discussion on the GRDDL mailing list revealed that there seems to be a mismatch between what the intention of the majority of the Working Group was at the time of writing the recommendation and the recommendation itself. The representatives of the GRDDL WG (note that the WG is not active any more) did not see any practical usage of a non-executable reference. But the fact remains that the recommendation itself is not 100% clear in that aspect. Some would consider that as a bug, others as a feature; this is not something the OWL WG should take side on.

The possible choices for the OWL WG

In light of what has been said, the following options are available for the OWL WG:

  1. Adopt GRDDL by developing an XSLT transformation for OWL/XML->RDF/XML, and add the suitable references to the Namespace document of OWL/XML
  2. Consider the GRDDL only in the non-executable sense, ie, add a reference to, say, the OWL/XML specification only
  3. Do nothing, ie, do not add anything related to GRDDL to OWL/XML

There are a bunch of pro and con arguments for the various options that came up during the discussions. These are:

  • (A) Pro 1: by adopting this, existing GRDDL Processors would work out of the box right away and could make an easy bridge from OWL/XML to RDF/XML. Ie, users could use OWL/XML to write down their ontologies if they wish, or tools can produce those using this format, and the result would be readily available in RDF/XML (note that a number of RDF oriented tools, like OpenLink's virtuoso, Jena, a number of RDF browsers have a GRDDL processor built in already)
  • (B) Con 1: What would be the really important use case?
  • (C) Con 1: The working group’s manpower is getting thin, and there is no real justification for the amount of work that this would require. Such an (XSLT or other) transformation would be equivalent to the specification itself, and the Working Group should not duplicate its work by (a) writing a recommendation and (b) developing a software.
  • (D) Con 1: A software would not go through the same rigorous review process by W3C members than the recommendation itself. This means that the quality could not be guaranteed the same way as the recommendation itself.
  • (E) Con (D): it will be the task of the WG to develop an exhaustive test suite for the OWL/XML->RDF/XML. If a software (XSLT or otherwise) passes all those tests, the quality can be assured.
    • We're already behind on the test suite. Producing the test suite alone is a major, major effort. Producing one that is an actual test suite for software (e.g., conformance worthy) is almost certainly not possible. There are lots of areas we need to test besides this transformation. --BJP
    • There are many other aspects of an implementation besides correctness of the transform (including scalability, security, etc.)--BJP
    • Another way of putting this point is that it anticipates a huge amount of future work. Even supposing we could do that work, is that the best use of our time? Is this a precedent we want to set? (I don't) --BJP.
  • (F) Con 1: A GRDDL transformation added to the Namespace document (which is controlled by the Working Group) would be seen as the “canonical” transformation. That would discourage others to develop similar transformations. W3C should not be in the business of developing or “blessing” software that would be viewed as a competitor to software that, say, members would develop, possibly as part of their business model.
  • (G) Con (F): The “unique identifier” aspect, described above, makes it perfectly possible for a GRDDL processor to use another service or transformation, in case the implementor considers it superior by any respect. The transformation referred to in the Namespace document could be considered as a “fall-back” in case the GRDDL processor does not know anything a priori of OWL/XML.
    • Then the fact that one could possibly install another browser on Windows95 meant that having the prominent default being IE had no impact on IE's market share? Note that *fall-back* means that something else is tried first. The current GRDDL model is that the XSLT is the first choice (by default). --BJP
      • The GRDDL model is perfectly o.k. with using the URI of the transformation to identify a possibly local or other, alternative, implementations. “fall-back” means that if the implementation does not have access or knowledge to other implementations, it can always decide to use the “real” transformation referred to by the URI.--IH
  • (H) Con 2: It is unclear what the usage of that approach would be. It would not bring anything that the Namespace URI itself would not by itself. (Another way of saying this is that if Alternative #1 is rejected, then alternative #3 should be considered.)
  • (I) Pro 1: The WG charter includes a reference to the GRDDL transformation, and alternative 2 is not what the community expects from a GRDDL aware setup (see the discussion on the different readings of the recommendations)
    • The wider web community expects differently. Explicit plugins from a vendor (rather than a namespace document) are the norm.--BJP
      • which is not in contradiction with the "unique identifier" aspect above.--IH
    • The charter has enough wiggle room that we can drop the GRDDL requirement.--BJP
      • that may not be the opinion of some other members... --IH
      • Well, either it does, or it doesn't. From a textual point of view, it clearly does, see Deliverables: "Other deliverables may include (at least):" and then the XML syntax with GRDDL. The "may include" clearly indicates the total set of possibilities, not strict inclusion (for example, we could drop the XML and add the Manchester syntax). Therefore, we could add an XML syntax without GRDDL and drop the one with. One might not like that possibility, but it's clearly there in the charter text. If the counter is "reasonable people might have thought otherwise", well, then we're back to my reasonable belief that we could do something minimal. I'll also add that popping stuff like this into the charter is really an anti-social, anti-consensus move. Thus, it is more in the spirit of the W3C to read such requirements lightly, rather than strictly. --BJP
  • (J) Con 1: Strain on the W3C infrastructure
    • Although that mail related primarily to an issue of a different scale than the OWL/XML GRDDL transformation would ever be, namely to the HTML DTD-s that browsers and, primarily, HTML editors and authoring tools download all the time instead of caching them. That is on a different scale. -- IH
    • So, either there will *not* be sufficiently wide scale, in which case, why are we bothering, or there *will* be, and we need to plan for it. --BJP

References