My review of RDFa Core 1.1 (2011-12-15 version) from Niklas Lindström on 2012-01-19 (public-rdfa-wg@w3.org from January 2012)

From: Niklas Lindström <lindstream@gmail.com>
Date: Thu, 19 Jan 2012 15:54:19 +0100
To: public-rdfa-wg <public-rdfa-wg@w3.org>
Message-ID: <CADjV5jf6ms3PR1ykemPnxy_eTxaFwigObfohchJCogVEpq-SmQ@mail.gmail.com>
Hi all!

Here are the results of my review of RDFa Core 1.1.


"Abstract"
----------

* The first occurrence of "RDFa Primer" is a direct link to
<http://www.w3.org/TR/xhtml-rdfa-primer/>. Shouldn't it be given in
the same way as the other references to it, i.e. as "RDFa Primer
[RDFA-PRIMER]" with the square bracket reference to the link in the
"Normative references"? Also, the link to the document is incorrect;
it should be <http://www.w3.org/TR/rdfa-primer/>.

"How to Read this Document"
---------------------------

* How about splitting the sentence "There is a lot of material about
RDF on the web, and a growing range of tools that support RDFa, this
document only contains enough background on RDF to make the goals of
RDFa more clear." into two: "There is a lot of material about RDF on
the web and a growing range of tools that support RDFa. This document
only contains enough background on RDF to make the goals of RDFa more
clear."?

"Status of This Document"
-------------------------

* I think that the part describing the differences from RDFa 1.0
(paragraph 3 and subsequent list) should link to the appendix about
"Major differences with RDFa Syntax 1.0". Also, perhaps the list isn't
needed, since it only mentions some of these changes?

"1. Motivation"
---------------

* In "1. Motivation", I find the prominence of the initial part on
validation a bit strange. Although host languages may be validated,
the expressed RDF graphs aren't automatically validated because of it.
Instead, I would bring to the forefront the publishing aspect. Also, I
recall that Manu has expressed some very good points about why the
adding metadata inline is easier for a lot of publishers, than to
publish e.g. RDF/XML in parallel. It would be great to add a piece on
that!

"2.2 Examples"
--------------

* The first example uses the terms "author", "prev" and "next". But
these aren't mapped to property IRIs anymore, right? If they don't I
believe it's a poor example. Although I'm a bit confused by the
XHTML+RDFa 1.1 spec which still links to
<http://www.w3.org/2011/rdfa-context/xhtml-rdfa-1.1>, defining
these... In any case, it is probably confusing to have this in RDFa
Core 1.1 if it relies on terms defined for XHTML only...

* In the text "If some displayed text is different to the actual
'value' it represents" it should probably read "different from".

* In the text "In many cases a block of markup will contain a number
of properties that relate to the same item; it's possible with RDFa
to", change the semicolon to a period and make a new sentence.

* I find the example a bit awkward since it builds up an event by
first implying that the current document is the event, before
enclosing it as a bnode of type cal:Vevent..

* Is it wise to use urn:ISBN URIs for the bibo:Book examples? I'd
rather see even example.org IRIs, or ideally stable real world IRIs
from some linked data library service..

* In the next-to-last example, and only there, we show the produced
data as Turtle. Wouldn't it be more uniform to show the same example
using CURIEs or full IRIs instead?

* The text for the last example "when the element does not contain
@rel, @datatype, of @content" should read "when the element does not
contain @rel, @datatype, or @content" (s/of/or/).

* Should we follow our own advice in "RDFa Processor Conformance" and
remove any unnecessary white space in plain literals of these
examples?

"3.4 Plain literals"
--------------------

* As I already brought up, the description of literals in section "3.4
Plain literals" isn't entirely correct. It might be good to add an
example of a literal with language related to the ongoing example
here, such as:

    <http://dbpedia.org/resource/German_Empire>
        rdfs:label "German Empire"@en;
        rdfs:label "Deutsches Kaiserreich"@de .

"3.6 Turtle"
------------

* Perhaps the first two examples should include the full data being
discussed for the sake of completeness?

"3.8 Compact URI Expressions"
-----------------------------

* The last sentence: "Full details on how CURIEs are processed are in
the section titled CURIE Processing." can be removed, as it reiterates
the last part of the first paragraph. This first paragraph also links
to the same section as the last, but with the correct heading, i.e.
"CURIE and IRI Processing". I noted though that the fragment is still
named "#s_curieprocessing". Should we hunt down and fix any such
heading/fragment mismatch?

"4.1 RDFa Processor Conformance"
--------------------------------

* In the note at the end of this section, could we add something like
"or an overriding user argument" to the parenthesis exemplifying
additional mechanisms?

"5. Attributes and Syntax"
--------------------------

* In the definition of "prefix", replace xs:anyURI with xsd:anyURI.

* The definition of "context" says "a 'plain literal object'", it
should only read "a 'literal object'" (since @content can be also
combined with either @datatype or lang)

* The definition of "property" says "and some literal text". It should
be changed to something like "and either a resource object if given or
some literal text".

* How about adding a new sub-section here explaining the various roles
of the RDFa attributes? I'd suggest something like:

  syntax attributes: prefix, vocab
  subject attributes: about
  predicate attributes: property, rel, rev
  resource attributes: resource, href, src
  literal attributes: datatype, content, *lang attribute e.g. xml:lang*
  macro attributes: typeof, inlist

"6. CURIE Syntax Definition"
----------------------------

* The following is stated:
[[[
A CURIE is comprised of two components, a prefix and a reference. The
prefix is separated from the reference by a colon (:). In general use
it is possible to omit the prefix, and so create a CURIE that makes
use of the 'default prefix' mapping; in RDFa the 'default prefix'
mapping is http://www.w3.org/1999/xhtml/vocab#. It's also possible to
omit both the prefix and the colon, and so create a CURIE that
contains just a reference which makes use of the 'no prefix' mapping.
This specification does not define a 'no prefix' mapping. RDFa Host
Languages must not define a 'no prefix' mapping.
]]]

I find this confusing on three accounts:

- Is the default prefix mapping set? Shouldn't it be possible to set
it to what ever the default *vocabulary* is? So that someone can use
e.g.:

    <a vocab="http://example.org/vocab#" rel=":describedby" href="">

To mean:

    <> <http://example.org/vocab#describedby> <> .

- This definition of CURIEs state that terms are also CURIEs, does it not?

- Although I interpret "RDFa Host Languages must not define a 'no
prefix' mapping" to mean TODO

* I would like to see a note here about CURIEs effectively working
like protocol shorthands, with appropriate warnings about how they
*may* overshadow existing or future protocols (especially profiles
with many prefixes could cause this in a non-obvious way). This is
what we discussed when we resolved ISSUE-90 [1]. Note though that the
resolution of ISSUE-125 [2] might remove the need for this.


"7.2 Evaluation Context"
------------------------

* Should it be explained that "parent subject" is only used, as far as
I can see, for handling incomplete triples? (Or am I missing
something?)

* The point of "parent object" uses the wording "this property is used
to convey this value" twice. Instead of "property" we should probably
use "member", to avoid confusion with an RDF property.

* Perhaps the concept "parent object" could be renamed to "parent
resource", due to the way it is used in the processing.

* The phrase "A list mapping mapping IRIs to lists." could use some elaboration.

* The point on default vocabulary, says "a value to use as the prefix
IRI when a term is used". Perhaps we should make that more precise by
saying when an *unknown* term is used?

* "Additionally, some rules will cause new triples to be created"
might read a little funny since that's the heart of the process...
Change "Additionally, some rules" to at least "Then, the core rules"?

"7.4 CURIE and IRI Processing"
--------------------------------

* About TERMorCURIEorAbsIRI, replace "If the value is an term" with
"If the value is a term".

* Text above second example reads: "In RDFa these mappings are
expressed using the XML namespace syntax:". Should be something like
"In RDFa these mappings are defined with a @prefix attribute:"

"7.4.1 Scoping of Prefix Mappings"
------------------------------------

* We might want to add a note stating that redefining prefixes is bad
practise and that care should be taken to use the same prefix for the
same vocabulary consistently across a document. (And ideally a
recommended, well-known prefix if such exists.)

"7.4.2 General Use of CURIEs in Attributes"
-------------------------------------------

* The note states: "An empty attribute value (e.g., typeof='') is
still a CURIE, and is processed as such.". Is that really true? Isn't
it rather so that certain attributes have meaning (effect on the
processing) even when empty? The notion of an empty CURIE strikes me
as strange.

"7.4.4 Use of CURIEs in Specific Attributes"
--------------------------------------------

* The same issue as in "2.2 Examples" appears. Here, "next" is used in
the example of expansion of predefined terms. We could replace this
with an example using e.g. "license".

"7.4.5 Referencing Blank Nodes"
-------------------------------

* The parenthesis in the text "The RDFa Processor must also ensure
that no bnode created automatically (as a result of chaining)" should
possibly begin with "e.g." or be removed (since handling anonymous
typed nodes and lists also involves automatically created bnodes).

* The last sentence "As a special case, _: is also a valid reference
for one specific bnode." is the only explanation of what "_:" means. I
think it should be elaborated a little upon, making it clear how it
works and why. (Also I was under the impression that it should
generate a unique bnode each time it is used (and not represent the
same bnode across the document), but that does not seem to be the
case?)

"7.5 Sequence"
--------------

* Step 1 (and 3). Shouldn't the local list of IRI mappings actually be
set to *a copy of* the list of IRI mappings from the evaluation
context? In step 3, this local list is mutated by adding to it, so we
must ensure that someone implementing it like this doesn't affect the
list for following sibling elements. Either that or express "adding to
the local list" differently, in functional terms.

* Step 3:

- s/before processing an mappings from/before processing any mappings from/
- Replace "value" with "prefix" in "the value to be mapped must be
converted to lower case" to be more clear. Also, why does it have to
be lower-cased?

* Step 5:

- s/does not contains/does not contain/
- s/is set the IRI/is set to the IRI/

* Step 5.1 and 5.2:

- Both substeps use "a @href, @src, or @resource attribute" and gets
an IRI  "from the first value from this set of attributes". That
sounds wrong, since e.g. it's unclear that @resource should override
any @href or @src. I believe the rule for establishing of the current
object resource as described in step 6 should be used here as well.

- The wording "resource attribute" is used twice here. It might be a
good idea to define this (to be the set of @href, @src and @resource).

* Step 6. Neither "Then the current object resource is set to the IRI
obtained from the first match from the following rules:" nor "Note
that final value of the current object resource will either be null
(from initialization) or a full IRI." are entirely correct, since the
current object resource can also be set to a bnode (as per the last
bullet point). It may do to replace mentions of the IRI here with
"resource".

* Step 7. In "object" it says "full IRI of 'type'". It should probably
say "current full IRI from the set of IRIs in @typeof" or similar.
Also, I suppose one *could* use a bnode CURIE in @typeof as well (I
don't know why though), so maybe "current resource from the set of
resources given in @typeof" is better.

* Step 9. Shouldn't there be an error or warning if both @rev and
@inlist are present?

* Step 10. It would be good to explain why "Also, current object
resource should be set to a newly created bnode".

* Step 12:

- The text "This list is iterated and is and each of the predicates is
used with parent subject and [...]", is malformed. The part "is
iterated and is and each of" should be something like "is iterated
over and each of".

- In the text "has a direction value that it used to determine",
replace "it used" with "is used".

- Instead of: "If direction is neither 'forward' nor 'none' then this
is the triple generated", I suggest: "If direction is 'reverse' then
this is the triple generated"

* Step 14:

- I'm not sure I follow: "For each IRI in the local list mapping, if
the equivalent list does not exist in the evaluation context,
indicating that the list was originally defined on the current
element, use the list as follows". It should probably be reworded, and
the "originally defined on the current element" part explained a bit.

- Instead of "element" in "for each element", I'd say "item" or
"object" to avoid confusion with "XML element".

- It says: "For each pair of bnode and IRI", but the local list
mapping can contain literals as well. So the IRI should really be
"object" or similar.

To sum up on "7.5 Sequence", I get the feeling that the sequence could
be refactored to use less branching, but as long as it's working and
is verified by a couple of independent implementations it is probably
good enough. I did have a go at an implementation a while ago (in
Clojure) but alas haven't had time to finish it. My impression when
implementing was that there *might* be a different set of things in
the evaluation context which could reduce the number of
if-else-branches. But it is unfair of me to criticise this further
without doing the full implementation. Most importantly, to my mind
the processing steps do fully cover the possible variations and corner
cases, which of course is paramount and thus excellent!

"7.6 Processor Status"
----------------------

* While this section is good, and the concept of a processor graph in
general can be useful, I do wonder if it's really tangential to
defining the core of RDFa processing? Have we considered placing it in
a separate document? If I understand correctly, it came to life as a
part of the now removed custom profile processing mechanics? Mostly
thinking out loud here though.

"7.6.2 Processor Graph Terms"
-----------------------------

* The final example uses the prefix "http:". Ouch. As mentioned above,
we have to be very vigilant about this due to how CURIEs work now.

* Perhaps it's better to remove the final example, since it's a bit speculative?

"7.7 Vocabulary Expansion"
---------------------------

* Only "limited RDFS entailment" is mentioned. That should probably
read "limited RDFS and OWL entailment".

"8. RDFa Processing in detail"
------------------------------

Unfortunately I didn't have time to review all the details here.
Especially I intended to verify the examples with an RDFa 1.1
implementation. I really recommend that we add an action for doing
that.

"8.2 Completing incomplete triples'"
------------------------------------

* Remove the spurious trailing single quote from the name of this section.

"8.3.1 Object resolution for the @property attribute"
------------------------------------------------------

*  This section begins with "An object literal will be generated when
@property is present.", which is no longer entirely true. We need
wording to explain that this only happens unless a resource attribute
is present.

* The image with caption "Literal object resolution" is too big,
causing scrollbars to appear.

"8.4 List generation"
---------------------

* s/Admittingly/Admittedly/

* In "when all list elements are resources and not literal" change
"literal" to "literals".

* I think the last sentence: "Note that it is also possible to
generate an empty list easily, without @inlist, using:" should be:
"Note that it is also possible to express an empty list, without
@inlist, using:". (Although rdf:nil really denotes *the* empty list..)

"9. Initial Contexts"
-----------------------

* The text "and/or default vocabulary declarations" should be "and/or
a default vocabulary declaration".

"10. RDFa Vocabulary Expansion"
--------------------------------

* The headings for 10 and 10.1 are in bold italic, since they are
wrapped in <dfn> elements. Is this correct? (I find no other headings
simultaneously used as definitions.)

* Section "10. RDFa Vocabulary Expansion" should include the relevant
triple from the cc vocabulary used for expansion?

"10.2.1 Notes to RDFa Vocabulary Implementations and Publishing"
----------------------------------------------------------------

* s/dereferencable/dereferenceable/

"B.4 Changes"
-------------

Could this perhaps be merged with its only subsection "B.4.1 Major
differences with RDFa Syntax 1.0"?

More importantly, I believe this should be a top level section (i.e.
"C. Changes") and not part of the parent section "B. The RDFa
Vocabulary".


Other
-----

* Eleven occurrences of "a IRI" to be replaced with "an IRI"


This should complete my ACTION-109.

Best regards,
Niklas
--
<http://neverspace.net/>

[1]: http://www.w3.org/2010/02/rdfa/track/issues/90
[2]: http://www.w3.org/2010/02/rdfa/track/issues/125
Received on Thursday, 19 January 2012 14:55:30 UTC