Re: Jim Melton: XML Query WG review of RIF Datatypes and Built-Ins 1.0

> Many thanks for your detailed response to the XML Query WG's comments 
> (and for the heads-up phone call this afternoon!).  My WG will not 
> meet again until Tuesday, 6 October; because you have expressed a 
> strong desire to publish tomorrow, I'm taking some liberties here for 
> which I hope my WG will forgive me.  That is, I'm responding 
> unilaterally, hoping that my response accurately reflects what the WG 
> participants would endorse.  If not, then we may have to send you a 
> retraction and/or supplementary comments that you would have to 
> consider after this publication cycle.

Thanks, that makes sense.  I'm responding directly, as well, without
yet consulting the rest of RIF WG.   This is to them, as well.

In general, I see no problem with adding the notes you suggest.  Two
other points inline below.

> My high-level response is that we appreciate the serious 
> consideration that you gave our comments and that we, except as 
> indicated below, are satisfied with your responses.  I do not believe 
> that we have any reason to object to your progression to CR since you 
> have already agreed (below) to further consider a few of our comments 
> during the CR period.
> 
> My responses below are all preceded by the string "Jim:" for easy 
> identification.
> 
> At 9/29/2009 10:24 PM, Sandro Hawke wrote:
> 
> >Dear Jim,
> >
> >Thank you and the XML Query WG for your detailed review [1] of RIF
> >Datatypes and Builtins.  We appreciate the time you put in, finding
> >weaknesses and errors in our draft.
> >
> >Our responses to your comments are inline below.  It would be most
> >helpful if you could let us know very soon whether you find these
> >responses satisfactory.  (If you are satisfied, we can go ahead and
> >publish as Candidate Recommendation immediately.  Best case, we could
> >publish on Thursday October 1.)
> >
> >Our wiki has the latest version (including the changes made in
> >response to your comments):
> >
> >   http://www.w3.org/2005/rules/wiki/DTB
> >
> >and a diff of those changes:
> >
> > 
> >http://www.w3.org/2005/rules/wiki/index.php?title=DTB&diff=11377&oldid=11060
> >
> > > The XML Query WG has completed its review of
> > >
> > > RIF Datatypes and
> > > Built-Ins 1.0 and has developed some comments.  Please note that
> > > Sharon and I initially agreed to submit the XML Query WG's comments and
> > > the XSL WG's comments jointly, but my WG objected on the grounds that
> > > they had not yet seen the XSL WG's comments and did not want to wait for
> > > them.  Consequently, Sharon will submit the XSL WG's comments
> > > separately whenever they are ready.
> > >
> > >
> > > <comments>
> > >
> > >
> > > 1) Thanks for giving us the opportunity to review this
> > > document.  We were very pleased to see that you have based much of
> > > this spec on the Functions and Operators specification that we developed,
> > > as well as on the XML Schema Part 2 Datatypes spec on which we also
> > > depend.  There are other W3C WGs whose documents made use of the
> > > F&O functions, but redefined the functions instead of incorporating
> > > them by reference.  Your approach is manifestly appropriate.
> > > Thanks!
> >
> >We were glad to be able to reuse so much of your work.  (I expect our
> >users and implementors will be glad, too.  Several implementors have
> >said they plan to re-use xpath libraries.)
> 
> Jim: That's nice to know -- thanks.
> 
> 
> > > 2) We are slightly concerned by the fact that you state in the Overview
> > > that "A large part of the definitions of the listed functions and
> > > operators are adopted from [XPath-Functions]," but that you define a
> > > different namespace
> > > (
> > > http://www.w3.org/2007/rif-builtin-function#) for the functions
> > > instead of using the defined namespace
> > > (
> > > http://www.w3.org/2005/xpath-functions) of those functions that have
> > > been adopted.
> > >
> > > We note that Section 4 uses the word
> > > "adapted" instead of "adopted", which has
> > > significantly different connotations.  We have concluded from
> > > additional text in the document that "adapted" is the word that
> > > you intended to use and recommend that you resolve the discrepancy by
> > > correcting the Overview.
> >
> >We believe "adapted" is the more accurate term here, and have changed
> >the document to reflect this.  At the end of section 4 (just before
> >4.1) we added this clarifying sentence:
> >
> >"The differences from the original [XPath-Functions] include the
> >handling of errors, the differentiation between predicates and
> >functions, and a few specific differences noted in the definitions
> >below."
> 
> Jim: That sentence is most welcome, as I often find it difficult to 
> discover the ways one spec has "adapted" material in another spec.  I 
> now have some guidance on what those differences are and how to look for them
> .
> 
> 
> >The different namespace URIs are intended to allow for these changes,
> >and are also to save users from having to remember which RIF functions
> >are xpath functions and which are xpath operators (for which RIF would
> >have to provide a namespace, since xpath does not) or new RIF
> >functions.  (RIF does not have the function/operator distinction.)
> 
> Jim: I expect that you understand that the XPath "operator functions" 
> are for definition purposes only and are not in any way normative for 
> implementations.  That, of course, is why we didn't define a 
> namespace for them.  Some of our participants are a little 
> uncomfortable with a referencing specification choosing to make those 
> operator functions normative, but overall we do not object.
> 
> 
> > > 3) In section 2.2.1, we read the statement "since xs:duration does
> > > not have a well-defined value space."  We believe that
> > > mischaracterizes the rationale for the creation of the types
> > > xs:dayTimeDuration and xs:yearMonthDuration.  The rationale is
> > > actually that the xs:duration data type is not fully ordered, while the
> > > two types derived from xs:duration are fully ordered.  It is
> > > unlikely that XML Schema will be able to redefine xs:duration in a way
> > > that is both compatible and fully ordered.
> >
> >It looks like this comment about xs:duration in our draft was based on
> >the XSD 1.0 defintion.  As you suggest, it may have been fixed in XSD
>1.1.  We have decided not to add it, however, since we already agreed
> >to have the same datatypes as OWL 2, which is already at Proposed
> >Recommendation.
> 
> Jim: Thanks for the explanation (and the removal of the incorrect 
> explanation).  Do you think it would be worthwhile adding a 
> (non-normative) note telling the reader the reasoning behind the 
> decision (that is, the relationship with OWL 2)?
> 
> 
> >We have removed the now-incorrect explanation you cite.
> >
> > > 4) Also, in section 2.2.1, since xs:dateTimeStamp is taken from XSD 1.1,
> > > it would also make sense to take xs:dayTimeDuration and
> > > xs:yearMonthDuration from XSD 1.1, rather than from XDM. The definitions
> > > are equivalent by design. (This also affects section 2.3.)
> >
> >We would rather avoid this change, at this point, because it would
> >increase the risky dependency on XSD 1.1.  The OWL WG was recently
> >delayed because of such a dependency, and is left with the awkward
> >work-around you see here:
> >
> >   http://www.w3.org/TR/2009/PR-owl2-overview-20090922/#sotd-xml-dep
> >
> >If XSD 1.1 makes it to PR before RIF, we can make this change at that
> >point.
> 
> Jim: Fair enough.
> 
> 
> > > 5)
> > >
> > > In section 2.3, the type hierarchy for integer subtypes
> > > appears to be incorrect.  unsignedLong should not be a subtype of
> > > positiveInteger (because it allows the value zero). Also the prefix
> > > "xs:" is included or omitted indiscriminately.
> >
> >Thanks for catching this; it's fixed now.
> >
> > > 6) In section 4.3, we learn that "Itruth Iexternal( ?arg1;
> > > pred:is-literal-not-DATATYPE ( ?arg1 ) )(s1) = t if and only if s1 is in
> > > the value space of one of the datatypes in
> > > <
> > > http://www.w3.org/TR/rif-dtb/#sec-data-types>DTS but not in the
> > > value space of the datatype with shortname DATATYPE, and f
> > > otherwise."  We believe that means that the predicate
> > > pred:is-literal-not-integer returns f if the value of its argument is not
> > > in the value space of any datatype in DTS!  If that is true, then it
> > > is highly misleading, because returning false implies that the value is a
> > > literal of type integer.  We recommend that you reconsider this
> > > definition so that the predicate returns true when the value is either
> > > (a)not in the value space of any datatype in DTS or (b)is in the value
> > > space of some data type in DTS but not in the value space of the
> > > specified datatype.
> >
> >We believe the definition as given is correct, but that the intended
> >meaning of negative guards was not clear.  We have added this note to
> >the end of section 4.3:
> >
> >"Note: The semantics of negative guards may be surprising. The
> >is-literal-not-String guard essentially asks, "Is this a literal, and
> >(if it is) is it something other than a String?" It could also be read
> >as "Is this a decimal or a float or a double or a date or a dateTime,
> >etc, [for every datatype except string] ?". The negative guards are
> >formulated like this to allow for rules which detect, for instance,
> >some kinds of bad inputs, while still using the open world assumption
> >of some RIF dialects."
> >
> >Hopefully, that's detailed enough to show that the definition is
> >correct.  A more-detailed explanation of why we can't provide
> >is-not-String seems out-of-scope for this document.
> 
> Jim: Thanks for the explanation. I now understand why the predicate 
> has the semantics that it does.  I must say, though, that I find the 
> name itself unfortunate because of its counter-intuitiveness.  Full 
> disclosure: I have long advised people to not depend on intuition or 
> on Webster's Dictionary for the meaning of keywords and function name 
> in a programming language, but to depend solely on the language 
> spec.  This is obviously a case where I am not following my own 
> advice.  But I also advise designers of languages to avoid 
> consciously using counter-intuitive terms whenever possible.
> 
> Jim: In spite of the conflicting tone of the preceding paragraph, I 
> do not ask that you reconsider the name of the predicate, because 
> there is great value in having consistency amongst the names used for 
> similar purposes in a programming language and that consideration 
> probably outweighs the counter-intuitiveness (which might not affect 
> every reader anyway).

Can you explain how the name seems counter-intuitive to you?  Given the
meaning (test that something is a literal and is not a string), it seems
to me that is-literal-not-String is pretty clear.  It could also be
is-literal-and-is-not-String, but that doesn't seem that much clearer.

I guess I'm surprised that, knowing the meaning, you find the name to be
a problem.  That surprise makes me thing there's some aspect of the
situation I'm missing.

> > > 7) In section 4.4.1, we discovered the trivial typographical error
> > > "funcitons".  We also noticed the trivial typographical
> > > error "ab" (should be "an").
> >
> >Fixed, thanks.
> >
> > > 8) In section 4.5.1, Numeric functions, it is not clear whether functions
> > > such as func:numeric-add accept arguments of mixed type (e.g. integer
> > > plus double).  Although neither sections 1.3, 1.4 and 6.2 of
> > > Functions and Operators nor appendix B of XPath 2.0 are wonderfully clear
> > > on the point, our reading is that the underlying function
> > > op:numeric-add() does not accept mixed arguments; rather, when the XPath
> > > "+" operator is applied to an integer and a double, the integer
> > > is promoted to a double and the function op:numeric-add(double, double)
> > > is called. The operator accepts mixed-type arguments, but the underlying
> > > function does not. (Others may disagree with this reading, as it really
> > > isn't 100% clear.)
> >
> >Thanks for pointing out this omission.  We have added text requiring
> >types be promoted in RIF, in the Mapping part of section 4.5.1.
> >
> > > 9) Section 4.7.1.2. Note that for reasons that are entirely
> > > paternalistic, the fn:concat() function requires two or more arguments.
> > > Also, the reference to xs:anyAtomicType seems odd: this abstract type
> > > doesn't seem to be present in RIF.
> >
> >Okay, we've gone with two-or-more, and removed anyAtomicType.
> >
> > > 10) Section 4.11: We suspect there is a fourth difference between RIF
> > > Lists and XPath sequences: in RIF, there is no equivalence between an
> > > atomic value and a singleton list containing that value. (Otherwise,
> > > pred:is-list() would be meaningless).
> >
> >Fixed, thanks.
> >
> > > 11) Section 4.11.1. Is it wise to number positions in a list starting
> > > from zero, while numbering characters within a string (for example, in
> > > the substring() function) from 1?  We think this inconsistency will
> > > confuse your readers and users.
> >
> >We struggled with this some more today, but decided to leave indexing
> >as is.  It's a really infortunate situation, and we can't see any way
> >forward which wont confuse users.  Given that lists are substantially
> >different from xpath sequences, well, hopefully people will understand
> >and tolerate this approach.
> 
> Jim: This is most unfortunate, and may be the only thing to which we 
> might actually object.  Please note that our comment didn't use the 
> example of sequences, because your language doesn't contain that 
> concept; we used strings, which your language does have.  Yes, lists 
> and character strings are not the same thing, but many application 
> programmers will be familiar with treating strings as lists of 
> characters.  Numbering lists of characters (that is, strings) 
> starting with 1, but numbering other kinds of lists starting with 0 
> is, in our opinion, likely to be a serious source of confusion and 
> erroneous code.  We strongly urge you to further consider this 
> decision.  Assuming that you will choose to publish the CR without 
> making a change here, I must advise you that our WG might choose to 
> make an additional comment on this point during your CR 
> period.  (Personally, I sincerely doubt that any of our members would 
> go so far as to raise a formal objection, so you need not fear that outcome.)
> 
> Jim: If you do not make changes to resolve this concern, then please 
> be sure that the spec clearly points out the different base value for 
> the two kinds of positions.  That is probably best done with 
> non-normative notes in two places -- where character strings are 
> defined and where lists are defined, cross referencing one another.

Perhaps a short-term solution is for us to mark the indexing (perhaps
for strings and lists?) as At Risk.  That lets us defer the decision
until the end of CR, and get feedback from implementors and the
community.  It would allow us to change this later, without a second
Last Call and second CR.

For myself, mostly programming Python these days, I find it odd that in
xpath and RIF I can't use the same functions on lists and strings
(treating strings as sequences of 1-character strings).  The fact that
concat is for strings and concatenate is for sequences/lists is
... unfortunate.  (I wonder if we could define our list builtins to also
work on strings, as if they were lists like that....  Ah well, maybe
it's too late for that.)

I'm hoping that a more elegant DTB 2.0 based on user experience wont be
too many years off.

   -- Sandro

> > > 12) Section 4.11.4.11: There is no function fn:union. The link is to
> > > op:union, but the RIF function is essentially unrelated to op:union, as
> > > it is defined on atomic values rather than nodes. Same applies to
> > > 4.11.4.13 fn:intersect and 4.11.4.14 fn:except. XPath contains no
> > > functions to manipulate sequences of atomic values in this way: such
> > > functions can easily be written by users as explained in F+O appendix
> > > E.2.
> >
> >Our sense was that this difference was within the wiggle-room of
> >"adapted from", but I guess we could change it to "inspired by" or
> >"contrast with", if you think that's important.  We did want to
> >highlight the fact that xpath does have something with the same name.
> 
> Jim: I don't think you can get away with the "adapted from" argument 
> here.  The semantics (and signature) of the op:union operator from 
> XPath and those of the corresponding RIF function are different in 
> very important ways.  I believe that my WG's members would argue that 
> you should highlight this function as not coming from the F&O spec, 
> characterize it as "inspired by" if you wish, and explicitly state 
> that XPath has something with the same name, but that it is not the 
> same function.  This can all be done in a non-normative note, but I 
> believe that it's important to avoid misunderstanding by your readers.
> 
> 
> > > 13) Section 4.11.4.12: it's not clear what "in the same order"
> > > means. Order of first appearance, perhaps?
> >
> >Fixed, thanks.
> >
> > > 14) In various places in section 4, we read phrases such as "the
> > > value of the function is unspecified".  The discussion early in
> > > section 4 of that term states that implementations are free to do as they
> > > wish, including returning either true or false, as well as aborting
> > > evaluation of the containing expression/query.  In the specs for
> > > which we are responsible, as well as some well-known international
> > > standards, the term "implementation-dependent" is used for the
> > > same purpose.  You might consider the use of that term instead.
> >
> >Thanks, I agree with you on this, but the group hasn't had a chance to
> >talk it through.  I'd like to let addressing this wait until the next
> >round.
> 
> Jim: Fair enough.
> 
> 
> > > 15) Near the beginning of section 2.3 and in Appendix 6, we see three
> > > places where an unexpected character (a hollow square box) appears.
> >
> >There is a community, including some of our editors, who use that box
> >to signal the end of a formal definition.  Personally, I find it
> >confusing, and would prefer we use some CSS styling to set off the
> >definition.  The WG hasn't had a chance to talk about this, and I'd
> >like to wait until the next round for this, as well.
> 
> Jim: I don't personally object to the convention.  But you are 
> undoubtedly aware that a great many fonts use that glyph as a signal 
> that there is a (Unicode) character for which the font does not have 
> a corresponding glyph.  Most readers of your spec will see those 
> boxes and assume that they are spurious characters for which the font 
> their browser is using does not have the proper glyph.  If you must 
> use that convention, then you must clearly tell your readers what it 
> means.  I actually think your readers will be better served if you 
> look into other W3C specs to see how they deal with formal 
> definitions and use those specs as inspiration.
> 
> 
> > > </comments>
> > >
> > > Hope this helps,
> > >    Jim
> >
> >Indeed, thank you again for catching all this, and please let us know
> >if our response is satisfactory.
> >
> >      -- Sandro (on behalf of RIF WG)
> 
> Jim: It was our pleasure to help in whatever way we could.
> 
> Jim: Going out on that limb, I will assert that we do not object to 
> progressing this spec to CR, but that we may submit additional 
> comments during the CR period.
> 
> Hope this helps,
>     Jim
> 
> 
> 
> >[1] http://lists.w3.org/Archives/Public/public-rif-comments/2009Sep/0008
> 
> ========================================================================
> Jim Melton --- Editor of ISO/IEC 9075-* (SQL)     Phone: +1.801.942.0144
>    Chair, W3C XML Query WG; XQX (etc.) editor       Fax : +1.801.942.3345
> Oracle Corporation        Oracle Email: jim dot melton at oracle dot com
> 1930 Viscounti Drive      Standards email: jim dot melton at acm dot org
> Sandy, UT 84093-1063 USA          Personal email: jim at melton dot name
> ========================================================================
> =  Facts are facts.   But any opinions expressed are the opinions      =
> =  only of myself and may or may not reflect the opinions of anybody   =
> =  else with whom I may or may not have discussed the issues at hand.  =
> ========================================================================  
> 

Received on Wednesday, 30 September 2009 23:23:55 UTC