This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 2402 - Text Comparison (Literals056 and more generally)
Summary: Text Comparison (Literals056 and more generally)
Status: CLOSED FIXED
Alias: None
Product: XML Query Test Suite
Classification: Unclassified
Component: XML Query Test Suite (show other bugs)
Version: 0.7.0
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Mike Rorke
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-10-21 11:27 UTC by David Carlisle
Modified: 2005-10-27 21:21 UTC (History)
0 users

See Also:


Attachments

Description David Carlisle 2005-10-21 11:27:02 UTC
This was raised by Jonathan Robie in #1863 (marked as closed) but is still
presentin the 0.7 release.

Literals056.txt is specified as Text comparison but contains the text & I
would have expected &.


Actually the meaning of the text comparison as still very unclear.
http://www.w3.org/XML/Query/test-suite/Guidelines%20for%20Running%20the%20XML%20Query%20Test%20Suite.html

says of the Text comparison 

  Text: text is compared using byte-comparison.

I, and apparently other testers, have interpreted this as meaning that the
result should be serialised using the text output method in utf8 and compared to
the supplied result. (Although actually I compare the string value of the result
with the expected result as xpath strings (so a unicode character equality not
byte comparison of its utf8 encoding).

However the new guidelines are now explict that the results are always to be
serialised with a specified set of serialisation paremters, including method=xml.

If Literals056.txt is to be compared with the output of an xml serialisation,
then it does need to be &amp; not < (and the other examples that -were- changed
for bug #1863 need changing back) in this case other results would need to be
allowed (eg using numeric character reference or CDATA section to quote the &)
Otherwise the guidelines need to say that text serialisation should be used for
the Text comparison.

Alternatively (and perhaps preferably) all instances of Text comparison could be
 replaced by Fragment. That reduces the dependence on support for multiple
serialisation forms which is an optional feature for Xquery. 
http://www.w3.org/TR/2003/WD-xquery-20031112/#id-serialization
makes it clear that even if serialisation is supported a system need only
support method="xml" which means that a conforming application may fail all
tests using Text comparison.
Comment 1 Mike Rorke 2005-10-27 19:45:16 UTC
We have clarified this in the task force and will add some text to the upcoming 
release to make this more clear. Basically, we require all results to be 
serialized using XML serialization. In this case, simple string literals will 
be output as a top-level text node with the special XML characters escaped. So, 
for Literals056, the results is correct - while we had incorrect results for 
the output of the other special XML characters. I have now fixed these.
Comment 2 David Carlisle 2005-10-27 20:45:10 UTC
I must say this is a slightly surprising outcome (although at least it will be
consistent, even though it reverses the resolution of bug #1863).
However I now can't see any reason for the Text comparison at all, there are
virtually no features of an XML serialisation that can be compared safely in a
byte-for-byte manner. In my own test harness I currently don't serialise the
result at all, XML and Fragment are compared by loading the file as XML (after
wrapping in an element in the case of Fragment) and then compared using
deep-equal() (Or I could write a different recursive equality function that took
different actions on comments etc, but that wouldn't change this issue)

Currently I compare Text by reading the Expected result with unparsed-text() and
comparing using = with the string value of the result. With the clarification
you give here I would change my test harness to treat Text in exactly the same
way as Fragment. Even if I was serialising and then comparing (which is closer
to the offical method) I still couldn't see any difference between Text and
Fragment comparison.

If the output has no attributes or namespaces, Text and Fragment are the same
and if the output does have attributes or namepsaces, you'd want to compare as
Fragment rather than Text so that you can write all attributes in some canonical
order before comparing, wouldn't you?

David
Comment 3 Mike Rorke 2005-10-27 20:53:48 UTC
I agree - technically, the 'text' comparator could be subsumed by 
the 'Fragment' comparator. So could the 'XML' comparator (provided we do not 
have expected results which contain an XML declaration and/or DTD - which we do 
not). The origional idea of the different comparators was to give the user an 
indication of what to expect from the results - initially, either an XML result 
or a textual string. But, with the additional complications raised by the 
different serialization methods and the neccessity of adding the 'Fragment' 
comparator, these differences have been made less important.

So, basically, these comparators are just an indication to the user about what 
type of results to expect. The actual technical details of how each comparator 
is implemented are pretty similar - the comparator is more of a conceptual idea 
than a technical requirement. Though we do provide technical details of how 
each should be implemented in the Test Execution Guidelines document - as you 
have pointed out, these implementations are all very similar.
Comment 4 David Carlisle 2005-10-27 21:21:02 UTC
Thanks for the further feedback.
Personally I'd prefer to see XML on all well formed expected results and
Fragment on any that are not well formed, and no Text at all, but if you think
that having some marked as Text will help some other testers, I don't actually
object to that.
So I'm closing this report, as at least I now know what I'm supposed to do,
which is the main thing!

David