3756 – Pre-canonicalize the testsuite

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 3756 - Pre-canonicalize the testsuite

Summary: Pre-canonicalize the testsuite

Status:	RESOLVED WONTFIX

Alias:	None

Product:	XML Query Test Suite
Classification:	Unclassified
Component:	XML Query Test Suite (show other bugs)
Version:	unspecified
Hardware:	PC Linux

Importance:	P2 enhancement
Target Milestone:	---
Assignee:	Andrew Eisenberg
QA Contact:	Mailing list for public feedback on specs from XSL and XML Query WGs

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2006-09-20 22:50 UTC by Per Bothner
Modified:	2006-09-25 22:48 UTC (History)
CC List:	0 users

See Also:

Attachments
sort order of namespace declarations in some test output (9.39 KB, patch) 2006-09-21 19:04 UTC, Per Bothner	Details

Description Per Bothner 2006-09-20 22:50:31 UTC

The following files contains XML or Fragment expected output with xmlns declarations that are *not* in canonical order:

(See http://www.w3.org/TR/xml-c14n#DocumentOrder.)

fn-union-node-args-015.txt
fn-union-node-args-016.txt
fn-union-node-args-017.txt
fn-intersect-node-args-015.txt
fn-intersect-node-args-016.txt
fn-except-node-args-017.txt
Constr-cont-nsmode-1.xml
Constr-inscope-1.xml ('XXX' is lexicographically before 'foo')
Constr-inscope-2.xml
Constr-inscope-3.xml
Constr-inscope-4.xml
ns-queries-results-q2.txt
ns-queries-results-q3.txt
ns-queries-results-q5.txt
ns-queries-results-q6.txt
ns-queries-results-q7.txt
ns-queries-results-q8.txt

There may be similar issues with order of attribute nodes; I haven't inspect that yet.

Comment 1 Michael Kay 2006-09-21 07:46:14 UTC

The spec say that "The test harness must canonicalize both the actual result and the expected result according to the Canonical XML recommendation"

There's no statement anywhere, AFAIK, that the published results are already canonicalized.

Comment 2 Frans Englich 2006-09-21 09:13:36 UTC

I believe Michael's comment is right here. None of the XML results are purposefully canonical(although that can be the case for some tests), but require that the implementor do the canonicalization, or use a semantically equivalent method. For example, I load the XML into DOM trees and compare them. Slow, but it works.

Therefore, I'm closing this report as invalid. Per, if this resolution is ok, feel free to change status to CLOSED, otherwise re-open the bug.

Comment 3 Per Bothner 2006-09-21 15:49:45 UTC

Ok, I changed the severity to "enhancement".  The reason is I sometimes run the testsuite many times an hour.  Having to canonicalize the expected output adds an extra processing step which slows down running the testsuite.  (Since I haven't yet implemented this extra step I don't know how much.)  It would be much more efficient to pre-canonicalize the testsuite.  As there doesn't seemt o be that many testsuites that aren't already canonicalized it should be fairly easy to do.  (I volunteer to send patches if you'll accept this change in principle.)  It makes more sense to change a few test cases rather than run an extra canonicalization step on *each* non-error testcase.

Comment 4 Frans Englich 2006-09-21 16:11:13 UTC

I think it is a reasonable suggestion, and I wouldn't mind to see it done either, I think. The question is whether it's feasible:

How would fragments be handled? Those are messy since c14n tools will choke on non-XML input(I guess). One could wrap them with an element and leave it there but that would require existing drivers to be changed, and also require them to wrap their output in a particular way. Another approach is to try to remove the document element afterwards, but this is all getting messy. As I currently see it, XML fragments are a show stopper for this enhancement.

It should probably be implemented as a tool that the task force run on the suite(just like all our other scripts), to ensure maintainability. The smartest thing would probably be to extract the K-* tests into the CVS rep., as suggested by several(including me), otherwise they would pose a problem.

But again, in principle the idea is good. Especially since it feels like this test infrastructure will be used for future XQuery extensions. However, there's a general conservatism towards changes in the XQTS(understandably), so I wouldn't expect anything to change any time soon.

Comment 5 Per Bothner 2006-09-21 16:56:06 UTC

(In reply to comment #4)
> How would fragments be handled? Those are messy since c14n tools will choke on
> non-XML input(I guess). One could wrap them with an element and leave it there
> but that would require existing drivers to be changed, and also require them to
> wrap their output in a particular way. Another approach is to try to remove the
> document element afterwards, but this is all getting messy. As I currently see
> it, XML fragments are a show stopper for this enhancement.

First, we may not need a tool, if most of the tests are already canonicalized, which they seem to be.  The number of mis-ordered namespace declarations that I found is quite modest.  Attribute order may be another (unknown) problem.  And there may be soem whitespace issues.  If the discrepancies are modest, we could just fix them manually.  I can certainly submit patches for the namespace-declaration "errors".

Secondly, I don't see that handling Fragments shoudl be difficult.  Surround the Fragment by some magical header like <xqts-magic-frag-wrapper> ... </xqts-magic-frag-wrapper> before running the canonicalizer.  Removing the header afterwards should just be a trivial sed/perl script.  Am I missing something?

Comment 6 Per Bothner 2006-09-21 19:04:20 UTC

Created attachment 439 [details]
sort order of namespace declarations in some test output

This "fixes" 6 of the test cases on my list so the namespaces declarations are "canonically" sorted.  This improves the number of passing tests for Qexo.

Comment 7 Per Bothner 2006-09-21 19:13:47 UTC

If the previous patch is accepted (which I woukd much appreciate) I'll submit a similar patch for the remaining namespace-declaration-order cases.

I did find one rather larger discrepency from canonical output: Most of the results "short-cut" end-tags, which is not allowed by canonical XML.  I.e.
<foo attributes.../> rather than <foo attributes...></foo>.  When I changed my output routine to emit the latter rather than the former, then the number of passes dropped by 174 in my testing.  "Fixing" these manually will be somewhat more tedious.

Of course the goal isn't necessarily "canonical XML" as it is a well-specified and predictable output format.  So "canonical XML but with end-tag abbreviation" is a reasonable option for expected output files, at least as an interim step for XQTS 1.0.x.

Comment 8 Michael Kay 2006-09-21 22:05:47 UTC

What I do is to first compare my test results with the published results as a simple string compare. Only if that fails do I try to canonicalize. If, as you suggest, most of the results are in canonical form already, this procedure efectively eliminates any performance impact of canonicalization.

Comment 9 Per Bothner 2006-09-22 05:54:47 UTC

(In reply to comment #8)
> What I do is to first compare my test results with the published results as a
> simple string compare. Only if that fails do I try to canonicalize. If, as you
> suggest, most of the results are in canonical form already, this procedure
> efectively eliminates any performance impact of canonicalization.

That is certainly a practical work-around. But we're talking about a testsuite, and running the testsuite should be as simple and regular as possible, to avoid artifacts and errors in the testsuite and testing framework.  The more complicated the comparison is, and the more different ways we try finding a match between expected and actual output, then the greater the risk of errors in the testing framework or otherwise missing actual bugs in the implementation.

Comment 10 Michael Kay 2006-09-22 08:37:40 UTC

>running the testsuite should be as simple and regular as possible

That suggests that it's better for the test suite to use the output format that most processors are likely to generate from their everyday serializers, for example <a/>, rather than the canonicalized form which is <a></a>.

But I'm probably influenced by the fact that I chose not to use canonicalization as my comparison method: instead (assuming string comparison fails) I use a customized variant of the deep-equal() function applied to the two node trees.

Comment 11 Frans Englich 2006-09-22 14:36:33 UTC

If any kind of output normalization is deployed it needs to be done in a verifiable and maintainable way. A patch or two can fix some issues that are found by a particular harness driver at this point, but it doesn't help when bugs are fixed and the test suite is potentially further developed in the future. That's why I mention a tool here, such that the task force can guarantee any promises it gives.

So, I doubt patches would be accepted. However, I do think the more general ideas behind this surely has it merits and that it should be considered for further development.

My best advice is what Michael said; to do string comparisons and in the cases they fail, fallback to c14n comparison(or functional equivalent).

This is only my personal response. The task force will discuss and take action on this report.

Comment 12 Andrew Eisenberg 2006-09-25 20:12:12 UTC

The Testing TF discussed this at our meeting on Sept. 21 and decline to make this change. In part this is because our decision to have implementors canonicalize the expected result is one that was made a very long time ago and has been found acceptable by many implementors. In part this is also because we are about to "declare victory" when we publish XQTS 1.0.1 (containing bug fixes for reports that we've received over the last several weeks).

I suggest that if you want to reduce the cost running the test suite repeatedly, that you first replace our expected results with their canonic representations on your local copy. We may produce some versions of XQTS beyond 1.0.1, but these will likely be few and far between.

By the way, we do acknowledge and appreciate the offer of your labor for the change that you suggest.

Please close this bug report if you agree with this resolution.

Comment 13 Per Bothner 2006-09-25 21:47:57 UTC

(In reply to comment #12)
> In part this is because our decision to have implementors
> canonicalize the expected result is one that was made a very long time ago and
> has been found acceptable by many implementors.

The problem is that this can mroe easily hide errors, thus making the testsuite stricly less powerful.  To verify that matches(ACTUAL,EXPECTED) by doing match(canonicalize(ACTAL),canonicalize(EXPECTED)) is doing a stricly less powerful test.  A bug in canonicalize can lead to a false positive.  Alternatively, if canonicalize "throws away too much information" you can also get false positives.  E.g. if someone implements canonicalize by taking the string value of the argument then a test case is more likely to pass even in an implementation fails to emit correct namespace declarations.

> In part this is also because we
> are about to "declare victory" when we publish XQTS 1.0.1 (containing bug fixes
> for reports that we've received over the last several weeks).

Yes, I understand the timing issue, and I apologize for not brining this issue up earlier.
> 
> I suggest that if you want to reduce the cost running the test suite
> repeatedly, that you first replace our expected results with their canonic
> representations on your local copy.

I not so concerned about the cost (as Mike Kay said one only needs to do canonicalization for the relatively few tests that aren't canonicalized), but it does reduce the "power" of the testsuite, and its usefulness in avoiding regressions.

Comment 14 Michael Kay 2006-09-25 22:28:51 UTC

>if someone implements canonicalize by taking the string value of the argument

but that wouldn't implement canonicalize.

Comment 15 Per Bothner 2006-09-25 22:48:15 UTC

(In reply to comment #14)
> but that wouldn't implement canonicalize.

My point is: how could you tell?  Unless you had a separate test-suite for canonicalization.  Using the string value is an extreme and unlikely case, but some more subtle bug in the canonicalization mechanism could easily mask a real error.

You mentioned you use a "customized variant" of deep-equal.  A lazy implementor might use deep-equal, without remembering that it doesn't check namespace declarations or element/attribute prefixes or nested comments or nested processing instructions.

Plus of course one might want to check that serialization works as expected.  using a variant of deep-equal doesn't do that.