ACTION-549: Handling of arbitrary QName content in c14n 2.0

The current proposal for c14n includes an explicit option for recognizing
QName prefixes inside the xsi:type attribute. I would favor replacing this,
or adding a more generic option to carry a list of elements and attributes
that should be treated as QName-valued. I think this is a better way of
replacing the function of the inclusive prefix option, because it doesn't
automatically result in declarations of prefixes until they actually show up
in the document (which is how exclusive c14n is intended to work).

I do not propose to try and recognize QName data sitting inside of arbitrary
text content, and propose identification of nodes that contain only a QName
as their complete value (xsi:type being an example of that).

The only complication to this idea (aside from concerns that I don't share
about having to check nodes against a list being a performance issue) is
that it's difficult to concisely express unqualified attributes of specific
elements, because those attributes can't be identified themselves by a
namespace/localname pair.

The best I could come up with to express the list of QName-valued nodes
would be this kind of syntax:

<QNameValued xmlns="....c14n20">
	<!-- identify a QName-valued element -->
	<Element ns="..." name="..."/>

	<!-- identify a QName-valued qualified attr -->
	<Attr ns="..." name="..."/>	

	<!-- identify a QName-valued unqualified attr -->
	<Attr name="..." parentns="..." parentname="..."/>
</QNameValued>

A more compact approach would be to use QNames to identify these nodes, but
that creates a circular requirement of also having to recognize this set of
expressions as containing QNames (though obviously that would be built-in).

I tend to think QNames are rare enough in systems that would care enough to
use this option that worrying about the size of the expressions is not a big
concern.

I also think that for some people, correctness and robustness against these
problems trumps squeezing every possible drop of performance, but I can
understand if some people would take the opposite view.

I can write up a formal proposal on this if there's any agreement to support
this capability, and am open to any suggestions on how to encode it.

-- Scott

Received on Thursday, 10 June 2010 00:29:47 UTC