This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 29588 - [fo31] xml-to-json - rules for duplicate keys
Summary: [fo31] xml-to-json - rules for duplicate keys
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Functions and Operators 3.1 (show other bugs)
Version: Candidate Recommendation
Hardware: PC All
: P2 normal
Target Milestone: ---
Assignee: Michael Kay
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-04-23 20:43 UTC by Michael Kay
Modified: 2016-07-21 15:20 UTC (History)
1 user (show)

See Also:


Attachments

Description Michael Kay 2016-04-23 20:43:45 UTC
The spec for xml-to-json states that duplicate keys are disallowed, as a consequence of the requirement for the input XML to be valid against our published schema. Our schema says that the @key attribute must be unique for all the element children of a <j:map> element.

This doesn't quite achieve the desired effect. For example the following are disallowed as duplicates, when in fact they are different keys:

<j:null key="\n" escaped-key="true"/>
<j:null key="\n" escaped-key="false"/>

and the following pair are not disallowed, although they are different representations of the same key:

<j:null key="\n" escaped-key="true"/>
<j:null key="&#xa;"/>

It's possible to solve the first problem by tweaking the schema (make the uniqueness constraint apply to the combination of @key and @escaped-key). Solving the second problem is harder. It could potentially be done using an XSD 1.1 assertion, using a regex to unescape the key values and compare them in unescaped form. (But handling escaped surrogate pairs - even well-formed ones - using regular expressions is not easy!). It would be simpler to state the constraint in prose as part of the xml-to-json function specification, rather than relying on the schema.
Comment 1 Andrew Coleman 2016-05-06 09:48:22 UTC
At the meeting on 2016-05-03, the WG agreed to make this change by making the prose normative for duplicate keys. Action A-642-02 was raised to track this
Comment 2 Michael Kay 2016-06-09 10:29:13 UTC
The following changes are made to implement this decision:

(A) The schema for JSON is changed: the uniqueness constraint for maps changes from

        <xs:unique name="unique-key">
            <xs:selector xpath="*"/>
            <xs:field xpath="@key"/>
        </xs:unique>

to

        <xs:unique name="unique-key">
            <xs:selector xpath="*"/>
            <xs:field xpath="@key"/>
            <xs:field xpath="@escaped-key"/>
        </xs:unique>

(Note: the semantics of xs:unique ensure that (a) if there is no @escaped-key value, then the default (false) is used; (b) evaluation of the constraint effectively uses the typed value rather than the string value of the attribute, so "true", "1", and " 1 " are equivalent.)

(B) In the rules for fn:xml-to-json, after the first numbered list, add:

Furthermore, the input must satisfy the following constraint (which cannot be conveniently expressed in the schema). Every element M in the input tree having local name "map" must satisfy the following rule: there must not be two distinct children of M (say C1 and C2) such that the normalized key of C1 is equal to the normalized key of C2. The normalized key of an element C is as follows:

1. If C has the attribute value escaped-key='true', then the value of the key attribute, with all JSON escape sequences expanded to the corresponding Unicode characters according to the JSON escaping rules.

2. Otherwise (the escaped-key attribute is absent or set to false), the value of the key attribute.

(C) In the second error condition (FOJS0006) add the condition "or if a map element has two children whose normalized key values are the same".

(D) Expand Note 4. "Duplicate key values are not permitted. Most cases of duplicate keys are prevented by the rules in the schema; additional cases (where the keys are equal only after expanding JSON escape sequences) are prevented by the prose rules of this function."

(E) If time permits, add some examples.
Comment 3 Michael Kay 2016-06-09 12:01:16 UTC
The changes have been applied to both the F+O3.1 and the XSLT3.0 specifications.

XSLT 3.0 test cases have been updated.
Comment 4 Michael Kay 2016-06-09 12:06:50 UTC
QT3 test cases fn-xml-to-json-D-501|2|3 have been updated.