[Bug 28812] New: JSON options 'unescape' and 'liberal' prevent use of off-the-shelf JSON parsers

https://www.w3.org/Bugs/Public/show_bug.cgi?id=28812

            Bug ID: 28812
           Summary: JSON options 'unescape' and 'liberal' prevent use of
                    off-the-shelf JSON parsers
           Product: XPath / XQuery / XSLT
           Version: Candidate Recommendation
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Functions and Operators 3.1
          Assignee: mike@saxonica.com
          Reporter: josh.spiegel@oracle.com
        QA Contact: public-qt-comments@w3.org
             Group: XSLXQuery_WG

Query processors should be able support functions like fn:parse-json and
fn:json-doc using off-the-shelf JSON parsers.  This allows end-users to plug-in
parsers that meet their specific needs.  For example, for fn:doc and
fn:parse-xml, we allow our users to specify an entity resolver that controls
the XML parser and set of available documents.  See:

  See Document Entities, Parser Factory Entities:
  https://docs.oracle.com/database/121/JAXML/oracle/xml/xquery/OXQEntity.html 

This level of customization has been important to our users in the past.

In Java, JSR-353 provides a "standard" API for JSON parsing:

  https://jsonp.java.net/

This API does not standardize a liberal parsing mode or provide access to
unescaped strings.  

There is also Jackson, a popular JSON parsing library I have seen used by a
number of Java projects.

  http://wiki.fasterxml.com/JacksonHome

Jackson also does not provide access to unescaped strings.

I would like the working group to consider creating an error code that
implementations may raise if they do not support 'unescape=false' or
'liberal=true'.  This would allow implementations to stay conformant while
using off-the-shelf parsers.

Alternatively, I would also support removing options 'unescape' and 'liberal'. 
'liberal=true' is too loosely specified to ensure much interoperability.  And,
I don't see the usecase for 'unescape=false':
  * This sort of thing isn't supported for XML (i.e. parse-xml has no feature
to preserve character references)
  * In general, it doesn't solve the problem of non-xml characters in JSON. 
json-doc may still produce non-XML characters even when unescape=false. 
Assuming it might do this is possibly a gotcha for users.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

Received on Monday, 15 June 2015 16:43:50 UTC