W3C

XQuery 3.1 Requirements and Use Cases

W3C First Public Working Draft 24 April 2014

This version:
http://www.w3.org/TR/2014/WD-xquery-31-requirements-20140424/
Latest version:
http://www.w3.org/TR/xquery-31-requirements/
Editor:
Jonathan Robie, EMC Corporation

Abstract

This document specifies goals and requirements for XQuery 3.1.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This is a First Public Working Draft as described in the Process Document. It was developed by the W3C XML Query Working Group, which is part of the XML Activity. The group does not expect this document to become a W3C Recommendation, but to eventually publish this document as a W3C Working Group Note.

These Requirements identify extensions to the XQuery 3.0 Recommendation, published 04 April 2014, that have been requested by WG participants and by reviewers who do not participate in the W3C activities. The XML Query WG has not yet fully reviewed these requirements.

Please report errors in this document using W3C's public Bugzilla system (instructions can be found at http://www.w3.org/XML/2005/04/qt-bugzilla). If access to that system is not feasible, you may send your comments to the W3C XSLT/XPath/XQuery public comments mailing list, public-qt-comments@w3.org. It will be very helpful if you include the string “[XQuery31Req]” in the subject line of your report, whether made in Bugzilla or in email. Please use multiple Bugzilla entries (or, if necessary, multiple email messages) if you have more than one comment to make. Archives of the comments and responses are available at http://lists.w3.org/Archives/Public/public-qt-comments/.

Publication as a First Public Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1 Goals
2 Requirements
    2.1 Terminology
    2.2 General Requirements
        2.2.1 Backward compatibility
        2.2.2 Extension compatibility
    2.3 Maps, Arrays, Nulls, and JSON
        2.3.1 Maps
        2.3.2 Arrays
        2.3.3 Nulls
        2.3.4 Serialization
    2.4 Usability Features
        2.4.1 Scientific Notation
        2.4.2 Type Aliases
        2.4.3 Invoking XSLT Transformations
        2.4.4 Collations
3 Use Cases
    3.1 Streaming
        3.1.1 Simple Grouping
            3.1.1.1 Solution in XQuery 3.0
            3.1.1.2 Solution in XQuery 3.0 with XSLT Maps
            3.1.1.3 Solution in XSLT 3.0
        3.1.2 Simultaneous Grouping
            3.1.2.1 Solution in XQuery 3.0
            3.1.2.2 Solution in XQuery 3.0 with XSLT Maps
        3.1.3 Word Count by Lemma
            3.1.3.1 Input Data
            3.1.3.2 Result
            3.1.3.3 Solution in XQuery 3.0 with XSLT Maps:
            3.1.3.4 Alternative Solution in XQuery 3.0 with XSLT Maps:
            3.1.3.5 Solution Using Grouping in XQuery 3.0:
            3.1.3.6 Solution in XSLT 3.0:
    3.2 Compound Values
        3.2.1 Complex Number Library
            3.2.1.1 Solution in XQuery 3.0 with XSLT Maps:
            3.2.1.2 Solution in XSLT 3.0 (using type-alias proposal, still in discussion):
    3.3 Manual Indexing
        3.3.1 Simple Manual Join
            3.3.1.1 Input Data
            3.3.1.2 Solution in XQuery 3.0 with XSLT Maps:
            3.3.1.3 Solution in XSLT 3.0:
    3.4 Interface / Implementation Pattern
        3.4.1 Data Variety
            3.4.1.1 Input Data
            3.4.1.2 Solution in XQuery 3.0 with XSLT Maps:
            3.4.1.3 Solution in XSLT 3.0:
        3.4.2 Search and Snippeting
            3.4.2.1 Solution in XQuery Full Text 3.0 with XSLT Maps:
        3.4.3 Abstracting Document Structure
            3.4.3.1 Solution in XQuery 3.0 with XSLT Maps:
    3.5 Parameter Passing
        3.5.1 XSLT Stylesheet Parameters
            3.5.1.1 Solution in XQuery 3.0 with XSLT Maps:
        3.5.2 Function Options
            3.5.2.1 Solution in XQuery 3.0 with XSLT Maps:
            3.5.2.2 Solution in XQuery 3.0 with XSLT Maps enhanced with stronger typing:
        3.5.3 Translation
            3.5.3.1 Solution in XQuery 3.0 with XSLT Maps:
        3.5.4 Cipher Functions
            3.5.4.1 Solution in XQuery 3.0 with XSLT Maps:
    3.6 Natural Language Processing
        3.6.1 Input Data
        3.6.2 Convert Part of Speech Data to XML
        3.6.3 Converting arrays to maps
        3.6.4 Group by Part of Speech
        3.6.5 Trigrams
        3.6.6 Partitioning using filters
    3.7 Comparing Sequences in Optical Character Recognition
    3.8 Transforms for Graphics
    3.9 JSON
        3.9.1 Information Retrieval
            3.9.1.1 Input Data
            3.9.1.2 Result
            3.9.1.3 Solution in XQuery 3.0 with XSLT Maps:
            3.9.1.4 Alternative Solution in XQuery 3.0 with XSLT Maps:
            3.9.1.5 Solution in JSONiq:
        3.9.2 Converting JSON to XML
            3.9.2.1 Input Data
            3.9.2.2 Result
            3.9.2.3 Solution in XQuery 3.0 with XSLT Maps:
            3.9.2.4 Solution in JSONiq:
            3.9.2.5 Solution in XSLT 3.0:
        3.9.3 Update by Copying
            3.9.3.1 Input Data
            3.9.3.2 Solution in XQuery 3.0 with XSLT Maps:
            3.9.3.3 Solution in XSLT 3.0:
        3.9.4 Joins
            3.9.4.1 Input Data
            3.9.4.2 Solution in JSONiq:
            3.9.4.3 Solution in XSLT 3.0:
        3.9.5 Grouping Queries for JSON
            3.9.5.1 Input Data
            3.9.5.2 Result
            3.9.5.3 Solution in JSONiq:
            3.9.5.4 Solution in XSLT 3.0:
        3.9.6 More Complex Grouping Queries for JSON
            3.9.6.1 Input Data
            3.9.6.2 Result
            3.9.6.3 Solution in JSONiq:
            3.9.6.4 Solution in XSLT 3.0:
        3.9.7 JSON to JSON Transformations
            3.9.7.1 Input Data
            3.9.7.2 Result
            3.9.7.3 Solution in JSONiq:
            3.9.7.4 Solution in XSLT 3.0:
        3.9.8 Converting XML to JSON
            3.9.8.1 Input Data
            3.9.8.2 Result
            3.9.8.3 Solution in JSONiq:
        3.9.9 Transforming JSON to SVG
            3.9.9.1 Input Data
            3.9.9.2 Solution in JSONiq:
        3.9.10 Transforming Arrays to HTML Tables
            3.9.10.1 Input Data
            3.9.10.2 Solution in JSONiq:
        3.9.11 Windowing Queries
            3.9.11.1 Input Data
            3.9.11.2 Result
            3.9.11.3 Solution in JSONiq:
        3.9.12 JSON views in middleware
            3.9.12.1 Input Data
            3.9.12.2 Solution in JSONiq:
        3.9.13 In-Place Updates
            3.9.13.1 Input Data
            3.9.13.2 Solution in JSONiq:
        3.9.14 Data Transformations
            3.9.14.1 Input Data
            3.9.14.2 Solution in JSONiq:

Appendix

A References

End Notes


1 Goals

The primary goal of XML Query 3.1 is to extend XML Query 3.0 with support for JSON maps and arrays, and to leverage these structures to make XQuery more useful. These data structures are also part of XPath 3.1, and are used in XSLT as well as XQuery.

Other features that improve usability or compatibility will be considered as time permits.

Satisfying these goals may require changes to the set of seven documents that have progressed to Recommendation together (Data Model 3.1, Functions and Operators 3.1, Serialization 3.1, XPath 3.1, XQuery 3.1, XQueryX 3.1, and XSLT 3.0).

2 Requirements

2.1 Terminology

The following keywords are used throughout the document to specify the extent to which an item is a requirement for the work of the XML Query Working Group:

MUST

The item is an absolute requirement.

MUST NOT

The item is an absolute prohibition.

SHOULD

There may exist valid reasons not to treat this item as a requirement, but the full implications should be understood and the case carefully weighed before discarding this item.

SHOULD NOT

There may exist valid reasons when the particular behavior is acceptable or even useful, but the full implications should be understood and the case carefully weighed before implementing any behavior described with this label.

MAY

An item deserves attention, but further study is needed to determine whether the item should be treated as a requirement.

When the words MUST, SHOULD, or MAY are used in this technical sense [IETF RFC 2119], they occur as a hyperlink to these definitions. These words will also be used with their conventional English meaning, in which case there is no hyperlink. For instance, the phrase "the full implications should be understood" uses the word "should" in its conventional English sense, and therefore occurs without the hyperlink.

Each requirement also includes a status section, indicating its current situation in the XQuery/XPath/XSLT family of specifications. Three status levels are used:

"Green" status

green status This indicates that the requirement, according to its original formulation, has been completely met. Optional clarifying text may follow.

"Yellow" status

yellow status This indicates that the requirement has been partially met according to its original formulation. When this happens, explanatory text is provided to better clarify the current scope of the requirement.

"Red" status

red status This indicates that the requirement, according to its original formulation, has not been met. If this is the case, explanatory text is provided.

2.2 General Requirements

2.2.1 Backward compatibility

XQuery 3.1 MUST be backward compatible with [XQuery 3.0].

Every valid XQuery 3.0 expression MUST be valid in XQuery 3.1 and it MUST evaluate to the same result.

green status Status: this requirement has been met.

2.2.2 Extension compatibility

XQuery 3.1 MUST be compatible with XQuery 3.0 extensions developed by the XML Query Working Group, including [XQuery Update Facility 3.0] and [XQuery and XPath Full Text 3.0].

green status Status: this requirement has been met.

2.3 Maps, Arrays, Nulls, and JSON

2.3.1 Maps

XQuery 3.1 MUST support collections of name / value pairs, which we call maps. In JSON, they are called objects, in other languages they are sometimes called records, structs, dictionaries, hash tables, keyed lists, or associative arrays).

green status Status: this requirement has been met.

The map feature MUST provide a convenient syntax for creating maps.

green status Status: this requirement has been met.

The map feature MUST provide a convenient syntax for returning the value associated with a key.

green status Status: this requirement has been met.

The map feature MUST provide a convenient way to enumerate the keys in a map.

green status Status: this requirement has been met (using functions).

The map feature MUST provide a convenient way to create modified copies of maps, e.g. by adding or deleting entries.

green status Status: this requirement has been met (using functions).

The map feature MUST NOT preclude in-situ updates analogous to updates in the XQuery Update Facility.

green status Status: this requirement has been met.

A map SHOULD allow any atomic value as a key. The map feature SHOULD allow keys of various types to be used as keys in the same map.

green status Status: this requirement has been met.

A map SHOULD allow any XDM sequence as a value. A map MUST allow any XDM item, map, or array as a value.

green status Status: this requirement has been met.

A map MUST be allowed as a member of an XDM sequence.

green status Status: this requirement has been met.

It MAY be possible to use a map as a function.

green status Status: this requirement has been met.

For the sake of optimizability, a map SHOULD NOT expose identity via the is, <<, >>, union, intersect, or except operators, or any operation that exposes document order.

green status Status: this requirement has been met.

2.3.2 Arrays

XQuery 3.1 MUST support arrays, which can nest.

green status Status: this requirement has been met.

XQuery 3.1 MUST provide a convenient syntax for creating arrays.

green status Status: this requirement has been met.

Arrays MUST provide a convenient syntax for returning the value found in a given position.

green status Status: this requirement has been met (using function call syntax).

Arrays SHOULD provide a convenient way to create modified copies of an array, e.g. by adding or deleting entries.

green status Status: this requirement has been met (using functions).

Arrays MUST NOT preclude in-situ updates analogous to updates in the XQuery Update Facility.

green status Status: this requirement has been met.

An array MUST allow any XDM item, array, or map as a member of an array.

green status Status: this requirement has been met.

An array MUST be allowed as a member of an XDM sequence.

green status Status: this requirement has been met.

It MAY be possible to use an array as a function.

green status Status: this requirement has been met.

For the sake of optimizability, an array SHOULD NOT expose identity via the is, <<, >>, union, intersect, or except operators, or any operation that exposes document order.

green status Status: this requirement has been met.

2.3.3 Nulls

XQuery 3.1 MUST support nulls. It MAY represent nulls using the empty sequence, or it MAY represent nulls with a new item.

red status Status: this requirement has not been met. The representation of nulls is still under investigation.

2.3.4 Serialization

XQuery 3.1 MUST support JSON serialization.

green status Status: this requirement has been met.

XQuery 3.1 MAY support serialization to multiple resources from a single query.

green status Status: this requirement has been met (via fn:put().

2.4 Usability Features

2.4.1 Scientific Notation

XQuery 3.1 MUST provide support for numbers in scientific notation.

green status Status: this requirement has been met.

2.4.2 Type Aliases

XQuery 3.1 MAY support aliases for types.

red status Status: this requirement has not been met.

2.4.3 Invoking XSLT Transformations

XQuery 3.1 MUST provide a means to invoke XSLT transformations.

red status Status: this requirement has not been met.

2.4.4 Collations

XQuery 3.1 MAY provide a standard mechanism for referring to collations.

green status Status: this requirement has been met (via fn:put().

3 Use Cases

The solutions provided for the following Use Cases include solutions in the following languages:

Note:

None of these solutions are in the XQuery 3.1 language. These solutions are shown in languages we used to investigate the requirements for XQuery 3.1. The next publication of these use cases will replace the current set of solutions with XQuery 3.1 solutions.

3.1 Streaming

In a streaming application you only get one chance to look at each piece of data in the source file. Therefore, if the output is not a pure event-to-event function of the input, you have to selectively remember some of the things you have seen in the input for use later. This sometimes creates a need for data structures to hold working data in memory. This is an important motivating use case for maps in XSLT. Some of the motivating examples for XSLT can be solved in other ways in XQuery; because XQuery does not have a streaming facility, it's unclear whether maps would be the best solution for these examples in a streaming XQuery processor.

Note:

This is solved in XSLT 3.0 using the streaming facility.

3.1.1 Simple Grouping

Find the highest earning employee in each department.

3.1.1.1 Solution in XQuery 3.0
for $e in doc("employees.xml")/employees/employee,
    $d in $e/department
group by $d
return
   <department name="{$d}">
     {
       let $max := max($e/salary)
       return $e[salary=$max]
     }
   </department>
3.1.1.2 Solution in XQuery 3.0 with XSLT Maps
declare function local:search-employees(
  $employees as element(employee)*,
  $highest-earners as map(xs:string, element(employee))
)
{
  if(empty($employees)) then $highest-earners else

  let $this := head($employees)
  let $existing := $highest-earners($this/department)
  let $new-earners :=
    if ($existing/salary gt $this/salary) then $highest-earners
    else map:new(($highest-earners, map:entry($this/department, $this)))
  return local:search-employees(tail($employees), $new-earners)
};

let $highest-earners := local:search-employees(doc("employees.xml")/*/employee, map:new())
for $department in map:keys($highest-earners)
return
  <department name="{$department}">{ $highest-earners($department) }</department>
3.1.1.3 Solution in XSLT 3.0
<xsl:stream href="employees.xml">
  <xsl:iterate select="*/employee">
    <xsl:param name="highest-earners"
               as="map(xs:string, element(employee))"
               select="map:new()"/>
    <xsl:variable name="this" select="copy-of(.)" as="element(employee)"/>
    <xsl:next-iteration>
      <xsl:with-param name="highest-earners"
                      select="let $existing := $highest-earners($this/department)
                              return if ($existing/salary gt $this/salary)
                                then $highest-earners
                                else map:new(($highest-earners,
                                  map:entry($this/department, $this)))"/>
    </xsl:next-iteration>
    <xsl:on-completion>
      <xsl:for-each select="map:keys($highest-earners)">
        <department name="{.}">
          <xsl:copy-of select="$highest-earners(.)"/>
        </department>
      </xsl:for-each>
    </xsl:on-completion>
  </xsl:iterate>
</xsl:stream>

3.1.2 Simultaneous Grouping

Find both the highest earning employee in each department, and the total number of employees to job-type across all departments.

3.1.2.1 Solution in XQuery 3.0
for $employee in doc("employees.xml")/*/employee
let $salary := $employee/salary
group by $department := $employee/department
let $max-salary := max($salary)
let $highest-earners := $employee[salary = $max-salary]
return
   <department name="{$department}">{ $highest-earners }</department>,

for $employee in doc("employees.xml")/*/employee
let $salary := $employee/salary
group by $job-type := $employee/job-type
let $totals := count($employee)
return
   <total-by-job-type type="{$job-type}">{ $totals }</total-by-job-type>

          
3.1.2.2 Solution in XQuery 3.0 with XSLT Maps
declare function local:search-employees(
  $employees as element(employee)*,
  $highest-earners as map(xs:string, element(employee),
  $totals as map(xs:string, xs:double))
)
{
  if(empty($employees)) then ($highest-earners, $totals) else

  let $this := head($employees)
  let $existing := $highest-earners($this/department)
  let $new-earners :=
    if ($existing/salary gt $this/salary) then $highest-earners
    else map:new(($highest-earners, map:entry($this/department, $this)))
  let $job-type := $this/job-type/string()
  let $new-totals := map:new(($totals, map { $job-type := $totals($job-type) + 1 }))
  return local:search-employees(tail($employees), $new-earners, $new-totals)
};

let $results := local:search-employees(doc("employees.xml")/*/employee, map:new())
let $highest-earners := $results[1]
let $totals := results[2]
return (
  for $department in map:keys($highest-earners)
  return
    <department name="{$department}">{ $highest-earners($department) }</department>,
  for $job-type in map:keys($totals)
  return
    <total-by-job-type type="{$job-type}">{ $totals($job-type) }</total-by-job-type>
)

3.1.3 Word Count by Lemma

Calculate the word count by lemma of the verbs in the following document.

3.1.3.1 Input Data

The XML document, gnt.xml.

<gnt>
<s>
 <w pos="PP">I</w>
 <w pos="V" lemma="go">go</w>
 <pu>.</pu>
</s>
<s>
 <w pos="PP">She</w>
 <w pos="V" lemma="go">went</w>
 <pu>.</pu>
</s>
<s>
 <w pos="PP">He</w>
 <w pos="V" lemma="go">goes</w>
 <pu>.</pu>
</s>
<s>
 <w pos="PP">I</w>
 <w pos="V" lemma="see">see</w>
 <pu>.</pu>
</s>
<s>
 <w pos="PP">She</w>
 <w pos="V" lemma="see">sees</w>
 <pu>.</pu>
</s>
<s>
 <w pos="PP">I</w>
 <w pos="V" lemma="have">have</w>
 <pu>.</pu>
</s>
<s>
 <w pos="PP">She</w>
 <w pos="V" lemma="have">has</w>
 <pu>.</pu>
</s>
</gnt>
3.1.3.2 Result
<verb lemma="go" count="3"/>
<verb lemma="see" count="2"/>
<verb lemma="have" count="2"/>
3.1.3.3 Solution in XQuery 3.0 with XSLT Maps:
declare function local:word-count($words, $result)
{
  if(empty($words)) then $result else

  let $word := head($words)
  return local:word-count(tail($words),
    map:new(($result, map { $word/@lemma := ($result($word/@lemma), 0)[1] + 1 })))
};

let $counts := local:word-count(doc("gnt.xml")//w[m:is-verb(.)], map{})
for $lemma in map:keys($counts)
let $count := $counts($lemma)
order by $count
return
  <verb lemma="{ $lemma }" count="{ $count }"/>
3.1.3.4 Alternative Solution in XQuery 3.0 with XSLT Maps:
let $counts := fold-left(function($map, $word) {
    map:new(($result, map { $word/@lemma := ($map($word/@lemma), 0)[1] + 1 }))
  }, map{}, doc("gnt.xml")//w[m:is-verb(.)])
for $lemma in map:keys($counts)
let $count := $counts($lemma)
order by $count
return
  <verb lemma="{ $lemma }" count="{ $count }"/>
3.1.3.5 Solution Using Grouping in XQuery 3.0:

A solution just using grouping, without maps.

for $word in doc("gnt.xml")//w
let $lemma := $word/@lemma
where m:is-verb($word)
group by $lemma
order by count($word) descending
return
  <verb lemma="{ $lemma }" count="{count($word)}" />
3.1.3.6 Solution in XSLT 3.0:
<xsl:iterate select="doc("gnt.xml")//w">
  <xsl:param name="result" select="map{}"/>
  <xsl:next-iteration>
    <xsl:with-param name="result"
      select="map:new(($map, map { $word := ($map($word), 0)[1] + 1 }))"/>
  </xsl:next-iteration>
  <xsl:on-completion>
    <xsl:for-each select="map:keys($result)">
      <xsl:sort select="$result(.)"/>
      <verb lemma="{ . }" count="{ $result(.) }"/>
    </xsl:for-each>
  </xsl:on-completion>
</xsl:iterate>

3.2 Compound Values

3.2.1 Complex Number Library

Implement a complex number library for XQuery or XSLT 3.0. Complex numbers should be represented as a single item, so they can themselves be manipulated like regular numbers by returning sequences of them etc.

3.2.1.1 Solution in XQuery 3.0 with XSLT Maps:
declare function i:complex(
  $real as xs:double,
  $imaginary as xs:double
) as map(xs:boolean, xs:double)
{
  map{ true() := $real, false() := $imaginary }
};

declare function i:real(
  $complex as map(xs:boolean, xs:double)
) as xs:double
{
  $complex(true())
};

declare function i:imaginary(
  $complex as map(xs:boolean, xs:double)
) as xs:double
{
  $complex(false())
};

declare function i:add(
  $arg1 as map(xs:boolean, xs:double),
  $arg2 as map(xs:boolean, xs:double)
) as map(xs:boolean, xs:double)
}
  i:complex(i:real($arg1)+i:real($arg2),
    i:imaginary($arg1)+i:imaginary($arg2))
};

declare function i:multiply(
  $arg1 as map(xs:boolean, xs:double),
  $arg2 as map(xs:boolean, xs:double)
) as map(xs:boolean, xs:double)
{
  i:complex(
    i:real($arg1)*i:real($arg2) - i:imaginary($arg1)*i:imaginary($arg2),
    i:real($arg1)*i:imaginary($arg2) + i:imaginary($arg1)*i:real($arg2))
};
3.2.1.2 Solution in XSLT 3.0 (using type-alias proposal, still in discussion):
<xsl:type-alias name="i:complex" as="map(xs:boolean, xs:double)"/>

<xsl:function name="i:complex" as="i:complex">
<xsl:param name="real" as="xs:double"/>
<xsl:param name="imaginary" as="xs:double"/>
<xsl:sequence select="map{ true() := $real, false() := $imaginary }"/>
</xsl:function>

<xsl:function name="i:real" as="xs:double">
<xsl:param name="complex" as="i:complex"/>
<xsl:sequence select="$complex(true())"/>
</xsl:function>

<xsl:function name="i:imaginary" as="xs:double">
<xsl:param name="complex" as="i:complex"/>
<xsl:sequence select="$complex(false())"/>
</xsl:function>

<xsl:function name="i:add" as="i:complex">
<xsl:param name="arg1" as="i:complex"/>
<xsl:param name="arg2" as="i:complex"/>
<xsl:sequence select="i:complex(i:real($arg1)+i:real($arg2),
  i:imaginary($arg1)+i:imaginary($arg2))"/>
</xsl:function>

<xsl:function name="i:multiply" as="i:complex">
<xsl:param name="arg1" as="i:complex"/>
<xsl:param name="arg2" as="i:complex"/>
<xsl:sequence select="i:complex(
       i:real($arg1)*i:real($arg2) - i:imaginary($arg1)*i:imaginary($arg2),
       i:real($arg1)*i:imaginary($arg2) + i:imaginary($arg1)*i:real($arg2))"/>
</xsl:function>

3.3 Manual Indexing

Build an index to manually optimize retrieval of books in a catalog by their ISBN number.

3.3.1 Simple Manual Join

Construct a list of all authors, and the books they have written.

3.3.1.1 Input Data

Book elements of the form:

<book>
<isbn>0470192747</isbn>
<publisher>Wiley</publisher>
<title>XSLT 2.0 and XPath 2.0 Programmer's Reference</title>
</book>

Author elements of the form:

<author>
<author>Michael H. Kay</author>
<isbn>0470192747</isbn>
<isbn>...</isbn>
</book>
3.3.1.2 Solution in XQuery 3.0 with XSLT Maps:
declare variable $index := map:new(//book ! map{isbn := .});

<table>{
  for $a in //author
  return <tr>
    <td>{ $a/name/string() }</td>
    <td>{ string-join($a/isbn ! $index(.)/title/string(), ", ") }</td>
  </tr>
}</table>
3.3.1.3 Solution in XSLT 3.0:

XSLT has the xsl:key functionality, which is preferable. However, a straight forward translation from the XQuery solution follows:

<xsl:variable name="index" select="map:new(//book ! map{isbn := .})"/>

<table>
  <xsl:for-each select="//author">
    <tr>
      <td><xsl:value-of select="name"/></td>
      <td><xsl:value-of select="string-join(isbn ! $index(.)/title/string(), ', ')"/></td>
    </tr>
  </xsl:for-each>
}</table>

3.4 Interface / Implementation Pattern

As in Javascript, a map whose keys are strings and whose associated values are function items can be used in a similar way to a class in object-oriented programming languages.

3.4.1 Data Variety

Suppose an application needs to handle customer order information that may arrive in three different formats, with different hierarchic arrangement.

An application can isolate itself from these differences by defining a set of functions to navigate the relationships between customers, orders, and products: orders-for-customer, orders-for-product, customer-for-order, product-for-order. These functions can be implemented in different ways for the three different input formats.

3.4.1.1 Input Data

Flat structure:

<customer id="c123">...</customer>
<product id="p789">...</product>
<order customer="c123" product="p789">...</order>

Orders within customer elements:

<customer id="c123">
<order product="p789">...</order>
</customer>
<product id="p789">...</product>

Orders within product elements:

<customer id="c123">...</customer>
<product id="p789">
<order customer id="c123">...</order>
</product>
3.4.1.2 Solution in XQuery 3.0 with XSLT Maps:

For example, with the first format the implementation might be:

let $flat-input-functions as map(xs:string, function(*))*
return map {
  'orders-for-customer' := function($c as element(customer)) as element(order)*
    { $c/../order[@customer=$c/@id] },
  'orders-for-product' := function($p as element(product)) as element(order)*
    { $p/../order[@product=$p/@id] },
  'customer-for-order' := function($o as element(order)) as element(customer)
    { $o/../customer[@id=$o/@customer] },
  'product-for-order' := function($o as element(order)) as element(product)
    { $o/../product[@id=$o/@product] }
}
3.4.1.3 Solution in XSLT 3.0:
<xsl:variable name="flat-input-functions" as="map(xs:string, function(*))*"
   select="map {
             'orders-for-customer' :=
                  function($c as element(customer)) as element(order)*
                     {$c/../order[@customer=$c/@id]},
             'orders-for-product' :=
                  function($p as element(product)) as element(order)*
                     {$p/../order[@product=$p/@id]},
             'customer-for-order' :=
                  function($o as element(order)) as element(customer)
                     {$o/../customer[@id=$o/@customer]},
             'product-for-order' :=
                  function($o as element(order)) as element(product)
                     {$o/../product[@id=$o/@product]} }
          "/>

3.4.2 Search and Snippeting

Create a general interface that takes as input some words, does a full-text search for them, and returns snippets of the top 10 results, ordered by score, where the nodes to search, their structure, how to construct snippets and how to score them differ for different data sets.

3.4.2.1 Solution in XQuery Full Text 3.0 with XSLT Maps:

Create a template method and use a map of functions to define the implementation of the plug-in points.

(: General interface module :)

module namespace this="http://example.com/search-interface/";

declare function this:search(
    $words as xs:string*, $collection as map(xs:string, function(*)) )
{
    (for $d in $collection('select')[. contains text {$words} any word]
     order by $collection('score', $d, $words)
     return $collection('snippet', $d, $words))[position()<=10]
};

(: Specific implementation example :)

import module namespace s="http://example.com/search-interface/";

declare variable $twitter as map(xs:string, function(*)) :=
    map {
      'select' := function() as node()*
          { collection("twitter") },
      'score' := function($n as node(), $words as xs:string*) as xs:double
          { let score $s1 := $n contains text {$words} any word
            let score $s2 := $n contains text {$words} all words
            return $s1 + $s2
          },
      'snippet' := function($node as node(), $words as xs:string*) as node()
          { $node },
    };

declare variable $blog as map(xs:string, function(*)) :=
    map {
      'select' := function() as node()*
          { collection("blogs")/body },
      'score' := function($n as node(), $words as xs:string*) as xs:double
          {
            let $s1 :=
              avg(
                for $p score $s in $n/para[. contains text {$words} any  
word]
                return $s)
            let $s2 :=
              avg(
                for $p score $s in
                  $n/comment[. contains text {$words} weight 0.5 any word]
                return $s)
            let score $s3 := $n/title contains text {$words} weight 5.0 any  
word
            return $s1 + $s2 + $s3
          },
      'snippet' := function($node as node(), $words as xs:string*) as node()
          { <result>{$node/title, $node/para[1], $node/comment[1]}</result>  
},
    };

declare variable $books as map(xs:string, function(*)) :=
    map {
      'select' := function() as node()*
          { collection()//chapter },
      'score' := function($n as node(), $words as xs:string*) as xs:double
          { let score $s1 := $n contains text {$words} any word
            let score $s2 := $n/title contains text {$words} weight 5.0 any  
word
            return $s1 + $s2
          },
      'snippet' := function($node as node(), $words as xs:string*) as node()
          { <result>{$node/title,
            ((for $p score $s in $node/p[. contains text {$words} all words]
              order by $s
              return $p),
             (for $p score $s in $node/p[. contains text {$words} any word]
              order by $s
              return $p))[1]
            }</result> },
    };

(: Get top 10 from various sources :)
s:search(("fire","earthquake"),$books),
s:search(("fire","earthquake"),$twitter),
s:search(("fire","earthquake"),$blog)

3.4.3 Abstracting Document Structure

Provide access to various pieces of metadata to application, insulating that application code from variations in document structure.

3.4.3.1 Solution in XQuery 3.0 with XSLT Maps:

Define the metadata interface through a map of functions.

(: Specific implementations :)
declare namespace xh="http://www.w3.org/1999/xhtml";
declare variable $xhtml as map(xs:string, function(*)) :=
    map {
      'title' := function($n as document-node()) as xs:string?
          { $n/xh:head/xh:title },
      'author' := function($n as document-node()) as xs:string?
          { $n/xh:head/xh:meta[@name='author']/@content },
      'pubdate' := function($n as document-node()) as xs:string?
          { $n/xh:head/xh:meta[@name='created']/@content },
      'publisher' := function($n as document-node()) as xs:string?
          { () }
    };

declare variable $medline-citation as map(xs:string, function(*)) :=
    map {
      'title' := function($n as document-node()) as xs:string?
          { $n/MedlineCitation/Article/ArticleTitle },
      'author' := function($n as document-node()) as xs:string?
          {
            string-join(
              for $a in $n/MedlineCitation//Author return
              concat($a/LastName, ", ", $a/ForeName), "; ")
          },
      'pubdate' := function($n as document-node()) as xs:string?
          {
             let $d := $n/MedlineCitation/Article/PubDate
             return string-join(($d/Day,$d/Month,$d/Year), " ")
          },
      'publisher' := function($n as document-node()) as xs:string?
          {  $n/MedlineCitation/MedlineJournalIngo/MedlineTA }
    };

3.5 Parameter Passing

Often library functions may have a large number of optional arguments, which are awkward or impossible to provide using the existing mechanism of variable arity functions.

3.5.1 XSLT Stylesheet Parameters

Pass the list of parameter names and values to the xdmp:xslt-invoke() function, which invokes an XSLT stylesheet.

3.5.1.1 Solution in XQuery 3.0 with XSLT Maps:
declare function xdmp:xslt-invoke($path as xs:string, $input as node(),
  $params as map(xs:QName, item()*)) as document-node()* external;

xdmp:xslt-invoke("my-stylesheet.xsl", doc("my-doc.xml"), map {
  xs:QName("toc") := true(),
  xs:QName("index") := doc("index_terms.xml")
})

3.5.2 Function Options

Provide a machanism to supply (otherwise defaulted) option values to the fn:doc() function, which control aspects of it's behaviour, including:

  • Parsing of external entities

  • DTD validation

  • XML Schema validation

  • Lax (XML Schema) validation

  • Whitespace stripping

  • URI resolution

Using maps in this scenario brings benefits over using XML structure, including:

  • Nodes are not copied; identity is retained

  • Atomic items are not serialized, and retain their specific type

  • Functions can be passed in as options - the relevant example in this case being the URI resolver.

3.5.2.1 Solution in XQuery 3.0 with XSLT Maps:
declare function fn:doc($uri as xs:string, $options as map(xs:string, item()*)) as document-node()? external;

(: Enable lax XML Schema validation :)
doc("validate-me.xml", map {
  "schema-validation" := true(),
  "lax-validation" := true()
}),

(: Enable whitespace stripping, and a custom URI resolution :)
doc("../relative-uri.xml", map {
  "strip-whitespace" := true(),
  "uri-resolver" := resolve-uri(?, base-uri())
})
3.5.2.2 Solution in XQuery 3.0 with XSLT Maps enhanced with stronger typing:
declare function fn:doc(
  $uri as xs:string,
  $options as strong-map(
    external-entities as xs:boolean?,
    dtd-validation as xs:boolean?,
    schema-validation as xs:boolean?,
    lax-validation as xs:boolean?,
    strip-whitespace as xs:boolean?,
    uri-resolver as function(xs:string) as xs:string
  )
) as document-node()? external;

(: Enable lax XML Schema validation :)
doc("validate-me.xml", map {
  xs:QName("schema-validation") := true(),
  xs:QName("lax-validation") := true()
}),

(: Enable whitespace stripping, and a custom URI resolution :)
doc("../relative-uri.xml", map {
  xs:QName("strip-whitespace") := true(),
  xs:QName("uri-resolver") := resolve-uri(?, base-uri())
})

3.5.3 Translation

Design a language-agnostic game (here just the core), which allows a translation function or map as a parameter.

3.5.3.1 Solution in XQuery 3.0 with XSLT Maps:
declare function local:play(
  $secret-number as xs:integer,
  $guessed-number as xs:integer,
  $translator as function(xs:string) as xs:string)
{
  switch (true())
  case $guessed-number eq $secret-number
    return $translator("You won!")
  case $guessed-number lt $secret-number
    return $translator("The secret number is greater.")
  default case (: $guessed-number gt $secret-number :)
    return $translator("The secret number is lower.")
};

local:play(76, 86, function($x) { $x }), (: Keep English :)

local:play(76, 86, map {
  "You won!" := "Du hast gewonnen!",
  "The secret number is greater." := "Die geheime Zahl ist groesser.",
  "The secret number is lower." :=  Die geheime Zahl ist kleiner." }
),

local:play(76, 86, $automated-translator-based-on-natural-language-processing)

3.5.4 Cipher Functions

Provide an encryption function which will encode some input according to a cipher that can be a codebook implemented as a map or an explicit algorithm.

3.5.4.1 Solution in XQuery 3.0 with XSLT Maps:
declare function local:encode(
  $input as xs:string,
  $cipher as function(xs:integer) as xs:integer)
{
  codepoints-to-string($cipher(string-to-codepoints($input)))
};

let $code := {
  string-to-codepoints("a") := string-to-codepoints("z"),
  string-to-codepoints("b") := string-to-codepoints("e"),
  ...
}
return
local:encode("Message", $code),

local:encode("Message",
  function($c) { $c + 3 (: Caesar's cipher :) })

3.6 Natural Language Processing

Software used for natural language processing and text analytics frequently uses data structures like maps and arrays. For instance, the Python Natural Language Toolkit (NLTK) uses lists and tuples extensively. In this use case, we use a library that invokes NLTK to perform simple natural language processing, returning results in a format very similar to that used by NLTK, and perform a variety of simple tasks.

3.6.1 Input Data

In this use case, we are using the Gutenberg edition of Jane Austin's "Emma", as packaged in NLTK. To return the sentences of a text, we use the nltk:sentences() function, which returns sentences using the same data structures as NLTK.

Here are a few sentences resulting from the function call nltk:sentences('austin-emma.txt'), using arrays to represent Python's list structures:

Sentence Representation:

[
  ['I', 'must', 'put', 'on', 'a', 'few', 'ornaments', 'now', ',', 'because', 'it', 'is', 'expected', 'of', 'me', '.'],
  ['A', 'bride', ',', 'you', 'know', ',', 'must', 'appear', 'like', 'a', 'bride', ',', 'but', 'my', 'natural', 'taste', 
   'is', 'all', 'for', 'simplicity', ';', 'a', 'simple', 'style', 'of', 'dress', 'is', 'so', 'infinitely', 'preferable', 
   'to', 'finery', '.'],
  ['But', 'I', 'am', 'quite', 'in', 'the', 'minority', ',', 'I', 'believe', ';', 'few', 'people', 'seem', 'to', 'value', 
   'simplicity', 'of', 'dress', ',--', 'show', 'and', 'finery', 'are', 'every', 'thing', '.']
]
      

NLTK has multiple representations of sentences. If $s is bound to the second sentence in the above data structure, then nltk:pos-tag($s) returns the following:

Part of Speech Representation:

[['A', 'DT'], ['bride', 'NN'], [',', ','], ['you', 'PRP'], ['know', 'VBP'], [',', ','], ['must', 'MD'], 
 ['appear', 'VB'], ['like', 'IN'], ['a', 'DT'], ['bride', 'NN'], [',', ','], ['but', 'CC'], ['my', 'PRP$'], 
 ['natural', 'JJ'], ['taste', 'NN'], ['is', 'VBZ'], ['all', 'DT'], ['for', 'IN'], ['simplicity', 'NN'], [';', ':'], 
 ['a', 'DT'], ['simple', 'JJ'], ['style', 'NN'], ['of', 'IN'], ['dress', 'NN'], ['is', 'VBZ'], 
 ['so', 'RB'], ['infinitely', 'RB'], ['preferable', 'JJ'], ['to', 'TO'], ['finery', 'VB'], ['.', '.']
]
      

3.6.2 Convert Part of Speech Data to XML

If $s is bound to a part of speech representation, we can convert it to an XML format using the following query:

<s>
 {
  for $w in $s()
  return <w pos="{ $w(2) }">{ $w(1) }</w>
 }
</s>
      

Or if we prefer to use meaningful names instead of the numeric positions, we can create an index that maps between names and positions and use it as follows:

declare variable $index := { "pos" : 2, "lemma" : 1 };

<s>
 {
  for $w in $s()
  return <w pos="{ $w($index("pos")) }">{ $w($index("lemma")) }</w>
 }
</s>
      

Both queries have the same result:

<s>
  <w pos="DT">A</w>
  <w pos="NN">bride</w>
  <w pos=",">,</w>
  <w pos="PRP">you</w>
  <w pos="VBP">know</w>
  <w pos=",">,</w>
  <w pos="MD">must</w>
  <w pos="VB">appear</w>
  <w pos="IN">like</w>
  <w pos="DT">a</w>
  <w pos="NN">bride</w>
  <w pos=",">,</w>
  <w pos="CC">but</w>
  <w pos="PRP$">my</w>
  <w pos="JJ">natural</w>
  <w pos="NN">taste</w>
  <w pos="VBZ">is</w>
  <w pos="DT">all</w>
  <w pos="IN">for</w>
  <w pos="NN">simplicity</w>
  <w pos=":">;</w>
  <w pos="DT">a</w>
  <w pos="JJ">simple</w>
  <w pos="NN">style</w>
  <w pos="IN">of</w>
  <w pos="NN">dress</w>
  <w pos="VBZ">is</w>
  <w pos="RB">so</w>
  <w pos="RB">infinitely</w>
  <w pos="JJ">preferable</w>
  <w pos="TO">to</w>
  <w pos="VB">finery</w>
  <w pos=".">.</w>
</s>
      
      

3.6.3 Converting arrays to maps

If $s is bound to a sentence in part of speech representation, the following query converts it to a map with meaningful property names:

[
  for $w in $s()
  return { "pos" : $w(2), "lemma" : $w(1) }
]
       

Here is the output of the above query:

[ { "pos" : "DT", "lemma" : "A" }, 
  { "pos" : "NN", "lemma" : "bride" }, 
  { "pos" : ",", "lemma" : "," }, 
  { "pos" : "PRP", "lemma" : "you" }, 
  { "pos" : "VBP", "lemma" : "know" }, 
  { "pos" : ",", "lemma" : "," }, 
  { "pos" : "MD", "lemma" : "must" }, 
  { "pos" : "VB", "lemma" : "appear" }, 
  { "pos" : "IN", "lemma" : "like" }, 
  { "pos" : "DT", "lemma" : "a" }, 
  { "pos" : "NN", "lemma" : "bride" }, 
  { "pos" : ",", "lemma" : "," }, 
  { "pos" : "CC", "lemma" : "but" }, 
  { "pos" : "PRP$", "lemma" : "my" }, 
  { "pos" : "JJ", "lemma" : "natural" }, 
  { "pos" : "NN", "lemma" : "taste" }, 
  { "pos" : "VBZ", "lemma" : "is" }, 
  { "pos" : "DT", "lemma" : "all" }, 
  { "pos" : "IN", "lemma" : "for" }, 
  { "pos" : "NN", "lemma" : "simplicity" }, 
  { "pos" : ":", "lemma" : ";" }, 
  { "pos" : "DT", "lemma" : "a" }, 
  { "pos" : "JJ", "lemma" : "simple" }, 
  { "pos" : "NN", "lemma" : "style" }, 
  { "pos" : "IN", "lemma" : "of" }, 
  { "pos" : "NN", "lemma" : "dress" }, 
  { "pos" : "VBZ", "lemma" : "is" }, 
  { "pos" : "RB", "lemma" : "so" }, 
  { "pos" : "RB", "lemma" : "infinitely" }, 
  { "pos" : "JJ", "lemma" : "preferable" }, 
  { "pos" : "TO", "lemma" : "to" }, 
  { "pos" : "VB", "lemma" : "finery" }, 
  { "pos" : ".", "lemma" : "." } 
]
       

3.6.4 Group by Part of Speech

If $s is bound to a sentence in part of speech representation, the following query groups words by part of speech, selecting parts of speech particularly illustrative of Jane Austen's writing style.

for $word in $s()
let $pos := $word(2)
let $lexeme := $word(1)
where $pos = ("JJ", "NN", "RB", "VB")
group by $pos
order by $pos
return 
  <pos name="{$pos}">
    { 
      for $l in distinct-values($lexeme)
      return <lexeme>{ $l }</lexeme>
    }
  </pos>
      

Here is the output of the above query:

<pos name="JJ">
<lexeme>natural</lexeme>
<lexeme>simple</lexeme>
<lexeme>preferable</lexeme>
</pos>
<pos name="NN">
  <lexeme>bride</lexeme>
  <lexeme>taste</lexeme>
  <lexeme>simplicity</lexeme>
  <lexeme>style</lexeme>
  <lexeme>dress</lexeme>
</pos>
<pos name="RB">
  <lexeme>so</lexeme>
  <lexeme>infinitely</lexeme>
</pos>
<pos name="VB">
  <lexeme>appear</lexeme>
  <lexeme>finery</lexeme>
</pos>
      

3.6.5 Trigrams

In corpus linguistics, n-grams are the basis for certain statistical techniques used to explore and compare texts; for instance, they are used to determine authorship of texts. If $s is bound to a sentence in sentence notation, the following query computes trigrams for a text:

declare function local:words-only($s)
{
  for $w in $s
  where not($w(2) = (".", ",", ";", ":"))
  return $w(1)
};

for sliding window $w in local:words-only($s())
    start at $i when true()
    only end at $j when $j - $i eq 2
return [ $w ]

Here is the result for a sentence used in an earlier example:

[ "A", "bride", "you" ], 
[ "bride", "you", "know" ], 
[ "you", "know", "must" ], 
[ "know", "must", "appear" ], 
[ "must", "appear", "like" ], 
[ "appear", "like", "a" ], 
[ "like", "a", "bride" ], 
[ "a", "bride", "but" ], 
[ "bride", "but", "my" ], 
[ "but", "my", "natural" ], 
[ "my", "natural", "taste" ], 
[ "natural", "taste", "is" ], 
[ "taste", "is", "all" ], 
[ "is", "all", "for" ], 
[ "all", "for", "simplicity" ], 
[ "for", "simplicity", "a" ], 
[ "simplicity", "a", "simple" ], 
[ "a", "simple", "style" ], 
[ "simple", "style", "of" ], 
[ "style", "of", "dress" ], 
[ "of", "dress", "is" ], 
[ "dress", "is", "so" ], 
[ "is", "so", "infinitely" ], 
[ "so", "infinitely", "preferable" ], 
[ "infinitely", "preferable", "to" ], 
[ "preferable", "to", "finery" ]
          

3.6.6 Partitioning using filters

Filters can be used to partition the words of a sentence in a variety of ways. In this simple example, we use filters to distinguish verbs from other parts of speech. In NLTK, parse codes that start with the string VB denote verb forms.

In this example, the variable $s is bound to sentence in parsed format, e.g.

[
 ['A', 'DT'], ['bride', 'NN'], [',', ','], ['you', 'PRP'], ['know', 'VBP'], [',', ','], ['must', 'MD'], 
 ['appear', 'VB'], ['like', 'IN'], ['a', 'DT'], ['bride', 'NN'], [',', ','], ['but', 'CC'], ['my', 'PRP$'], 
 ['natural', 'JJ'], ['taste', 'NN'], ['is', 'VBZ'], ['all', 'DT'], ['for', 'IN'], ['simplicity', 'NN'], [';', ':'], 
 ['a', 'DT'], ['simple', 'JJ'], ['style', 'NN'], ['of', 'IN'], ['dress', 'NN'], ['is', 'VBZ'], 
 ['so', 'RB'], ['infinitely', 'RB'], ['preferable', 'JJ'], ['to', 'TO'], ['finery', 'VB'], ['.', '.']
]

The filter function takes a boolean function, and returns one array with those items that satisfy the function, and a second array with those items that do not.

declare function local:filter($s as item()*, $p as function(item()) as xs:boolean)
{
  [ $s[$p(.)] ],   [ $s[not($p(.))] ]
};
        

We can call it with the starts-with() function to partition a sentence.

let $f := function($a) { starts-with($a(2), "VB") }
return
  local:filter($s(), $f)
       

Here is the output of the query for the sentence shown above.

[ [ "know", "VBP" ], [ "appear", "VB" ], [ "is", "VBZ" ], [ "is", "VBZ" ], 
[ "finery", "VB" ] ],

[ [ "A", "DT" ], [ "bride", "NN" ], [ ",", "," ], [ "you", "PRP" ], 
  [ ",", "," ], [ "must", "MD" ], [ "like", "IN" ], [ "a", "DT" ], 
  [ "bride", "NN" ], [ ",", "," ], [ "but", "CC" ], [ "my", "PRP$" ], 
  [ "natural", "JJ" ], [ "taste", "NN" ], [ "all", "DT" ], [ "for", "IN" ], 
  [ "simplicity", "NN" ], [ ";", ":" ], [ "a", "DT" ], [ "simple", "JJ" ], 
  [ "style", "NN" ], [ "of", "IN" ], [ "dress", "NN" ], [ "so", "RB" ], 
  [ "infinitely", "RB" ], [ "preferable", "JJ" ], [ "to", "TO"], 
  [ ".", "." ] ]
       

A programmer might choose to represent filter results using a map instead of an array, as shown in the following code.

declare function local:filter($s as item()*, $p as function(item()) as xs:boolean)
{
  {
    true() : [ $s[$p(.)] ],   
    false() : [ $s[not($p(.))] ]
  }
};


let $f := function($a) { starts-with($a(2), "VB") }
return
  local:filter($s(), $f)
      

Here is the output of the above query using the same data.

{ 

  "true" : 
             [ [ "know", "VBP" ], [ "appear", "VB" ], [ "is", "VBZ" ],
               ["is", "VBZ" ], [ "finery", "VB" ] ],

  "false" :  

             [ [ "A", "DT" ], ["bride", "NN" ], [ ",", "," ], 
               [ "you", "PRP" ], [ ",", "," ], [ "must", "MD" ], 
               [ "like", "IN" ], [ "a", "DT" ], [ "bride", "NN" ], 
               [ ",", "," ], [ "but", "CC" ], [ "my", "PRP$" ], 
               [ "natural", "JJ" ], [ "taste", "NN" ], [ "all", "DT"],
               [ "for", "IN" ], [ "simplicity", "NN" ], [ ";", ":" ],
               [ "a", "DT" ], [ "simple", "JJ" ], [ "style", "NN" ], 
               [ "of", "IN" ], [ "dress", "NN" ], [ "so", "RB" ], 
               [ "infinitely", "RB" ], [ "preferable", "JJ" ], 
               [ "to", "TO" ], [ ".", "." ] ] 
}
      

3.7 Comparing Sequences in Optical Character Recognition

When Rigaudon optical character recognition software is used for multilingual texts, languages are identified by character set if possible, and formatted in hocr format. For instance, the text "the other possible derivation from ἡ ἐπιοῦσα, dies crastinus", which contains English, Greek, and Latin, might be represented as follows in raw OCR output (the format is simplified somewhat for the sake of presentation).

<span class="ocr_word" title="bbox 1388 430 1461 474">the</span> 
<span class="ocr_word" title="bbox 1514 433 1635 476">other</span>
<span class="ocr_word" title="bbox 133 498 317 554">pcssible</span> 
<span class="ocr_word" title="bbox 354 498 590 541">derivation</span> 
<span class="ocr_word" title="bbox 631 497 738 538">from</span> 
<span class="ocr_word" title="bbox 772 495 799 547" lang="grc" xml:lang="grc">ἡ</span> 
<span class="ocr_word" title="bbox 835 495 1019 538" lang="grc" xml:lang="grc">ἐπιοῦσα</span> 
<span class="ocr_word" title="bbox 134 567 220 607">dies</span> 
<span class="ocr_word" title="bbox 257 566 462 607">erastinus</span>
    

In the above output, two words were not correctly recognized, the English word "possible" and the Latin word "crastinus". Rigaudon uses multilingual spell checkers to find the nearest likely word in a one of the languages likely to be used in a given text. For this particular text, we expect to find English, Greek, and Latin.

In this use case, we take the above hocr as input and call the spellcheck function, implemented as an external function, to identify which words are likely in each candidate language. Having done so, we combine the results to construct the most likely text.

The following function extracts the text from the above data.

declare function local:extract-text($spans)
{
  for $s in $spans return string($s)
};
    

Here is the output of the function for the data shown above.

"the", "other", "pcssible", "derivation", "from", "ἡ", "ἐπιοῦσα", "dies", "erastinus"
    

The following function performs a spellcheck in a set of languages, creating a map that identifies the original and each language.

declare variable $languages := ("English", "Greek", "Latin");

declare function local:spellcheck($languages, $text)
{
  {|
     { "languages" : $languages },
     { "raw" : $text  },

     for $l in $languages
     return { 
       $l : [
         for $w in $text
         return ext:sc($l, $w)
       ]
     }
  |}
};

let $t := local:extract-text($spans)
return local:spellcheck($languages, $t)
    

Here is the output of the above query.

{ 
   "languages" : ( "English", "Greek", "Latin" ), 
   "raw" :     [ "the", "other", "pcssible", "derivation", "from", "ἡ", "ἐπιοῦσα", "dies", "erastinus" ], 
   "English" : [ "the", "other", "possible", "derivation", "from", null, null, "dies", null ], 
   "Greek" :   [ null, null, null, null, null, "ἡ", "ἐπιοῦσα", null, null ],
   "Latin" :   [ null, null, null, null, null, null, null, "dies", "erastinus" ]
}
    

The following function merges lookup results in the above format. The first parameter lists a set of languages, in preference order. For each word, the function picks the non-null lookup result for the most preferred language available, or the original "raw" word if all lookups return null. In this code, we assume that $m is bound to the data structure shown above.

declare variable $languages := ("English", "Greek", "Latin");

declare function local:merge($languages, $m)
{
  let $size := count($m("raw")())
  for $i in 1 to $size
  let $candidates := ($languages ! $m(.)($i)[ . ne null] , $m("raw")($i))
  return $candidates[1]
};

local:merge($languages, $m)
    

Here is the result of the query:

the other possible derivation from ἡ ἐπιοῦσα dies crastinus

3.8 Transforms for Graphics

This use case uses rotation matrices to rotate a shape in three dimensions.

The following library implements three-dimensional rotation in XQuery

declare function local:rotate-x( $theta )
{
   [
     [ 1, 0, 0 ],
     [ 0, cosine($theta), - sine($theta) ],
     [ 0, sine($theta), cosine($theta) ]
   ]
}; 

declare function local:rotate-y( $theta )
{
   [
     [ cosine($theta), 0, sine($theta) ],
     [ 0, 1, 0],
     [ - sine($theta), 0, cosine($theta) ]
   ]
}; 

declare function local:rotate-z( $theta )
{
   [
     [ cosine($theta), - sine($theta), 0 ],
     [ sine($theta), cosine($theta), 0 ],
     [ 0, 0, 1]
   ]
}; 

declare function local:rotate($pitch as xs:double, $yaw as xs:double, $roll as xs:double)
{
   let $p := local:rotate-x($pitch)
   let $y := local:rotate-y($yaw)
   let $r := local:rotate-z($roll)
   let $py :=local:mult($p, $y)
   return local:mult($py, $r)
};

declare function local:mult( $matrix1, $matix2 )
{
  if (length($matrix1) != length($matrix2(1))
  then error("Matrices must be m*n and n*p to multiply!")
  else [
     for $i in 1 to length($matrix1)
     return [
         for $j in 1 to length($matrix2(1))
         return
            sum (
           for $k in 1 to length($matrix2)
               return $matrix1($i)($k) * $matrix2($k)($j)
            )
     ]
  ]
};

let $rect := [[0, 0, 0], [10, 0, 0], [10, 10, 0], [0, 10, 0], [0, 0, 0]]
let $rot := for $r in $rect()
            return local:mult($r, local:rotate( 10, 10, 10 )
return img:render( $rot )
                        
        

3.9 JSON

JSON is becoming an important data format that many XQuery and XSLT users have to deal with. Tasks performed can include importing JSON, processing it, and exporting JSON.

3.9.1 Information Retrieval

Import a JSON document and retrieve the mobile phone number from it.

The fn:parse-json() function parses a JSON document into an XDM value as follows:

  1. A JSON object is converted into a map of type map(xs:string, item()?).

  2. A JSON array is converted into a map of type map(xs:integer, item()?).

  3. A JSON string is converted into an xs:string atomic value.

  4. A JSON number is converted into an xs:double atomic value.

  5. A JSON boolean is converted into an xs:boolean atomic value.

  6. A JSON null is converted into the empty sequence.

3.9.1.1 Input Data

The JSON document, mildred.json:

{
     "firstname": "Mildred",
     "lastname": "Moore",
     "age": 32,
     "address":
     {
         "street": "91 High Street",
         "town": "Biscester",
         "county": "Oxfordshire",
         "postcode": "OX6 3PD"
     },
     "phone":
     [
         {
           "type": "home",
           "number": "01869 378073"
         },
         {
           "type": "mobile",
           "number": "07356 740756"
         }
     ]
}
3.9.1.2 Result
"07356 740756"
3.9.1.3 Solution in XQuery 3.0 with XSLT Maps:
let $phoneArray := parse-json(unparsed-text("mildred.json"))("phone")
for $n in map:keys($phoneArray)
let $entry := $phoneArray($n)
where $entry("type") = "mobile"
return $entry("number")
3.9.1.4 Alternative Solution in XQuery 3.0 with XSLT Maps:
declare function map:entries($map as map(*)) as map(*)*
{
  for $k in map:keys($map)
  return map { "key" := $k, "value" := $map($k) }
};

parse-json(unparsed-text("mildred.json"))
  ("phone")!map:entries(.)[.("value")("type") = "mobile"]("number")
3.9.1.5 Solution in JSONiq:
        let $mildred := json("mildred.json")
        let $phones := values($mildred("phone"))
        return $phones[.("type") = "mobile"]("number")

3.9.2 Converting JSON to XML

Convert a JSON data file to XML.

3.9.2.1 Input Data

The JSON document, employees.json:

{ "accounting" : [
      { "firstName" : "John",
        "lastName"  : "Doe",
        "age"       : 23 },

      { "firstName" : "Mary",
        "lastName"  : "Smith",
        "age"       : 32 }
                 ],
  "sales"     : [
      { "firstName" : "Sally",
        "lastName"  : "Green",
        "age"       : 27 },

      { "firstName" : "Jim",
        "lastName"  : "Galley",
        "age"       : 41 }
                  ]
}
3.9.2.2 Result
<department name="accounting">
  <employee>
    <firstName>John</firstName>
    <lastName>Doe</lastName>
    <age>23</age>
  </employee>
  <employee>
    <firstName>Mary</firstName>
    <lastName>Smith</lastName>
    <age>32</age>
  </employee>
</department>
<department name="sales">
  <employee>
    <firstName>Sally</firstName>
    <lastName>Green</lastName>
    <age>27</age>
  </employee>
  <employee>
    <firstName>Jim</firstName>
    <lastName>Galley</lastName>
    <age>41</age>
  </employee>
</department>
3.9.2.3 Solution in XQuery 3.0 with XSLT Maps:
let $input := parse-json(unparsed-text('employees.json'))
for $k in map:keys($input)
return
  <department name="{$k}">{
    let $array := $input($k)
    for $i in map:keys($array)
    let $emp := $array($i)
    return
      <employee>
        <firstName>{ $emp('firstName') }</firstName>
        <lastName>{ $emp('lastName') }</lastName>
        <age>{ $emp('age') }</age>
      </employee>
  }</department>
3.9.2.4 Solution in JSONiq:
for $dept in pairs(json("employees.json"))
return
   <department name="{ name($dept) }"> {
       for $employee in members(value($dept))
       return
         <employee>
           <firstName>{ $employee('firstName') }</firstName>
           <lastName>{ $employee('lastName') }</lastName>
           <age>{ $employee('age') }</age>
         </employee>
   }</department>
     
3.9.2.5 Solution in XSLT 3.0:
<xsl:template name="main">
  <xsl:variable name="input"
                as="map(xs:string, map(xs:string, xs:anyAtomicType)*)"
                select="parse-json(unparsed-text('employees.json'))"/>
  <xsl:for-each select="map:keys($input)">
    <department name="{.}">
      <xsl:for-each select="$input(.)">
        <employee>
          <firstName><xsl:value-of select=".('firstName')"/></firstName>
          <lastName><xsl:value-of select=".('lastName')"/></lastName>
          <age><xsl:value-of select=".('age')"/></age>
        </employee>
      </xsl:for-each>
    </department>
  </xsl:for-each>
</xsl:template>

3.9.3 Update by Copying

Update the first name of the author "Dan Suciu" to "John" in the "bookinfo.json" document.

3.9.3.1 Input Data

The JSON document, bookinfo.json:

{
    "book": {
        "title": "Data on the Web",
        "year": 2000,
        "author": [
            {
                "last": "Abiteboul",
                "first": "Serge"
            },
            {
                "last": "Buneman",
                "first": "Peter"
            },
            {
                "last": "Suciu",
                "first": "Dan"
            }
        ],
        "publisher": "Morgan Kaufmann Publishers",
        "price": 39.95
    }
}
3.9.3.2 Solution in XQuery 3.0 with XSLT Maps:
declare function local:map-transform($map as map(*))
{
  typeswitch($arg)
  case $map as map(*) return
    map:new((
      for $k in map:keys($map)
      let $v := $map($k)
      return map { $k := local:map-transform($v) },
      if($map('last')='Suciu') then map { 'first' := "John" } else ()
    ))
  default $arg
};

local:map-transform(parse-json(unparsed-text("bookinfo.json")))
3.9.3.3 Solution in XSLT 3.0:

Assuming a function map:entries() which returns the entries in a map as a sequence of singleton maps.

<xsl:template match="~map(*)" mode="john" as="map(*)">
  <xsl:variable name="entries" as="map(*)*>
    <xsl:apply-templates select="map:entries(.)" mode="john"/>
  </xsl:variable>
  <xsl:sequence select="map:new($entries)"/>
</xsl:template>

<xsl:template match="~map(*)[.('last')='Suciu']" mode="john">
  <xsl:sequence select="map:new((., map{'first':='John'}))"/>
</xsl:template>

3.9.4 Joins

3.9.4.1 Input Data

The following queries are based on a social media site that allows users to interact with their friends. collection("users") contains data on users and their friends:

{
  "name" : "Sarah",
  "age" : 13,
  "gender" : "female",
  "friends" : [ "Jim", "Mary", "Jennifer"]
}

{
  "name" : "Jim",
  "age" : 13,
  "gender" : "male",
  "friends" : [ "Sarah" ]
}
          
3.9.4.2 Solution in JSONiq:

The following query performs a join on Sarah's friend list to return the Object representing each of her friends:

for $sarah in collection("users")
    $friend in collection("users")
where $sarah("name") = "Sarah"
  and values($sarah("friends")) = $friend("name")
return $friend 

The query can be simplified using a filter. In the following expression, [.("name") = "Sarah"] is a filter that restricts the set of users to the one named "Sarah":

let $sarah := collection("users")[.("name") eq "Sarah"]
for $friend in values($sarah("friends"))
return collection("users")[.("name") eq $friend]
          
3.9.4.3 Solution in XSLT 3.0:

Solution using the XSLT maps proposal: essentially the same as the above, assuming (a) the existence of some mechanism similar to collection() to get a collection of JSON inputs and parse them using the parse-json() function, and (b) the existence of a (potentially user-written) function values() to extract the values of the map representing a JSON array. This function might be written:

<xsl:function name="values" as="item(*)">
  <xsl:param name="array" as="map(xs:integer, item())"/>
  <xsl:for-each select="map:keys($array)">
    <xsl:sequence select="$array(.)"/>
  </xsl:for-each>
</xsl:function>

3.9.5 Grouping Queries for JSON

Note:

These queries are based on similar queries in the XQuery 3.0 Use Cases.

3.9.5.1 Input Data

The input is a sequence (whose order is of no concern) that contains the following sales data, represented here in JSON notation:

{ "product" : "broiler", "store number" : 1, "quantity" : 20  },
{ "product" : "toaster", "store number" : 2, "quantity" : 100 },
{ "product" : "toaster", "store number" : 2, "quantity" : 50 },
{ "product" : "toaster", "store number" : 3, "quantity" : 50 },
{ "product" : "blender", "store number" : 3, "quantity" : 100 },
{ "product" : "blender", "store number" : 3, "quantity" : 150 },
{ "product" : "socks", "store number" : 1, "quantity" : 500 },
{ "product" : "socks", "store number" : 2, "quantity" : 10 },
{ "product" : "shirt", "store number" : 3, "quantity" : 10 }

We want to group sales by product, across stores.

3.9.5.2 Result
{
  "blender" : 250,
  "broiler" : 20,
  "shirt" : 10,
  "socks" : 510,
  "toaster" : 200
  }       
3.9.5.3 Solution in JSONiq:

We assume a function collection("sales") that returns a sequence of items representing the rows in this table.

Query:

{
  for $sales in collection("sales")
  let $pname := $sales("product")
  group by $pname
  return $pname : sum(for $s in $sales return $s("quantity"))
}       
3.9.5.4 Solution in XSLT 3.0:

Solution using the XSLT maps proposal: assuming that collection("sales") delivers a sequence of unparsed JSON texts, and that the result is to be serialized as a JSON text:

  <xsl:variable name="entries" as="map(xs:string, xs:integer)">
    <xsl:for-each-group select="collection('sales')!parse-json(.)" group-by=".('product')">
      <xsl:sequence select="map{ current-grouping-key() := sum(current-group()('quantity')) }"/>
    </xsl:for-each-group>
  </xsl:variable>
  <xsl:sequence select="serialize-json($entries)"/> 
  

3.9.6 More Complex Grouping Queries for JSON

Now let's do a more complex grouping query, showing sales by category within each state. We need further data to describe the categories of products and the location of stores.

3.9.6.1 Input Data

collection("products") contains the following data:

{ "name" : "broiler", "category" : "kitchen", "price" : 100, "cost" : 70 },
{ "name" : "toaster", "category" : "kitchen", "price" : 30, "cost" : 10 },
{ "name" : "blender", "category" : "kitchen", "price" : 50, "cost" : 25 },
{ "name" : "socks", "category" : "clothes", "price" : 5, "cost" : 2 },
{ "name" : "shirt", "category" : "clothes", "price" : 10, "cost" : 3 }

collection("stores") contains the following data:

{ "store number" : 1, "state" : CA },
{ "store number" : 2, "state" : CA },
{ "store number" : 3, "state" : MA },
{ "store number" : 4, "state" : MA }
        
3.9.6.2 Result
            [
              { "CA" : 
                [
                  {"kitchen" : { "broiler" : 20, "toaster" : 150 }},
                  {"clothes" : { "socks" : 510 }}
                ]
              },
              { "MA" : 
                [ 
                  { "kitchen" : { "blender" : 250, "toaster" : 50 }},
                  { "clothes" : { "shirt" : 10 }}
                ]
              }
            ]
        
3.9.6.3 Solution in JSONiq:

The following query groups by state, then by category, then lists individual products and the sales associated with each.

Query:

{
  for $store in collection("stores")
  let $state := $store("state")
  group by $state
  return
     $state : {
       for $product in collection("products")
       let $category := $product("category")
       group by $category
       return
         $category : {
            for $sales in collection("sales")
            where $sales("store number") = $store("store number")
              and $sales("product") = $product("name")
            let $pname := $sales("product")
            group by $pname
            return $pname : sum( for $s in $sales return $s("quantity") )
         }
      }
}
        
3.9.6.4 Solution in XSLT 3.0:

An equivalent XSLT solution is given below. This uses the syntax of the proposed maps facility in XSLT.

<xsl:stylesheet version="3.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:map="http://www.w3.org/2005/xpath-functions/map"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="map xs">
    
    <xsl:output method="text"/>
    
    <xsl:variable name="sales" as="map(*)*" select='
        map{ "product" := "broiler", "store number" := 1, "quantity" := 20  },
        map{ "product" := "toaster", "store number" := 2, "quantity" := 100 },
        map{ "product" := "toaster", "store number" := 2, "quantity" := 50 },
        map{ "product" := "toaster", "store number" := 3, "quantity" := 50 },
        map{ "product" := "blender", "store number" := 3, "quantity" := 100 },
        map{ "product" := "blender", "store number" := 3, "quantity" := 150 },
        map{ "product" := "socks", "store number" := 1, "quantity" := 500 },
        map{ "product" := "socks", "store number" := 2, "quantity" := 10 },
        map{ "product" := "shirt", "store number" := 3, "quantity" := 10 }'/>
    
    <xsl:variable name="products" as="map(*)*" select='
        map{ "name" := "broiler", "category" := "kitchen", "price" := 100, "cost" := 70 },
        map{ "name" := "toaster", "category" := "kitchen", "price" := 30, "cost" := 10 },
        map{ "name" := "blender", "category" := "kitchen", "price" := 50, "cost" := 25 },
        map{ "name" := "socks", "category" := "clothes", "price" := 5, "cost" := 2 },
        map{ "name" := "shirt", "category" := "clothes", "price" := 10, "cost" := 3 }'/>
    
    <xsl:variable name="stores" as="map(*)*" select='
        map{ "store number" := 1, "state" := "CA" },
        map{ "store number" := 2, "state" := "CA" },
        map{ "store number" := 3, "state" := "MA" },
        map{ "store number" := 4, "state" := "MA" }'/>
    
    <xsl:template name="main">
        <xsl:variable name="state-maps" as="map(*)*">
            <xsl:for-each-group select="$stores" group-by=".('state')">
                <xsl:variable name="state" select="current-grouping-key()" 
                                           as="xs:string"/>
                <xsl:variable name="stores-in-state" select="current-group()!.('store number')" 
                                                     as="xs:integer*"/>
                <xsl:variable name="state-map-entry" as="map(*)*">
                    <xsl:for-each-group select="$products" group-by=".('category')">
                        <xsl:variable name="category" select="current-grouping-key()" as="xs:string"/>
                        <xsl:variable name="products-in-category" select="current-group()" as="map(*)*"/>
                        <xsl:variable name="totals-map" as="map(*)*">
                            <xsl:variable name="totals-map-entries" as="map(*)*">
                                <xsl:for-each select="$products-in-category">
                                   <xsl:variable name="product-name" select=".('name')"/>
                                   <xsl:variable name="product-sales" 
                                       select="$sales[.('product') = $product-name and 
                                                         .('store number') = $stores-in-state]"/>                      
                                   <xsl:if test="exists($product-sales)">                      
                                      <xsl:sequence select="map{ $product-name := 
                                                                 sum($product-sales!.('quantity')) }"/>
                                   </xsl:if>   
                                </xsl:for-each>
                            </xsl:variable>
                            <xsl:sequence select="map:new($totals-map-entries)"/>
                        </xsl:variable>
                        <xsl:sequence select="map{ $category := $totals-map }"/>
                    </xsl:for-each-group>
                </xsl:variable>    
                <xsl:sequence select=" map { $state := $state-map-entry }"/>
            </xsl:for-each-group>
        </xsl:variable>
        <xsl:value-of select="serialize-json($state-maps, map{ 'indent' := true()} )"/>
    </xsl:template>   
    
</xsl:stylesheet>

Note that this example appears to suffer badly from the lack of composability between the XPath map{} construct and the XSLT xsl:for-each-group instruction. For such use cases, an XSLT instruction to construct maps could be a better approach.

3.9.7 JSON to JSON Transformations

The following query takes satellite data, and summarizes which satellites are visible. The data for the query is a simplified version of a Stellarium file that contains this information.

3.9.7.1 Input Data
{
  "creator" : "Satellites plugin version 0.6.4",
  "satellites" : {
    "AAU CUBESAT" : {
      "tle1" : "1 27846U 03031G 10322.04074654  .00000056  00000-0  45693-4 0  8768",
      "visible" : false
    },
    "AJISAI (EGS)" : {
      "tle1" : "1 16908U 86061A 10321.84797408 -.00000083  00000-0  10000-3 0  3696",
      "visible" : true
    },
    "AKARI (ASTRO-F)" : {
      "tle1" : "1 28939U 06005A 10321.96319841  .00000176  00000-0  48808-4 0  4294",
      "visible" : true
    }
  }
}

We want to query this data to return a summary that looks like this.

3.9.7.2 Result
{
  "visible" : [
     "AJISAI (EGS)",
     "AKARI (ASTRO-F)"
  ],
  "invisible" : [
     "AAU CUBESAT"
  ]
}       
3.9.7.3 Solution in JSONiq:

The following is a JSONiq query that returns the desired result.

Query:

let $sats := json("satellites.json")("satellites")
return {
  "visible" : [
     for $sat in pairs($sats)
     where $sat("visible")
     return name($sat)
  ],
  "invisible" : [
     for $sat in pairs($sats)
     where not($sat)("visible"))
     return name($sat)
  ]
  }
3.9.7.4 Solution in XSLT 3.0:

Equivalent using the XSLT maps proposal:

  <xsl:variable name="sats" select="parse-json(unparsed-text('satellites.json'))('satellites')"/>
  <xsl:sequence select="map{
     'visible' := array(map:keys($sats)[$sats(.)('visible')]),
     'invisible' := array(map:keys($sats)[$sats(.)('invisible')])}"/>
     

This assumes the existence of a (potentially user-written) function array() that takes a sequence and turns it into a map with consecutive integer keys:

<xsl:function name="array" as="map(xs:integer, item())">
  <xsl:param name="seq" as="item()*"/>
  <xsl:sequence select="map:new(for $i in 1 to count($seq) return map{$i := $seq[$i]})"/>
</xsl:function>

3.9.8 Converting XML to JSON

JSON programmers frequently need to convert XML to JSON. The following query is based on a Wikipedia XML export format, using data from the category "Origami". Here is an excerpt of this data:

3.9.8.1 Input Data
<mediawiki>
  <siteinfo>
    <sitename>Wikipedia</sitename>

    <page>
      <title>Kawasaki's theorem</title>
      <id>14511776</id>
      <revision>
        <id>435519187</id>
        <timestamp>2011-06-21T20:08:56Z</timestamp>
        <contributor>
          <username>Some jerk on the Internet</username>
          <id>6636894</id>
        </contributor>

!!! SNIP !!!

    <page>
      <title>Origami techniques</title>
      <id>193590</id>
      <revision>
        <id>447687387</id>
        <timestamp>2011-08-31T17:21:49Z</timestamp>
        <contributor>
          <username>Dmcq</username>
          <id>3784322</id>
        </contributor>

!!! SNIP !!!

    <page>
      <title>Mathematics of paper folding</title>
      <id>232840</id>
      <revision>
        <id>440970828</id>
        <timestamp>2011-07-23T09:10:42Z</timestamp>
        <contributor>
          <username>Tabletop</username>
          <id>173687</id>
        </contributor>
       
3.9.8.2 Result
[
 {
  "title" : "Kawasaki's theorem",
  "id" : "14511776",
  "timestamp" : "2011-06-21T20:08:56Z",
  "authors" : ["Some jerk on the Internet" ]
 },
 {
  "title" : "Origami techniques",
  "id" : "193590",
  "timestamp" : "2011-08-31T17:21:49Z",
  "authors" : ["Dmcq" ]
 },
 {
  "title" : "Mathematics of paper folding",
  "id" : "232840",
  "timestamp" : "2011-07-23T09:10:42Z",
  "authors" : ["Tabletop" ]
 }
]
          
3.9.8.3 Solution in JSONiq:

The following query converts this data to JSON:

Query:

[
 for $page in doc("Wikipedia-Origami.xml")//page
 return {
  "title": string($page/title),
  "id" : string($page/id),
  "last updated" : string($page/revision[1]/timestamp),
  "authors" : [
       for $a in $page/revision/contributor/username
       return string($a)
  ]
 }
]          

3.9.9 Transforming JSON to SVG

Suppose a JavaScript implementation provides an interface for JSONiq queries, and a JavaScript program contains the following data [1]:

3.9.9.1 Input Data
var data = {
   "color" : "blue",
   "closed" : true,
   "points" : [[10,10], [20,10], [20,20], [10,20]]
   };
          
3.9.9.2 Solution in JSONiq:

This data can be converted to SVG by placing the text of a query in a JavaScript variable and calling the appropriate JavaScript function to invoke the query:

var query =
 "declare variable stroke := attribute stroke { color };
  declare variable points := attribute points { points };
  if (closed) then
    <svg><polygon>{ $stroke, $points }</polygon></svg>
  else
    <svg><polyline>{ $stroke, $points }</polyline></svg>" 

This query can be invoked with a JavaScript API call:

jsoniq(data, query)
          

Here is the result of the above query:

<svg><polygon stroke="blue" points="10 10 20 10 20 20 10 20" /></svg>

3.9.10 Transforming Arrays to HTML Tables

The data in a JSON array is frequently displayed using HTML tables. The following query shows how to transform from the former to the latter.

3.9.10.1 Input Data

The following Object contains the labels desired for columns and rows, as well as the data for the table.

{
  "col labels" : ["singular", "plural"],
  "row labels" : ["1p", "2p", "3p"],
  "data" :
     [
        ["spinne", "spinnen"],
        ["spinnst", "spinnt"],
        ["spinnt", "spinnen"]
     ]
}
3.9.10.2 Solution in JSONiq:

The following query creates an HTML table, using the column headings and row labels as well as the data in the Object shown above.

<html>
  <body>

    <table>
      <tr> (: Column headings :)
         {
            <th> </th>,
            for $th in values(json("table.json")("col labels"))
            return <th>{ $th }</th>
         }
      </tr>
      {  (: Data for each row :)
         for $r at $i in values(json("table.json")("data"))
         return
            <tr>
             {
               <th>{ values(json("table.json")("row labels")[$i]) }</th>,
               for $c in $r
               return <td>{ $c }</td>
             }
            </tr>
      }
    </table>

  </body>
</html>    

3.9.11 Windowing Queries

XQuery provides support for both sliding windows and tumbling windows, frequently used to analyze event streams or other sequential data. This simple windowing example converts a sequence of items to a table with three columns (using as many rows as necessary), and assigns a row number to each row.

3.9.11.1 Input Data
[
  { "color" : "Green" },
  { "color" : "Pink" },
  { "color" : "Lilac" },
  { "color" : "Turquoise" },
  { "color" : "Peach" },
  { "color" : "Opal" },
  { "color" : "Champagne" }
}
          
3.9.11.2 Result

Result:

<table>
  <tr>
    <td>Green</td>
    <td>Pink</td>
    <td>Lilac</td>
  </tr>
  <tr>
    <td>Turquoise</td>
    <td>Peach</td>
    <td>Opal</td>
  </tr>
  <tr>
    <td>Champagne</td>
  </tr>
</table>
          
3.9.11.3 Solution in JSONiq:

Query:

<table>{
  for tumbling window $w in values(json("colors.json"))
    start at $x when fn:true()
    end at $y when $y - $x = 2
  return
    <tr>{
      for $i in $w
      return
        <td>{ $i }</td>
    }</tr>
}</table>
          

3.9.12 JSON views in middleware

This example assumes a middleware system that presents relational tables as JSON arrays. The following two tables are used as sample data.

3.9.12.1 Input Data
Users
userid firstname lastname
W0342 Walter Denisovich
M0535 Mick Goulish

The JSON representation this particular implementation provides for the above table looks like this:

[
  { "userid" : "W0342", "firstname" : "Walter", "lastname" : "Denisovich" },
  { "userid" : "M0535", "firstname" : "Mick", "lastname" : "Goulish" }
]       
Holdings
userid ticker shares
W0342 DIS 153212312
M0535 DIS 10
M0535 AIG 23412

The JSON representation this particular implementation provides for the above table looks like this:

[
  { "userid" : "W0342", "ticker" : "DIS", "shares" : 153212312 },
  { "userid" : "M0535", "ticker" : "DIS", "shares" : 10 },
  { "userid" : "M0535", "ticker" : "AIG", "shares" : 23412 }
]       
3.9.12.2 Solution in JSONiq:

The following query uses the fictitious vendor's vendor:table() function to retrieve the values from a table, and creates an Object for each user, with a list of the user's holdings in the value of that Object.

[
  for $u in vendor:table("Users")
  order by $u("userid")
  return {
    "userid" : $u("userid"),
    "first" :  $u("firstname"),
    "last" :   $u("lastname"),
    "holdings" : [
         for $h in vendor:table("Holdings")
         where $h("userid") = $u("userid")
         order by $h("ticker")
         return {
            "ticker" : $u("ticker"),
            "share" : $u("shares")
         }
    ]
  }
]       

3.9.13 In-Place Updates

The XQuery Update Facility allows XML data to be updated. JSONiq provides updating functions to allow JSON to be updated.

Suppose an application receives an order that contains a credit card number, and needs to put the user on probation.

3.9.13.1 Input Data

Data for an order:

{
  "user" : "Deadbeat Jim",
  "credit card" : VISA 4111 1111 1111 1111,
  "product" : "lottery tickets",
  "quantity" : 243
}
        

collection("users") contains the data for each individual user:

{
  "name" : "Deadbeat Jim",
  "address" : "1 E 161st St, Bronx, NY 10451",
  "risk tolerance" : "high"
}
        
3.9.13.2 Solution in JSONiq:

The following query adds "status" : "credit card declined" to the user's record.

let $dbj := collection("users")[ .("name") = "Deadbeat Jim" ]
return json:insert-into($dbj, "status" : "credit card declined")
        

After the update is finished, the user's record looks like this:

{
  "name" : "Deadbeat Jim",
  "address" : "1 E 161st St, Bronx, NY 10451",
  "status" : "credit card declined",
  "risk tolerance" : "high"
}
        

3.9.14 Data Transformations

Many applications need to modify data before forwarding it to another source. The XQuery Update Facility provides an expression called a tranform expression that can be used to create modified copies. The transform expression uses updating expressions to perform a transformation. JSONiq defines updating functions for JSON, which can be used in the XQuery transform expression.

3.9.14.1 Input Data

Suppose an application make videos available using feeds from Youtube. The following data comes from one such feed:

{
    "encoding" : "UTF-8",
    "feed" : {
        "author" : [
            {
                "name" : {
                    "$t" : "YouTube"
                },
                "uri" : {
                    "$t" : "http://www.youtube.com/"
                }
            }
        ],
        "category" : [
            {
                "scheme" : "http://schemas.google.com/g/2005#kind",
                "term" : "http://gdata.youtube.com/schemas/2007#video"
            }
        ],
        "entry" : [
            {
                "app$control" : {
                    "yt$state" : {
                        "$t" : "Syndication of this video was restricted by its owner.",
                        "name" : "restricted",
                        "reasonCode" : "limitedSyndication"
                    }
                },
                "author" : [
                    {
                        "name" : {
                            "$t" : "beyonceVEVO"
                        },
                        "uri" : {
                            "$t" : "http://gdata.youtube.com/feeds/api/users/beyoncevevo"
                        }
                    }
                ]
!!! SNIP !!!         
3.9.14.2 Solution in JSONiq:

The following query creates a modified copy of the feed by removing all entries that restrict syndication.

let $feed := json("incoming.json")
return
   copy $out := $feed
   modify
      for $entry in $out("feed")("entry")
      where $entry("app$control")("yt$state")("name") = "restricted"
      return json:delete($entry)
   return $out

A References

RFC 2119
S. Bradner. Key Words for use in RFCs to Indicate Requirement Levels. IETF RFC 2119. See http://www.ietf.org/rfc/rfc2119.txt.
XQuery 3.0
XQuery 3.0: An XML Query Language, Jonathan Robie, Don Chamberlin, Michael Dyck, John Snelson, Editors. World Wide Web Consortium, 08 April 2014. This version is http://www.w3.org/TR/2014/REC-xquery-30-20140408/. The latest version is available at http://www.w3.org/TR/xquery-30/.
XQuery and XPath Data Model 3.1
XQuery and XPath Data Model (XDM) 3.1, Norman Walsh, John Snelson, Editors. World Wide Web Consortium, 24 April 2014. This version is http://www.w3.org/TR/2014/WD-xpath-datamodel-31-20140424/. The latest version is available at http://www.w3.org/TR/xpath-datamodel-31/.
XPath 3.1
XML Path Language (XPath) 3.1, Jonathan Robie, Michael Dyck, John Snelson, Editors. World Wide Web Consortium, 24 April 2014. This version is http://www.w3.org/TR/2014/WD-xpath-31-20140424/. The latest version is available at http://www.w3.org/TR/xpath-31/.
XQuery 3.1
XQuery 3.1: An XML Query Language, Jonathan Robie, Michael Dyck, John Snelson, Editors. World Wide Web Consortium, 24 April 2014. This version is http://www.w3.org/TR/2014/WD-xquery-31-20140424/. The latest version is available at http://www.w3.org/TR/xquery-31/.
XSLT 3.0
XSL Transformations (XSLT) Version 3.0, Michael Kay, Editor. World Wide Web Consortium, 12 December 2013. This version is http://www.w3.org/TR/2013/WD-xslt-30-20131212/. The latest version is available at http://www.w3.org/TR/xslt-30/.
XQuery Update Facility 3.0
XQuery Update Facility 3.0, John Snelson, Editor. World Wide Web Consortium, 08 January 2013. This version is http://www.w3.org/TR/2013/WD-xquery-update-30-20130108/. The latest version is available at http://www.w3.org/TR/xquery-update-30/.
XQuery and XPath Full Text 3.0
XQuery and XPath Full Text 3.0, Mary Holstege, Editor. World Wide Web Consortium, 08 January 2013. This version is http://www.w3.org/TR/2013/WD-xpath-full-text-30-20130108/. The latest version is available at http://www.w3.org/TR/xpath-full-text-30/.
JSONiq
Jonathan Robie, Matthias Brantner, Daniela Florescu, Ghislain Fourny, Till Westmann. JSONiq: XQuery for JSON, JSON for XQuery. See http://www.jsoniq.org/docs/JSONiqExtensionToXQuery/pdf/Language_Specification-0.4.42-JSONiq-en-US.pdf.
JSONiq Use Cases
Jonathan Robie, Matthias Brantner, Daniela Florescu, Ghislain Fourny, Till Westmann. JSONiq Use Cases. See http://www.jsoniq.org/docs/JSONiq-usecases/html-single/.

End Notes

[1]

This example is based on an example on Stefan Goessner's JSONT site (http://goessner.net/articles/jsont/).