W3C

XML Query Use Cases

W3C Working Draft 15 September 2005

This version:
http://www.w3.org/TR/2005/WD-xquery-use-cases-20050915/
Latest version:
http://www.w3.org/TR/xquery-use-cases/
Previous version:
http://www.w3.org/TR/2005/WD-xquery-use-cases-20050404/
Editors:
Don Chamberlin, IBM Almaden Research Center <chamberlin@almaden.ibm.com >
Peter Fankhauser, Infonyte GmbH <fankhauser@infonyte.com>
Daniela Florescu, Oracle corporation <dana.florescu@oracle.com>
Massimo Marchiori, University of Venice <massimo@w3.org>
Jonathan Robie, DataDirect Technologies <jonathan.robie@datadirect.com>

This document is also available in these non-normative formats: XML.


Abstract

This document specifies usage scenarios for XQuery.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This version of the Use Cases document corresponds to the XQuery Working Draft released on 15 September 2005. The queries in this document have been parsed using a parser generated from the same grammar used to create the documentation for the XQuery Working Draft. The syntax of types has changed, and this is reflected in the queries. A number of errors have been corrected. See B Change Log for more information on changes.

This is a W3C Working Draft for review by W3C Members and other interested parties. Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document has been produced as part of the W3C XML Activity, following the procedures set out for the W3C Process. The document has been written by the XML Query Working Group The goals of the XML Query working group are discussed in the XML Query Working Group charter.

The XML Query Working Group feels that the contents of this Working Draft are relatively stable, and therefore encourages feedback on this version.

As of this publication, the Working Group expects to eventually publish this document as a Working Group Note. It is not expected to become a W3C Recommendation, and therefore it has no associated W3C Patent Policy licensing obligations.

Comments for this specification should be entered into the last call issue tracking system (instructions can be found at http://www.w3.org/XML/2005/04/qt-bugzilla. If access to that system is not feasible, you may send your comments to the W3C mailing list public-qt-comments@w3.org. (archived at http://lists.w3.org/Archives/Public/public-qt-comments/) with "[use]" at the beginning of the subject field.

A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR/.

Table of Contents

1 Use Cases for XML Queries
    1.1 Use Case "XMP": Experiences and Exemplars
        1.1.1 Document Type Definitions (DTD)
        1.1.2 Sample Data
        1.1.3 DTD for Q5
        1.1.4 Sample Data for Q5
        1.1.5 DTD for Q9
        1.1.6 Data for Q9
        1.1.7 DTD for Q10
        1.1.8 Data for Q10
        1.1.9 Queries and Results
            1.1.9.1 Q1
            1.1.9.2 Q2
            1.1.9.3 Q3
            1.1.9.4 Q4
            1.1.9.5 Q5
            1.1.9.6 Q6
            1.1.9.7 Q7
            1.1.9.8 Q8
            1.1.9.9 Q9
            1.1.9.10 Q10
            1.1.9.11 Q11
            1.1.9.12 Q12
    1.2 Use Case "TREE": Queries that preserve hierarchy
        1.2.1 Description
        1.2.2 Document Type Definition (DTD)
        1.2.3 Sample Data
        1.2.4 Queries and Results
            1.2.4.1 Q1
            1.2.4.2 Q2
            1.2.4.3 Q3
            1.2.4.4 Q4
            1.2.4.5 Q5
            1.2.4.6 Q6
    1.3 Use Case "SEQ" - Queries based on Sequence
        1.3.1 Description
        1.3.2 Document Type Definition (DTD)
        1.3.3 Sample Data
        1.3.4 Queries and Results
            1.3.4.1 Q1
            1.3.4.2 Q2
            1.3.4.3 Q3
            1.3.4.4 Q4
            1.3.4.5 Q5
    1.4 Use Case "R" - Access to Relational Data
        1.4.1 Description
        1.4.2 Document Type Definition (DTD)
        1.4.3 Sample Data
        1.4.4 Queries and Results
            1.4.4.1 Q1
            1.4.4.2 Q2
            1.4.4.3 Q3
            1.4.4.4 Q4
            1.4.4.5 Q5
            1.4.4.6 Q6
            1.4.4.7 Q7
            1.4.4.8 Q8
            1.4.4.9 Q9
            1.4.4.10 Q10
            1.4.4.11 Q11
            1.4.4.12 Q12
            1.4.4.13 Q13
            1.4.4.14 Q14
            1.4.4.15 Q15
            1.4.4.16 Q16
            1.4.4.17 Q17
            1.4.4.18 Q18
    1.5 Use Case "SGML": Standard Generalized Markup Language
        1.5.1 Description
        1.5.2 Document Type Definition (DTD)
        1.5.3 Sample Data
        1.5.4 Queries and Results
            1.5.4.1 Q1
            1.5.4.2 Q2
            1.5.4.3 Q3
            1.5.4.4 Q4
            1.5.4.5 Q5
            1.5.4.6 Q6
            1.5.4.7 Q7
            1.5.4.8 Q8a
            1.5.4.9 Q8b
            1.5.4.10 Q9
            1.5.4.11 Q10
    1.6 Use Case "STRING": String Search
        1.6.1 Description
        1.6.2 Document Type Definition (DTD)
        1.6.3 Sample Data
        1.6.4 Queries and Results
            1.6.4.1 Q1
            1.6.4.2 Q2
            1.6.4.3 Q3
            1.6.4.4 Q4
            1.6.4.5 Q5
    1.7 Use Case "NS" - Queries Using Namespaces
        1.7.1 Description
        1.7.2 Document Type Definition (DTD)
        1.7.3 Sample Data
        1.7.4 Queries and Results
            1.7.4.1 Q1
            1.7.4.2 Q2
            1.7.4.3 Q3
            1.7.4.4 Q4
            1.7.4.5 Q5
            1.7.4.6 Q6
            1.7.4.7 Q7
            1.7.4.8 Q8
    1.8 Use Case "PARTS" - Recursive Parts Explosion
        1.8.1 Description
        1.8.2 Document Type Definitions (DTD)
        1.8.3 Sample Data
        1.8.4 Queries and Results
            1.8.4.1 Q1
    1.9 Use Case "STRONG" - queries that exploit strongly typed data
        1.9.1 Description
        1.9.2 Schema
        1.9.3 Sample Data
        1.9.4 Queries
            1.9.4.1 Q1
            1.9.4.2 Q2
            1.9.4.3 Q3
            1.9.4.4 Q4
            1.9.4.5 Q5
            1.9.4.6 Q6
            1.9.4.7 Q7
            1.9.4.8 Q8
            1.9.4.9 Q9
            1.9.4.10 Q10
            1.9.4.11 Q11
            1.9.4.12 Q12

Appendices

A Acknowledgements
B Change Log (Non-Normative)
    B.1 31 Aug 2005
    B.2 11 July 2005
    B.3 04 April 2005
    B.4 30 Jan 2005
C References (Non-Normative)


1 Use Cases for XML Queries

The use cases listed below were created by the XML Query Working Group to illustrate important applications for an XML query language. Each use case is focused on a specific application area, and contains a Document Type Definition (DTD) and example input data. Each use case specifies a set of queries that might be applied to the input data, and the expected results for each query. Since the English description of each query is concise, the expected results form an important part of the definition of each query, specifying the expected output format. These use cases were originally published as part of the [Requirements] document, without solutions in concrete query languages. Now it is being republished with solutions for [XQuery]. These use cases are also being used by the W3C XML Query Testing Task Force.

The input environment for each use case is stated in its Document Type Definition (DTD) section. All of these use cases assume that input is provided in the form of one or more documents with specific names. For instance, the authors in a document may be accessed with expressions like this:

doc("http://bstore1.example.com/bib.xml")//author

Some implementations of XQuery bind input to external variables. If the environment has bound the external variable $b to the same document used in the above query, this expression would return the same set of authors:

$b//author

Some implementations of XQuery predefine a single 'context item', which is available at the root level of a query, and which is used to resolve paths that begin with a leading slash. In such an implementation, if the context item is bound to document node of the same well-formed document used in the previous examples, this expression would return the same set of authors:

//author

Previous versions of this document accessed implicit documents using the input() function, which no longer exists. The input() function had similar functionality to a predefined context item, except that it could be bound to a sequence of nodes, whereas the context item may only be bound to a single node. The use cases that used input() have been rewritten to use explicit file names.

Several implementors have asked that we make the queries from these use cases available in a separate file to make it easier for them to test their parsers. These queries may be found in [Use Case Sample Queries]. Also, the queries from the XQuery specification itself have been made available in [XQuery Sample Queries].

To make output more readable, the output of queries has been formatted using whitespace which may not be returned by a query processor. This whitespace should not be considered normative for the correctness of results.

These use cases represent a snapshot of an ongoing work. Some important application areas are not yet adequately covered by a use case. The XML Query Working Group reserves the right to add, delete, or modify individual queries or whole use cases as the work progresses. The presence of a query in this set of use cases does not necessarily indicate that the query will be expressible in the XML Query Language(s) to be created by the XML Query Working Group.

1.1 Use Case "XMP": Experiences and Exemplars

This use case contains several example queries that illustrate requirements gathered from the database and document communities.

1.1.1 Document Type Definitions (DTD)

Most of the example queries in this use case are based on a bibliography document named "http://bstore1.example.com/bib.xml" with the following DTD:

<!ELEMENT bib  (book* )>
<!ELEMENT book  (title,  (author+ | editor+ ), publisher, price )>
<!ATTLIST book  year CDATA  #REQUIRED >
<!ELEMENT author  (last, first )>
<!ELEMENT editor  (last, first, affiliation )>
<!ELEMENT title  (#PCDATA )>
<!ELEMENT last  (#PCDATA )>
<!ELEMENT first  (#PCDATA )>
<!ELEMENT affiliation  (#PCDATA )>
<!ELEMENT publisher  (#PCDATA )>
<!ELEMENT price  (#PCDATA )>

1.1.2 Sample Data

Here is the data found at "bstore1.example.com/bib.xml":

<bib>
    <book year="1994">
        <title>TCP/IP Illustrated</title>
        <author><last>Stevens</last><first>W.</first></author>
        <publisher>Addison-Wesley</publisher>
        <price>65.95</price>
    </book>
 
    <book year="1992">
        <title>Advanced Programming in the Unix environment</title>
        <author><last>Stevens</last><first>W.</first></author>
        <publisher>Addison-Wesley</publisher>
        <price>65.95</price>
    </book>
 
    <book year="2000">
        <title>Data on the Web</title>
        <author><last>Abiteboul</last><first>Serge</first></author>
        <author><last>Buneman</last><first>Peter</first></author>
        <author><last>Suciu</last><first>Dan</first></author>
        <publisher>Morgan Kaufmann Publishers</publisher>
        <price>39.95</price>
    </book>
 
    <book year="1999">
        <title>The Economics of Technology and Content for Digital TV</title>
        <editor>
               <last>Gerbarg</last><first>Darcy</first>
                <affiliation>CITI</affiliation>
        </editor>
            <publisher>Kluwer Academic Publishers</publisher>
        <price>129.95</price>
    </book>
 
</bib>
                    

1.1.3 DTD for Q5

Q5 also uses information on book reviews and prices from a separate data source named "http://bstore2.example.com/reviews.xml" with the following DTD:

<!ELEMENT reviews (entry*)>
<!ELEMENT entry   (title, price, review)>
<!ELEMENT title   (#PCDATA)>
<!ELEMENT price   (#PCDATA)>
<!ELEMENT review  (#PCDATA)>
 

1.1.4 Sample Data for Q5

Here are the contents of "http://bstore2.example.com/reviews.xml":

<reviews>
    <entry>
        <title>Data on the Web</title>
        <price>34.95</price>
        <review>
               A very good discussion of semi-structured database
               systems and XML.
        </review>
    </entry>
    <entry>
        <title>Advanced Programming in the Unix environment</title>
        <price>65.95</price>
        <review>
               A clear and detailed discussion of UNIX programming.
        </review>
    </entry>
    <entry>
        <title>TCP/IP Illustrated</title>
        <price>65.95</price>
        <review>
               One of the best books on TCP/IP.
        </review>
    </entry>
</reviews>

1.1.5 DTD for Q9

Q9 uses an input document named "books.xml", with the following DTD:

<!ELEMENT chapter (title, section*)>
<!ELEMENT section (title, section*)>
<!ELEMENT title (#PCDATA)>
 

1.1.6 Data for Q9

Here are the contents of books.xml:

<chapter>
    <title>Data Model</title>
    <section>
        <title>Syntax For Data Model</title>
    </section>
    <section>
        <title>XML</title>
        <section>
            <title>Basic Syntax</title>
        </section>
        <section>
            <title>XML and Semistructured Data</title>
        </section>
    </section>
</chapter>

1.1.7 DTD for Q10

Q10 uses an input document named "prices.xml", with the following DTD:

<!ELEMENT prices (book*)>
<!ELEMENT book (title, source, price)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT source (#PCDATA)>
<!ELEMENT price (#PCDATA)>
                    

1.1.8 Data for Q10

Here are the contents of prices.xml:

<prices>
    <book>
        <title>Advanced Programming in the Unix environment</title>
        <source>bstore2.example.com</source>
        <price>65.95</price>
    </book>
    <book>
        <title>Advanced Programming in the Unix environment</title>
        <source>bstore1.example.com</source>
        <price>65.95</price>
    </book>
    <book>
        <title>TCP/IP Illustrated</title>
        <source>bstore2.example.com</source>
        <price>65.95</price>
    </book>
    <book>
        <title>TCP/IP Illustrated</title>
        <source>bstore1.example.com</source>
        <price>65.95</price>
    </book>
    <book>
        <title>Data on the Web</title>
        <source>bstore2.example.com</source>
        <price>34.95</price>
    </book>
    <book>
        <title>Data on the Web</title>
        <source>bstore1.example.com</source>
        <price>39.95</price>
    </book>
</prices>

1.1.9 Queries and Results

1.1.9.1 Q1

List books published by Addison-Wesley after 1991, including their year and title.

Solution in XQuery:

<bib>
 {
  for $b in doc("http://bstore1.example.com/bib.xml")/bib/book
  where $b/publisher = "Addison-Wesley" and $b/@year > 1991
  return
    <book year="{ $b/@year }">
     { $b/title }
    </book>
 }
</bib> 

Expected Result:

<bib>
    <book year="1994">
        <title>TCP/IP Illustrated</title>
    </book>
    <book year="1992">
        <title>Advanced Programming in the Unix environment</title>
    </book>
</bib>
                        
1.1.9.2 Q2

Create a flat list of all the title-author pairs, with each pair enclosed in a "result" element.

Solution in XQuery:

<results>
  {
    for $b in doc("http://bstore1.example.com/bib.xml")/bib/book,
        $t in $b/title,
        $a in $b/author
    return
        <result>
            { $t }    
            { $a }
        </result>
  }
</results>
                        

Expected Result:

<results>
    <result>
        <title>TCP/IP Illustrated</title>
        <author>
            <last>Stevens</last>
            <first>W.</first>
        </author>
    </result>
    <result>
        <title>Advanced Programming in the Unix environment</title>
        <author>
            <last>Stevens</last>
            <first>W.</first>
        </author>
    </result>
    <result>
        <title>Data on the Web</title>
        <author>
            <last>Abiteboul</last>
            <first>Serge</first>
        </author>
    </result>
    <result>
        <title>Data on the Web</title>
        <author>
            <last>Buneman</last>
            <first>Peter</first>
        </author>
    </result>
    <result>
        <title>Data on the Web</title>
        <author>
            <last>Suciu</last>
            <first>Dan</first>
        </author>
    </result>
</results>
1.1.9.3 Q3

For each book in the bibliography, list the title and authors, grouped inside a "result" element.

Solution in XQuery:

<results>
{
    for $b in doc("http://bstore1.example.com/bib.xml")/bib/book
    return
        <result>
            { $b/title }
            { $b/author  }
        </result>
}
</results> 

Expected Result:

<results>
    <result>
        <title>TCP/IP Illustrated</title>
        <author>
            <last>Stevens</last>
            <first>W.</first>
        </author>
    </result>
    <result>
        <title>Advanced Programming in the Unix environment</title>
        <author>
            <last>Stevens</last>
            <first>W.</first>
        </author>
    </result>
    <result>
        <title>Data on the Web</title>
        <author>
            <last>Abiteboul</last>
            <first>Serge</first>
        </author>
        <author>
            <last>Buneman</last>
            <first>Peter</first>
        </author>
        <author>
            <last>Suciu</last>
            <first>Dan</first>
        </author>
    </result>
    <result>
        <title>The Economics of Technology and Content for Digital TV</title>
    </result>
</results>
                        
1.1.9.4 Q4

For each author in the bibliography, list the author's name and the titles of all books by that author, grouped inside a "result" element.

Solution in XQuery:

<results>
  {
    let $a := doc("http://bstore1.example.com/bib/bib.xml")//author
    for $last in distinct-values($a/last),
        $first in distinct-values($a[last=$last]/first)
    order by $last, $first
    return
        <result>
            <author>
               <last>{ $last }</last>
               <first>{ $first }</first>
            </author>
            {
                for $b in doc("http://bstore1.example.com/bib.xml")/bib/book
                where some $ba in $b/author 
                      satisfies ($ba/last = $last and $ba/first=$first)
                return $b/title
            }
        </result>
  }
</results> 

The order in which values are returned by distinct-values() is undefined. The distinct-values() function returns atomic values, extracting the names from the elements.

Expected Result:

<results>
    <result>
        <author>
            <last>Abiteboul</last>
            <first>Serge</first>
        </author>
        <title>Data on the Web</title>
    </result>
    <result>
        <author>
            <last>Buneman</last>
            <first>Peter</first>
        </author>
        <title>Data on the Web</title>
    </result>
    <result>
        <author>
            <last>Stevens</last>
            <first>W.</first>
        </author>
        <title>TCP/IP Illustrated</title>
        <title>Advanced Programming in the Unix environment</title>
    </result>
    <result>
        <author>
            <last>Suciu</last>
            <first>Dan</first>
        </author>
        <title>Data on the Web</title>
    </result>
</results>
                        
1.1.9.5 Q5

For each book found at both bstore1.example.com and bstore2.example.com, list the title of the book and its price from each source.

Solution in XQuery:

<books-with-prices>
  {
    for $b in doc("http://bstore1.example.com/bib.xml")//book,
        $a in doc("http://bstore2.example.com/reviews.xml")//entry
    where $b/title = $a/title
    return
        <book-with-prices>
            { $b/title }
            <price-bstore2>{ $a/price/text() }</price-bstore2>
            <price-bstore1>{ $b/price/text() }</price-bstore1>
        </book-with-prices>
  }
</books-with-prices>

Expected Result:

<books-with-prices>
    <book-with-prices>
        <title>TCP/IP Illustrated</title>
        <price-bstore2>65.95</price-bstore2>
        <price-bstore1>65.95</price-bstore1>
    </book-with-prices>
    <book-with-prices>
        <title>Advanced Programming in the Unix environment</title>
        <price-bstore2>65.95</price-bstore2>
        <price-bstore1>65.95</price-bstore1>
    </book-with-prices>
    <book-with-prices>
        <title>Data on the Web</title>
        <price-bstore2>34.95</price-bstore2>
        <price-bstore1>39.95</price-bstore1>
    </book-with-prices>
</books-with-prices> 
1.1.9.6 Q6

For each book that has at least one author, list the title and first two authors, and an empty "et-al" element if the book has additional authors.

Solution in XQuery:

<bib>
  {
    for $b in doc("http://bstore1.example.com/bib.xml")//book
    where count($b/author) > 0
    return
        <book>
            { $b/title }
            {
                for $a in $b/author[position()<=2]  
                return $a
            }
            {
                if (count($b/author) > 2)
                 then <et-al/>
                 else ()
            }
        </book>
  }
</bib>

Expected Result:

<bib>
    <book>
        <title>TCP/IP Illustrated</title>
        <author>
            <last>Stevens</last>
            <first>W.</first>
        </author>
    </book>
    <book>
        <title>Advanced Programming in the Unix environment</title>
        <author>
            <last>Stevens</last>
            <first>W.</first>
        </author>
    </book>
    <book>
        <title>Data on the Web</title>
        <author>
            <last>Abiteboul</last>
            <first>Serge</first>
        </author>
        <author>
            <last>Buneman</last>
            <first>Peter</first>
        </author>
        <et-al/>
    </book>
</bib> 
1.1.9.7 Q7

List the titles and years of all books published by Addison-Wesley after 1991, in alphabetic order.

Solution in XQuery:

<bib>
  {
    for $b in doc("http://bstore1.example.com/bib.xml")//book
    where $b/publisher = "Addison-Wesley" and $b/@year > 1991
    order by $b/title
    return
        <book>
            { $b/@year }
            { $b/title }
        </book>
  }
</bib> 

Expected Result:

<bib>
    <book year="1992">
        <title>Advanced Programming in the Unix environment</title>
    </book>
    <book year="1994">
        <title>TCP/IP Illustrated</title>
    </book>
</bib>     
1.1.9.8 Q8

Find books in which the name of some element ends with the string "or" and the same element contains the string "Suciu" somewhere in its content. For each such book, return the title and the qualifying element.

Solution in XQuery:

for $b in doc("http://bstore1.example.com/bib.xml")//book
let $e := $b/*[contains(string(.), "Suciu") 
               and ends-with(local-name(.), "or")]
where exists($e)
return
    <book>
        { $b/title }
        { $e }
    </book> 

In the above solution, string(), local-name() and ends-with() are functions defined in the Functions and Operators document.

Expected Result:

<book>
        <title>Data on the Web</title>
        <author>
            <last>Suciu</last>
            <first>Dan</first>
        </author>
 </book>
1.1.9.9 Q9

In the document "books.xml", find all section or chapter titles that contain the word "XML", regardless of the level of nesting.

Solution in XQuery:

<results>
  {
    for $t in doc("books.xml")//(chapter | section)/title
    where contains($t/text(), "XML")
    return $t
  }
</results> 

Expected Result:

<results>
    <title>XML</title>
    <title>XML and Semistructured Data</title>
</results> 
1.1.9.10 Q10

In the document "prices.xml", find the minimum price for each book, in the form of a "minprice" element with the book title as its title attribute.

Solution in XQuery:

<results>
  {
    let $doc := doc("prices.xml")
    for $t in distinct-values($doc//book/title)
    let $p := $doc//book[title = $t]/price
    return
      <minprice title="{ $t }">
        <price>{ min($p) }</price>
      </minprice>
  }
</results> 

Expected Result:

<results>
    <minprice title="Advanced Programming in the Unix environment">
        <price>65.95</price>
    </minprice>
    <minprice title="TCP/IP Illustrated">
        <price>65.95</price>
    </minprice>
    <minprice title="Data on the Web">
        <price>34.95</price>
    </minprice>
</results> 
1.1.9.11 Q11

For each book with an author, return the book with its title and authors. For each book with an editor, return a reference with the book title and the editor's affiliation.

Solution in XQuery:

<bib>
{
        for $b in doc("http://bstore1.example.com/bib.xml")//book[author]
        return
            <book>
                { $b/title }
                { $b/author }
            </book>
}
{
        for $b in doc("http://bstore1.example.com/bib.xml")//book[editor]
        return
          <reference>
            { $b/title }
            {$b/editor/affiliation}
          </reference>
}
</bib>  

Expected Result:

<bib>
    <book>
        <title>TCP/IP Illustrated</title>
        <author>
            <last>Stevens</last>
            <first>W.</first>
        </author>
    </book>
    <book>
        <title>Advanced Programming in the Unix environment</title>
        <author>
            <last>Stevens</last>
            <first>W.</first>
        </author>
    </book>
    <book>
        <title>Data on the Web</title>
        <author>
            <last>Abiteboul</last>
            <first>Serge</first>
        </author>
        <author>
            <last>Buneman</last>
            <first>Peter</first>
        </author>
        <author>
            <last>Suciu</last>
            <first>Dan</first>
        </author>
    </book>
    <reference>
        <title>The Economics of Technology and Content for Digital TV</title>
        <affiliation>CITI</affiliation>
    </reference>
</bib> 
1.1.9.12 Q12

Find pairs of books that have different titles but the same set of authors (possibly in a different order).

Solution in XQuery:

<bib>
{
    for $book1 in doc("http://bstore1.example.com/bib.xml")//book,
        $book2 in doc("http://bstore1.example.com/bib.xml")//book
    let $aut1 := for $a in $book1/author 
                 order by $a/last, $a/first
                 return $a
    let $aut2 := for $a in $book2/author 
                 order by $a/last, $a/first
                 return $a
    where $book1 << $book2
    and not($book1/title = $book2/title)
    and deep-equal($aut1, $aut2) 
    return
        <book-pair>
            { $book1/title }
            { $book2/title }
        </book-pair>
}
</bib> 

Expected Result:

<bib>
    <book-pair>
        <title>TCP/IP Illustrated</title>
        <title>Advanced Programming in the Unix environment</title>
    </book-pair>
</bib> 

The above solution uses a function, deep-equal(), which compares sequences. Two sequences are equal if all items in corresponding positions in the two sequences are equal - if the sequences are node sequences, the values of the nodes are used for comparison.

1.2 Use Case "TREE": Queries that preserve hierarchy

Some XML document-types have a very flexible structure in which text is mixed with elements and many elements are optional. These document-types show a wide variation in structure from one document to another. In documents of these types, the ways in which elements are ordered and nested are usually quite important.

1.2.1 Description

An XML query language should have the ability to extract elements from documents while preserving their original hierarchy. This Use Case illustrates this requirement by means of a flexible document type named Book.

1.2.2 Document Type Definition (DTD)

This use case is based on an input document named "book.xml". The DTD for this schema is found in a file called "book.dtd":

  <!ELEMENT book (title, author+, section+)>
  <!ELEMENT title (#PCDATA)>
  <!ELEMENT author (#PCDATA)>
  <!ELEMENT section (title, (p | figure | section)* )>
  <!ATTLIST section
      id         ID    #IMPLIED
      difficulty CDATA #IMPLIED>
  <!ELEMENT p (#PCDATA)>
  <!ELEMENT figure (title, image)>
  <!ATTLIST figure
       width   CDATA   #REQUIRED
       height  CDATA   #REQUIRED >
  <!ELEMENT image EMPTY>
  <!ATTLIST image
       source  CDATA   #REQUIRED >  

1.2.3 Sample Data

The queries in this use case are based on the following sample data.

<?xml version="1.0"?>
<!DOCTYPE book SYSTEM "book.dtd">
<book>
  <title>Data on the Web</title>
  <author>Serge Abiteboul</author>
  <author>Peter Buneman</author>
  <author>Dan Suciu</author>
  <section id="intro" difficulty="easy" >
    <title>Introduction</title>
    <p>Text ... </p>
    <section>
      <title>Audience</title>
      <p>Text ... </p>
    </section>
    <section>
      <title>Web Data and the Two Cultures</title>
      <p>Text ... </p>
      <figure height="400" width="400">
        <title>Traditional client/server architecture</title>
        <image source="csarch.gif"/>
      </figure>
      <p>Text ... </p>
    </section>
  </section>
  <section id="syntax" difficulty="medium" >
    <title>A Syntax For Data</title>
    <p>Text ... </p>
    <figure height="200" width="500">
      <title>Graph representations of structures</title>
      <image source="graphs.gif"/>
    </figure>
    <p>Text ... </p>
    <section>
      <title>Base Types</title>
      <p>Text ... </p>
    </section>
    <section>
      <title>Representing Relational Databases</title>
      <p>Text ... </p>
      <figure height="250" width="400">
        <title>Examples of Relations</title>
        <image source="relations.gif"/>
      </figure>
    </section>
    <section>
      <title>Representing Object Databases</title>
      <p>Text ... </p>
    </section>       
  </section>
</book> 

1.2.4 Queries and Results

1.2.4.1 Q1

Prepare a (nested) table of contents for Book1, listing all the sections and their titles. Preserve the original attributes of each <section> element, if any.

Solution in XQuery:

declare function local:toc($book-or-section as element()) as element()*
{
    for $section in $book-or-section/section
    return
      <section>
         { $section/@* , $section/title , local:toc($section) }                 
      </section>
};

<toc>
   {
     for $s in doc("book.xml")/book return local:toc($s)
   }
</toc> 

Expected Result:

<toc>
    <section id="intro" difficulty="easy">
        <title>Introduction</title>
        <section>
            <title>Audience</title>
        </section>
        <section>
            <title>Web Data and the Two Cultures</title>
        </section>
    </section>
    <section id="syntax" difficulty="medium">
        <title>A Syntax For Data</title>
        <section>
            <title>Base Types</title>
        </section>
        <section>
            <title>Representing Relational Databases</title>
        </section>
        <section>
            <title>Representing Object Databases</title>
        </section>
    </section>
</toc> 
1.2.4.2 Q2

Prepare a (flat) figure list for Book1, listing all the figures and their titles. Preserve the original attributes of each <figure> element, if any.

Solution in XQuery:

<figlist>
  {
    for $f in doc("book.xml")//figure
    return
        <figure>
            { $f/@* }
            { $f/title }
        </figure>
  }
</figlist> 

Expected Result:

<figlist>
    <figure  height="400" width="400">
        <title>Traditional client/server architecture</title>
    </figure>
    <figure  height="200" width="500">
        <title>Graph representations of structures</title>
    </figure>
    <figure  height="250" width ="400">
        <title>Examples of Relations</title>
    </figure>
</figlist> 
1.2.4.3 Q3

How many sections are in Book1, and how many figures?

Solution in XQuery:

<section_count>{ count(doc("book.xml")//section) }</section_count>, 
<figure_count>{ count(doc("book.xml")//figure) }</figure_count> 

Expected Result:

<section_count>7</section_count>
<figure_count>3</figure_count> 
1.2.4.4 Q4

How many top-level sections are in Book1?

Solution in XQuery:

<top_section_count>
 { 
   count(doc("book.xml")/book/section) 
 }
</top_section_count>

Expected Result:

<top_section_count>2</top_section_count> 
1.2.4.5 Q5

Make a flat list of the section elements in Book1. In place of its original attributes, each section element should have two attributes, containing the title of the section and the number of figures immediately contained in the section.

Solution in XQuery:

<section_list>
  {
    for $s in doc("book.xml")//section
    let $f := $s/figure
    return
        <section title="{ $s/title/text() }" figcount="{ count($f) }"/>
  }
</section_list> 

Expected Result:

<section_list>
    <section title="Introduction" figcount="0"/>
    <section title="Audience" figcount="0"/>
    <section title="Web Data and the Two Cultures" figcount="1"/>
    <section title="A Syntax For Data" figcount="1"/>
    <section title="Base Types" figcount="0"/>
    <section title="Representing Relational Databases" figcount="1"/>
    <section title="Representing Object Databases" figcount="0"/>
</section_list>
1.2.4.6 Q6

Make a nested list of the section elements in Book1, preserving their original attributes and hierarchy. Inside each section element, include the title of the section and an element that includes the number of figures immediately contained in the section.

Solution in XQuery:

declare function local:section-summary($book-or-section as element()*)
  as element()*
{
  for $section in $book-or-section
  return
    <section>
       { $section/@* }
       { $section/title }       
       <figcount>         
         { count($section/figure) }
       </figcount>                
       { local:section-summary($section/section) }                      
    </section>
};

<toc>
  {
    for $s in doc("book.xml")/book/section
    return local:section-summary($s)
  }
</toc> 
Editorial note  
This solution was provided by Michael Wenger, a student at the University of Würzburg.

Expected Result:

<toc>
    <section id="intro" difficulty="easy">
        <title>Introduction</title>
        <figcount>0</figcount>
        <section>
            <title>Audience</title>
            <figcount>0</figcount>
        </section>
        <section>
            <title>Web Data and the Two Cultures</title>
            <figcount>1</figcount>
        </section>
    </section>
    <section id="syntax" difficulty="medium">
        <title>A Syntax For Data</title>
        <figcount>1</figcount>
        <section>
            <title>Base Types</title>
            <figcount>0</figcount>
        </section>
        <section>
            <title>Representing Relational Databases</title>
            <figcount>1</figcount>
        </section>
        <section>
            <title>Representing Object Databases</title>
            <figcount>0</figcount>
        </section>
    </section>
</toc> 

1.3 Use Case "SEQ" - Queries based on Sequence

This use case illustrates queries based on the sequence in which elements appear in a document.

1.3.1 Description

Although sequence is not significant in most traditional database systems or object systems, it can be quite significant in structured documents. This use case presents a series of queries based on a medical report.

1.3.2 Document Type Definition (DTD)

This use case is based on a medical report using the HL7 Patient Record Architecture. We simplify the DTD in this example, using only what is needed to understand the queries.

<!DOCTYPE report [
  <!ELEMENT report (section*)>
  <!ELEMENT section (section.title, section.content)>
  <!ELEMENT section.title  (#PCDATA )>
  <!ELEMENT section.content  (#PCDATA | anesthesia | prep 
                            | incision | action | observation )*>
  <!ELEMENT anesthesia (#PCDATA)>
  <!ELEMENT prep ( (#PCDATA | action)* )>
  <!ELEMENT incision ( (#PCDATA | geography | instrument)* )>
  <!ELEMENT action ( (#PCDATA | instrument )* )>
  <!ELEMENT observation (#PCDATA)>
  <!ELEMENT geography (#PCDATA)>
  <!ELEMENT instrument (#PCDATA)>
]> 

1.3.3 Sample Data

The queries in this use case are based on the following sample data.

<report>
  <section>
    <section.title>Procedure</section.title>
     <section.content>
      The patient was taken to the operating room where she was placed
      in supine position and
      <anesthesia>induced under general anesthesia.</anesthesia>
      <prep> 
        <action>A Foley catheter was placed to decompress the bladder</action>
        and the abdomen was then prepped and draped in sterile fashion.
      </prep>  
      <incision>
        A curvilinear incision was made
        <geography>in the midline immediately infraumbilical</geography>
        and the subcutaneous tissue was divided
        <instrument>using electrocautery.</instrument>
      </incision>
      The fascia was identified and
      <action>#2 0 Maxon stay sutures were placed on each side of the midline.
      </action>
      <incision>
        The fascia was divided using
        <instrument>electrocautery</instrument>
        and the peritoneum was entered.
      </incision>
      <observation>The small bowel was identified.</observation>
      and
      <action>
        the
        <instrument>Hasson trocar</instrument>
        was placed under direct visualization.
      </action>
      <action>
        The
        <instrument>trocar</instrument>
        was secured to the fascia using the stay sutures.
      </action>
     </section.content>
  </section>
</report>

1.3.4 Queries and Results

1.3.4.1 Q1

In the Procedure section of Report1, what Instruments were used in the second Incision?

Solution in XQuery:

for $s in doc("report1.xml")//section[section.title = "Procedure"]
return ($s//incision)[2]/instrument

Expected Result:

<instrument>electrocautery</instrument>
1.3.4.2 Q2

In the Procedure section of Report1, what are the first two Instruments to be used?

Solution in XQuery:

for $s in doc("report1.xml")//section[section.title = "Procedure"]
return ($s//instrument)[position()<=2]

Expected Result:

<instrument>using electrocautery.</instrument>
<instrument>electrocautery</instrument> 
1.3.4.3 Q3

In Report1, what Instruments were used in the first two Actions after the second Incision?

Solution in XQuery:

let $i2 := (doc("report1.xml")//incision)[2]
for $a in (doc("report1.xml")//action)[. >> $i2][position()<=2]
return $a//instrument 

Expected Result:

<instrument>Hasson trocar</instrument>
<instrument>trocar</instrument> 
1.3.4.4 Q4

In Report1, find "Procedure" sections where no Anesthesia element occurs before the first Incision

Solution in XQuery:

for $p in doc("report1.xml")//section[section.title = "Procedure"]
where not(some $a in $p//anesthesia satisfies
        $a << ($p//incision)[1] )
return $p 

Expected Result:


                        

(No sections satisfy Q4, thankfully.)

1.3.4.5 Q5

In Report1, what happened between the first Incision and the second Incision?

Solution in XQuery:

declare function local:precedes($a as node(), $b as node()) as xs:boolean 
{
    $a << $b
      and
    empty($a//node() intersect $b)
};


declare function local:follows($a as node(), $b as node()) as xs:boolean 
{
    $a >> $b
      and
    empty($b//node() intersect $a)
};

<critical_sequence>
 {
  let $proc := doc("report1.xml")//section[section.title="Procedure"][1]
  for $n in $proc//node()
  where local:follows($n, ($proc//incision)[1])
    and local:precedes($n, ($proc//incision)[2])
  return $n
 }
</critical_sequence> 

Here is another solution that is perhaps more efficient and less readable:

<critical_sequence>
 {
  let $proc := doc("report1.xml")//section[section.title="Procedure"][1],
      $i1 :=  ($proc//incision)[1],
      $i2 :=  ($proc//incision)[2]
  for $n in $proc//node() except $i1//node()
  where $n >> $i1 and $n << $i2
  return $n 
 }
</critical_sequence> 

Expected Result:

<critical_sequence>
      The fascia was identified and
      <action>#2 0 Maxon stay sutures were placed on each side of the midline.
      </action>#2 0 Maxon stay sutures were placed on each side of the midline.
</critical_sequence> 

In the above output, the contents of the critical sequence element include a text node, an action element, and the text node containing the content of the action element. But the serialization we are using already shows all descendants of a given node. If $c is bound to a sequence of nodes, the following expression eliminates members of the sequence that are descendants of another node already found in the sequence:

$c except $c//node()

In the following solution, the between() function takes a sequence of nodes, a starting node, and an ending node, and returns the nodes between them:

declare function local:between($seq as node()*, $start as node(), $end as node())
 as item()*
{
  let $nodes :=
    for $n in $seq except $start//node()
    where $n >> $start and $n << $end
    return $n
  return $nodes except $nodes//node()
};

<critical_sequence>
 {
  let $proc := doc("report1.xml")//section[section.title="Procedure"][1],
      $first :=  ($proc//incision)[1],
      $second:=  ($proc//incision)[2]
  return local:between($proc//node(), $first, $second)
 }
</critical_sequence>
                        

Here is the output from the above query:

<critical_sequence>
      The fascia was identified and
      <action>#2 0 Maxon stay sutures were placed on each side of the midline.
      </action>
</critical_sequence> 

1.4 Use Case "R" - Access to Relational Data

One important use of an XML query language will be to access data stored in relational databases. This use case describes one possible way in which this access might be accomplished.

1.4.1 Description

A relational database system might present a view in which each table (relation) takes the form of an XML document. One way to represent a database table as an XML document is to allow the document element to represent the table itself, and each row (tuple) inside the table to be represented by a nested element. Inside the tuple-elements, each column is in turn represented by a nested element. Columns that allow null values are represented by optional elements, and a missing element denotes a null value.

As an example, consider a relational database used by an online auction. The auction maintains a USERS table containing information on registered users, each identified by a unique userid, who can either offer items for sale or bid on items. An ITEMS table lists items currently or recently for sale, with the userid of the user who offered each item. A BIDS table contains all bids on record, keyed by the userid of the bidder and the item number of the item to which the bid applies.

The three tables used by the online auction are below, with their column-names indicated in parentheses.

 USERS ( USERID, NAME, RATING ) ITEMS ( ITEMNO,
                        DESCRIPTION, OFFERED_BY, START_DATE, END_DATE,
                        RESERVE_PRICE ) BIDS ( USERID, ITEMNO, BID, BID_DATE
                    )

1.4.2 Document Type Definition (DTD)

This use case is based on three separate input documents named users.xml, items.xml, and bids.xml. Each of the documents represents one of the tables in the relational database described above, using the following DTDs:

 <!DOCTYPE users [ <!ELEMENT users
                        (user_tuple*)> <!ELEMENT user_tuple
                        (userid, name, rating?)> <!ELEMENT userid
                        (#PCDATA)> <!ELEMENT name
                        (#PCDATA)> <!ELEMENT rating
                        (#PCDATA)> ]> <!DOCTYPE items [
                        <!ELEMENT items (item_tuple*)>
                        <!ELEMENT item_tuple (itemno, description,
                        offered_by, start_date?, end_date?, reserve_price?
                        )> <!ELEMENT itemno (#PCDATA)>
                        <!ELEMENT description (#PCDATA)>
                        <!ELEMENT offered_by (#PCDATA)>
                        <!ELEMENT start_date (#PCDATA)>
                        <!ELEMENT end_date (#PCDATA)>
                        <!ELEMENT reserve_price (#PCDATA)>
                        ]> <!DOCTYPE bids [ <!ELEMENT bids
                        (bid_tuple*)> <!ELEMENT bid_tuple (userid,
                        itemno, bid, bid_date)> <!ELEMENT userid
                        (#PCDATA)> <!ELEMENT itemno
                        (#PCDATA)> <!ELEMENT bid (#PCDATA)>
                        <!ELEMENT bid_date (#PCDATA)>
                    ]>

1.4.3 Sample Data

Here is an abbreviated set of data showing the XML format of the instances:

<items>
  <item_tuple>
    <itemno>1001</itemno>
    <description>Red Bicycle</description>
    <offered_by>U01</offered_by>
    <start_date>1999-01-05</start_date>
    <end_date>1999-01-20</end_date>
    <reserve_price>40</reserve_price>
  </item_tuple>
  <!-- !!! Snip !!! -->

<users>
  <user_tuple>
    <userid>U01</userid>
    <name>Tom Jones</name>
    <rating>B</rating>
  </user_tuple>
  <!-- !!! Snip !!! -->

<bids>
  <bid_tuple> 
    <userid>U02</userid> 
    <itemno>1001</itemno> 
    <bid>35</bid> 
    <bid_date>1999-01-07</bid_date> 
    </bid_tuple> 
  <bid_tuple> 
  <!-- !!! Snip !!! -->
                    

The entire data set is represented by the following table:

USERS
USERID NAME RATING
U01 Tom Jones B
U02 Mary Doe A
U03 Dee Linquent D
U04 Roger Smith C
U05 Jack Sprat B
U06 Rip Van Winkle B
ITEMS
ITEMNO DESCRIPTION OFFERED_BY START_DATE END_DATE RESERVE_PRICE
1001 Red Bicycle U01 1999-01-05 1999-01-20 40
1002 Motorcycle U02 1999-02-11 1999-03-15 500
1003 Old Bicycle U02 1999-01-10 1999-02-20 25
1004 Tricycle U01 1999-02-25 1999-03-08 15
1005 Tennis Racket U03 1999-03-19 1999-04-30 20
1006 Helicopter U03 1999-05-05 1999-05-25 50000
1007 Racing Bicycle U04 1999-01-20 1999-02-20 200
1008 Broken Bicycle U01 1999-02-05 1999-03-06 25
BIDS
USERID ITEMNO BID BID_DATE
U02 1001 35 1999-01-07
U04 1001 40 1999-01-08
U02 1001 45 1999-01-11
U04 1001 50 1999-01-13
U02 1001 55 1999-01-15
U01 1002 400 1999-02-14
U02 1002 600 1999-02-16
U03 1002 800 1999-02-17
U04 1002 1000 1999-02-25
U02 1002 1200 1999-03-02
U04 1003 15 1999-01-22
U05 1003 20 1999-02-03
U01 1004 40 1999-03-05
U03 1007 175 1999-01-25
U05 1007 200 1999-02-08
U04 1007 225 1999-02-12

1.4.4 Queries and Results

1.4.4.1 Q1

List the item number and description of all bicycles that currently have an auction in progress, ordered by item number.

Solution in XQuery:

<result>
  {
    for $i in doc("items.xml")//item_tuple
    where $i/start_date <= current-date()
      and $i/end_date >= current-date() 
      and contains($i/description, "Bicycle")
    order by $i/itemno
    return
        <item_tuple>
            { $i/itemno }
            { $i/description }
        </item_tuple>
  }
</result>

Note:

This solution assumes that the current date is 1999-01-31.

Expected Result:

<result>
    <item_tuple>
        <itemno>1003</itemno>
        <description>Old Bicycle</description>
    </item_tuple>
    <item_tuple>
        <itemno>1007</itemno>
        <description>Racing Bicycle</description>
    </item_tuple>
</result>

The above query returns an element named item_tuple, but its definition does not match the definition of item_tuple in the DTD.

1.4.4.2 Q2

For all bicycles, list the item number, description, and highest bid (if any), ordered by item number.

Solution in XQuery:

<result>
  {
    for $i in doc("items.xml")//item_tuple
    let $b := doc("bids.xml")//bid_tuple[itemno = $i/itemno]
    where contains($i/description, "Bicycle")
    order by $i/itemno
    return
        <item_tuple>
            { $i/itemno }
            { $i/description }
            <high_bid>{ max($b/bid) }</high_bid>
        </item_tuple>
  }
</result> 

Expected Result:

<result>
  <item_tuple>
    <itemno>1001</itemno>
    <description>Red Bicycle</description>
    <high_bid>55</high_bid>
  </item_tuple>
  <item_tuple>
    <itemno>1003</itemno>
    <description>Old Bicycle</description>
    <high_bid>20</high_bid>
  </item_tuple>
  <item_tuple>
    <itemno>1007</itemno>
    <description>Racing Bicycle</description>
    <high_bid>225</high_bid>
  </item_tuple>
  <item_tuple>
    <itemno>1008</itemno>
    <description>Broken Bicycle</description>
    <high_bid></high_bid>
  </item_tuple>
</result> 
1.4.4.3 Q3

Find cases where a user with a rating worse (alphabetically, greater) than "C" is offering an item with a reserve price of more than 1000.

Solution in XQuery:

<result>
  {
    for $u in doc("users.xml")//user_tuple
    for $i in doc("items.xml")//item_tuple
    where $u/rating > "C" 
       and $i/reserve_price > 1000 
       and $i/offered_by = $u/userid
    return
        <warning>
            { $u/name }
            { $u/rating }
            { $i/description }
            { $i/reserve_price }
        </warning>
  }
</result>

Expected Result:

<result>
    <warning>
        <name>Dee Linquent</name>
        <rating>D</rating>
        <description>Helicopter</description>
        <reserve_price>50000</reserve_price>
    </warning>
</result>
1.4.4.4 Q4

List item numbers and descriptions of items that have no bids.

Solution in XQuery:

<result>
  {
    for $i in doc("items.xml")//item_tuple
    where empty(doc("bids.xml")//bid_tuple[itemno = $i/itemno])
    return
        <no_bid_item>
            { $i/itemno }
            { $i/description }
        </no_bid_item>
  }
</result> 

Expected Result:

<result>
    <no_bid_item>
        <itemno>1005</itemno>
        <description>Tennis Racket</description>
    </no_bid_item>
    <no_bid_item>
        <itemno>1006</itemno>
        <description>Helicopter</description>
    </no_bid_item>
    <no_bid_item>
        <itemno>1008</itemno>
        <description>Broken Bicycle</description>
    </no_bid_item>
</result>
1.4.4.5 Q5

For bicycle(s) offered by Tom Jones that have received a bid, list the item number, description, highest bid, and name of the highest bidder, ordered by item number.

Solution in XQuery:

<result>
  {
    for $seller in doc("users.xml")//user_tuple,
        $buyer in  doc("users.xml")//user_tuple,
        $item in  doc("items.xml")//item_tuple,
        $highbid in  doc("bids.xml")//bid_tuple
    where $seller/name = "Tom Jones"
      and $seller/userid  = $item/offered_by
      and contains($item/description , "Bicycle")
      and $item/itemno  = $highbid/itemno
      and $highbid/userid  = $buyer/userid
      and $highbid/bid = max(
                              doc("bids.xml")//bid_tuple
                                [itemno = $item/itemno]/bid
                         )
    order by ($item/itemno)
    return
        <jones_bike>
            { $item/itemno }
            { $item/description }
            <high_bid>{ $highbid/bid }</high_bid>
            <high_bidder>{ $buyer/name }</high_bidder>
        </jones_bike>
  }
</result> 

The above query does several joins, and requires the results in a particular order. If there were no order by clause, results would be reported in document order. If you do not care about the order, you can use the unordered function to inform the query processor that the order of the lists in the for clause is not significant, which means that the tuples can be generated in any order. This can enable better optimization.

Unordered Solution in XQuery:

<result>
  {
   unordered (
    for $seller in doc("users.xml")//user_tuple,
        $buyer in doc("users.xml")//user_tuple,
        $item in doc("items.xml")//item_tuple,
        $highbid in  doc("bids.xml")//bid_tuple
    where $seller/name = "Tom Jones"
      and $seller/userid  = $item/offered_by
      and contains($item/description , "Bicycle")
      and $item/itemno  = $highbid/itemno
      and $highbid/userid  = $buyer/userid
      and $highbid/bid = max(
                              doc("bids.xml")//bid_tuple
                                [itemno = $item/itemno]/bid
                         )
    return
        <jones_bike>
            { $item/itemno }
            { $item/description }
            <high_bid>{ $highbid/bid }</high_bid>
            <high_bidder>{ $buyer/name }</high_bidder>
        </jones_bike>
    )
  }
</result> 

Expected Result:

<result>
    <jones_bike>
        <itemno>1001</itemno>
        <description>Red Bicycle</description>
        <high_bid>
            <bid>55</bid>
        </high_bid>
        <high_bidder>
            <name>Mary Doe</name>
        </high_bidder>
    </jones_bike>
</result>
1.4.4.6 Q6

For each item whose highest bid is more than twice its reserve price, list the item number, description, reserve price, and highest bid.

Solution in XQuery:

<result>
  {
    for $item in doc("items.xml")//item_tuple
    let $b := doc("bids.xml")//bid_tuple[itemno = $item/itemno]
    let $z := max($b/bid)
    where $item/reserve_price * 2 < $z
    return
        <successful_item>
            { $item/itemno }
            { $item/description }
            { $item/reserve_price }
            <high_bid>{$z }</high_bid>
         </successful_item>
  }
</result> 

Expected Result:

<result>
    <successful_item>
        <itemno>1002</itemno>
        <description>Motorcycle</description>
        <reserve_price>500</reserve_price>
        <high_bid>1200</high_bid>
    </successful_item>
    <successful_item>
        <itemno>1004</itemno>
        <description>Tricycle</description>
        <reserve_price>15</reserve_price>
        <high_bid>40</high_bid>
    </successful_item>
</result> 
1.4.4.7 Q7

Find the highest bid ever made for a bicycle or tricycle.

Solution in XQuery:

let $allbikes := doc("items.xml")//item_tuple
                    [contains(description, "Bicycle") 
                     or contains(description, "Tricycle")]
let $bikebids := doc("bids.xml")//bid_tuple[itemno = $allbikes/itemno]
return
    <high_bid>
      { 
        max($bikebids/bid) 
      }
    </high_bid> 

Expected Result:

<high_bid>225</high_bid>
1.4.4.8 Q8

How many items were actioned (auction ended) in March 1999?

Solution in XQuery:

let $item := doc("items.xml")//item_tuple
  [end_date >= xs:date("1999-03-01") and end_date <= xs:date("1999-03-31")]
return
    <item_count>
      { 
        count($item) 
      }
    </item_count>

Expected Result:

<item_count>3</item_count>
1.4.4.9 Q9

List the number of items auctioned each month in 1999 for which data is available, ordered by month.

Solution in XQuery:

<result>
  {
    let $end_dates := doc("items.xml")//item_tuple/end_date
    for $m in distinct-values(for $e in $end_dates 
                              return month-from-date($e))
    let $item := doc("items.xml")
        //item_tuple[year-from-date(end_date) = 1999 
                     and month-from-date(end_date) = $m]
    order by $m
    return
        <monthly_result>
            <month>{ $m }</month>
            <item_count>{ count($item) }</item_count>
        </monthly_result>
  }
</result>

Expected Result:

<result>
    <monthly_result>
        <month>1</month>
        <item_count>1</item_count>
    </monthly_result>
    <monthly_result>
        <month>2</month>
        <item_count>2</item_count>
    </monthly_result>
    <monthly_result>
        <month>3</month>
        <item_count>3</item_count>
    </monthly_result>
    <monthly_result>
        <month>4</month>
        <item_count>1</item_count>
    </monthly_result>
    <monthly_result>
        <month>5</month>
        <item_count>1</item_count>
    </monthly_result>
</result> 
1.4.4.10 Q10

For each item that has received a bid, list the item number, the highest bid, and the name of the highest bidder, ordered by item number.

Solution in XQuery:

<result>
 {
    for $highbid in doc("bids.xml")//bid_tuple,
        $user in doc("users.xml")//user_tuple
    where $user/userid = $highbid/userid 
      and $highbid/bid = max(doc("bids.xml")//bid_tuple[itemno=$highbid/itemno]/bid)
    order by $highbid/itemno
    return
        <high_bid>
            { $highbid/itemno }
            { $highbid/bid }
            <bidder>{ $user/name/text() }</bidder>
        </high_bid>
  }
</result> 

Expected Result:

<result>
    <high_bid>
        <itemno>1001</itemno>
        <bid>55</bid>
        <bidder>Mary Doe</bidder>
    </high_bid>
    <high_bid>
        <itemno>1002</itemno>
        <bid>1200</bid>
        <bidder>Mary Doe</bidder>
    </high_bid>
    <high_bid>
        <itemno>1003</itemno>
        <bid>20</bid>
        <bidder>Jack Sprat</bidder>
    </high_bid>
    <high_bid>
        <itemno>1004</itemno>
        <bid>40</bid>
        <bidder>Tom Jones</bidder>
    </high_bid>
    <high_bid>
        <itemno>1007</itemno>
        <bid>225</bid>
        <bidder>Roger Smith</bidder>
    </high_bid>
</result> 
1.4.4.11 Q11

List the item number and description of the item(s) that received the highest bid ever recorded, and the amount of that bid.

Solution in XQuery:

let $highbid := max(doc("bids.xml")//bid_tuple/bid)
return
    <result>
     {
        for $item in doc("items.xml")//item_tuple,
            $b in doc("bids.xml")//bid_tuple[itemno = $item/itemno]
        where $b/bid = $highbid
        return
            <expensive_item>
                { $item/itemno }
                { $item/description }
                <high_bid>{ $highbid }</high_bid>
            </expensive_item>
     }
    </result> 

Expected Result:

<result>
    <expensive_item>
        <itemno>1002</itemno>
        <description>Motorcycle</description>
        <high_bid>1200</high_bid>
    </expensive_item>
</result> 
1.4.4.12 Q12

List the item number and description of the item(s) that received the largest number of bids, and the number of bids it (or they) received.

Solution in XQuery:

declare function local:bid_summary()
  as element()*
{
    for $i in distinct-values(doc("bids.xml")//itemno)
    let $b := doc("bids.xml")//bid_tuple[itemno = $i]
    return
        <bid_count>
            <itemno>{ $i }</itemno>
            <nbids>{ count($b) }</nbids>
        </bid_count>
};

<result>
 {
    let $bid_counts := local:bid_summary(),
        $maxbids := max($bid_counts/nbids),
        $maxitemnos := $bid_counts[nbids = $maxbids]
    for $item in doc("items.xml")//item_tuple,
        $bc in $bid_counts
    where $bc/nbids =  $maxbids and $item/itemno = $bc/itemno
    return
        <popular_item>
            { $item/itemno }
            { $item/description }
            <bid_count>{ $bc/nbids/text() }</bid_count>
        </popular_item>
 }
</result> 

Expected Result:

<result>
    <popular_item>
        <itemno>1001</itemno>
        <description>Red Bicycle</description>
        <bid_count>5</bid_count>
    </popular_item>
    <popular_item>
        <itemno>1002</itemno>
        <description>Motorcycle</description>
        <bid_count>5</bid_count>
    </popular_item>
</result> 
1.4.4.13 Q13

For each user who has placed a bid, give the userid, name, number of bids, and average bid, in order by userid.

Solution in XQuery:

<result>
 {
    for $uid in distinct-values(doc("bids.xml")//userid),
        $u in doc("users.xml")//user_tuple[userid = $uid]
    let $b := doc("bids.xml")//bid_tuple[userid = $uid]
    order by $u/userid
    return
        <bidder>
            { $u/userid }
            { $u/name }
            <bidcount>{ count($b) }</bidcount>
            <avgbid>{ avg($b/bid) }</avgbid>
        </bidder>
  }
</result> 

Expected Result:

<result>
    <bidder>
        <userid>U01</userid>
        <name>Tom Jones</name>
        <bidcount>2</bidcount>
        <avgbid>220</avgbid>
    </bidder>
    <bidder>
        <userid>U02</userid>
        <name>Mary Doe</name>
        <bidcount>5</bidcount>
        <avgbid>387</avgbid>
    </bidder>
    <bidder>
        <userid>U03</userid>
        <name>Dee Linquent</name>
        <bidcount>2</bidcount>
        <avgbid>487.5</avgbid>
    </bidder>
    <bidder>
        <userid>U04</userid>
        <name>Roger Smith</name>
        <bidcount>5</bidcount>
        <avgbid>266</avgbid>
    </bidder>
    <bidder>
        <userid>U05</userid>
        <name>Jack Sprat</name>
        <bidcount>2</bidcount>
        <avgbid>110</avgbid>
    </bidder>
</result> 
1.4.4.14 Q14

List item numbers and average bids for items that have received three or more bids, in descending order by average bid.

Solution in XQuery:

<result>
 {
    for $i in distinct-values(doc("bids.xml")//itemno)
    let $b := doc("bids.xml")//bid_tuple[itemno = $i]
    let $avgbid := avg($b/bid)
    where count($b) >= 3
    order by $avgbid descending
    return
        <popular_item>
            <itemno>{ $i }</itemno>
            <avgbid>{ $avgbid }</avgbid>
        </popular_item>
  }
</result> 

Expected Result:

<result>
    <popular_item>
        <itemno>1002</itemno>
        <avgbid>800</avgbid>
    </popular_item>
    <popular_item>
        <itemno>1007</itemno>
        <avgbid>200</avgbid>
    </popular_item>
    <popular_item>
        <itemno>1001</itemno>
        <avgbid>45</avgbid>
    </popular_item>
</result>
1.4.4.15 Q15

List names of users who have placed multiple bids of at least $100 each.

Solution in XQuery:

<result>
  {
    for $u in doc("users.xml")//user_tuple
    let $b := doc("bids.xml")//bid_tuple[userid=$u/userid and bid>=100]
    where count($b) > 1
    return
        <big_spender>{ $u/name/text() }</big_spender>
  }
</result>
                        

Expected Result:

<result>
    <big_spender>Mary Doe</big_spender>
    <big_spender>Dee Linquent</big_spender>
    <big_spender>Roger Smith</big_spender>
</result>
1.4.4.16 Q16

List all registered users in order by userid; for each user, include the userid, name, and an indication of whether the user is active (has at least one bid on record) or inactive (has no bid on record).

Solution in XQuery:

<result>
  {
    for $u in doc("users.xml")//user_tuple
    let $b := doc("bids.xml")//bid_tuple[userid = $u/userid]
    order by $u/userid
    return
        <user>
            { $u/userid }
            { $u/name }
            {
                if (empty($b))
                  then <status>inactive</status>
                  else <status>active</status>
            }
        </user>
  }
</result>
                        

Expected Result:

<result>
    <user>
        <userid>U01</userid>
        <name>Tom Jones</name>
        <status>active</status>
    </user>
    <user>
        <userid>U02</userid>
        <name>Mary Doe</name>
        <status>active</status>
    </user>
    <user>
        <userid>U03</userid>
        <name>Dee Linquent</name>
        <status>active</status>
    </user>
    <user>
        <userid>U04</userid>
        <name>Roger Smith</name>
        <status>active</status>
    </user>
    <user>
        <userid>U05</userid>
        <name>Jack Sprat</name>
        <status>active</status>
    </user>
    <user>
        <userid>U06</userid>
        <name>Rip Van Winkle</name>
        <status>inactive</status>
    </user>
</result>
1.4.4.17 Q17

List the names of users, if any, who have bid on every item.

Solution in XQuery:

<frequent_bidder>
  {
    for $u in doc("users.xml")//user_tuple
    where 
      every $item in doc("items.xml")//item_tuple satisfies 
        some $b in doc("bids.xml")//bid_tuple satisfies 
          ($item/itemno = $b/itemno and $u/userid = $b/userid)
    return
        $u/name
  }
</frequent_bidder>

Expected Result:

<frequent_bidder /> 

(No users satisfy Q17.)

1.4.4.18 Q18

List all users in alphabetic order by name. For each user, include descriptions of all the items (if any) that were bid on by that user, in alphabetic order.

Solution in XQuery:

<result>
  {
    for $u in doc("users.xml")//user_tuple
    order by $u/name
    return
        <user>
            { $u/name }
            {
                for $b in distinct-values(doc("bids.xml")//bid_tuple
                                             [userid = $u/userid]/itemno)
                for $i in doc("items.xml")//item_tuple[itemno = $b]
                let $descr := $i/description/text()
                order by $descr
                return
                    <bid_on_item>{ $descr }</bid_on_item>
            }
        </user>
  }
</result>
                        

Expected Result:

<result>
    <user>
        <name>Dee Linquent</name>
        <bid_on_item>Motorcycle</bid_on_item>
        <bid_on_item>Racing Bicycle</bid_on_item>
    </user>
    <user>
        <name>Jack Sprat</name>
        <bid_on_item>Old Bicycle</bid_on_item>
        <bid_on_item>Racing Bicycle</bid_on_item>
    </user>
    <user>
        <name>Mary Doe</name>
        <bid_on_item>Motorcycle</bid_on_item>
        <bid_on_item>Red Bicycle</bid_on_item>
    </user>
    <user>
        <name>Rip Van Winkle</name>
    </user>
    <user>
        <name>Roger Smith</name>
        <bid_on_item>Motorcycle</bid_on_item>
        <bid_on_item>Old Bicycle</bid_on_item>
        <bid_on_item>Racing Bicycle</bid_on_item>
        <bid_on_item>Red Bicycle</bid_on_item>
    </user>
    <user>
        <name>Tom Jones</name>
        <bid_on_item>Motorcycle</bid_on_item>
        <bid_on_item>Tricycle</bid_on_item>
    </user>
</result>

1.5 Use Case "SGML": Standard Generalized Markup Language

1.5.1 Description

The example document and queries in this Use Case were first created for a 1992 conference on Standard Generalized Markup Language (SGML). For our use, the Document Type Definition (DTD) and example document have been translated from SGML to XML.

1.5.2 Document Type Definition (DTD)

This use case is based on data conforming to the DTD shown below.

 <!NOTATION cgm PUBLIC "Computer Graphics Metafile">
 <!NOTATION ccitt PUBLIC "CCITT group 4 raster">

 <!ENTITY % text "(#PCDATA | emph)*">
 <!ENTITY infoflow SYSTEM "infoflow.ccitt" NDATA ccitt>
 <!ENTITY tagexamp SYSTEM "tagexamp.cgm"   NDATA cgm>

 <!ELEMENT report (title, chapter+)>
 <!ELEMENT title %text;>
 <!ELEMENT chapter (title, intro?, section*)>
 <!ATTLIST chapter
               shorttitle CDATA #IMPLIED>
 <!ELEMENT intro (para | graphic)+>
 <!ELEMENT section (title, intro?, topic*)>
 <!ATTLIST section
               shorttitle CDATA #IMPLIED
               sectid ID #IMPLIED>
 <!ELEMENT topic (title, (para | graphic)+)>
 <!ATTLIST topic
               shorttitle CDATA #IMPLIED
               topicid ID #IMPLIED>
 <!ELEMENT para (#PCDATA | emph | xref)*>
 <!ATTLIST para
               security (u | c | s | ts) "u">
 <!ELEMENT emph %text;>
 <!ELEMENT graphic EMPTY>
 <!ATTLIST graphic
               graphname ENTITY #REQUIRED>
 <!ELEMENT xref EMPTY>
 <!ATTLIST xref
               xrefid IDREF #IMPLIED>
                    

1.5.3 Sample Data

The queries in this use case are based on the following sample data, which is found in the file "sgml.xml". Line numbers have been added to the data to allow the results of queries to be conveniently specified.

 0: <!DOCTYPE report SYSTEM "report.dtd">
 1: <report>
 2: <title>Getting started with SGML</title>
 3: <chapter>
 4: <title>The business challenge</title>
 5: <intro>
 6: <para>With the ever-changing and growing global market, companies and
 7: large organizations are searching for ways to become more viable and
 8: competitive. Downsizing and other cost-cutting measures demand more
 9: efficient use of corporate resources. One very important resource is
10: an organization's information.</para>
11: <para>As part of the move toward integrated information management,
12: whole industries are developing and implementing standards for
13: exchanging technical information. This report describes how one such
14: standard, the Standard Generalized Markup Language (SGML), works as
15: part of an overall information management strategy.</para>
16: <graphic graphname="infoflow"/></intro></chapter>
17: <chapter>
18: <title>Getting to know SGML</title>
19: <intro>
20: <para>While SGML is a fairly recent technology, the use of
21: <emph>markup</emph> in computer-generated documents has existed for a
22: while.</para></intro>
23: <section shorttitle="What is markup?">
24: <title>What is markup, or everything you always wanted to know about
25: document preparation but were afraid to ask?</title>
26: <intro>
27: <para>Markup is everything in a document that is not content. The
28: traditional meaning of markup is the manual <emph>marking</emph> up
29: of typewritten text to give instructions for a typesetter or
30: compositor about how to fit the text on a page and what typefaces to
31: use. This kind of markup is known as <emph>procedural markup</emph>.</para></intro>
32: <topic topicid="top1">
33: <title>Procedural markup</title>
34: <para>Most electronic publishing systems today use some form of
35: procedural markup. Procedural markup codes are good for one
36: presentation of the information.</para></topic>
37: <topic topicid="top2">
38: <title>Generic markup</title>
39: <para>Generic markup (also known as descriptive markup) describes the
40: <emph>purpose</emph> of the text in a document. A basic concept of
41: generic markup is that the content of a document must be separate from
42: the style. Generic markup allows for multiple presentations of the
43: information.</para></topic>
44: <topic topicid="top3">
45: <title>Drawbacks of procedural markup</title>
46: <para>Industries involved in technical documentation increasingly
47: prefer generic over procedural markup schemes. When a company changes
48: software or hardware systems, enormous data translation tasks arise,
49: often resulting in errors.</para></topic></section>
50: <section shorttitle="What is SGML?">
51: <title>What <emph>is</emph> SGML in the grand scheme of the universe, anyway?</title>
52: <intro>
53: <para>SGML defines a strict markup scheme with a syntax for defining
54: document data elements and an overall framework for marking up
55: documents.</para>
56: <para>SGML can describe and create documents that are not dependent on
57: any hardware, software, formatter, or operating system. Since SGML documents
58: conform to an international standard, they are portable.</para></intro></secti