This document is also available in these non-normative formats: XML.
Copyright © 2005 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This version of the Use Cases document corresponds to the XQuery Working Draft released on 15 September 2005. The queries in this document have been parsed using a parser generated from the same grammar used to create the documentation for the XQuery Working Draft. The syntax of types has changed, and this is reflected in the queries. A number of errors have been corrected. See B Change Log for more information on changes.
This is a W3C Working Draft for review by W3C Members and other interested parties. Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document has been produced as part of the W3C XML Activity, following the procedures set out for the W3C Process. The document has been written by the XML Query Working Group The goals of the XML Query working group are discussed in the XML Query Working Group charter.
The XML Query Working Group feels that the contents of this Working Draft are relatively stable, and therefore encourages feedback on this version.
As of this publication, the Working Group expects to eventually publish this document as a Working Group Note. It is not expected to become a W3C Recommendation, and therefore it has no associated W3C Patent Policy licensing obligations.
Comments for this specification should be entered into the last call issue tracking system (instructions can be found at http://www.w3.org/XML/2005/04/qt-bugzilla. If access to that system is not feasible, you may send your comments to the W3C mailing list public-qt-comments@w3.org. (archived at http://lists.w3.org/Archives/Public/public-qt-comments/) with "[use]" at the beginning of the subject field.
A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR/.
1 Use Cases for XML Queries
1.1 Use Case "XMP": Experiences and Exemplars
1.1.1 Document Type Definitions (DTD)
1.1.2 Sample Data
1.1.3 DTD for Q5
1.1.4 Sample Data for Q5
1.1.5 DTD for Q9
1.1.6 Data for Q9
1.1.7 DTD for Q10
1.1.8 Data for Q10
1.1.9 Queries and Results
1.1.9.1 Q1
1.1.9.2 Q2
1.1.9.3 Q3
1.1.9.4 Q4
1.1.9.5 Q5
1.1.9.6 Q6
1.1.9.7 Q7
1.1.9.8 Q8
1.1.9.9 Q9
1.1.9.10 Q10
1.1.9.11 Q11
1.1.9.12 Q12
1.2 Use Case "TREE": Queries that preserve hierarchy
1.2.1 Description
1.2.2 Document Type Definition (DTD)
1.2.3 Sample Data
1.2.4 Queries and Results
1.2.4.1 Q1
1.2.4.2 Q2
1.2.4.3 Q3
1.2.4.4 Q4
1.2.4.5 Q5
1.2.4.6 Q6
1.3 Use Case "SEQ" - Queries based on Sequence
1.3.1 Description
1.3.2 Document Type Definition (DTD)
1.3.3 Sample Data
1.3.4 Queries and Results
1.3.4.1 Q1
1.3.4.2 Q2
1.3.4.3 Q3
1.3.4.4 Q4
1.3.4.5 Q5
1.4 Use Case "R" - Access to Relational Data
1.4.1 Description
1.4.2 Document Type Definition (DTD)
1.4.3 Sample Data
1.4.4 Queries and Results
1.4.4.1 Q1
1.4.4.2 Q2
1.4.4.3 Q3
1.4.4.4 Q4
1.4.4.5 Q5
1.4.4.6 Q6
1.4.4.7 Q7
1.4.4.8 Q8
1.4.4.9 Q9
1.4.4.10 Q10
1.4.4.11 Q11
1.4.4.12 Q12
1.4.4.13 Q13
1.4.4.14 Q14
1.4.4.15 Q15
1.4.4.16 Q16
1.4.4.17 Q17
1.4.4.18 Q18
1.5 Use Case "SGML": Standard Generalized Markup Language
1.5.1 Description
1.5.2 Document Type Definition (DTD)
1.5.3 Sample Data
1.5.4 Queries and Results
1.5.4.1 Q1
1.5.4.2 Q2
1.5.4.3 Q3
1.5.4.4 Q4
1.5.4.5 Q5
1.5.4.6 Q6
1.5.4.7 Q7
1.5.4.8 Q8a
1.5.4.9 Q8b
1.5.4.10 Q9
1.5.4.11 Q10
1.6 Use Case "STRING": String Search
1.6.1 Description
1.6.2 Document Type Definition (DTD)
1.6.3 Sample Data
1.6.4 Queries and Results
1.6.4.1 Q1
1.6.4.2 Q2
1.6.4.3 Q3
1.6.4.4 Q4
1.6.4.5 Q5
1.7 Use Case "NS" - Queries Using Namespaces
1.7.1 Description
1.7.2 Document Type Definition (DTD)
1.7.3 Sample Data
1.7.4 Queries and Results
1.7.4.1 Q1
1.7.4.2 Q2
1.7.4.3 Q3
1.7.4.4 Q4
1.7.4.5 Q5
1.7.4.6 Q6
1.7.4.7 Q7
1.7.4.8 Q8
1.8 Use Case "PARTS" - Recursive Parts Explosion
1.8.1 Description
1.8.2 Document Type Definitions (DTD)
1.8.3 Sample Data
1.8.4 Queries and Results
1.8.4.1 Q1
1.9 Use Case "STRONG" - queries that exploit strongly typed data
1.9.1 Description
1.9.2 Schema
1.9.3 Sample Data
1.9.4 Queries
1.9.4.1 Q1
1.9.4.2 Q2
1.9.4.3 Q3
1.9.4.4 Q4
1.9.4.5 Q5
1.9.4.6 Q6
1.9.4.7 Q7
1.9.4.8 Q8
1.9.4.9 Q9
1.9.4.10 Q10
1.9.4.11 Q11
1.9.4.12 Q12
A Acknowledgements
B Change Log (Non-Normative)
B.1 31 Aug 2005
B.2 11 July 2005
B.3 04 April 2005
B.4 30 Jan 2005
C References (Non-Normative)
The use cases listed below were created by the XML Query Working Group to illustrate important applications for an XML query language. Each use case is focused on a specific application area, and contains a Document Type Definition (DTD) and example input data. Each use case specifies a set of queries that might be applied to the input data, and the expected results for each query. Since the English description of each query is concise, the expected results form an important part of the definition of each query, specifying the expected output format. These use cases were originally published as part of the [Requirements] document, without solutions in concrete query languages. Now it is being republished with solutions for [XQuery]. These use cases are also being used by the W3C XML Query Testing Task Force.
The input environment for each use case is stated in its Document Type Definition (DTD) section. All of these use cases assume that input is provided in the form of one or more documents with specific names. For instance, the authors in a document may be accessed with expressions like this:
doc("http://bstore1.example.com/bib.xml")//author
Some implementations of XQuery bind input to external variables. If the environment has bound the external variable $b to the same document used in the above query, this expression would return the same set of authors:
$b//author
Some implementations of XQuery predefine a single 'context item', which is available at the root level of a query, and which is used to resolve paths that begin with a leading slash. In such an implementation, if the context item is bound to document node of the same well-formed document used in the previous examples, this expression would return the same set of authors:
//author
Previous versions of this document accessed implicit documents using the input() function, which no longer exists. The input() function had similar functionality to a predefined context item, except that it could be bound to a sequence of nodes, whereas the context item may only be bound to a single node. The use cases that used input() have been rewritten to use explicit file names.
Several implementors have asked that we make the queries from these use cases available in a separate file to make it easier for them to test their parsers. These queries may be found in [Use Case Sample Queries]. Also, the queries from the XQuery specification itself have been made available in [XQuery Sample Queries].
To make output more readable, the output of queries has been formatted using whitespace which may not be returned by a query processor. This whitespace should not be considered normative for the correctness of results.
These use cases represent a snapshot of an ongoing work. Some important application areas are not yet adequately covered by a use case. The XML Query Working Group reserves the right to add, delete, or modify individual queries or whole use cases as the work progresses. The presence of a query in this set of use cases does not necessarily indicate that the query will be expressible in the XML Query Language(s) to be created by the XML Query Working Group.
This use case contains several example queries that illustrate requirements gathered from the database and document communities.
Most of the example queries in this use case are based on a bibliography document named "http://bstore1.example.com/bib.xml" with the following DTD:
<!ELEMENT bib (book* )> <!ELEMENT book (title, (author+ | editor+ ), publisher, price )> <!ATTLIST book year CDATA #REQUIRED > <!ELEMENT author (last, first )> <!ELEMENT editor (last, first, affiliation )> <!ELEMENT title (#PCDATA )> <!ELEMENT last (#PCDATA )> <!ELEMENT first (#PCDATA )> <!ELEMENT affiliation (#PCDATA )> <!ELEMENT publisher (#PCDATA )> <!ELEMENT price (#PCDATA )>
Here is the data found at "bstore1.example.com/bib.xml":
<bib>
<book year="1994">
<title>TCP/IP Illustrated</title>
<author><last>Stevens</last><first>W.</first></author>
<publisher>Addison-Wesley</publisher>
<price>65.95</price>
</book>
<book year="1992">
<title>Advanced Programming in the Unix environment</title>
<author><last>Stevens</last><first>W.</first></author>
<publisher>Addison-Wesley</publisher>
<price>65.95</price>
</book>
<book year="2000">
<title>Data on the Web</title>
<author><last>Abiteboul</last><first>Serge</first></author>
<author><last>Buneman</last><first>Peter</first></author>
<author><last>Suciu</last><first>Dan</first></author>
<publisher>Morgan Kaufmann Publishers</publisher>
<price>39.95</price>
</book>
<book year="1999">
<title>The Economics of Technology and Content for Digital TV</title>
<editor>
<last>Gerbarg</last><first>Darcy</first>
<affiliation>CITI</affiliation>
</editor>
<publisher>Kluwer Academic Publishers</publisher>
<price>129.95</price>
</book>
</bib>
Q5 also uses information on book reviews and prices from a separate data source named "http://bstore2.example.com/reviews.xml" with the following DTD:
<!ELEMENT reviews (entry*)> <!ELEMENT entry (title, price, review)> <!ELEMENT title (#PCDATA)> <!ELEMENT price (#PCDATA)> <!ELEMENT review (#PCDATA)>
Here are the contents of "http://bstore2.example.com/reviews.xml":
<reviews>
<entry>
<title>Data on the Web</title>
<price>34.95</price>
<review>
A very good discussion of semi-structured database
systems and XML.
</review>
</entry>
<entry>
<title>Advanced Programming in the Unix environment</title>
<price>65.95</price>
<review>
A clear and detailed discussion of UNIX programming.
</review>
</entry>
<entry>
<title>TCP/IP Illustrated</title>
<price>65.95</price>
<review>
One of the best books on TCP/IP.
</review>
</entry>
</reviews>
Q9 uses an input document named "books.xml", with the following DTD:
<!ELEMENT chapter (title, section*)> <!ELEMENT section (title, section*)> <!ELEMENT title (#PCDATA)>
Here are the contents of books.xml:
<chapter>
<title>Data Model</title>
<section>
<title>Syntax For Data Model</title>
</section>
<section>
<title>XML</title>
<section>
<title>Basic Syntax</title>
</section>
<section>
<title>XML and Semistructured Data</title>
</section>
</section>
</chapter>
Q10 uses an input document named "prices.xml", with the following DTD:
<!ELEMENT prices (book*)>
<!ELEMENT book (title, source, price)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT source (#PCDATA)>
<!ELEMENT price (#PCDATA)>
Here are the contents of prices.xml:
<prices>
<book>
<title>Advanced Programming in the Unix environment</title>
<source>bstore2.example.com</source>
<price>65.95</price>
</book>
<book>
<title>Advanced Programming in the Unix environment</title>
<source>bstore1.example.com</source>
<price>65.95</price>
</book>
<book>
<title>TCP/IP Illustrated</title>
<source>bstore2.example.com</source>
<price>65.95</price>
</book>
<book>
<title>TCP/IP Illustrated</title>
<source>bstore1.example.com</source>
<price>65.95</price>
</book>
<book>
<title>Data on the Web</title>
<source>bstore2.example.com</source>
<price>34.95</price>
</book>
<book>
<title>Data on the Web</title>
<source>bstore1.example.com</source>
<price>39.95</price>
</book>
</prices>
List books published by Addison-Wesley after 1991, including their year and title.
Solution in XQuery:
<bib>
{
for $b in doc("http://bstore1.example.com/bib.xml")/bib/book
where $b/publisher = "Addison-Wesley" and $b/@year > 1991
return
<book year="{ $b/@year }">
{ $b/title }
</book>
}
</bib>
Expected Result:
<bib>
<book year="1994">
<title>TCP/IP Illustrated</title>
</book>
<book year="1992">
<title>Advanced Programming in the Unix environment</title>
</book>
</bib>
Create a flat list of all the title-author pairs, with each pair enclosed in a "result" element.
Solution in XQuery:
<results>
{
for $b in doc("http://bstore1.example.com/bib.xml")/bib/book,
$t in $b/title,
$a in $b/author
return
<result>
{ $t }
{ $a }
</result>
}
</results>
Expected Result:
<results>
<result>
<title>TCP/IP Illustrated</title>
<author>
<last>Stevens</last>
<first>W.</first>
</author>
</result>
<result>
<title>Advanced Programming in the Unix environment</title>
<author>
<last>Stevens</last>
<first>W.</first>
</author>
</result>
<result>
<title>Data on the Web</title>
<author>
<last>Abiteboul</last>
<first>Serge</first>
</author>
</result>
<result>
<title>Data on the Web</title>
<author>
<last>Buneman</last>
<first>Peter</first>
</author>
</result>
<result>
<title>Data on the Web</title>
<author>
<last>Suciu</last>
<first>Dan</first>
</author>
</result>
</results>
For each book in the bibliography, list the title and authors, grouped inside a "result" element.
Solution in XQuery:
<results>
{
for $b in doc("http://bstore1.example.com/bib.xml")/bib/book
return
<result>
{ $b/title }
{ $b/author }
</result>
}
</results>
Expected Result:
<results>
<result>
<title>TCP/IP Illustrated</title>
<author>
<last>Stevens</last>
<first>W.</first>
</author>
</result>
<result>
<title>Advanced Programming in the Unix environment</title>
<author>
<last>Stevens</last>
<first>W.</first>
</author>
</result>
<result>
<title>Data on the Web</title>
<author>
<last>Abiteboul</last>
<first>Serge</first>
</author>
<author>
<last>Buneman</last>
<first>Peter</first>
</author>
<author>
<last>Suciu</last>
<first>Dan</first>
</author>
</result>
<result>
<title>The Economics of Technology and Content for Digital TV</title>
</result>
</results>
For each author in the bibliography, list the author's name and the titles of all books by that author, grouped inside a "result" element.
Solution in XQuery:
<results>
{
let $a := doc("http://bstore1.example.com/bib/bib.xml")//author
for $last in distinct-values($a/last),
$first in distinct-values($a[last=$last]/first)
order by $last, $first
return
<result>
<author>
<last>{ $last }</last>
<first>{ $first }</first>
</author>
{
for $b in doc("http://bstore1.example.com/bib.xml")/bib/book
where some $ba in $b/author
satisfies ($ba/last = $last and $ba/first=$first)
return $b/title
}
</result>
}
</results>
The order in which values are returned by distinct-values() is undefined. The distinct-values() function returns atomic values, extracting the names from the elements.
Expected Result:
<results>
<result>
<author>
<last>Abiteboul</last>
<first>Serge</first>
</author>
<title>Data on the Web</title>
</result>
<result>
<author>
<last>Buneman</last>
<first>Peter</first>
</author>
<title>Data on the Web</title>
</result>
<result>
<author>
<last>Stevens</last>
<first>W.</first>
</author>
<title>TCP/IP Illustrated</title>
<title>Advanced Programming in the Unix environment</title>
</result>
<result>
<author>
<last>Suciu</last>
<first>Dan</first>
</author>
<title>Data on the Web</title>
</result>
</results>
For each book found at both bstore1.example.com and bstore2.example.com, list the title of the book and its price from each source.
Solution in XQuery:
<books-with-prices>
{
for $b in doc("http://bstore1.example.com/bib.xml")//book,
$a in doc("http://bstore2.example.com/reviews.xml")//entry
where $b/title = $a/title
return
<book-with-prices>
{ $b/title }
<price-bstore2>{ $a/price/text() }</price-bstore2>
<price-bstore1>{ $b/price/text() }</price-bstore1>
</book-with-prices>
}
</books-with-prices>
Expected Result:
<books-with-prices>
<book-with-prices>
<title>TCP/IP Illustrated</title>
<price-bstore2>65.95</price-bstore2>
<price-bstore1>65.95</price-bstore1>
</book-with-prices>
<book-with-prices>
<title>Advanced Programming in the Unix environment</title>
<price-bstore2>65.95</price-bstore2>
<price-bstore1>65.95</price-bstore1>
</book-with-prices>
<book-with-prices>
<title>Data on the Web</title>
<price-bstore2>34.95</price-bstore2>
<price-bstore1>39.95</price-bstore1>
</book-with-prices>
</books-with-prices>
For each book that has at least one author, list the title and first two authors, and an empty "et-al" element if the book has additional authors.
Solution in XQuery:
<bib>
{
for $b in doc("http://bstore1.example.com/bib.xml")//book
where count($b/author) > 0
return
<book>
{ $b/title }
{
for $a in $b/author[position()<=2]
return $a
}
{
if (count($b/author) > 2)
then <et-al/>
else ()
}
</book>
}
</bib>
Expected Result:
<bib>
<book>
<title>TCP/IP Illustrated</title>
<author>
<last>Stevens</last>
<first>W.</first>
</author>
</book>
<book>
<title>Advanced Programming in the Unix environment</title>
<author>
<last>Stevens</last>
<first>W.</first>
</author>
</book>
<book>
<title>Data on the Web</title>
<author>
<last>Abiteboul</last>
<first>Serge</first>
</author>
<author>
<last>Buneman</last>
<first>Peter</first>
</author>
<et-al/>
</book>
</bib>
List the titles and years of all books published by Addison-Wesley after 1991, in alphabetic order.
Solution in XQuery:
<bib>
{
for $b in doc("http://bstore1.example.com/bib.xml")//book
where $b/publisher = "Addison-Wesley" and $b/@year > 1991
order by $b/title
return
<book>
{ $b/@year }
{ $b/title }
</book>
}
</bib>
Expected Result:
<bib>
<book year="1992">
<title>Advanced Programming in the Unix environment</title>
</book>
<book year="1994">
<title>TCP/IP Illustrated</title>
</book>
</bib>
Find books in which the name of some element ends with the string "or" and the same element contains the string "Suciu" somewhere in its content. For each such book, return the title and the qualifying element.
Solution in XQuery:
for $b in doc("http://bstore1.example.com/bib.xml")//book
let $e := $b/*[contains(string(.), "Suciu")
and ends-with(local-name(.), "or")]
where exists($e)
return
<book>
{ $b/title }
{ $e }
</book>
In the above solution, string(), local-name() and ends-with() are functions defined in the Functions and Operators document.
Expected Result:
<book>
<title>Data on the Web</title>
<author>
<last>Suciu</last>
<first>Dan</first>
</author>
</book>
In the document "books.xml", find all section or chapter titles that contain the word "XML", regardless of the level of nesting.
Solution in XQuery:
<results>
{
for $t in doc("books.xml")//(chapter | section)/title
where contains($t/text(), "XML")
return $t
}
</results>
Expected Result:
<results>
<title>XML</title>
<title>XML and Semistructured Data</title>
</results>
In the document "prices.xml", find the minimum price for each book, in the form of a "minprice" element with the book title as its title attribute.
Solution in XQuery:
<results>
{
let $doc := doc("prices.xml")
for $t in distinct-values($doc//book/title)
let $p := $doc//book[title = $t]/price
return
<minprice title="{ $t }">
<price>{ min($p) }</price>
</minprice>
}
</results>
Expected Result:
<results>
<minprice title="Advanced Programming in the Unix environment">
<price>65.95</price>
</minprice>
<minprice title="TCP/IP Illustrated">
<price>65.95</price>
</minprice>
<minprice title="Data on the Web">
<price>34.95</price>
</minprice>
</results>
For each book with an author, return the book with its title and authors. For each book with an editor, return a reference with the book title and the editor's affiliation.
Solution in XQuery:
<bib>
{
for $b in doc("http://bstore1.example.com/bib.xml")//book[author]
return
<book>
{ $b/title }
{ $b/author }
</book>
}
{
for $b in doc("http://bstore1.example.com/bib.xml")//book[editor]
return
<reference>
{ $b/title }
{$b/editor/affiliation}
</reference>
}
</bib>
Expected Result:
<bib>
<book>
<title>TCP/IP Illustrated</title>
<author>
<last>Stevens</last>
<first>W.</first>
</author>
</book>
<book>
<title>Advanced Programming in the Unix environment</title>
<author>
<last>Stevens</last>
<first>W.</first>
</author>
</book>
<book>
<title>Data on the Web</title>
<author>
<last>Abiteboul</last>
<first>Serge</first>
</author>
<author>
<last>Buneman</last>
<first>Peter</first>
</author>
<author>
<last>Suciu</last>
<first>Dan</first>
</author>
</book>
<reference>
<title>The Economics of Technology and Content for Digital TV</title>
<affiliation>CITI</affiliation>
</reference>
</bib>
Find pairs of books that have different titles but the same set of authors (possibly in a different order).
Solution in XQuery:
<bib>
{
for $book1 in doc("http://bstore1.example.com/bib.xml")//book,
$book2 in doc("http://bstore1.example.com/bib.xml")//book
let $aut1 := for $a in $book1/author
order by $a/last, $a/first
return $a
let $aut2 := for $a in $book2/author
order by $a/last, $a/first
return $a
where $book1 << $book2
and not($book1/title = $book2/title)
and deep-equal($aut1, $aut2)
return
<book-pair>
{ $book1/title }
{ $book2/title }
</book-pair>
}
</bib>
Expected Result:
<bib>
<book-pair>
<title>TCP/IP Illustrated</title>
<title>Advanced Programming in the Unix environment</title>
</book-pair>
</bib>
The above solution uses a function, deep-equal(), which compares sequences. Two sequences are equal if all items in corresponding positions in the two sequences are equal - if the sequences are node sequences, the values of the nodes are used for comparison.
Some XML document-types have a very flexible structure in which text is mixed with elements and many elements are optional. These document-types show a wide variation in structure from one document to another. In documents of these types, the ways in which elements are ordered and nested are usually quite important.
An XML query language should have the ability to extract elements from documents while preserving their original hierarchy. This Use Case illustrates this requirement by means of a flexible document type named Book.
This use case is based on an input document named "book.xml". The DTD for this schema is found in a file called "book.dtd":
<!ELEMENT book (title, author+, section+)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT section (title, (p | figure | section)* )>
<!ATTLIST section
id ID #IMPLIED
difficulty CDATA #IMPLIED>
<!ELEMENT p (#PCDATA)>
<!ELEMENT figure (title, image)>
<!ATTLIST figure
width CDATA #REQUIRED
height CDATA #REQUIRED >
<!ELEMENT image EMPTY>
<!ATTLIST image
source CDATA #REQUIRED >
The queries in this use case are based on the following sample data.
<?xml version="1.0"?>
<!DOCTYPE book SYSTEM "book.dtd">
<book>
<title>Data on the Web</title>
<author>Serge Abiteboul</author>
<author>Peter Buneman</author>
<author>Dan Suciu</author>
<section id="intro" difficulty="easy" >
<title>Introduction</title>
<p>Text ... </p>
<section>
<title>Audience</title>
<p>Text ... </p>
</section>
<section>
<title>Web Data and the Two Cultures</title>
<p>Text ... </p>
<figure height="400" width="400">
<title>Traditional client/server architecture</title>
<image source="csarch.gif"/>
</figure>
<p>Text ... </p>
</section>
</section>
<section id="syntax" difficulty="medium" >
<title>A Syntax For Data</title>
<p>Text ... </p>
<figure height="200" width="500">
<title>Graph representations of structures</title>
<image source="graphs.gif"/>
</figure>
<p>Text ... </p>
<section>
<title>Base Types</title>
<p>Text ... </p>
</section>
<section>
<title>Representing Relational Databases</title>
<p>Text ... </p>
<figure height="250" width="400">
<title>Examples of Relations</title>
<image source="relations.gif"/>
</figure>
</section>
<section>
<title>Representing Object Databases</title>
<p>Text ... </p>
</section>
</section>
</book>
Prepare a (nested) table of contents for Book1, listing all the sections and their titles. Preserve the original attributes of each <section> element, if any.
Solution in XQuery:
declare function local:toc($book-or-section as element()) as element()*
{
for $section in $book-or-section/section
return
<section>
{ $section/@* , $section/title , local:toc($section) }
</section>
};
<toc>
{
for $s in doc("book.xml")/book return local:toc($s)
}
</toc>
Expected Result:
<toc>
<section id="intro" difficulty="easy">
<title>Introduction</title>
<section>
<title>Audience</title>
</section>
<section>
<title>Web Data and the Two Cultures</title>
</section>
</section>
<section id="syntax" difficulty="medium">
<title>A Syntax For Data</title>
<section>
<title>Base Types</title>
</section>
<section>
<title>Representing Relational Databases</title>
</section>
<section>
<title>Representing Object Databases</title>
</section>
</section>
</toc>
Prepare a (flat) figure list for Book1, listing all the figures and their titles. Preserve the original attributes of each <figure> element, if any.
Solution in XQuery:
<figlist>
{
for $f in doc("book.xml")//figure
return
<figure>
{ $f/@* }
{ $f/title }
</figure>
}
</figlist>
Expected Result:
<figlist>
<figure height="400" width="400">
<title>Traditional client/server architecture</title>
</figure>
<figure height="200" width="500">
<title>Graph representations of structures</title>
</figure>
<figure height="250" width ="400">
<title>Examples of Relations</title>
</figure>
</figlist>
How many sections are in Book1, and how many figures?
Solution in XQuery:
<section_count>{ count(doc("book.xml")//section) }</section_count>,
<figure_count>{ count(doc("book.xml")//figure) }</figure_count>
Expected Result:
<section_count>7</section_count> <figure_count>3</figure_count>
How many top-level sections are in Book1?
Solution in XQuery:
<top_section_count>
{
count(doc("book.xml")/book/section)
}
</top_section_count>
Expected Result:
<top_section_count>2</top_section_count>
Make a flat list of the section elements in Book1. In place of its original attributes, each section element should have two attributes, containing the title of the section and the number of figures immediately contained in the section.
Solution in XQuery:
<section_list>
{
for $s in doc("book.xml")//section
let $f := $s/figure
return
<section title="{ $s/title/text() }" figcount="{ count($f) }"/>
}
</section_list>
Expected Result:
<section_list>
<section title="Introduction" figcount="0"/>
<section title="Audience" figcount="0"/>
<section title="Web Data and the Two Cultures" figcount="1"/>
<section title="A Syntax For Data" figcount="1"/>
<section title="Base Types" figcount="0"/>
<section title="Representing Relational Databases" figcount="1"/>
<section title="Representing Object Databases" figcount="0"/>
</section_list>
Make a nested list of the section elements in Book1, preserving their original attributes and hierarchy. Inside each section element, include the title of the section and an element that includes the number of figures immediately contained in the section.
Solution in XQuery:
declare function local:section-summary($book-or-section as element()*)
as element()*
{
for $section in $book-or-section
return
<section>
{ $section/@* }
{ $section/title }
<figcount>
{ count($section/figure) }
</figcount>
{ local:section-summary($section/section) }
</section>
};
<toc>
{
for $s in doc("book.xml")/book/section
return local:section-summary($s)
}
</toc>
| Editorial note | |
| This solution was provided by Michael Wenger, a student at the University of Würzburg. | |
Expected Result:
<toc>
<section id="intro" difficulty="easy">
<title>Introduction</title>
<figcount>0</figcount>
<section>
<title>Audience</title>
<figcount>0</figcount>
</section>
<section>
<title>Web Data and the Two Cultures</title>
<figcount>1</figcount>
</section>
</section>
<section id="syntax" difficulty="medium">
<title>A Syntax For Data</title>
<figcount>1</figcount>
<section>
<title>Base Types</title>
<figcount>0</figcount>
</section>
<section>
<title>Representing Relational Databases</title>
<figcount>1</figcount>
</section>
<section>
<title>Representing Object Databases</title>
<figcount>0</figcount>
</section>
</section>
</toc>
This use case illustrates queries based on the sequence in which elements appear in a document.
Although sequence is not significant in most traditional database systems or object systems, it can be quite significant in structured documents. This use case presents a series of queries based on a medical report.
This use case is based on a medical report using the HL7 Patient Record Architecture. We simplify the DTD in this example, using only what is needed to understand the queries.
<!DOCTYPE report [
<!ELEMENT report (section*)>
<!ELEMENT section (section.title, section.content)>
<!ELEMENT section.title (#PCDATA )>
<!ELEMENT section.content (#PCDATA | anesthesia | prep
| incision | action | observation )*>
<!ELEMENT anesthesia (#PCDATA)>
<!ELEMENT prep ( (#PCDATA | action)* )>
<!ELEMENT incision ( (#PCDATA | geography | instrument)* )>
<!ELEMENT action ( (#PCDATA | instrument )* )>
<!ELEMENT observation (#PCDATA)>
<!ELEMENT geography (#PCDATA)>
<!ELEMENT instrument (#PCDATA)>
]>
The queries in this use case are based on the following sample data.
<report>
<section>
<section.title>Procedure</section.title>
<section.content>
The patient was taken to the operating room where she was placed
in supine position and
<anesthesia>induced under general anesthesia.</anesthesia>
<prep>
<action>A Foley catheter was placed to decompress the bladder</action>
and the abdomen was then prepped and draped in sterile fashion.
</prep>
<incision>
A curvilinear incision was made
<geography>in the midline immediately infraumbilical</geography>
and the subcutaneous tissue was divided
<instrument>using electrocautery.</instrument>
</incision>
The fascia was identified and
<action>#2 0 Maxon stay sutures were placed on each side of the midline.
</action>
<incision>
The fascia was divided using
<instrument>electrocautery</instrument>
and the peritoneum was entered.
</incision>
<observation>The small bowel was identified.</observation>
and
<action>
the
<instrument>Hasson trocar</instrument>
was placed under direct visualization.
</action>
<action>
The
<instrument>trocar</instrument>
was secured to the fascia using the stay sutures.
</action>
</section.content>
</section>
</report>
In the Procedure section of Report1, what Instruments were used in the second Incision?
Solution in XQuery:
for $s in doc("report1.xml")//section[section.title = "Procedure"]
return ($s//incision)[2]/instrument
Expected Result:
<instrument>electrocautery</instrument>
In the Procedure section of Report1, what are the first two Instruments to be used?
Solution in XQuery:
for $s in doc("report1.xml")//section[section.title = "Procedure"]
return ($s//instrument)[position()<=2]
Expected Result:
<instrument>using electrocautery.</instrument> <instrument>electrocautery</instrument>
In Report1, what Instruments were used in the first two Actions after the second Incision?
Solution in XQuery:
let $i2 := (doc("report1.xml")//incision)[2]
for $a in (doc("report1.xml")//action)[. >> $i2][position()<=2]
return $a//instrument
Expected Result:
<instrument>Hasson trocar</instrument> <instrument>trocar</instrument>
In Report1, find "Procedure" sections where no Anesthesia element occurs before the first Incision
Solution in XQuery:
for $p in doc("report1.xml")//section[section.title = "Procedure"]
where not(some $a in $p//anesthesia satisfies
$a << ($p//incision)[1] )
return $p
Expected Result:
(No sections satisfy Q4, thankfully.)
In Report1, what happened between the first Incision and the second Incision?
Solution in XQuery:
declare function local:precedes($a as node(), $b as node()) as xs:boolean
{
$a << $b
and
empty($a//node() intersect $b)
};
declare function local:follows($a as node(), $b as node()) as xs:boolean
{
$a >> $b
and
empty($b//node() intersect $a)
};
<critical_sequence>
{
let $proc := doc("report1.xml")//section[section.title="Procedure"][1]
for $n in $proc//node()
where local:follows($n, ($proc//incision)[1])
and local:precedes($n, ($proc//incision)[2])
return $n
}
</critical_sequence>
Here is another solution that is perhaps more efficient and less readable:
<critical_sequence>
{
let $proc := doc("report1.xml")//section[section.title="Procedure"][1],
$i1 := ($proc//incision)[1],
$i2 := ($proc//incision)[2]
for $n in $proc//node() except $i1//node()
where $n >> $i1 and $n << $i2
return $n
}
</critical_sequence>
Expected Result:
<critical_sequence>
The fascia was identified and
<action>#2 0 Maxon stay sutures were placed on each side of the midline.
</action>#2 0 Maxon stay sutures were placed on each side of the midline.
</critical_sequence>
In the above output, the contents of the critical sequence element include a text node, an action element, and the text node containing the content of the action element. But the serialization we are using already shows all descendants of a given node. If $c is bound to a sequence of nodes, the following expression eliminates members of the sequence that are descendants of another node already found in the sequence:
$c except $c//node()
In the following solution, the between() function takes a sequence of nodes, a starting node, and an ending node, and returns the nodes between them:
declare function local:between($seq as node()*, $start as node(), $end as node())
as item()*
{
let $nodes :=
for $n in $seq except $start//node()
where $n >> $start and $n << $end
return $n
return $nodes except $nodes//node()
};
<critical_sequence>
{
let $proc := doc("report1.xml")//section[section.title="Procedure"][1],
$first := ($proc//incision)[1],
$second:= ($proc//incision)[2]
return local:between($proc//node(), $first, $second)
}
</critical_sequence>
Here is the output from the above query:
<critical_sequence>
The fascia was identified and
<action>#2 0 Maxon stay sutures were placed on each side of the midline.
</action>
</critical_sequence>
One important use of an XML query language will be to access data stored in relational databases. This use case describes one possible way in which this access might be accomplished.
A relational database system might present a view in which each table (relation) takes the form of an XML document. One way to represent a database table as an XML document is to allow the document element to represent the table itself, and each row (tuple) inside the table to be represented by a nested element. Inside the tuple-elements, each column is in turn represented by a nested element. Columns that allow null values are represented by optional elements, and a missing element denotes a null value.
As an example, consider a relational database used by an online auction. The auction maintains a USERS table containing information on registered users, each identified by a unique userid, who can either offer items for sale or bid on items. An ITEMS table lists items currently or recently for sale, with the userid of the user who offered each item. A BIDS table contains all bids on record, keyed by the userid of the bidder and the item number of the item to which the bid applies.
The three tables used by the online auction are below, with their column-names indicated in parentheses.
USERS ( USERID, NAME, RATING ) ITEMS ( ITEMNO,
DESCRIPTION, OFFERED_BY, START_DATE, END_DATE,
RESERVE_PRICE ) BIDS ( USERID, ITEMNO, BID, BID_DATE
)
This use case is based on three separate input documents named users.xml, items.xml, and bids.xml. Each of the documents represents one of the tables in the relational database described above, using the following DTDs:
<!DOCTYPE users [ <!ELEMENT users
(user_tuple*)> <!ELEMENT user_tuple
(userid, name, rating?)> <!ELEMENT userid
(#PCDATA)> <!ELEMENT name
(#PCDATA)> <!ELEMENT rating
(#PCDATA)> ]> <!DOCTYPE items [
<!ELEMENT items (item_tuple*)>
<!ELEMENT item_tuple (itemno, description,
offered_by, start_date?, end_date?, reserve_price?
)> <!ELEMENT itemno (#PCDATA)>
<!ELEMENT description (#PCDATA)>
<!ELEMENT offered_by (#PCDATA)>
<!ELEMENT start_date (#PCDATA)>
<!ELEMENT end_date (#PCDATA)>
<!ELEMENT reserve_price (#PCDATA)>
]> <!DOCTYPE bids [ <!ELEMENT bids
(bid_tuple*)> <!ELEMENT bid_tuple (userid,
itemno, bid, bid_date)> <!ELEMENT userid
(#PCDATA)> <!ELEMENT itemno
(#PCDATA)> <!ELEMENT bid (#PCDATA)>
<!ELEMENT bid_date (#PCDATA)>
]>
Here is an abbreviated set of data showing the XML format of the instances:
<items>
<item_tuple>
<itemno>1001</itemno>
<description>Red Bicycle</description>
<offered_by>U01</offered_by>
<start_date>1999-01-05</start_date>
<end_date>1999-01-20</end_date>
<reserve_price>40</reserve_price>
</item_tuple>
<!-- !!! Snip !!! -->
<users>
<user_tuple>
<userid>U01</userid>
<name>Tom Jones</name>
<rating>B</rating>
</user_tuple>
<!-- !!! Snip !!! -->
<bids>
<bid_tuple>
<userid>U02</userid>
<itemno>1001</itemno>
<bid>35</bid>
<bid_date>1999-01-07</bid_date>
</bid_tuple>
<bid_tuple>
<!-- !!! Snip !!! -->
The entire data set is represented by the following table:
| USERID | NAME | RATING |
| U01 | Tom Jones | B |
| U02 | Mary Doe | A |
| U03 | Dee Linquent | D |
| U04 | Roger Smith | C |
| U05 | Jack Sprat | B |
| U06 | Rip Van Winkle | B |
| ITEMNO | DESCRIPTION | OFFERED_BY | START_DATE | END_DATE | RESERVE_PRICE |
| 1001 | Red Bicycle | U01 | 1999-01-05 | 1999-01-20 | 40 |
| 1002 | Motorcycle | U02 | 1999-02-11 | 1999-03-15 | 500 |
| 1003 | Old Bicycle | U02 | 1999-01-10 | 1999-02-20 | 25 |
| 1004 | Tricycle | U01 | 1999-02-25 | 1999-03-08 | 15 |
| 1005 | Tennis Racket | U03 | 1999-03-19 | 1999-04-30 | 20 |
| 1006 | Helicopter | U03 | 1999-05-05 | 1999-05-25 | 50000 |
| 1007 | Racing Bicycle | U04 | 1999-01-20 | 1999-02-20 | 200 |
| 1008 | Broken Bicycle | U01 | 1999-02-05 | 1999-03-06 | 25 |
| USERID | ITEMNO | BID | BID_DATE |
| U02 | 1001 | 35 | 1999-01-07 |
| U04 | 1001 | 40 | 1999-01-08 |
| U02 | 1001 | 45 | 1999-01-11 |
| U04 | 1001 | 50 | 1999-01-13 |
| U02 | 1001 | 55 | 1999-01-15 |
| U01 | 1002 | 400 | 1999-02-14 |
| U02 | 1002 | 600 | 1999-02-16 |
| U03 | 1002 | 800 | 1999-02-17 |
| U04 | 1002 | 1000 | 1999-02-25 |
| U02 | 1002 | 1200 | 1999-03-02 |
| U04 | 1003 | 15 | 1999-01-22 |
| U05 | 1003 | 20 | 1999-02-03 |
| U01 | 1004 | 40 | 1999-03-05 |
| U03 | 1007 | 175 | 1999-01-25 |
| U05 | 1007 | 200 | 1999-02-08 |
| U04 | 1007 | 225 | 1999-02-12 |
List the item number and description of all bicycles that currently have an auction in progress, ordered by item number.
Solution in XQuery:
<result>
{
for $i in doc("items.xml")//item_tuple
where $i/start_date <= current-date()
and $i/end_date >= current-date()
and contains($i/description, "Bicycle")
order by $i/itemno
return
<item_tuple>
{ $i/itemno }
{ $i/description }
</item_tuple>
}
</result>
Note:
This solution assumes that the current date is 1999-01-31.
Expected Result:
<result>
<item_tuple>
<itemno>1003</itemno>
<description>Old Bicycle</description>
</item_tuple>
<item_tuple>
<itemno>1007</itemno>
<description>Racing Bicycle</description>
</item_tuple>
</result>
The above query returns an element named item_tuple, but its definition does not match the definition of item_tuple in the DTD.
For all bicycles, list the item number, description, and highest bid (if any), ordered by item number.
Solution in XQuery:
<result>
{
for $i in doc("items.xml")//item_tuple
let $b := doc("bids.xml")//bid_tuple[itemno = $i/itemno]
where contains($i/description, "Bicycle")
order by $i/itemno
return
<item_tuple>
{ $i/itemno }
{ $i/description }
<high_bid>{ max($b/bid) }</high_bid>
</item_tuple>
}
</result>
Expected Result:
<result>
<item_tuple>
<itemno>1001</itemno>
<description>Red Bicycle</description>
<high_bid>55</high_bid>
</item_tuple>
<item_tuple>
<itemno>1003</itemno>
<description>Old Bicycle</description>
<high_bid>20</high_bid>
</item_tuple>
<item_tuple>
<itemno>1007</itemno>
<description>Racing Bicycle</description>
<high_bid>225</high_bid>
</item_tuple>
<item_tuple>
<itemno>1008</itemno>
<description>Broken Bicycle</description>
<high_bid></high_bid>
</item_tuple>
</result>
Find cases where a user with a rating worse (alphabetically, greater) than "C" is offering an item with a reserve price of more than 1000.
Solution in XQuery:
<result>
{
for $u in doc("users.xml")//user_tuple
for $i in doc("items.xml")//item_tuple
where $u/rating > "C"
and $i/reserve_price > 1000
and $i/offered_by = $u/userid
return
<warning>
{ $u/name }
{ $u/rating }
{ $i/description }
{ $i/reserve_price }
</warning>
}
</result>
Expected Result:
<result>
<warning>
<name>Dee Linquent</name>
<rating>D</rating>
<description>Helicopter</description>
<reserve_price>50000</reserve_price>
</warning>
</result>
List item numbers and descriptions of items that have no bids.
Solution in XQuery:
<result>
{
for $i in doc("items.xml")//item_tuple
where empty(doc("bids.xml")//bid_tuple[itemno = $i/itemno])
return
<no_bid_item>
{ $i/itemno }
{ $i/description }
</no_bid_item>
}
</result>
Expected Result:
<result>
<no_bid_item>
<itemno>1005</itemno>
<description>Tennis Racket</description>
</no_bid_item>
<no_bid_item>
<itemno>1006</itemno>
<description>Helicopter</description>
</no_bid_item>
<no_bid_item>
<itemno>1008</itemno>
<description>Broken Bicycle</description>
</no_bid_item>
</result>
For bicycle(s) offered by Tom Jones that have received a bid, list the item number, description, highest bid, and name of the highest bidder, ordered by item number.
Solution in XQuery:
<result>
{
for $seller in doc("users.xml")//user_tuple,
$buyer in doc("users.xml")//user_tuple,
$item in doc("items.xml")//item_tuple,
$highbid in doc("bids.xml")//bid_tuple
where $seller/name = "Tom Jones"
and $seller/userid = $item/offered_by
and contains($item/description , "Bicycle")
and $item/itemno = $highbid/itemno
and $highbid/userid = $buyer/userid
and $highbid/bid = max(
doc("bids.xml")//bid_tuple
[itemno = $item/itemno]/bid
)
order by ($item/itemno)
return
<jones_bike>
{ $item/itemno }
{ $item/description }
<high_bid>{ $highbid/bid }</high_bid>
<high_bidder>{ $buyer/name }</high_bidder>
</jones_bike>
}
</result>
The above query does several joins, and requires the results in a particular order. If there were no order by clause, results would be reported in document order. If you do not care about the order, you can use the unordered function to inform the query processor that the order of the lists in the for clause is not significant, which means that the tuples can be generated in any order. This can enable better optimization.
Unordered Solution in XQuery:
<result>
{
unordered (
for $seller in doc("users.xml")//user_tuple,
$buyer in doc("users.xml")//user_tuple,
$item in doc("items.xml")//item_tuple,
$highbid in doc("bids.xml")//bid_tuple
where $seller/name = "Tom Jones"
and $seller/userid = $item/offered_by
and contains($item/description , "Bicycle")
and $item/itemno = $highbid/itemno
and $highbid/userid = $buyer/userid
and $highbid/bid = max(
doc("bids.xml")//bid_tuple
[itemno = $item/itemno]/bid
)
return
<jones_bike>
{ $item/itemno }
{ $item/description }
<high_bid>{ $highbid/bid }</high_bid>
<high_bidder>{ $buyer/name }</high_bidder>
</jones_bike>
)
}
</result>
Expected Result:
<result>
<jones_bike>
<itemno>1001</itemno>
<description>Red Bicycle</description>
<high_bid>
<bid>55</bid>
</high_bid>
<high_bidder>
<name>Mary Doe</name>
</high_bidder>
</jones_bike>
</result>
For each item whose highest bid is more than twice its reserve price, list the item number, description, reserve price, and highest bid.
Solution in XQuery:
<result>
{
for $item in doc("items.xml")//item_tuple
let $b := doc("bids.xml")//bid_tuple[itemno = $item/itemno]
let $z := max($b/bid)
where $item/reserve_price * 2 < $z
return
<successful_item>
{ $item/itemno }
{ $item/description }
{ $item/reserve_price }
<high_bid>{$z }</high_bid>
</successful_item>
}
</result>
Expected Result:
<result>
<successful_item>
<itemno>1002</itemno>
<description>Motorcycle</description>
<reserve_price>500</reserve_price>
<high_bid>1200</high_bid>
</successful_item>
<successful_item>
<itemno>1004</itemno>
<description>Tricycle</description>
<reserve_price>15</reserve_price>
<high_bid>40</high_bid>
</successful_item>
</result>
Find the highest bid ever made for a bicycle or tricycle.
Solution in XQuery:
let $allbikes := doc("items.xml")//item_tuple
[contains(description, "Bicycle")
or contains(description, "Tricycle")]
let $bikebids := doc("bids.xml")//bid_tuple[itemno = $allbikes/itemno]
return
<high_bid>
{
max($bikebids/bid)
}
</high_bid>
Expected Result:
<high_bid>225</high_bid>
How many items were actioned (auction ended) in March 1999?
Solution in XQuery:
let $item := doc("items.xml")//item_tuple
[end_date >= xs:date("1999-03-01") and end_date <= xs:date("1999-03-31")]
return
<item_count>
{
count($item)
}
</item_count>
Expected Result:
<item_count>3</item_count>
List the number of items auctioned each month in 1999 for which data is available, ordered by month.
Solution in XQuery:
<result>
{
let $end_dates := doc("items.xml")//item_tuple/end_date
for $m in distinct-values(for $e in $end_dates
return month-from-date($e))
let $item := doc("items.xml")
//item_tuple[year-from-date(end_date) = 1999
and month-from-date(end_date) = $m]
order by $m
return
<monthly_result>
<month>{ $m }</month>
<item_count>{ count($item) }</item_count>
</monthly_result>
}
</result>
Expected Result:
<result>
<monthly_result>
<month>1</month>
<item_count>1</item_count>
</monthly_result>
<monthly_result>
<month>2</month>
<item_count>2</item_count>
</monthly_result>
<monthly_result>
<month>3</month>
<item_count>3</item_count>
</monthly_result>
<monthly_result>
<month>4</month>
<item_count>1</item_count>
</monthly_result>
<monthly_result>
<month>5</month>
<item_count>1</item_count>
</monthly_result>
</result>
For each item that has received a bid, list the item number, the highest bid, and the name of the highest bidder, ordered by item number.
Solution in XQuery:
<result>
{
for $highbid in doc("bids.xml")//bid_tuple,
$user in doc("users.xml")//user_tuple
where $user/userid = $highbid/userid
and $highbid/bid = max(doc("bids.xml")//bid_tuple[itemno=$highbid/itemno]/bid)
order by $highbid/itemno
return
<high_bid>
{ $highbid/itemno }
{ $highbid/bid }
<bidder>{ $user/name/text() }</bidder>
</high_bid>
}
</result>
Expected Result:
<result>
<high_bid>
<itemno>1001</itemno>
<bid>55</bid>
<bidder>Mary Doe</bidder>
</high_bid>
<high_bid>
<itemno>1002</itemno>
<bid>1200</bid>
<bidder>Mary Doe</bidder>
</high_bid>
<high_bid>
<itemno>1003</itemno>
<bid>20</bid>
<bidder>Jack Sprat</bidder>
</high_bid>
<high_bid>
<itemno>1004</itemno>
<bid>40</bid>
<bidder>Tom Jones</bidder>
</high_bid>
<high_bid>
<itemno>1007</itemno>
<bid>225</bid>
<bidder>Roger Smith</bidder>
</high_bid>
</result>
List the item number and description of the item(s) that received the highest bid ever recorded, and the amount of that bid.
Solution in XQuery:
let $highbid := max(doc("bids.xml")//bid_tuple/bid)
return
<result>
{
for $item in doc("items.xml")//item_tuple,
$b in doc("bids.xml")//bid_tuple[itemno = $item/itemno]
where $b/bid = $highbid
return
<expensive_item>
{ $item/itemno }
{ $item/description }
<high_bid>{ $highbid }</high_bid>
</expensive_item>
}
</result>
Expected Result:
<result>
<expensive_item>
<itemno>1002</itemno>
<description>Motorcycle</description>
<high_bid>1200</high_bid>
</expensive_item>
</result>
List the item number and description of the item(s) that received the largest number of bids, and the number of bids it (or they) received.
Solution in XQuery:
declare function local:bid_summary()
as element()*
{
for $i in distinct-values(doc("bids.xml")//itemno)
let $b := doc("bids.xml")//bid_tuple[itemno = $i]
return
<bid_count>
<itemno>{ $i }</itemno>
<nbids>{ count($b) }</nbids>
</bid_count>
};
<result>
{
let $bid_counts := local:bid_summary(),
$maxbids := max($bid_counts/nbids),
$maxitemnos := $bid_counts[nbids = $maxbids]
for $item in doc("items.xml")//item_tuple,
$bc in $bid_counts
where $bc/nbids = $maxbids and $item/itemno = $bc/itemno
return
<popular_item>
{ $item/itemno }
{ $item/description }
<bid_count>{ $bc/nbids/text() }</bid_count>
</popular_item>
}
</result>
Expected Result:
<result>
<popular_item>
<itemno>1001</itemno>
<description>Red Bicycle</description>
<bid_count>5</bid_count>
</popular_item>
<popular_item>
<itemno>1002</itemno>
<description>Motorcycle</description>
<bid_count>5</bid_count>
</popular_item>
</result>
For each user who has placed a bid, give the userid, name, number of bids, and average bid, in order by userid.
Solution in XQuery:
<result>
{
for $uid in distinct-values(doc("bids.xml")//userid),
$u in doc("users.xml")//user_tuple[userid = $uid]
let $b := doc("bids.xml")//bid_tuple[userid = $uid]
order by $u/userid
return
<bidder>
{ $u/userid }
{ $u/name }
<bidcount>{ count($b) }</bidcount>
<avgbid>{ avg($b/bid) }</avgbid>
</bidder>
}
</result>
Expected Result:
<result>
<bidder>
<userid>U01</userid>
<name>Tom Jones</name>
<bidcount>2</bidcount>
<avgbid>220</avgbid>
</bidder>
<bidder>
<userid>U02</userid>
<name>Mary Doe</name>
<bidcount>5</bidcount>
<avgbid>387</avgbid>
</bidder>
<bidder>
<userid>U03</userid>
<name>Dee Linquent</name>
<bidcount>2</bidcount>
<avgbid>487.5</avgbid>
</bidder>
<bidder>
<userid>U04</userid>
<name>Roger Smith</name>
<bidcount>5</bidcount>
<avgbid>266</avgbid>
</bidder>
<bidder>
<userid>U05</userid>
<name>Jack Sprat</name>
<bidcount>2</bidcount>
<avgbid>110</avgbid>
</bidder>
</result>
List item numbers and average bids for items that have received three or more bids, in descending order by average bid.
Solution in XQuery:
<result>
{
for $i in distinct-values(doc("bids.xml")//itemno)
let $b := doc("bids.xml")//bid_tuple[itemno = $i]
let $avgbid := avg($b/bid)
where count($b) >= 3
order by $avgbid descending
return
<popular_item>
<itemno>{ $i }</itemno>
<avgbid>{ $avgbid }</avgbid>
</popular_item>
}
</result>
Expected Result:
<result>
<popular_item>
<itemno>1002</itemno>
<avgbid>800</avgbid>
</popular_item>
<popular_item>
<itemno>1007</itemno>
<avgbid>200</avgbid>
</popular_item>
<popular_item>
<itemno>1001</itemno>
<avgbid>45</avgbid>
</popular_item>
</result>
List names of users who have placed multiple bids of at least $100 each.
Solution in XQuery:
<result>
{
for $u in doc("users.xml")//user_tuple
let $b := doc("bids.xml")//bid_tuple[userid=$u/userid and bid>=100]
where count($b) > 1
return
<big_spender>{ $u/name/text() }</big_spender>
}
</result>
Expected Result:
<result>
<big_spender>Mary Doe</big_spender>
<big_spender>Dee Linquent</big_spender>
<big_spender>Roger Smith</big_spender>
</result>
List all registered users in order by userid; for each user, include the userid, name, and an indication of whether the user is active (has at least one bid on record) or inactive (has no bid on record).
Solution in XQuery:
<result>
{
for $u in doc("users.xml")//user_tuple
let $b := doc("bids.xml")//bid_tuple[userid = $u/userid]
order by $u/userid
return
<user>
{ $u/userid }
{ $u/name }
{
if (empty($b))
then <status>inactive</status>
else <status>active</status>
}
</user>
}
</result>
Expected Result:
<result>
<user>
<userid>U01</userid>
<name>Tom Jones</name>
<status>active</status>
</user>
<user>
<userid>U02</userid>
<name>Mary Doe</name>
<status>active</status>
</user>
<user>
<userid>U03</userid>
<name>Dee Linquent</name>
<status>active</status>
</user>
<user>
<userid>U04</userid>
<name>Roger Smith</name>
<status>active</status>
</user>
<user>
<userid>U05</userid>
<name>Jack Sprat</name>
<status>active</status>
</user>
<user>
<userid>U06</userid>
<name>Rip Van Winkle</name>
<status>inactive</status>
</user>
</result>
List the names of users, if any, who have bid on every item.
Solution in XQuery:
<frequent_bidder>
{
for $u in doc("users.xml")//user_tuple
where
every $item in doc("items.xml")//item_tuple satisfies
some $b in doc("bids.xml")//bid_tuple satisfies
($item/itemno = $b/itemno and $u/userid = $b/userid)
return
$u/name
}
</frequent_bidder>
Expected Result:
<frequent_bidder />
(No users satisfy Q17.)
List all users in alphabetic order by name. For each user, include descriptions of all the items (if any) that were bid on by that user, in alphabetic order.
Solution in XQuery:
<result>
{
for $u in doc("users.xml")//user_tuple
order by $u/name
return
<user>
{ $u/name }
{
for $b in distinct-values(doc("bids.xml")//bid_tuple
[userid = $u/userid]/itemno)
for $i in doc("items.xml")//item_tuple[itemno = $b]
let $descr := $i/description/text()
order by $descr
return
<bid_on_item>{ $descr }</bid_on_item>
}
</user>
}
</result>
Expected Result:
<result>
<user>
<name>Dee Linquent</name>
<bid_on_item>Motorcycle</bid_on_item>
<bid_on_item>Racing Bicycle</bid_on_item>
</user>
<user>
<name>Jack Sprat</name>
<bid_on_item>Old Bicycle</bid_on_item>
<bid_on_item>Racing Bicycle</bid_on_item>
</user>
<user>
<name>Mary Doe</name>
<bid_on_item>Motorcycle</bid_on_item>
<bid_on_item>Red Bicycle</bid_on_item>
</user>
<user>
<name>Rip Van Winkle</name>
</user>
<user>
<name>Roger Smith</name>
<bid_on_item>Motorcycle</bid_on_item>
<bid_on_item>Old Bicycle</bid_on_item>
<bid_on_item>Racing Bicycle</bid_on_item>
<bid_on_item>Red Bicycle</bid_on_item>
</user>
<user>
<name>Tom Jones</name>
<bid_on_item>Motorcycle</bid_on_item>
<bid_on_item>Tricycle</bid_on_item>
</user>
</result>
The example document and queries in this Use Case were first created for a 1992 conference on Standard Generalized Markup Language (SGML). For our use, the Document Type Definition (DTD) and example document have been translated from SGML to XML.
This use case is based on data conforming to the DTD shown below.
<!NOTATION cgm PUBLIC "Computer Graphics Metafile">
<!NOTATION ccitt PUBLIC "CCITT group 4 raster">
<!ENTITY % text "(#PCDATA | emph)*">
<!ENTITY infoflow SYSTEM "infoflow.ccitt" NDATA ccitt>
<!ENTITY tagexamp SYSTEM "tagexamp.cgm" NDATA cgm>
<!ELEMENT report (title, chapter+)>
<!ELEMENT title %text;>
<!ELEMENT chapter (title, intro?, section*)>
<!ATTLIST chapter
shorttitle CDATA #IMPLIED>
<!ELEMENT intro (para | graphic)+>
<!ELEMENT section (title, intro?, topic*)>
<!ATTLIST section
shorttitle CDATA #IMPLIED
sectid ID #IMPLIED>
<!ELEMENT topic (title, (para | graphic)+)>
<!ATTLIST topic
shorttitle CDATA #IMPLIED
topicid ID #IMPLIED>
<!ELEMENT para (#PCDATA | emph | xref)*>
<!ATTLIST para
security (u | c | s | ts) "u">
<!ELEMENT emph %text;>
<!ELEMENT graphic EMPTY>
<!ATTLIST graphic
graphname ENTITY #REQUIRED>
<!ELEMENT xref EMPTY>
<!ATTLIST xref
xrefid IDREF #IMPLIED>
The queries in this use case are based on the following sample data, which is found in the file "sgml.xml". Line numbers have been added to the data to allow the results of queries to be conveniently specified.
0: <!DOCTYPE report SYSTEM "report.dtd"> 1: <report> 2: <title>Getting started with SGML</title> 3: <chapter> 4: <title>The business challenge</title> 5: <intro> 6: <para>With the ever-changing and growing global market, companies and 7: large organizations are searching for ways to become more viable and 8: competitive. Downsizing and other cost-cutting measures demand more 9: efficient use of corporate resources. One very important resource is 10: an organization's information.</para> 11: <para>As part of the move toward integrated information management, 12: whole industries are developing and implementing standards for 13: exchanging technical information. This report describes how one such 14: standard, the Standard Generalized Markup Language (SGML), works as 15: part of an overall information management strategy.</para> 16: <graphic graphname="infoflow"/></intro></chapter> 17: <chapter> 18: <title>Getting to know SGML</title> 19: <intro> 20: <para>While SGML is a fairly recent technology, the use of 21: <emph>markup</emph> in computer-generated documents has existed for a 22: while.</para></intro> 23: <section shorttitle="What is markup?"> 24: <title>What is markup, or everything you always wanted to know about 25: document preparation but were afraid to ask?</title> 26: <intro> 27: <para>Markup is everything in a document that is not content. The 28: traditional meaning of markup is the manual <emph>marking</emph> up 29: of typewritten text to give instructions for a typesetter or 30: compositor about how to fit the text on a page and what typefaces to 31: use. This kind of markup is known as <emph>procedural markup</emph>.</para></intro> 32: <topic topicid="top1"> 33: <title>Procedural markup</title> 34: <para>Most electronic publishing systems today use some form of 35: procedural markup. Procedural markup codes are good for one 36: presentation of the information.</para></topic> 37: <topic topicid="top2"> 38: <title>Generic markup</title> 39: <para>Generic markup (also known as descriptive markup) describes the 40: <emph>purpose</emph> of the text in a document. A basic concept of 41: generic markup is that the content of a document must be separate from 42: the style. Generic markup allows for multiple presentations of the 43: information.</para></topic> 44: <topic topicid="top3"> 45: <title>Drawbacks of procedural markup</title> 46: <para>Industries involved in technical documentation increasingly 47: prefer generic over procedural markup schemes. When a company changes 48: software or hardware systems, enormous data translation tasks arise, 49: often resulting in errors.</para></topic></section> 50: <section shorttitle="What is SGML?"> 51: <title>What <emph>is</emph> SGML in the grand scheme of the universe, anyway?</title> 52: <intro> 53: <para>SGML defines a strict markup scheme with a syntax for defining 54: document data elements and an overall framework for marking up 55: documents.</para> 56: <para>SGML can describe and create documents that are not dependent on 57: any hardware, software, formatter, or operating system. Since SGML documents 58: conform to an international standard, they are portable.</para></intro></secti