The presentation of this document has been augmented to identify changes from a previous version. Three kinds of changes are highlighted: new, added text, changed text, and deleted text.


W3C

XML Query Use Cases

W3C Working Draft 8 June 2006

This version:
Latest version:
Previous version:
Editors:
Don Chamberlin, IBM Almaden Research Center
Peter Fankhauser, Infonyte GmbH
Daniela Florescu, Oracle corporation
Massimo Marchiori, University of Venice
Jonathan Robie, DataDirect Technologies

This document is also available in these non-normative formats:

XML
.


Abstract

This document specifies usage scenarios for XQuery.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This version of the Use Cases document corresponds to the XQuery Working Draft released on 8 June 2006. The queries in this document have been parsed using a parser generated from the same grammar used to create the documentation for the XQuery Working Draft.

This is a public W3C Working Draft for review by W3C Members and other interested parties. Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document wasproduced bythe XML Query Working Group(WG) (part ofthe XMLActivity). Thisdocument was produced jointly by the XML Query Working Group and the XSL Working Group, both of which arepart of the XML Activity.

Comments on thisdocument are invited and should bemade in W3C's publicBugzilla system (instructionscan be found at http://www.w3.org/XML/2005/04/qt-bugzilla). If accessto that system is not feasible,you may send your comments to the W3C XSLT/XPath/XQuery mailing list, public-qt-comments@w3.org. It willbe very helpful if you includethe string [XQueryUseCases] inthe subject line of your comment, whethermade in Bugzilla or in email. EachBugzilla entry and email message should contain only one comment. Archivesof thecomments and responses are available athttp://lists.w3.org/Archives/Public/public-qt-comments/ .

This document was produced by a group operatingunder the 5February 2004W3C Patent Policy.This documentis informativeonly.W3C maintainsa public listof any patent disclosuresmade in connection withthedeliverables ofthe group; that page alsoincludes instructions fordisclosing a patent. An individual whohas actual knowledgeof a patent whichthe individual believes contains Essential Claim(s)must disclose the information inaccordance with section 6of the W3C PatentPolicy.

Table of Contents

1 Use Cases for XML Queries
    1.1 Use Case "XMP": Experiences and Exemplars
        1.1.1 Document Type Definitions (DTD)
        1.1.2 Sample Data
        1.1.3 DTD for Q5
        1.1.4 Sample Data for Q5
        1.1.5 DTD for Q9
        1.1.6 Data for Q9
        1.1.7 DTD for Q10
        1.1.8 Data for Q10
        1.1.9 Queries and Results
            1.1.9.1 Q1
            1.1.9.2 Q2
            1.1.9.3 Q3
            1.1.9.4 Q4
            1.1.9.5 Q5
            1.1.9.6 Q6
            1.1.9.7 Q7
            1.1.9.8 Q8
            1.1.9.9 Q9
            1.1.9.10 Q10
            1.1.9.11 Q11
            1.1.9.12 Q12
    1.2 Use Case "TREE": Queries that preserve hierarchy
        1.2.1 Description
        1.2.2 Document Type Definition (DTD)
        1.2.3 Sample Data
        1.2.4 Queries and Results
            1.2.4.1 Q1
            1.2.4.2 Q2
            1.2.4.3 Q3
            1.2.4.4 Q4
            1.2.4.5 Q5
            1.2.4.6 Q6
    1.3 Use Case "SEQ" - Queries based on Sequence
        1.3.1 Description
        1.3.2 Document Type Definition (DTD)
        1.3.3 Sample Data
        1.3.4 Queries and Results
            1.3.4.1 Q1
            1.3.4.2 Q2
            1.3.4.3 Q3
            1.3.4.4 Q4
            1.3.4.5 Q5
    1.4 Use Case "R" - Access to Relational Data
        1.4.1 Description
        1.4.2 Document Type Definition (DTD)
        1.4.3 Sample Data
        1.4.4 Queries and Results
            1.4.4.1 Q1
            1.4.4.2 Q2
            1.4.4.3 Q3
            1.4.4.4 Q4
            1.4.4.5 Q5
            1.4.4.6 Q6
            1.4.4.7 Q7
            1.4.4.8 Q8
            1.4.4.9 Q9
            1.4.4.10 Q10
            1.4.4.11 Q11
            1.4.4.12 Q12
            1.4.4.13 Q13
            1.4.4.14 Q14
            1.4.4.15 Q15
            1.4.4.16 Q16
            1.4.4.17 Q17
            1.4.4.18 Q18
    1.5 Use Case "SGML": Standard Generalized Markup Language
        1.5.1 Description
        1.5.2 Document Type Definition (DTD)
        1.5.3 Sample Data
        1.5.4 Queries and Results
            1.5.4.1 Q1
            1.5.4.2 Q2
            1.5.4.3 Q3
            1.5.4.4 Q4
            1.5.4.5 Q5
            1.5.4.6 Q6
            1.5.4.7 Q7
            1.5.4.8 Q8a
            1.5.4.9 Q8b
            1.5.4.10 Q9
            1.5.4.11 Q10
    1.6 Use Case "STRING": String Search
        1.6.1 Description
        1.6.2 Document Type Definition (DTD)
        1.6.3 Sample Data
        1.6.4 Queries and Results
            1.6.4.1 Q1
            1.6.4.2 Q2
            1.6.4.3 Q3
            1.6.4.4 Q4
            1.6.4.5 Q5
    1.7 Use Case "NS" - Queries Using Namespaces
        1.7.1 Description
        1.7.2 Document Type Definition (DTD)
        1.7.3 Sample Data
        1.7.4 Queries and Results
            1.7.4.1 Q1
            1.7.4.2 Q2
            1.7.4.3 Q3
            1.7.4.4 Q4
            1.7.4.5 Q5
            1.7.4.6 Q6
            1.7.4.7 Q7
            1.7.4.8 Q8
    1.8 Use Case "PARTS" - Recursive Parts Explosion
        1.8.1 Description
        1.8.2 Document Type Definitions (DTD)
        1.8.3 Sample Data
        1.8.4 Queries and Results
            1.8.4.1 Q1
    1.9 Use Case "STRONG" - queries that exploit strongly typed data
        1.9.1 Description
        1.9.2 Schema
        1.9.3 Sample Data
        1.9.4 Queries
            1.9.4.1 Q1
            1.9.4.2 Q2
            1.9.4.3 Q3
            1.9.4.4 Q4
            1.9.4.5 Q5
            1.9.4.6 Q6
            1.9.4.7 Q7
            1.9.4.8 Q8
            1.9.4.9 Q9
            1.9.4.10 Q10
            1.9.4.11 Q11
            1.9.4.12 Q12

Appendices

A Acknowledgements
B Change Log (Non-Normative)
    B.1 8 May 2006
    B.2 31 Aug 2005
    B.3 11 July 2005
    B.4 04 April 2005
    B.5 30 Jan 2005
C References (Non-Normative)


1 Use Cases for XML Queries

The use cases listed below were created by the XML Query Working Group to illustrate important applications for an XML query language. Each use case is focused on a specific application area, and contains a Document Type Definition (DTD) and example input data. Each use case specifies a set of queries that might be applied to the input data, and the expected results for each query. Since the English description of each query is concise, the expected results form an important part of the definition of each query, specifying the expected output format. These use cases were originally published as part of the [XQuery Requirements] document, without solutions in concrete query languages. Now it is being republished with solutions for [XQuery]. These use cases are also being used by the W3C XML Query Testing Task Force.

The input environment for each use case is stated in its Document Type Definition (DTD) section. All of these use cases assume that input is provided in the form of one or more documents with specific names. For instance, the authors in a document may be accessed with expressions like this:

doc("http://bstore1.example.com/bib.xml")//author

Some implementations of XQuery bind input to external variables. If the environment has bound the external variable $b to the same document used in the above query, this expression would return the same set of authors:

$b//author

Some implementations of XQuery predefine a single 'context item', which is available at the root level of a query, and which is used to resolve paths that begin with a leading slash. In such an implementation, if the context item is bound to document node of the same well-formed document used in the previous examples, this expression would return the same set of authors:

//author

Previous versions of this document accessed implicit documents using the input() function, which no longer exists. The input() function had similar functionality to a predefined context item, except that it could be bound to a sequence of nodes, whereas the context item may only be bound to a single node. The use cases that used input() have been rewritten to use explicit file names.

Several implementors have asked that we make the queries from these use cases available in a separate file to make it easier for them to test their parsers. These queries may be found in [Use Case Sample Queries]. Also, the queries from the XQuery specification itself have been made available in [XQuery Sample Queries].

To make output more readable, the output of queries has been formatted using whitespace which may not be returned by a query processor. This whitespace should not be considered normative for the correctness of results.

These queries weretested with a dynamicimplementation of XQuery. Somequeries may require additional type declarations to be usedwith animplementation that implements the Static Typing feature.

1.1 Use Case "XMP": Experiences and Exemplars

This use case contains several example queries that illustrate requirements gathered from the database and document communities.

1.1.9 Queries and Results

1.2 Use Case "TREE": Queries that preserve hierarchy

Some XML document-types have a very flexible structure in which text is mixed with elements and many elements are optional. These document-types show a wide variation in structure from one document to another. In documents of these types, the ways in which elements are ordered and nested are usually quite important.

1.2.4 Queries and Results

1.3 Use Case "SEQ" - Queries based on Sequence

This use case illustrates queries based on the sequence in which elements appear in a document.

1.3.4 Queries and Results

1.3.4.5 Q5

In Report1, what happened between the first Incision and the second Incision?

Solution in XQuery:

Here is another solution that is perhaps more efficient and less readable:

Expected Result:

In the above output, the contents of the critical sequence element include a text node, an action element, and the text node containing the content of the action element. But the serialization we are using already shows all descendants of a given node. If $c is bound to a sequence of nodes, the following expression eliminates members of the sequence that are descendants of another node already found in the sequence:

In the following solution, the between() function takes a sequence of nodes, a starting node, and an ending node, and returns the nodes between them:

Here is the output from the above query:

1.4 Use Case "R" - Access to Relational Data

One important use of an XML query language will be to access data stored in relational databases. This use case describes one possible way in which this access might be accomplished.

1.4.3 Sample Data

Here is an abbreviated set of data showing the XML format of the instances:

The entire data set is represented by the following table:

USERS
USERIDNAMERATING
U01Tom JonesB
U02Mary DoeA
U03Dee LinquentD
U04Roger SmithC
U05Jack SpratB
U06Rip Van WinkleB
ITEMS
ITEMNODESCRIPTIONOFFERED_BYSTART_DATEEND_DATERESERVE_PRICE
1001Red BicycleU011999-01-051999-01-2040
1002MotorcycleU021999-02-111999-03-15500
1003Old BicycleU021999-01-101999-02-2025
1004TricycleU011999-02-251999-03-0815
1005Tennis RacketU031999-03-191999-04-3020
1006HelicopterU031999-05-051999-05-2550000
1007Racing BicycleU041999-01-201999-02-20200
1008Broken BicycleU011999-02-051999-03-0625
BIDS
USERIDITEMNOBIDBID_DATE
U021001351999-01-07
U041001401999-01-08
U021001451999-01-11
U041001501999-01-13
U021001551999-01-15
U0110024001999-02-14
U0210026001999-02-16
U0310028001999-02-17
U04100210001999-02-25
U02100212001999-03-02
U041003151999-01-22
U051003201999-02-03
U011004401999-03-05
U0310071751999-01-25
U0510072001999-02-08
U0410072251999-02-12

1.4.4 Queries and Results

1.4.4.5 Q5

For bicycle(s) offered by Tom Jones that have received a bid, list the item number, description, highest bid, and name of the highest bidder, ordered by item number.

Solution in XQuery:

The above query does several joins, and requires the results in a particular order. If there were no order by clause, results would be reported in document order. If you do not care about the order, you can use the unordered function to inform the query processor that the order of the lists in the for clause is not significant, which means that the tuples can be generated in any order. This can enable better optimization.

Unordered Solution in XQuery:

Expected Result:

1.5 Use Case "SGML": Standard Generalized Markup Language

1.5.3 Sample Data

The queries in this use case are based on the following sample data, which is found in the file "sgml.xml". Line numbers have been added to the data to allow the results of queries to be conveniently specified.

 0: <!DOCTYPE report SYSTEM "report.dtd">
 1: <report>
 2: <title>Getting started with SGML</title>
 3: <chapter>
 4: <title>The business challenge</title>
 5: <intro>
 6: <para>With the ever-changing and growing global market, companies and
 7: large organizations are searching for ways to become more viable and
 8: competitive. Downsizing and other cost-cutting measures demand more
 9: efficient use of corporate resources. One very important resource is
10: an organization's information.</para>
11: <para>As part of the move toward integrated information management,
12: whole industries are developing and implementing standards for
13: exchanging technical information. This report describes how one such
14: standard, the Standard Generalized Markup Language (SGML), works as
15: part of an overall information management strategy.</para>
16: <graphic graphname="infoflow"/></intro></chapter>
17: <chapter>
18: <title>Getting to know SGML</title>
19: <intro>
20: <para>While SGML is a fairly recent technology, the use of
21: <emph>markup</emph> in computer-generated documents has existed for a
22: while.</para></intro>
23: <section shorttitle="What is markup?">
24: <title>What is markup, or everything you always wanted to know about
25: document preparation but were afraid to ask?</title>
26: <intro>
27: <para>Markup is everything in a document that is not content. The
28: traditional meaning of markup is the manual <emph>marking</emph> up
29: of typewritten text to give instructions for a typesetter or
30: compositor about how to fit the text on a page and what typefaces to
31: use. This kind of markup is known as <emph>procedural markup</emph>.</para></intro>
32: <topic topicid="top1">
33: <title>Procedural markup</title>
34: <para>Most electronic publishing systems today use some form of
35: procedural markup. Procedural markup codes are good for one
36: presentation of the information.</para></topic>
37: <topic topicid="top2">
38: <title>Generic markup</title>
39: <para>Generic markup (also known as descriptive markup) describes the
40: <emph>purpose</emph> of the text in a document. A basic concept of
41: generic markup is that the content of a document must be separate from
42: the style. Generic markup allows for multiple presentations of the
43: information.</para></topic>
44: <topic topicid="top3">
45: <title>Drawbacks of procedural markup</title>
46: <para>Industries involved in technical documentation increasingly
47: prefer generic over procedural markup schemes. When a company changes
48: software or hardware systems, enormous data translation tasks arise,
49: often resulting in errors.</para></topic></section>
50: <section shorttitle="What is SGML?">
51: <title>What <emph>is</emph> SGML in the grand scheme of the universe, anyway?</title>
52: <intro>
53: <para>SGML defines a strict markup scheme with a syntax for defining
54: document data elements and an overall framework for marking up
55: documents.</para>
56: <para>SGML can describe and create documents that are not dependent on
57: any hardware, software, formatter, or operating system. Since SGML documents
58: conform to an international standard, they are portable.</para></intro></section>
59: <section shorttitle="How does SGML work?">
60: <title>How is SGML and would you recommend it to your grandmother?</title>
61: <intro>
62: <para>You can break a typical document into three layers: structure,
63: content, and style. SGML works by separating these three aspects and
64: deals mainly with the relationship between structure and content.</para></intro>
65: <topic topicid="top4">
66: <title>Structure</title>
67: <para>At the heart of an SGML application is a file called the DTD, or
68: Document Type Definition. The DTD sets up the structure of a document,
69: much like a database schema describes the types of information it
70: handles.</para>
71: <para>A database schema also defines the relationships between the
72: various types of data. Similarly, a DTD specifies <emph>rules</emph>
73: to help ensure documents have a consistent, logical structure.</para></topic>
74: <topic topicid="top5">
75: <title>Content</title>
76: <para>Content is the information itself. The method for identifying
77: the information and its meaning within this framework is called
78: <emph>tagging</emph>. Tagging must
79: conform to the rules established in the DTD (see <xref xrefid="top4"/>).</para>
80: <graphic graphname="tagexamp"/></topic>
81: <topic topicid="top6">
82: <title>Style</title>
83: <para>SGML does not standardize style or other processing methods for
84: information stored in SGML.</para></topic></section></chapter>
85: <chapter>
86: <title>Resources</title>
87: <section>
88: <title>Conferences, tutorials, and training</title>
89: <intro>
90: <para>The Graphic Communications Association has been
91: instrumental in the development of SGML. GCA provides conferences,
92: tutorials, newsletters, and publication sales for both members and
93: non-members.</para>
94: <para security="c">Exiled members of the former Soviet Union's secret
95: police, the KGB, have infiltrated the upper ranks of the GCA and are
96: planning the Final Revolution as soon as DSSSL is completed.</para>
97: </intro>
98: </section>
99: </chapter>
100:</report>
                    

1.5.4 Queries and Results