Copyright ©2000 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
This document specifies goals, requirements, and usage scenarios for the W3C XML Query data model, algebra, and query language.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this document series is maintained at the W3C. This document is a revision of the first public XML Query Requirements working draft that takes into account comments processed up to June 10, 2000.
This is a W3C Working Draft for review by W3C Members and other interested parties. It is a draft document and may be updated, replaced or made obsolete by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". This is work in progress and does not imply endorsement by the W3C membership.
This document has been produced as part of the W3C XML Activity, following the procedures set out for the W3C Process. The document has been written by the XML Query Working Group ( W3C members only). The goals of the XML Query working group are discussed in the XML Query Working Group charter ( W3C members only).
The XML Query Working Group feels that the contents of this Working Draft are relatively stable, and therefore encourages feedback on this version.
Comments on this document should be sent to the W3C mailing list www-xml-query-comments@w3.org (archived at http://lists.w3.org/Archives/Public/www-xml-query-comments/).
A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR/.
The goal of the XML Query Working Group is to produce a data model for XML documents, a set of query operators on that data model, and a query language based on these query operators. The data model will be based on the W3C XML Infoset, and will include support for Namespaces.
Queries operate on single documents or fixed collections of documents. They can select whole documents or subtrees of documents that match conditions defined on document content and structure, and can construct new documents based on what is selected.
The following usage scenarios describe how XML queries may be used in various environments, and represent a wide range of activities and needs that are representative of the problem space to be addressed. They are intended to be used as design cases during the development of XML Query, and should be reviewed when critical decisions are made. These usage scenarios should also prove useful in helping non-members of the XML Query Working Group understand the intent and goals of the project.
Perform queries on structured documents and collections of documents, such as technical manuals, to retrieve individual documents, to generate tables of contents, to search for information in structures found within a document, or to generate new documents as the result of a query.
Perform queries on the XML representation of database data, object data, or other traditional data sources to extract data from these sources, to transform data into new XML representations, or to integrate data from multiple heterogeneous data sources. The XML representation of data sources may be either physical or virtual; that is, data may be physically encoded in XML, or an XML representation of the data may be produced.
Perform both document-oriented and data-oriented queries on documents with embedded data, such as catalogs, patient health records, employment records, or business analysis documents.
Perform queries on configuration files, user profiles, or administrative logs represented in XML.
Perform queries on streams of XML data to process the data in a manner analogous to UNIX filters. This might be used to process logs of email messages, network packets, stock market data, newswire feeds, EDI, or weather data to filter and route messages represented in XML, to extract data from XML streams, or to transform data in XML streams.
Perform queries on DOM structures to return sets of nodes that meet the specified criteria.
Perform queries on collections of documents managed by native XML repositories or web servers.
Perform queries to search catalogs that describe document servers, document types, XML schemas, or documents. Such catalogs may be combined to support search among multiple servers. A document-retrieval system could use queries to allow the user to select server catalogs, represented in XML, by the information provided by the servers, by access cost, or by authorization. Once a server is selected, a retrieval system could query the kinds of documents found on the server and allow the user to query those documents.
Queries may be used in many environments. For example, a query might be embedded in a URL, an XML page, or a JSP or ASP page; represented by a string in a program written in a general-purpose programming language; provided as an argument on the command-line or standard input; or supported by a protocol, such as DASL or Z39.50.
The following key words are used throughout the document to specify the extent to which an item is a requirement for the work of the XML Query Working Group:
This word means that the item is an absolute requirement.
This word means that there may exist valid reasons not to treat this item as a requirement, but the full implications should be understood and the case carefully weighed before discarding this item.
This word means that an item deserves attention, but further study is needed to determine whether the item should be treated as a requirement.
When the words MUST, SHOULD, or MAY are used in this technical sense, they occur as a hyperlink to these definitions. These words will also be used with their conventional English meaning, in which case there is no hyperlink. For instance, the phrase "the full implications should be understood" uses the word "should" in its conventional English sense, and therefore occurs without the hyperlink.
The XML Query Language MAY have more than one syntax binding. One query language syntax MUST be convenient for humans to read and write. One query language syntax MUST be expressed in XML in a way that reflects the underlying structure of the query.
The XML Query Language MUST be declarative. Notably, it MUST not enforce a particular evaluation strategy.
The XML Query Language MUST be defined independently of any protocols with which it is used. (Relationships to some specific protocols are discussed in [4 Relationship to Other Activities].)
The XML Query Language MUST define standard error conditions that can occur during the execution of a query, such as processing errors within expressions, unavailability of external functions to the query processor, or processing errors generated by external functions.
Version 1.0 of the XML Query Language MUST not preclude the ability to add update capabilities in future versions.
The XML Query Language MUST be defined for finite instances of the data model. It MAY be defined for infinite instances.
The XML Query Data Model relies on information provided by XML Processors and Schema Processors, and it MUST ensure that it does not require information that is not made available by such processors. For XML constructs found in XML 1.0 or the Namespaces Recommendation, the XML Query Data Model MUST show how the equivalent XML Query Data Model constructs are built from items in the XML Information Set. The XML Query Data Model SHOULD represent all information items, or provide justification for any information items omitted. For information found in the XML Schema, such as datatypes, the XML Query Working Group MUST coordinate with the XML Schema Working Group to ensure that schema processors may be relied on to provide the information needed to construct the Data Model.
The XML Query Data Model MUST represent both XML 1.0 character data and the simple and complex types of the XML Schema specification.
The XML Query Data Model MUST represent collections of documents and collections of simple and complex values. (Note that collections are not part of the current XML Infoset.)
The XML Query Data Model MUST include support for references, including both references within an XML document and references from one XML document to another.
Queries MUST be possible whether or not a schema is available (in this document, the term "schema" may refer to either an XML Schema or a DTD). If a schema is available, the data model MUST represent any items that they define for their instances, such as default attributes, entity expansions, or data types. These items will not be present if a schema is not present.
The XML Query Language and XML Query Language Data Model MUST be namespace aware.
The XML Query Language MUST support operations on all data types represented by the XML Query Data Model (see datatypes, collections, references) .
Queries MUST be able to express simple conditions on text, including conditions on text that spans element boundaries.
Operations on collections MUST include support for universal and existential quantifiers.
Queries MUST support operations on hierarchy and sequence of document structures.
The XML Query Language MUST be able to combine related information from different parts of a given document or from multiple documents.
The XML Query Language MUST be able to compute summary information from a group of related document elements (this operation is sometimes called "aggregation.")
The XML Query Language MUST be able to sort query results.
The XML Query Language MUST support expressions in which operations can be composed, including the use of queries as operands.
The XML Query Language MUST include support for NULL values. Therefore, all operators, including logical operators, MUST take NULL values into account.
Queries MUST be able to preserve the relative hierarchy and sequence of input document structures in query results.
Queries MUST be able to transform XML structures and MUST be able to create new structures.
Queries MUST be able to traverse intra- and inter-document references.
Queries MUST be able to preserve the identity of items in the XML Query Data Model.
Queries SHOULD be able operate on XML Query Data Model instances specified with the query ("literal" data).
Queries MUST be able to perform simple operations on names, such as tests for equality in element names, attribute names, and processing instruction targets, and to perform simple operations on combinations of names and data. Queries MAY perform more powerful operations on names.
Queries SHOULD provide access to the XML schema or DTD for a document, if there is one. If the schema is represented as a DTD, a mapping to an appropriate XML Schema representation MAY be required.
The XML Query Language SHOULD support the use of externally defined functions on all datatypes of the XML Query Data Model. The interface to such functions SHOULD be defined by the Query Language, and SHOULD distinguish these functions from functions defined in the Query Language. The implementation of externally defined functions is not part of the Query Language.
The XML Query Language MUST provide access to information derived from the environment in which the query is executed, such as the current date, time, or user.
Queries MUST be closed with respect to the XML Query Data Model. Both the input to a query and the output of a query MUST be defined purely in terms of the XML Query Data Model. Non-XML sources such as traditional databases or objects may be queried if they are given an XML Query Data Model representation. Similarly, query results are defined purely in terms of the XML Query Data Model. In software systems these results may be instantiated in any convenient representation such as DOM nodes, hyperlinks, XML text, or various data formats.
XML has become a strategic technology in W3C and in the global Web market. The deliverable of the XML Query Working Group MUST satisfy the dependencies from the following Working Groups before it can advance to Proposed Recommendation. Some dependencies to and from the following W3C Working Groups will require close cooperation during the development process; the requirements posed for the Query work by these Working Groups may change during the development process, which means the interdependency of the Query work with these Working Groups must be managed actively:
The XML Query Language must be able to return results in a form that can be used in DOM programs, such as DOM Nodes or the Iterators and TreeWalkers defined in the Traversal specification.
Both XSLT and XPointer use the XML Path Language ( XPath), which defines a location path syntax that can be used to search for matching parts of an XML document. The XML Query work will take into consideration the expressibility and search facilities of XPath when formulating its algebra and query syntax, and where desirable try to encompass those functionalities into its query language. The XML Query WG will also take into consideration the additional functionality in the XSLT and XPointer specifications.
It is a goal of the XML Query work to be compatible with the work of the XML Schema Working Group (W3C members only), including both Structures and Datatypes.
For example, it should be possible to base query predicates on the existing DTD or XSDL definition of the content of an XML document and on the new data types being defined as part of the XDTL.
The XML Query work will define a formal data model of XML documents. This model must be based on the model of the XML Infoset. In case incompatibilities arise, requirements must be posed to the W3C XML Core Working Group (W3C members only). In any case, the final model used by the XML Query working group will have to be based on, and totally compatible with, the model of the XML Infoset.
There are no requirements for co-development of features with the following Working Groups, but there are points of contact between their work and that of this Working Group, and thus logical dependency between their deliverables and those of this Working Group. Requirements from these Working Groups are expected to be well suited for communication via documents:
Reuse of common constructs greatly facilitates accessibility; the WAI PF Working Group (W3C members only) will review work on the XML query facilities to be sure cost/benefit design decisions are informed of the benefits of accessibility.
The Internationalization Working Group (W3C members only) will review the work of the XML Query Working Group in order to ensure that it satisfies W3C goals for international access to the Web.
It may be necessary for the XML Query Working Group to reference the XML Fragment specification if a valid query return type is an XML fragment.
XML Query must strive for smooth interaction with the IETF DASL (DAV Searching & Locating) Working Group, in such a way that the XML query language can be easily incorporated into the DASL protocol.
Formal liaison between the XML Query Working Group and other W3C working groups, including the other XML working groups and the WAI (Web Accessibility Initiative) group, as well as organizations outside of the W3C, shall be accomplished by the exchange of documents (requirements, reviews, etc.) transmitted through the XML Coordination Group.
The following references are some of the works considered by the WG in deriving its requirements.
A quantifier applies a predicate to the elements of a collection. A universally quantified predicate evaluates to true if it holds for all elements of the collection, an existentially quantified predicate evaluates to true if it holds for at least one element.
A document consists of the set of nodes and edges in the subtree descended from a Document node in the XML Query Data Model.
References that refer to nodes that do not reside in the same XML document as the reference itself.
References that reside in the same XML document as the nodes they reference.
Literal fragments of an XML document such as <name><first>Joe</first><last>Doe</last></name>, which may be used for comparison.
The use cases listed below were created by the XML Query Working Group to illustrate important applications for an XML query language. Each use case is focused on a specific application area, and contains a Document Type Definition (DTD) and example input data. Each use case specifies a set of queries that might be applied to the input data, and the expected results for each query. Since the English description of each query is concise, the expected results form an important part of the definition of each query, specifying the expected output format.
Some of the use cases assume that input is provided in the form of one or more documents with specific names such as "http://www.bn.com/bib.xml". Other use cases are based on implicit (unnamed) input documents. The input environment for each use case is stated in its Document Type Definition (DTD) section.
These use cases represent a snapshot of an ongoing work. Some important application areas are not yet adequately covered by a use case. The XML Query Working Group reserves the right to add, delete, or modify individual queries or whole use cases as the work progresses. The presence of a query in this set of use cases does not necessarily indicate that the query will be expressible in the XML Query Language(s) to be created by the XML Query Working Group.
This use case contains several example queries that illustrate requirements gathered from the database and document communities.
Most of the example queries in this use case are based on a bibliography document named "http://www.bn.com/bib.xml" with the following DTD:
<!ELEMENT bib (book* )> <!ELEMENT book (title, (author+ | editor+ ), publisher, price )> <!ATTLIST book year CDATA #REQUIRED > <!ELEMENT author (last, first )> <!ELEMENT editor (last, first, affiliation )> <!ELEMENT title (#PCDATA )> <!ELEMENT last (#PCDATA )> <!ELEMENT first (#PCDATA )> <!ELEMENT affiliation (#PCDATA )> <!ELEMENT publisher (#PCDATA )> <!ELEMENT price (#PCDATA )>
Here is the data found at www.bn.com/bib.xml:
<bib> <book year="1994"> <title>TCP/IP Illustrated</title> <author><last>Stevens</last><first>W.</first></author> <publisher>Addison-Wesley</publisher> <price> 65.95</price> </book> <book year="1992"> <title>Advanced Programming in the Unix environment</title> <author><last>Stevens</last><first>W.</first></author> <publisher>Addison-Wesley</publisher> <price>65.95</price> </book> <book year="2000"> <title>Data on the Web</title> <author><last>Abiteboul</last><first>Serge</first></author> <author><last>Buneman</last><first>Peter</first></author> <author><last>Suciu</last><first>Dan</first></author> <publisher>Morgan Kaufmann Publishers</publisher> <price> 39.95</price> </book> <book year="1999"> <title>The Economics of Technology and Content for Digital TV</title> <editor> <last>Gerbarg</last><first>Darcy</first> <affiliation>CITI</affiliation> </editor> <publisher>Kluwer Academic Publishers</publisher> <price>129.95</price> </book> </bib>
Q5 also uses information on book reviews and prices from a separate data source named "http://www.amazon.com/reviews.xml" with the following DTD:
<!ELEMENT reviews (entry*)> <!ELEMENT entry (title, price, review)> <!ELEMENT title (#PCDATA)> <!ELEMENT price (#PCDATA)> <!ELEMENT review (#PCDATA)>
Here are the contents of "http://www.amazon.com/reviews.xml":
<reviews> <entry> <title> Data on the Web</title> <price>34.95</price> <review> A very good discussion of semi-structured database systems and XML. </review> </entry> <entry> <title> Advanced Programming in the Unix environment</title> <price>65.95</price> <review> A clear and detailed discussion of UNIX programming. </review> </entry> <entry> <title>TCP/IP Illustrated</title> <price>65.95</price> <review> One of the best books on TCP/IP. </review> </entry> </reviews>
Q9 uses an input document named "books.xml", with the following DTD:
<!ELEMENT chapter (title, section*)> <!ELEMENT section (title, section*)> <!ELEMENT title (#PCDATA)>
Here are the contents of books.xml:
<chapter> <title>Data Model</title> <section> <title> Syntax For Data Model</title> </section> <section> <title>XML</title> <section> <title>Basic Syntax</title> </section> <section> <title> XML and Semistructured Data</title> </section> </section> </chapter>
Q10 uses an input document named "prices.xml", with the following DTD:
<!ELEMENT prices (book*)> <!ELEMENT book (title, source, price)> <!ELEMENT title (#PCDATA)> <!ELEMENT source (#PCDATA)> <!ELEMENT price (#PCDATA)>
Here are the contents of prices.xml:
<prices> <book> <title>Advanced Programming in the Unix environment</title> <source>www.amazon.com</source> <price>65.95</price> </book> <book> <title>Advanced Programming in the Unix environment </title> <source>www.bn.com</source> <price>65.95</price> </book> <book> <title> TCP/IP Illustrated </title> <source>www.amazon.com</source> <price>65.95</price> </book> <book> <title> TCP/IP Illustrated </title> <source>www.bn.com</source> <price>65.95</price> </book> <book> <title>Data on the Web</title> <source>www.amazon.com</source> <price>34.95</price> </book> <book> <title>Data on the Web</title> <source>www.bn.com</source> <price>39.95</price> </book> </prices>
List books published by Addison Wesley after 1991, including their year and title.
<bib> <book year="1994"> <title>TCP/IP Illustrated</title> </book> <book year="1992"> <title>Advanced Programming in the Unix environment</title> </book> </bib>
Create a flat list of all the title-author pairs, with each pair enclosed in a "result" element.
<results> <result> <title>TCP/IP Illustrated</title> <author><last>Stevens</last><first>W.</first></author> </result> <result> <title>Advanced Programming in the Unix environment</title> <author><last>Stevens</last><first>W.</first></author> </result> <result> <title>Data on the Web</title> <author><last>Abiteboul</last><first>Serge</first></author> </result> <result> <title> Data on the Web</title> <author><last>Buneman</last><first>Peter</first></author> </result> <result> <title>Data on the Web</title> <author><last>Suciu</last><first>Dan</first></author> </result> </results>
For each book in the bibliography, list the title and authors, grouped inside a "result" element.
<results> <result> <title>TCP/IP Illustrated</title> <author><last>Stevens</last><first>W.</first></author> </result> <result> <title>Advanced Programming in the Unix environment</title> <author><last>Stevens</last><first>W.</first></author> </result> <result> <title>Data on the Web</title> <author><last>Abiteboul</last><first>Serge</first></author> <author><last>Buneman</last><first>Peter</first></author> <author><last>Suciu</last><first>Dan</first></author> </result> </results>
For each author in the bibliography, list the author's name and the titles of all books by that author, grouped inside a "result" element.
<results> <result> <author><last>Stevens</last><first>W.</first></author> <title>TCP/IP Illustrated</title> <title>Advanced Programming in the Unix environment</title> </result> <result> <author><last>Abiteboul</last><first>Serge</first></author> <title>Data on the Web</title> </result> <result> <author><last>Buneman</last><first>Peter</first></author> <title>Data on the Web</title> </result> <result> <author><last>Suciu</last><first>Dan</first></author> <title>Data on the Web</title> </result> </results>
For each book found at both bn.com and amazon.com, list the title of the book and its price from each source.
<books-with-prices> <book-with-prices> <title>TCP/IP Illustrated</title> <price-amazon>65.95</price-amazon> <price-bn>65.95</price-bn> </book-with-prices> <book-with-prices> <title>Advanced Programming in the Unix environment</title> <price-amazon>65.95</price-amazon> <price-bn>65.95</price-bn> </book-with-prices> <book-with-prices> <title>Data on the Web</title> <price-amazon>34.95</price-amazon> <price-bn>39.95</price-bn> </book-with-prices> </books-with-prices>
For each book, list the title and first two authors, and an empty "et-al" element if the book has additional authors.
<bib> <book> <title>TCP/IP Illustrated</title> <author><last>Stevens</last><first>W.</first></author> </book> <book> <title>Advanced Programming in the Unix environment</title> <author><last>Stevens</last><first>W.</first></author> </book> <book> <title>Data on the Web</title> <author><last>Abiteboul</last><first> Serge</first></author> <author><last>Buneman</last><first>Peter</first></author> <et-al/> </book> </bib>
List the titles and years of all books published by Addison Wesley after 1991, in alphabetic order.
<bib> <book year="1992"> <title>Advanced Programming in the Unix environment</title> </book> <book year="1994"> <title>TCP/IP Illustrated</title> </book> </bib>
Find books in which some element has a tag ending in "or" and the same element contains the string "Suciu" (at any level of nesting). For each such book, return the title and the qualifying element.
<book> <title> Data on the Web </title> <author> <last> Suciu </last> <first> Dan </first> </author> </book>
In the document "books.xml", find all section or chapter titles that contain the word "XML", regardless of the level of nesting.
<results> <title>XML</title> <title>XML and Semistructured Data</title> </results>
In the document "prices.xml", find the minimum price for each book, in the form of a "minprice" element with the book title as its title attribute.
<results> <minprice title="Advanced Programming in the Unix environment"> 65.95 </minprice> <minprice title="TCP/IP Illustrated"> 65.95 </minprice> <minprice title="Data on the Web"> 34.95 </minprice> </results>
From each book with an author, return the book with its title and authors. For each book with an editor, return a reference with the book title and the editor's affiliation.
<bib> <book> <title>TCP/IP Illustrated</title> <author><last> Stevens </last> <first> W.</first></author> </book> <book> <title>Advanced Programming in the Unix environment</title> <author><last>Stevens</last><first>W.</first></author> </book> <book> <title>Data on the Web</title> <author><last>Abiteboul</last><first>Serge</first></author> <author><last>Buneman</last><first>Peter</first></author> <author><last>Suciu</last><first>Dan</first></author> </book> <reference> <title>The Economics of Technology and Content for Digital TV</title> <org>CITI</org> </reference> </bib>
Find pairs of books that have different titles but the same set of authors.
<bib> <book-pair> <title> TCP/IP Illustrated </title> <title> Advanced Programming in the Unix environment </title> </book-pair> </bib>
Some XML document-types have a very flexible structure in which text is mixed with elements and many elements are optional. These document-types show a wide variation in structure from one document to another. In documents of these types, the ways in which elements are ordered and nested are usually quite important.
An XML query language should have the ability to extract elements from documents while preserving their original hierarchy. This Use Case illustrates this requirement by means of a flexible document type named Book.
This use case is based on an input document named "book.xml", with the following DTD:
<!DOCTYPE book [ <!ELEMENT book (title, author+, section+)> <!ELEMENT title (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT section (title, (p | figure | section)* )> <!ATTLIST section id ID #IMPLIED difficulty CDATA #IMPLIED> <!ELEMENT p (#PCDATA)> <!ELEMENT figure (title, image)> <!ATTLIST figure width CDATA #REQUIRED height CDATA #REQUIRED > <!ELEMENT image EMPTY> <!ATTLIST image source CDATA #REQUIRED > ]>
The queries in this use case are based on the following sample data.
<?xml version="1.0"?> <!DOCTYPE book SYSTEM "book.dtd"> <book> <title>Data on the Web</title> <author>Serge Abiteboul</author> <author>Peter Buneman</author> <author>Dan Suciu</author> <section id="intro" difficulty="easy" > <title>Introduction</title> <p>Text ... </p> <section> <title>Audience</title> <p>Text ... </p> </section> <section> <title>Web Data and the Two Cultures</title> <p>Text ... </p> <figure height="400" width="400"> <title>Traditional client/server architecture</title> <image source="csarch.gif"/> </figure> <p>Text ... </p> </section> </section> <section id="syntax" difficulty="medium" > <title>A Syntax For Data</title> <p>Text ... </p> <figure height="200" width="500"> <title>Graph representations of structures</title> <image source="graphs.gif"/> </figure> <p>Text ... </p> <section> <title>Base Types</title> <p>Text ... </p> </section> <section> <title>Representing Relational Databases</title> <p>Text ... </p> <figure height="250" width="400"> <title>Examples of Relations</title> <image source="relations.gif"/> </figure> </section> <section> <title>Representing Object Databases</title> <p>Text ... </p> </section> </section> </book>
Prepare a (nested) table of contents for Book1, listing all the sections and their titles. Preserve the original attributes of each <section> element, if any.
<toc> <section id="intro" difficulty="easy"> <title>Introduction</title> <section> <title>Audience</title> </section> <section> <title>Web Data and the Two Cultures</title> </section> </section> <section id="syntax" difficulty="medium"> <title>A Syntax For Data</title> <section> <title>Base Types</title> </section> <section> <title>Representing Relational Databases</title> </section> <section> <title>Representing Object Databases</title> </section> </section> </toc>
Prepare a (flat) figure list for Book1, listing all the figures and their titles. Preserve the original attributes of each <figure> element, if any.
<figlist> <figure height="400" width="400"> <title>Traditional client/server architecture</title> </figure> <figure height="200" width="500"> <title>Graph representations of structures</title> </figure> <figure height="250" width ="400"> <title>Examples of Relations</title> </figure> </figlist>
How many sections are in Book1, and how many figures?
<section_count> 7 </section_count> <figure_count> 3 </figure_count>
How many top-level sections are in Book1?
<top_section_count> 2 </top_section_count>
Make a flat list of the section elements in Book1. In place of its original attributes, each section element should have two attributes, containing the title of the section and the number of figures immediately contained in the section.
<section_list> <section title="Introduction" figcount="0"> </section> <section title="Audience" figcount="0"> </section> <section title="Web Data and the Two Cultures" figcount="1"> </section> <section title="A Syntax for Data" figcount="1"> </section> <section title="Base Types" figcount="0"> </section> <section title="Representing Relational Databases" figcount="1"> </section> <section title="Representing Object Databases" figcount="0"> </section> </section_list>
Make a nested list of the section elements in Book1, preserving their original attributes and hierarchy. Inside each section element, include the title of the section and an element that includes the number of figures immediately contained in the section.
<section_summary> <section id="intro" difficulty="easy"> <title>Introduction</title> <figcount> 0 </figcount> <section> <title>Audience</title> <figcount> 0 </figcount> </section> <section> <title>Web Data and the Two Cultures</title> <figcount> 1 </figcount> </section> </section> <section id="syntax" difficulty="medium"> <title>A Syntax For Data</title> <figcount> 1 </figcount> <section> <title>Base Types</title> <figcount> 0 </figcount> </section> <section> <title>Representing Relational Databases</title> <figcount> 1 </figcount> </section> <section> <title>Representing Object Databases</title> <figcount> 0 </figcount> </section> </section> </section_summary>
This use case illustrates queries based on the sequence in which elements appear in a document.
Although sequence is not significant in most traditional database systems or object systems, it can be quite significant in structured documents. This use case presents a series of queries based on a medical report.
This use case is based on a medical report using the HL7 Patient Record Architecture. We simplify the DTD in this example, using only what is needed to understand the queries.
<!DOCTYPE report [ <!ELEMENT report (section*)> <!ELEMENT section (section.title, section.content)> <!ELEMENT section.title (#PCDATA )> <!ELEMENT section.content (#PCDATA | anesthesia | prep | incision | action | observation )*> <!ELEMENT anesthesia (#PCDATA)> <!ELEMENT prep ( (#PCDATA | action)* )> <!ELEMENT incision ( (#PCDATA | geography | instrument)* )> <!ELEMENT action ( (#PCDATA | instrument )* )> <!ELEMENT observation (#PCDATA)> <!ELEMENT geography (#PCDATA)> <!ELEMENT instrument (#PCDATA)> ]>
The queries in this use case are based on the following sample data.
<report> <section> <section.title>Procedure</section.title> <section.content> The patient was taken to the operating room where she was placed in supine position and <anesthesia>induced under general anesthesia.</anesthesia> <prep> <action>A Foley catheter was placed to decompress the bladder</action> and the abdomen was then prepped and draped in sterile fashion. </prep> <incision> A curvilinear incision was made <geography>in the midline immediately infraumbilical</geography> and the subcutaneous tissue was divided <instrument>using electrocautery.</instrument> </incision> The fascia was identified and <action>#2 0 Maxon stay sutures were placed on each side of the midline.</action> <incision> The fascia was divided using <instrument>electrocautery</instrument> and the peritoneum was entered. </incision> <observation>The small bowel was identified.</observation> and <action> the <instrument>Hasson trocar</instrument> was placed under direct visualization. </action> <action> The <instrument>trocar</instrument> was secured to the fascia using the stay sutures. </action> </section.content> </section> </report>
In the Procedure section of Report1, what Instruments were used in the second Incision?
<result> <instrument>electrocautery</instrument> </result>
In the Procedure section of Report1, what are the first two Instruments to be used?
<result> <instrument>using electrocautery.</instrument> <instrument>electrocautery</instrument> </result>
In Report1, what Instruments were used in the first two Actions after the second Incision?
<result> <instrument>Hasson trocar</instrument> <instrument>trocar</instrument> </result>
In Report1, find "Procedure" sections where no Anesthesia element occurs before the first Incision
<result />
(No sections satisfy Q4)
In Report1, what happened between the first Incision and the second Incision?
<result> <action>#2 0 Maxon stay sutures were placed on each side of the midline.</action> </result>
One important use of an XML query language will be to access data stored in relational databases. This use case describes one possible way in which this access might be accomplished.
A relational database system might present a view in which each table (relation) takes the form of an XML document. One way to represent a database table as an XML document is to allow the document element to represent the table itself, and each row (tuple) inside the table to be represented by a nested element. Inside the tuple-elements, each column is in turn represented by a nested element. Columns that allow null values are represented by optional elements, and a missing element denotes a null value.
As an example, consider a relational database used by an online auction. The auction maintains a USERS table containing information on registered users, each identified by a unique userid, who can either offer items for sale or bid on items. An ITEMS table lists items currently or recently for sale, with the userid of the user who offered each item. A BIDS table contains all bids on record, keyed by the userid of the bidder and the item number of the item to which the bid applies.
The three tables used by the online auction are below, with their column-names indicated in parentheses.
USERS ( USERID, NAME, RATING ) ITEMS ( ITEMNO, DESCRIPTION, OFFERED_BY, START_DATE, END_DATE, RESERVE_PRICE ) BIDS ( USERID, ITEMNO, BID, BID_DATE )
This use case is based on three separate input documents named users.xml, items.xml, and bids.xml. Each of the documents represents one of the tables in the relational database described above, using the following DTDs:
<!DOCTYPE users [ <!ELEMENT users (user_tuple*)> <!ELEMENT user_tuple (userid, name, rating?)> <!ELEMENT userid (#PCDATA)> <!ELEMENT name (#PCDATA)> <!ELEMENT rating (#PCDATA)> ]> <!DOCTYPE items [ <!ELEMENT items (item_tuple*)> <!ELEMENT item_tuple (itemno, description, offered_by, start_date?, end_date?, reserve_price? )> <!ELEMENT itemno (#PCDATA)> <!ELEMENT description (#PCDATA)> <!ELEMENT offered_by (#PCDATA)> <!ELEMENT start_date (#PCDATA)> <!ELEMENT end_date (#PCDATA)> <!ELEMENT reserve_price (#PCDATA)> ]> <!DOCTYPE bids [ <!ELEMENT bids (bid_tuple*)> <!ELEMENT bid_tuple (userid, itemno, bid, bid_date)> <!ELEMENT userid (#PCDATA)> <!ELEMENT itemno (#PCDATA)> <!ELEMENT bid (#PCDATA)> <!ELEMENT bid_date (#PCDATA)> ]>
The following tables contain example data. Representation of this data in the DTD of this Use Case is straightforward but rather bulky, and is not included here.
USERID | NAME | RATING |
U01 | Tom Jones | B |
U02 | Mary Doe | A |
U03 | Dee Linquent | D |
U04 | Roger Smith | C |
U05 | Jack Sprat | B |
U06 | Rip Van Winkle | B |
ITEMNO | DESCRIPTION | OFFERED_BY | START_DATE | END_DATE | RESERVE_PRICE |
1001 | Red Bicycle | U01 | 99-01-05 | 99-01-20 | 40 |
1002 | Motorcycle | U02 | 99-02-11 | 99-03-15 | 500 |
1003 | Old Bicycle | U02 | 99-01-10 | 99-02-20 | 25 |
1004 | Tricycle | U01 | 99-02-25 | 99-03-08 | 15 |
1005 | Tennis Racket | U03 | 99-03-19 | 99-04-30 | 20 |
1006 | Helicopter | U03 | 99-05-05 | 99-05-25 | 50000 |
1007 | Racing Bicycle | U04 | 99-01-20 | 99-02-20 | 200 |
1008 | Broken Bicycle | U01 | 99-02-05 | 99-03-06 | 25 |
USERID | ITEMNO | BID | BID_DATE |
U02 | 1001 | 35 | 99-01-07 |
U04 | 1001 | 40 | 99-01-08 |
U02 | 1001 | 45 | 99-01-11 |
U04 | 1001 | 50 | 99-01-13 |
U02 | 1001 | 55 | 99-01-15 |
U01 | 1002 | 400 | 99-02-14 |
U02 | 1002 | 600 | 99-02-16 |
U03 | 1002 | 800 | 99-02-17 |
U04 | 1002 | 1000 | 99-02-25 |
U02 | 1002 | 1200 | 99-03-02 |
U04 | 1003 | 15 | 99-01-22 |
U05 | 1003 | 20 | 99-02-03 |
U01 | 1004 | 40 | 99-03-05 |
U03 | 1007 | 175 | 99-01-25 |
U05 | 1007 | 200 | 99-02-08 |
U04 | 1007 | 225 | 99-02-12 |
The following results assume that the queries were executed on Feb. 1, 1999.
List the item number and description of all bicycles that currently have an auction in progress, ordered by item number.
<result> <item_tuple> <itemno> 1003 </itemno> <description> Old Bicycle </description> </item_tuple> <item_tuple> <itemno> 1007 </itemno> <description> Racing Bicycle </description> </item_tuple> </result>
For all bicycles, list the item number, description, and highest bid (if any), ordered by item number.
<result> <item_tuple> <itemno> 1001 </itemno> <description> Red Bicycle </description> <high_bid> 55 </high_bid> </item_tuple> <item_tuple> <itemno> 1003 </itemno> <description> Old Bicycle </description> <high_bid> 20 </high_bid> </item_tuple> <item_tuple> <itemno> 1007 </itemno> <description> Racing Bicycle </description> <high_bid> 225 </high_bid> </item_tuple> <item_tuple> <itemno> 1008 </itemno> <description> Broken Bicycle </description> </item_tuple> </result>
Find cases where a user with a rating worse (alphabetically, greater) than "C" is offering an item with a reserve price of more than 1000.
<result> <warning> <user_name> Dee Linquent </user_name> <user_rating> D </user_rating> <item_description> Helicopter </item_description> <reserve_price> 50000 </reserve_price> </warning> </result>
List item numbers and descriptions of items that have no bids.
<result> <no_bid_item> <itemno> 1005 </itemno> <description> Tennis Racket </description> </no_bid_item> <no_bid_item> <itemno> 1006 </itemno> <description> Helicopter </description> </no_bid_item> <no_bid_item> <itemno> 1008 </itemno> <description> Broken Bicycle </description> </no_bid_item> </result>
For bicycle(s) offered by Tom Jones, list the item number, description, highest bid (if any), and name of the highest bidder, ordered by item number.
<result> <auction_item> <itemno> 1001 </itemno> <description> Red Bicycle </description> <high_bid> 55 </high_bid> <bidder> Mary Doe </bidder> </auction_item> <auction_item> <itemno> 1008 </itemno> <description> Broken Bicycle </description> </auction_item> </result>
For each item whose highest bid is more than twice its reserve price, list the item number, description, reserve price, and highest bid.
<result> <successful_item> <itemno> 1002 </itemno> <description> Motorcycle </description> <reserve_price> 500 </reserve_price> <high_bid> 1200 </high_bid> </successful_item> <successful_item> <itemno> 1004 </itemno> <description> Tricycle </description> <reserve_price> 15 </reserve_price> <high_bid> 40 </high_bid> </successful_item> </result>
Find the highest bid ever made for a bicycle or tricycle.
<high_bid> 225 </high_bid>
How many items were actioned (auction ended) in March 1999?
<item_count> 3 </item_count>
List the number of items auctioned each month in 1999 for which data is available, ordered by month.
<result> <monthly_result> <month> 1 </month> <item_count> 1 </item_count> </monthly_result> <monthly_result> <month> 2 </month> <item_count> 2 </item_count> </monthly_result> <monthly_result> <month> 3 </month> <item_count> 3 </item_count> </monthly_result> <monthly_result> <month> 4 </month> <item_count> 1 </item_count> </monthly_result> <monthly_result> <month> 5 </month> <item_count> 1 </item_count> </monthly_result> </result>
For each item that has received a bid, list the item number, the highest bid, and the name of the highest bidder, ordered by item number.
<result> <high_bid> <itemno> 1001 </itemno> <bid> 55 </bid> <bidder> Mary Doe </bidder> </high_bid> <high_bid> <itemno> 1002 </itemno> <bid> 1200 </bid> <bidder> Mary Doe </bidder> </high_bid> <high_bid> <itemno> 1003 </itemno> <bid> 20 </bid> <bidder> Jack Sprat </bidder> </high_bid> <high_bid> <itemno> 1004 </itemno> <bid> 40 </bid> <bidder> Tom Jones </bidder> </high_bid> <high_bid> <itemno> 1007 </itemno> <bid> 225 </bid> <bidder> Roger Smith </bidder> </high_bid> </result>
List the item number and description of the item(s) that received the highest bid ever recorded, and the amount of that bid.
<result> <expensive_item> <itemno> 1002 </itemno> <description> Motorcycle </description> <high_bid> 1200 </high_bid> </expensive_item> </result>
List the item number and description of the item(s) that received the largest number of bids, and the number of bids it (or they) received.
<result> <popular_item> <itemno> 1001 </itemno> <description> Red Bicycle </description> <bid_count> 5 </bid_count> </popular_item> <popular_item> <itemno> 1002 </itemno> <description> Motorcycle </description> <bid_count> 5 </bid_count> </popular_item> </result>
For each user who has placed a bid, give the userid, name, number of bids, and average bid, in order by userid.
<result> <bidder> <userid> U01 </userid> <name> Tom Jones </name> <bidcount> 2 </bidcount> <avgbid> 220 </avgbid> </bidder> <bidder> <userid> U02 </userid> <name> Mary Doe </name> <bidcount> 5 </bidcount> <avgbid> 387 </avgbid> </bidder> <bidder> <userid> U03 </userid> <name> Dee Linquent </name> <bidcount> 2 </bidcount> <avgbid> 475 </avgbid> </bidder> <bidder> <userid> U04 </userid> <name> Roger Smith </name> <bidcount> 5 </bidcount> <avgbid> 266 </avgbid> </bidder> <bidder> <userid> U05 </userid> <name> Jack Sprat </name> <bidcount> 1 </bidcount> <avgbid> 200 </avgbid> </bidder> </result>
List item numbers and average bids for items that have received three or more bids, in descending order by average bid.
<result> <popular_item> <itemno> 1002 </itemno> <avgbid> 800 </avgbid> </popular_item> <popular_item> <itemno> 1007 </itemno> <avgbid> 200 </avgbid> </popular_item> <popular_item> <itemno> 1001 </itemno> <avgbid> 45 </avgbid> </popular_item> </result>
List names of users who have placed multiple bids of at least $100 each.
<result> <big_spender> Mary Doe </big_spender> <big_spender> Dee Linquent </big_spender> <big_spender> Roger Smith </big_spender> </result>
List all registered users in order by userid; for each user, include the userid, name, and an indication of whether the user is active (has at least one bid on record) or inactive (has no bid on record).
<result> <user> <userid> U01 </userid> <name> Tom Jones </name> <status> active </status> </user> <user> <userid> U02 </userid> <name> Mary Doe </name> <status> active </status> </user> <user> <userid> U03 </userid> <name> Dee Linquent </name> <status> active </status> </user> <user> <userid> U04 </userid> <name> Roger Smith </name> <status> active </status> </user> <user> <userid> U05 </userid> <name> Jack Sprat </name> <status> active </status> </user> <user> <userid> U06 </userid> <name> Rip Van Winkle </name> <status> inactive </status> </user> </result>
List the names of users, if any, who have bid on every item.
<frequent_bidder />
(No users satisfy Q17.)
List all users in alphabetic order by name. For each user, include descriptions of all the items (if any) that were bid on by that user, in alphabetic order.
<result> <user> <name> Dee Linquent </name> <bid_on_item> Motorcycle </bid_on_item> <bid_on_item> Racing Bicycle </bid_on_item> </user> <user> <name> Jack Sprat </name> <bid_on_item> Old Bicycle </bid_on_item> <bid_on_item> Racing Bicycle </bid_on_item> </user> <user> <name> Mary Doe </name> <bid_on_item> Motorcycle </bid_on_item> <bid_on_item> Red Bicycle </bid_on_item> </user> <user> <name> Rip Van Winkle </name> </user> <user> <name> Roger Smith </name> <bid_on_item> Motorcycle </bid_on_item> <bid_on_item> Old Bicycle </bid_on_item> <bid_on_item> Racing Bicycle </bid_on_item> <bid_on_item> Red Bicycle </bid_on_item> </user> <user> <name> Tom Jones </name> <bid_on_item> Motorcycle </bid_on_item> <bid_on_item> Tricycle </bid_on_item> </user> </result>
The example document and queries in this Use Case were first created for a 1992 conference on Standard Generalized Markup Language (SGML). For our use, the Document Type Definition (DTD) and example document have been translated from SGML to XML.
This use case is based on an implicit (unnamed) input data set, using the DTD shown below.
<!NOTATION cgm PUBLIC "Computer Graphics Metafile"> <!NOTATION ccitt PUBLIC "CCITT group 4 raster"> <!ENTITY % text "(#PCDATA | emph)*"> <!ENTITY infoflow SYSTEM "infoflow.ccitt" NDATA ccitt> <!ENTITY tagexamp SYSTEM "tagexamp.cgm" NDATA cgm> <!ELEMENT report (title, chapter+)> <!ELEMENT title %text;> <!ELEMENT chapter (title, intro?, section*)> <!ATTLIST chapter shorttitle CDATA #IMPLIED> <!ELEMENT intro (para | graphic)+> <!ELEMENT section (title, intro?, topic*)> <!ATTLIST section shorttitle CDATA #IMPLIED sectid ID #IMPLIED> <!ELEMENT topic (title, (para | graphic)+)> <!ATTLIST topic shorttitle CDATA #IMPLIED topicid ID #IMPLIED> <!ELEMENT para (#PCDATA | emph | xref)*> <!ATTLIST para security (u | c | s | ts) "u"> <!ELEMENT emph %text;> <!ELEMENT graphic EMPTY> <!ATTLIST graphic graphname ENTITY #REQUIRED> <!ELEMENT xref EMPTY> <!ATTLIST xref xrefid IDREF #IMPLIED>
The queries in this use case are based on the following sample data. Line numbers have been added to the data to allow the results of queries to be conveniently specified.
<?xml version="1.0"?> <!-- 0--> <!DOCTYPE report SYSTEM "report.dtd"> <!-- 1--> <report> <!-- 2--> <title>Getting started with SGML</title> <!-- 3--> <chapter> <!-- 4--> <title>The business challenge</title> <!-- 5--> <intro> <!-- 6--> <para>With the ever-changing and growing global market, companies and <!-- 7--> large organizations are searching for ways to become more viable and <!-- 8--> competitive. Downsizing and other cost-cutting measures demand more <!-- 9--> efficient use of corporate resources. One very important resource is <!--10--> an organization's information.</para> <!--11--> <para>As part of the move toward integrated information management, <!--12--> whole industries are developing and implementing standards for <!--13--> exchanging technical information. This report describes how one such <!--14--> standard, the Standard Generalized Markup Language (SGML), works as <!--15--> part of an overall information management strategy.</para> <!--16--> <graphic graphname="infoflow"/></intro></chapter> <!--17--> <chapter> <!--18--> <title>Getting to know SGML</title> <!--19--> <intro> <!--20--> <para>While SGML is a fairly recent technology, the use of <!--21--> <emph>markup</emph> in computer-generated documents has existed for a <!--22--> while.</para></intro> <!--23--> <section shorttitle="What is markup?"> <!--24--> <title>What is markup, or everything you always wanted to know about <!--25--> document preparation but were afraid to ask?</title> <!--26--> <intro> <!--27--> <para>Markup is everything in a document that is not content. The <!--28--> traditional meaning of markup is the manual <emph>marking</emph> up <!--29--> of typewritten text to give instructions for a typesetter or <!--30--> compositor about how to fit the text on a page and what typefaces to <!--31--> use. This kind of markup is known as <emph>procedural markup</emph>.</para></intro> <!--32--> <topic topicid="top1"> <!--33--> <title>Procedural markup</title> <!--34--> <para>Most electronic publishing systems today use some form of <!--35--> procedural markup. Procedural markup codes are good for one <!--36--> presentation of the information.</para></topic> <!--37--> <topic topicid="top2"> <!--38--> <title>Generic markup</title> <!--39--> <para>Generic markup (also known as descriptive markup) describes the <!--40--> <emph>purpose</emph> of the text in a document. A basic concept of <!--41--> generic markup is that the content of a document must be separate from <!--42--> the style. Generic markup allows for multiple presentations of the <!--43--> information.</para></topic> <!--44--> <topic topicid="top3"> <!--45--> <title>Drawbacks of procedural markup</title> <!--46--> <para>Industries involved in technical documentation increasingly <!--47--> prefer generic over procedural markup schemes. When a company changes <!--48--> software or hardware systems, enormous data translation tasks arise, <!--49--> often resulting in errors.</para></topic></section> <!--50--> <section shorttitle="What is SGML?"> <!--51--> <title>What <emph>is</emph> SGML in the grand scheme of the universe, anyway?</title> <!--52--> <intro> <!--53--> <para>SGML defines a strict markup scheme with a syntax for defining <!--54--> document data elements and an overall framework for marking up <!--55--> documents.</para> <!--56--> <para>SGML can describe and create documents that are not dependent on <!--57--> any hardware, software, formatter, or operating system. Since SGML documents <!--58--> conform to an international standard, they are portable.</para></intro></section> <!--59--> <section shorttitle="How does SGML work?"> <!--60--> <title>How is SGML and would you recommend it to your grandmother?</title> <!--61--> <intro> <!--62--> <para>You can break a typical document into three layers: structure, <!--63--> content, and style. SGML works by separating these three aspects and <!--64--> deals mainly with the relationship between structure and content.</para></intro> <!--65--> <topic topicid="top4"> <!--66--> <title>Structure</title> <!--67--> <para>At the heart of an SGML application is a file called the DTD, or <!--68--> Document Type Definition. The DTD sets up the structure of a document, <!--69--> much like a database schema describes the types of information it <!--70--> handles.</para> <!--71--> <para>A database schema also defines the relationships between the <!--72--> various types of data. Similarly, a DTD specifies <emph>rules</emph> <!--73--> to help ensure documents have a consistent, logical structure.</para></topic> <!--74--> <topic topicid="top5"> <!--75--> <title>Content</title> <!--76--> <para>Content is the information itself. The method for identifying <!--77--> the information and its meaning within this framework is called <!--78--> <emph>tagging</emph>. Tagging must <!--79--> conform to the rules established in the DTD (see <xref xrefid="top4"/>).</para> <!--80--> <graphic graphname="tagexamp"/></topic> <!--81--> <topic topicid="top6"> <!--82--> <title>Style</title> <!--83--> <para>SGML does not standardize style or other processing methods for <!--84--> information stored in SGML.</para></topic></section></chapter> <!--85--> <chapter> <!--86--> <title>Resources</title> <!--87--> <section> <!--88--> <title>Conferences, tutorials, and training</title> <!--89--> <intro> <!--90--> <para>The Graphic Communications Association has been <!--91--> instrumental in the development of SGML. GCA provides conferences, <!--92--> tutorials, newsletters, and publication sales for both members and <!--93--> non-members.</para> <!--94--> <para security="c">Exiled members of the former Soviet Union's secret <!--95--> police, the KGB, have infiltrated the upper ranks of the GCA and are <!--96--> planning the Final Revolution as soon as DSSSL is completed.</para> <!--97--> </intro> <!--98--> </section> <!--99--> </chapter> <!--100--></report>
Locate all paragraphs in the report (all "para" elements occurring anywhere within the "report" element).
Expected Results: Elements whose start-tags are on lines 6, 11, 20, 27, 34, 39, 46, 53, 56, 62, 67, 71, 76, 83, 90, 94
Locate all paragraph elements in an introduction (all "para" elements directly contained within an "intro" element).
Expected Results: Elements whose start-tags are on lines 6, 11, 20, 27, 53, 56, 62, 90, 94
Locate all paragraphs in the introduction of a section that is in a chapter that has no introduction (all "para" elements directly contained within an "intro" element directly contained in a "section" element directly contained in a "chapter" element. The "chapter" element must not directly contain an "intro" element).
Expected Results: Elements whose start-tags are on lines 90, 94
Locate the second paragraph in the third section in the second chapter (the second "para" element occurring in the third "section" element occurring in the second "chapter" element occurring in the "report").
Expected Results: Element whose start-tag is on line 67
Locate all classified paragraphs (all "para" elements whose "security" attribute has the value "c").
Expected Results: Element whose start-tag is on line 94
List the short titles of all sections (the values of the "shorttitle" attributes of all "section" elements, expressing each short title as the value of a new element.)
Expected Results: Attribute values in start-tags on lines 23, 50, 59
Locate the initial letter of the initial paragraph of all introductions (the first character in the content [character content as well as element content] of the first "para" element contained in an "intro" element).
Expected Results: Character after start-tag on lines 6, 20, 27, 53, 62, 90
Locate all sections with a title that has "is SGML" in it (all "section" elements that contain a "title" element that has the consecutive characters "is SGML" in its content). The string can be interrupted by sub-elements.
Expected Results: Elements whose start-tags are on lines 51, 60
Same as (Q8a), but the string cannot be interrupted by sub-elements.
Expected Results: Element whose start-tag is on line 60
Locate all the topics referenced by a cross-reference anywhere in the report (all the "topic" elements whose "topicid" attribute value is the same as an "xrefid" attribute value of any "xref" element).
Expected Results: Element whose start-tag is on line 65
Locate the closest title preceding the cross-reference ("xref") element whose "xrefid" attribute is "top4" (the "title" element that would be touched last before this "xref" element when touching each element in document order).
Expected Results: Given xref on line 79, element whose start-tag is on line 75
This use case is based on company profiles and a set of news documents which contain data for PR, mergers and acquisitions, etc. Given a company, the use case illustrates several different queries for searching text in news documents and different ways of providing query results by matching the information from the company profile and the content of the news items.
In this use case, searches for company names are to be interpreted as word-based searches. The words in a company name may be in any case and may be separated by any kind of white space.
This use case is based on an implicit (unnamed) input data set based on the following two DTDs:
<!-- company.dtd --> <!ELEMENT company (name, ticker_symbol?, description?, business_code, partners?, competitors?)> <!ELEMENT name (#PCDATA)> <!ELEMENT ticker_symbol (#PCDATA)> <!ELEMENT description (#PCDATA)> <!ELEMENT business_code (#PCDATA)> <!ELEMENT partners (partner+)> <!ELEMENT partner (#PCDATA)> <!ELEMENT competitors (competitor+)> <!ELEMENT competitor (#PCDATA)>
<!-- news.dtd --> <!ELEMENT news (news_item*)> <!ELEMENT news_item (title, content, date, author?, news_agent)> <!ELEMENT title (#PCDATA)> <!ELEMENT content (par | figure)+ > <!ELEMENT date (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT news_agent (#PCDATA)> <!ELEMENT par (#PCDATA | quote | footnote)*> <!ELEMENT quote (#PCDATA)> <!ELEMENT footnote (#PCDATA)> <!ELEMENT figure (title, image)> <!ELEMENT image EMPTY> <!ATTLIST image source CDATA #REQUIRED >
The queries in this use case are based on the following sample data.
<?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE company SYSTEM "company.dtd"> <company> <name>Foobar Corporation</name> <ticker_symbol>FOO</ticker_symbol> <description>Foobar Corporation is a maker of Foo(TM) and Foobar(TM) products and a leading software company with a 300 Billion dollar revenue in 1999. It is located in Alaska. </description> <business_code>Software</business_code> <partners> <partner>YouNameItWeIntegrateIt.com</partner> <partner>TheAppCompany Inc.</partner> </partners> <competitors> <competitor>Gorilla Corporation</competitor> </competitors> </company>
<?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE news SYSTEM "news.dtd"> <news> <news_item> <title> Gorilla Corporation acquires YouNameItWeIntegrateIt.com </title> <content> <par> Today, Gorilla corparation announced that it will purchase YouNameItWeIntegrateIt.com. The shares of YouNameItWeIntegrateIt.com dropped $3.00 as a result of this announcement. </par> <par> As a result of this acquisition, the CEO of YouNameItWeIntegrateIt.com Bill Smarts resigned. He did not announce what he will do next. Sources close to YouNameItWeIntegrateIt.com hint that Bill Smarts might be taking a position in Foobar Corporation. </par> <par> YouNameItWeIntegrateIt.com is a leading systems integrator that enables <quote>brick and mortar</quote> companies to have a presence on the web. </par> </content> <date>1-20-2000</date> <author>Mark Davis</author> <news_agent>News Online</news_agent> </news_item> <news_item> <title>Foobar Corporation releases its new line of Foo products today</title> <content> <par> Foobar Corporation releases the 20.9 version of its Foo products. The new version of Foo products solve known performance problems which existed in 20.8 line and increases the speed of Foo based products tenfold. It also allows wireless clients to be connected to the Foobar servers. </par> <par> The President of Foobar Corporation announced that they were proud to release 20.9 version of Foo products and they will upgrade existing customers <footnote>where service agreements exist</footnote> promptly. TheAppCompany Inc. immediately announced that it will release the new version of its products to utilize the 20.9 architecture within the next three months. </par> <figure> <title>Presidents of Foobar Corporation and TheAppCompany Inc. Shake Hands</title> <image source="handshake.jpg"/> </figure> </content> <date>1-20-2000</date> <news_agent>Foobar Corporation</news_agent> </news_item> <news_item> <title>Foobar Corporation is suing Gorilla Corporation for patent infringement </title> <content> <par> In surprising developments today, Foobar Corporation announced that it is suing Gorilla Corporation for patent infringement. The patents that were mentioned as part of the lawsuit are considered to be the basis of Foobar Corporation's <quote>Wireless Foo</quote> line of products. </par> <par> The tension between Foobar and Gorilla Corporations has been increasing ever since the Gorilla Corporation acquired more than 40 engineers who have left Foobar Corporation, TheAppCompany Inc. and YouNameItWeIntegrateIt.com over the past 3 months. The engineers who have left the Foobar corporation and its partners were rumored to be working on the next generation of server products and applications which will directly compete with Foobar's Foo 20.9 servers. Most of the engineers have relocated to Hawaii where the Gorilla Corporation's server development is located. </par> </content> <date>1-20-2000</date> <news_agent>Reliable News Corporation</news_agent> </news_item> </news>
Find all news items where the name "Foobar Corporation" appears in the title.
Expected Results: The expected results are the news item elements with the following titles:
Foobar Corporation releases its new line of Foo products today
Foobar Corporation is suing Gorilla Corporation for patent infringement
Find news items where the Foobar Corporation and one or more of its partners are mentioned in the same paragraph and/or title. List each news item by its title and date.
<news_item> <title>Gorilla Corporation acquires YouNameItWeIntegrateIt.com</title> <date>1-20-2000</date> </news_item> <news_item> <title>Foobar Corporation releases its new line of Foo products today</title> <date>1-20-2000</date> </news_item> <news_item> <title>Foobar Corporation is suing Gorilla Corporation for patent infringement</title> <date>1-20-2000</date> </news_item>
Find titles of news items where Foobar Corporation and one or more of its partners are mentioned in the same sentence, but none of its competitors are mentioned in the news item. (The "." character designates the end of a sentence.)
<title>Foobar Corporation releases its new line of Foo products today </title>
Find news items where a company and one of its partners is mentioned in the same news item and the news item is not authored by the company itself.
Expected Results: The expected results are the news item elements with the following titles:
Gorilla Corporation acquires YouNameItWeIntegrateIt.com
Foobar Corporation is suing Gorilla Corporation for patent infringement
For each news item that is relevant to the Gorilla Corporation, create an "item summary" element. The content of the item summary is the content of the title, date, and first paragraph of the news item, separated by periods. A news item is relevant if the name of the company is mentioned anywhere within the content of the news item.
<item_summary> Gorilla Corporation acquires YouNameItWeIntegrateIt.com. 1-20-2000. Today, Gorilla corparation announced that it will purchase YouNameItWeIntegrateIt.com. The shares of YouNameItWeIntegrateIt.com dropped $3.00 as a result of this announcement. </item_summary> <item_summary> Foobar Corporation is suing Gorilla Corporation for patent infringement. 1-20-2000. In surprising developments today, Foobar Corporation announced that it is suing Gorilla Corporation for patent infringement. The patents that were mentioned as part of the lawsuit are considered to be the basis of Foobar Corporation's <quote>Wireless Foo</quote> line of products. </item_summary>
Find news items where two company names and some form of the word "acquire" appear in the title or in the same sentence in one of the paragraphs. A company name is defined as the content of a <name>, <partner>, or <competitor> element within a <company> element.
Expected Results: The expected results are the news item elements with the following titles:
Gorilla Corporation acquires YouNameItWeIntegrateIt.com
Foobar Corporation is suing Gorilla Corporation for patent infringement
This use case performs a variety of queries on namespace-qualified names.
This use case is based on a scenario in which a neutral mediator is acting with public auction servers on behalf of clients. The reason for a client to use this imaginary service may be anonymity, better insurance, or the possibility to cover more than one market at a time. The following aspects of namespaces are illustrated by this use case:
Syntactic disambiguation when combining XML data from different sources
Re-use of predefined modules, such as XLinks or XML Schema
Support for global classification schemas, such as the Dublin Core
The sample data consists of three records. The schema used for this data uses W3C XML Schema's schema composition to create a schema from predefined, namespace separated modules, and uses XLink to express references. Each record describes a running auction. It embeds data specific to an auctioneer (e.g. the company's credit rating system) and a taxonomy specific to a particular good (jazz records) in a framework that contains data common to all auctions (e.g. start and end time), using namespaces to distinguish the three vocabularies.
Note that namespace prefixes must be resolved to their Namespace URIs before matching namespace qualified names. It is not sufficient to use the literal prefixes to denote namespaces. Furthermore, there are several possible ways to represent namespace declarations. Therefore, processing must be done on the namespace processed XML Information Set, not on the XML text representation.
DTDs are not fully compatible with namespaces as they can not express the equality of nodes in the same namespace, but different namespace proxies. In a later version of this paper, an XML Schema should be added here.
This use case is based on an implicit (unnamed) input data set.
<?xml version="1.0" encoding="ISO-8859-1"?> <ma:AuctionWatchList xmlns:ma="http://www.AuctionMediatorCompany.com/AuctionWatch" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:anyzone="http://www.AuctionMediatorCompany.com/auctioneers#anyzone" xmlns:eachbay="http://www.AuctionMediatorCompany.com/auctioneers#eachbay" xmlns:yabadoo="http://www.AuctionMediatorCompany.com/auctioneers#yabadoo" > <!-- ________________________________________________________________________________ --> <ma:Auction anyzone:ID="0321K372910"> <ma:AuctionHomepage xlink:type="simple" xlink:href="http://auction.anyzone.com/item/0321K372910" /> <ma:Schedule> <ma:Open xmlns:dt="http://www.w3.org/1999/XMLSchema-datatypes" dt:type="timeInstant">2000-03-21:07:41:34-05:00</ma:Open> <ma:Close xmlns:dt="http://www.w3.org/1999/XMLSchema-datatypes" dt:type="timeInstant">2000-03-23:07:41:34-05:00</ma:Close> </ma:Schedule> <ma:Price> <ma:Start ma:currency="USD">3.00</ma:Start> <ma:Current ma:currency="USD">10.00</ma:Current> <ma:Number_of_Bids>5</ma:Number_of_Bids> </ma:Price> <ma:Trading_Partners> <ma:High_Bidder> <eachbay:ID>RecordsRUs</eachbay:ID> <eachbay:PositiveComments>231</eachbay:PositiveComments> <eachbay:NeutralComments>2</eachbay:NeutralComments> <eachbay:NegativeComments>5</eachbay:NegativeComments> <ma:MemberInfoPage xlink:type="simple" xlink:href="http://auction.eachbay.com/members?get=RecordsRUs" xlink:role="ma:MemberInfoPage" /> </ma:High_Bidder> <ma:Seller> <anyzone:ID>VintageRecordFreak</anyzone:ID> <anyzone:Member_Since>October 1999</anyzone:Member_Since> <anyzone:Rating>5</anyzone:Rating> <ma:MemberInfoPage xlink:type="simple" xlink:href="http://auction.anyzone.com/members/VintageRecordFreak" xlink:role="ma:MemberInfoPage" /> </ma:Seller> </ma:Trading_Partners> <ma:Details> <record xmlns="http://www.musicdatabase.org/music/records"> <artist>Miles Davis</artist> <title>In a Silent Way</title> <recorded>1969</recorded> <label>Columbia Records</label> <comment> With Miles Davis (trumpet), Herbie Hancock (Electric Piano), Chick Corea (Electric Piano), Wayne Shorter (Tenor Sax), Josef Zawinul (Electric Piano & Organ), John McLaughlin (Guitar), and Tony Williams ( Drums). The liner notes were written by Frank Glenn, and the record is in fine condition. </comment> </record> </ma:Details> </ma:Auction> <!-- ________________________________________________________________________________ --> <ma:Auction yabadoo:ID="13143816"> <ma:AuctionHomepage xlink:type="simple" xlink:href="http://auctions.yabadoo.com/auction/13143816" /> <ma:Schedule> <ma:Open xmlns:dt="http://www.w3.org/1999/XMLSchema-datatypes" dt:type="timeInstant">2000-03-19:17:03:00-04:00</ma:Open> <ma:Close xmlns:dt="http://www.w3.org/1999/XMLSchema-datatypes" dt:type="timeInstant">2000-03-29:17:03:00-04:00</ma:Close> </ma:Schedule> <ma:Price> <ma:Start ma:currency="USD">3.00</ma:Start> <ma:Current ma:currency="USD">3.00</ma:Current> <ma:Number_of_Bids>0</ma:Number_of_Bids> </ma:Price> <ma:Trading_Partners> <ma:High_Bidder> <eachbay:ID>VintageRecordFreak</eachbay:ID> <eachbay:PositiveComments>232</eachbay:PositiveComments> <eachbay:NeutralComments>0</eachbay:NeutralComments> <eachbay:NegativeComments>0</eachbay:NegativeComments> <ma:MemberInfoPage xlink:type="simple" xlink:href="http://auction.eachbay.com/showRating/user=VintageRecordFreak" xlink:role="ma:MemberInfoPage" /> </ma:High_Bidder> <ma:Seller xmlns:seller="http://www.AuctionMediatorCompany.com/auctioneers#eachbay"> <seller:ID>StarsOn45</seller:ID> <seller:PositiveComments>80</seller:PositiveComments> <seller:NeutralComments>1</seller:NeutralComments> <seller:NegativeComments>2</seller:NegativeComments> <ma:MemberInfoPage xlink:type="simple" xlink:href="http://auction.eachbay.com/showRating/user=StarsOn45" xlink:role="ma:MemberInfoPage" /> </ma:Seller> </ma:Trading_Partners> <ma:Details> <record xmlns="http://www.musicdatabase.org/music/records"> <artist>Wynton Marsalis</artist> <title>Think of One ...</title> <recorded>1983</recorded> <label>Columbia Records</label> <comment xml:lang="en"> Columbia Records 12" 33-1/3 rpm LP, #FC-38641, Stereo. The record is still clean and shiny and looks unplayed (looks like NM condition). The cover has very light surface and edge wear. </comment> <comment xml:lang="de"> Columbia Records 12" 33-1/3 rpm LP, #FC-38641, Stereo. Die Platte ist noch immer sauber und glänzend und sieht ungespielt aus (NM Zustand). Das Cover hat leichte Abnutzungen an Oberfläche und Ecken. </comment> </record> </ma:Details> </ma:Auction> </ma:AuctionWatchList>
List all unique namespaces used in the sample data.
<aux:Q1> http://www.AuctionMediatorCompany.com/AuctionWatch http://www.w3.org/1999/xlink http://www.AuctionMediatorCompany.com/auctioneers#anyzone http://www.AuctionMediatorCompany.com/auctioneers#eachbay http://www.AuctionMediatorCompany.com/auctioneers#yabadoo http://www.w3.org/1999/XMLSchema-datatypes http://www.musicdatabase.org/music/records </aux:Q1>
Select the title of each record that is for sale.
<aux:Q2 xmlns:music="http://www.musicdatabase.org/music/records"> <music:title>In a Silent Way</music:title> <music:title>Think of One ...</music:title> </aux:Q2>
Select all elements using datatypes from "XML Schema: Part 2" datatypes.
<aux:Q3 xmlns:dt="http://www.w3.org/1999/XMLSchema-datatypes" xmlns:ma="http://www.AuctionMediatorCompany.com/AuctionWatch" > <ma:Open dt:type="timeInstant">2000-03-21:07:41:34-05:00</ma:Open> <ma:Close dt:type="timeInstant">2000-03-23:07:41:34-05:00</ma:Close> <ma:Open dt:type="timeInstant">2000-03-19:17:03:00-04:00</ma:Open> <ma:Close dt:type="timeInstant">2000-03-29:17:03:00-04:00</ma:Close> </aux:Q3>
Select all XLinks in the document.
<aux:Q4 xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ma="http://www.AuctionMediatorCompany.com/AuctionWatch" > <ma:AuctionHomepage xlink:type="simple" xlink:href="http://auction.anyzone.com/item/0321K372910" /> <ma:MemberInfoPage xlink:type="simple" xlink:href="http://auction.eachbay.com/members?get=RecordsRUs" xlink:role="ma:MemberInfoPage" /> <ma:MemberInfoPage xlink:type="simple" xlink:href="http://auction.anyzone.com/members/VintageRecordFreak" xlink:role="ma:MemberInfoPage" /> <ma:AuctionHomepage xlink:type="simple" xlink:href="http://auctions.yabadoo.com/auction/13143816" /> <ma:MemberInfoPage xlink:type="simple" xlink:href="http://auction.eachbay.com/showRating/user=VintageRecordFreak" xlink:role="ma:MemberInfoPage" /> <ma:MemberInfoPage xlink:type="simple" xlink:href="http://auction.eachbay.com/showRating/user=StarsOn45" xlink:role="ma:MemberInfoPage" /> </aux:Q4>
Select all records which have a comment in German.
<aux:Q5 xmlns:music="http://www.musicdatabase.org/music/records"> <music:record> <music:artist>Wynton Marsalis</music:artist> <music:title>Think of One ...</music:title> <music:recorded>1983</music:recorded> <music:label>Columbia Records</music:label> <music:comment xml:lang="en"> Columbia Records 12" 33-1/3 rpm LP, #FC-38641, Stereo. The record is still clean and shiny and looks unplayed (looks like NM condition). The cover has very light surface and edge wear. </music:comment> <music:comment xml:lang="de"> Columbia Records 12" 33-1/3 rpm LP, #FC-38641, Stereo. Die Platte ist noch immer sauber und glänzend und sieht ungespielt aus (NM Zustand). Das Cover hat leichte Abnutzungen an Oberfläche und Ecken. </music:comment> </music:record> </aux:Q5>
Select the closing time elements of all AnyZone auctions currently monitored.
<aux:Q6 xmlns:dt="http://www.w3.org/1999/XMLSchema-datatypes"> <ma:Close dt:type="timeInstant">2000-03-23:07:41:34-05:00</ma:Close> </aux:Q6>
Select the homepage of all auctions where both seller and high bidder are registered at the same auctioneer.
<aux:Q7 xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ma="http://www.AuctionMediatorCompany.com/AuctionWatch" > <ma:AuctionHomepage xlink:type="simple" xlink:href="http://auctions.yabadoo.com/auction/13143816" /> </aux:Q7>
Select all traders (either seller or high bidder) without negative comments
<aux:Q8 xmlns:ma="http://www.AuctionMediatorCompany.com/AuctionWatch" xmlns:eachbay="http://www.AuctionMediatorCompany.com/auctioneers#eachbay" xmlns:xlink="http://www.w3.org/1999/xlink" > <ma:High_Bidder> <eachbay:ID>VintageRecordFreak</eachbay:ID> <eachbay:PositiveComments>232</eachbay:PositiveComments> <eachbay:NeutralComments>0</eachbay:NeutralComments> <eachbay:NegativeComments>0</eachbay:NegativeComments> <ma:MemberInfoPage xlink:type="simple" xlink:href="http://auction.eachbay.com/showRating/user=VintageRecordFreak" xlink:role="ma:MemberInfoPage" /> </ma:High_Bidder> </aux:Q8>
This use case illustrates how a recursive query might be used to construct a hierarchic document of arbitrary depth from flat structures stored in a database.
This use case is based on a "parts explosion" database that contains information about how parts are used in other parts.
The input to the use case is a "flat" document in which each different part is represented by a <part> element with partid and name attributes. Each part may or may not be part of a larger part; if so, the partid of the larger part is contained in a partof attribute. This input document might be derived from a relational database in which each part is represented by a row of a table with partid as primary key and partof as a foreign key referencing partid.
The challenge of this use case is to write a query that converts the "flat" representation of the parts explosion, based on foreign keys, into a hierarchic representation in which part containment is represented by the structure of the document.
The input data set uses the following DTD:
<!DOCTYPE partlist [ <!ELEMENT partlist (part*)> <!ELEMENT part EMPTY> <!ATTLIST part partid CDATA #REQUIRED partof CDATA #IMPLIED name CDATA #REQUIRED> ]>
Although the partid
and partof
attributes could
have been of type ID and IDREF, respectively, in this schema they are treated
as character data, possibly materialized in a straightforward way from a
relational database. Each partof
attribute matches exactly
one partid
. Parts having no partof
attribute are not
contained in any other part.
The output data conforms to the following DTD:
<!DOCTYPE parttree [ <!ELEMENT parttree (part*)> <!ELEMENT part (part*)> <!ATTLIST part partid CDATA #REQUIRED name CDATA #REQUIRED> ]>
<?xml version="1.0" encoding="ISO-8859-1"?> <partlist> <part partid="0" name="car"/> <part partid="1" partof="0" name="engine"/> <part partid="2" partof="0" name="door"/> <part partid="3" partof="1" name="piston"/> <part partid="4" partof="2" name="window"/> <part partid="5" partof="2" name="lock"/> <part partid="10" name="skateboard"/> <part partid="11" partof="10" name="board"/> <part partid="12" partof="10" name="wheel"/> <part partid="20" name="canoe"/> </partlist>
Convert the sample document from "partlist" format to "parttree" format (see DTD section for definitions). In the result document, part containment is represented by containment of one <part> element inside another. Each part that is not part of any other part should appear as a separate top-level element in the output document.
The expected result of running the query on the sample data is as follows:
<?xml version="1.0" encoding="ISO-8859-1"?> <parttree> <part partid="0" name="car"> <part partid="1" name="engine"> <part partid="3" name="piston"/> </part> <part partid="2" name="door"> <part partid="4" name="window"/> <part partid="5" name="lock"/> </part> </part> <part partid="10" name="skateboard"> <part partid="11" name="board"/> <part partid="12" name="wheel"/> </part> <part partid="20" name="canoe"/> </parttree>
References are an important aspect of XML. This use case describes a database in which references play a significant role, and contains several representative queries that exploit these references.
Suppose that the file "census.xml" contains an element for each person recorded in a recent census. For each person element, the person's name, job, and spouse (if any) are recorded as attributes. The "spouse" attribute is an IDREF-type attribute that matches the ID-type "name" attribute of the spouse element.
The parent-child relationship among persons is recorded by containment in the element hierarchy. In other words, the element that represents a child is contained within the element that represents the child's father or mother. Due to deaths, divorces, and remarriages, a child might be recorded under either its father or its mother (but not both). For the purposes of this exercise, the term "children of X" includes "children of the spouse of X." For example, if Joe and Martha are spouses, and Joe's element contains an element Sam, and Martha's element contains an element Dave, then Joe's children are considered to be Sam and Dave, and Martha's children are also considered to be Sam and Dave. Each person in the census has zero, one, or two parents.
This use case is based on an input document named "census.xml", with the following DTD:
<!DOCTYPE census [ <!ELEMENT census (person*)> <!ELEMENT person (person*)> <!ATTLIST person name ID #REQUIRED spouse IDREF #IMPLIED job CDATA #IMPLIED > ]>
The following census data describes two friendly families that have several intermarriages.
<census> <person name="Bill" job="Teacher"> <person name="Joe" job="Painter" spouse="Martha"> <person name="Sam" job="Nurse"> <person name="Fred" job="Senator" spouse="Jane"> </person> </person> <person name="Karen" job="Doctor" spouse="Steve"> </person> </person> <person name="Mary" job="Pilot"> <person name="Susan" job="Pilot" spouse="Dave"> </person> </person> </person> <person name="Frank" job="Writer"> <person name="Martha" job="Programmer" spouse="Joe"> <person name="Dave" job="Athlete" spouse="Susan"> </person> </person> <person name="John" job="Artist"> <person name="Helen" job="Athlete"> </person> <person name="Steve" job="Accountant" spouse="Karen"> <person name="Jane" job="Doctor" spouse="Fred"> </person> </person> </person> </person> </census>
Find Martha's spouse.
<person name="Joe" job="Painter" spouse="Martha" />
Find Joe's children.
<result> <person name="Sam" job="Nurse" /> <person name="Karen" job="Doctor" spouse="Steve" /> <person name="Dave" job="Athlete" spouse="Susan" /> </result>
Find parents of athletes.
<result> <person name="Joe" job="Painter" spouse="Martha" /> <person name="Martha" job="Programmer" spouse="Joe" /> <person name="John" job="Artist" /> </result>
Find people who have the same job as one of their parents.
<result> <person name="Susan" job="Pilot" spouse="Dave" /> <person name="Jane" job="Doctor" spouse="Fred" /> </result>
List names of parents and children who have the same job, and their jobs.
<result> <match parent="Mary" child="Susan" job="Pilot" /> <match parent="Karen" child="Jane" job="Doctor" /> </result>
Find Bill's grandchildren.
<result> <person name="Sam" job="Nurse" /> <person name="Karen" job="Doctor" spouse="Steve" /> <person name="Susan" job="Pilot" spouse="Dave" /> <person name="Dave" job="Athlete" spouse="Susan" /> </result>
List name-pairs of grandparents and grandchildren.
<result> <grandparent name="Bill" grandchild="Sam" /> <grandparent name="Bill" grandchild="Karen" /> <grandparent name="Bill" grandchild="Susan" /> <grandparent name="Bill" grandchild="Dave" /> <grandparent name="Frank" grandchild="Dave" /> <grandparent name="Frank" grandchild="Helen" /> <grandparent name="Frank" grandchild="Steve" /> <grandparent name="Frank" grandchild="Sam" /> <grandparent name="Frank" grandchild="Karen" /> <grandparent name="Joe" grandchild="Fred" /> <grandparent name="Joe" grandchild="Jane" /> <grandparent name="Martha" grandchild="Fred" /> <grandparent name="Martha" grandchild="Jane" /> <grandparent name="John" grandchild="Jane" /> </result>
Find Dave's parents-in-law (parents of his spouse, if any).
<result> <person name="Mary" job="Pilot" /> </result>
Find people with no children.
<result> <person name="Fred" job="Senator" spouse="Jane" /> <person name="Susan" job="Pilot" spouse="Dave" /> <person name="Dave" job="Athlete" spouse="Susan" /> <person name="Helen" job="Athlete" /> <person name="Jane" job="Doctor" spouse="Fred" /> </result>
Find single parents (people with children but no spouse)
<result> <person name="Bill" job="Teacher" /> <person name="Sam" job="Nurse" /> <person name="Mary" job="Pilot" /> <person name="Frank" job="Writer" /> <person name="John" job="Artist" /> </result>
List the names of all Joe's descendants. Show each descendant as an element with the descendant's name as content and his or her marital status and number of children as attributes. Sort the descendants in descending order by number of children, and secondarily in alphabetical order by name.
<result> <descendant married="Yes" kids="1">Karen</descendant> <descendant married="No" kids="1">Sam</descendant> <descendant married="Yes" kids="0">Dave</descendant> <descendant married="Yes" kids="0">Fred</descendant> <descendant married="Yes" kids="0">Jane</descendant> </result>
The editors thank the members of the XML Query Working Group, which produced the material in this document.
The use cases in this document were contributed by the following individuals:
Use Case "R" | Don Chamberlin |
Use Case "XMP" | Mary Fernandez, Jerome Simeon, Phil Wadler |
Use Case "TREE" | Jonathan Robie |
Use Case "PARTS" | Michael Rys |
Use Case "NS" | Ingo Macherius |
Use Case "REF" | Don Chamberlin |
Use Case "TEXT" | Umit Yalcinalp |
Use Case "SEQ" | Jonathan Robie |
Use Case "SGML" | Paula Angerstein |
Use case "XMP" has been previously published in [Fernandez]. Use cases "Tree" and "Seq" have been previously published in [Robie99].
The editors also wish to thank the members of the other W3C Working Groups who have commented on earlier drafts.