Summary of Requirements Gleaned From Workshop Position Papers 30-November-1998

Editors:
Paul Cotton (IBM) <cotton@ca.ibm.com>
Ashok Malhotra (IBM) <petsa@us.ibm.com>

Candidate Requirements for XML Query

Nov 30, 1998

Table of Contents

1. Motivation
2. XML Query Requirements
    2.1 Query Language and Structure
    2.2 Query Language Facilities
    2.3 Querying many documents from many, possibly non-XML, sources
    2.4 Security
    2.5 Using the XML Query Language
    2.6 Other Requirements
3. Papers We Did Not Consider
4. Bibliography

1. Motivation

We took the position papers that dealt with requirements for XML query, that were available online to us on November 30, and attempted to extract a list of requirements from them. This is presented below. If we have not represented your position fairly and accurately please accept our apologies and suggest corrections. If you wrote in support of a particular requirement and your paper is not cited below it is probably because we felt it more important to cover all the requirements rather than cite all the papers that wrote in support of a particular requirement. Although we have attempted to group related requirements, the grouping and ordering is not meant to indicate importance or priority.

2. XML Query Requirements

2.1 Query Language and Structure

  1. Need for non-procedural query language.
    Paper 17 (Eisenberg).
  2. XML Query should use XML syntax.
    Paper 5 (Agranat).
  3. Build upon syntax used by other XML standards.
    XPointer: Paper 15 (DeRose). XSL: Paper 38 (Schach), paper 49 (XSL), paper 50 (Bosworth) and paper 53 (MathWG). All, paper 41 (Simeonov).
  4. Ability to transmit a query as part of a URL.
    Implies syntactic constraints. Paper 46 (Vishwanath).
  5. Several query languages addressing different user sets.
    Paper 26 (Malloy).
  6. Queries should be XLink and XPointer cognizant.
    Paper 23 (Maier). Paper 40 (Shea). Paper 29 (Mecca) would like typed links. Query should support namespaces. Paper 23 (Maier) and 41 (Simeonov) discuss some issues related to namespace support.
  7. Support for querying data as well as metadata.
    Paper 28 (Masuda), paper 31 (Mihaila), paper 47 (Ward).
  8. Uniform treatment of attributes and elements.
    Paper 34 (Olken), paper 35 (Quark), paper 40 (Shea).
  9. Need to have a GUI for queries.
    Paper 35 (Quark).

2.2 Query Language Facilities

  1. Support for standard query operations.
    Several papers. Paper 23 (Maier) from a database prespective asks for: Paper 15 (DeRose) contains an excellent exposition of queries in a hierarchical space with linking.
  2. "it must express joins."
    Paper 18 (Fernandez).
  3. Support for insert, update and delete operations.
    Paper 28 (Masuda). Paper 3 (Beech) asks for transaction management.
  4. Support for nested queries.
    Paper 40 (Shea).
  5. Support for full-text queries.
    Paper 33 (Murata) discusses word containment, containment in order, wildcards and proximity queries. Also support for regular expressions. Paper 20 (Ishikawa) and Paper 31 (Mihaila) discuss wildcards and regular expressions. Paper 3 (Beech) asks for SQL-MM like facilities. Paper 37 (Rhys) wants a "mixture of exact queries on the structured part and information retrieval queries on the unstructured part."
  6. Provide facilities to construct XML documents.
    This is controversial! Paper 23 (Maier) states it as its first requirement citing the benefits of closure. Paper 33 (Murata) is equivocal: the query language may support construction but it may also return data that can be used by the environment in which it executes, such as XSL or a DOM program for construction or transformation. Paper 3 (Beech) is also equivocal. Paper 5 (Agranat) wants query to not support construction. Paper 50 (Bosworth) wants the query language to describe how the resultant graph is serialized.
  7. RDF query requirements
    such as selection based on property values, navigating over properties, boolean results from queries and support for alternate representations are discussed in papers 12 (Cranor), 14 (Decker) and 24 (Malhotra). Paper 52 (Shklar) discusses integration of full-text query with RDF query.

2.3 Querying many documents from many, possibly non-XML, sources

  1. Ability to query multiple documents.
    Paper 34 (Olken), paper 35 (Quark), paper 43 (Tompa).
  2. Ability to query distributed data stored on websites in a variety of formats: relational and OO databases, html, xml or ascii. XML query is translated to query/view on underlying data representation.
    Paper 2 (Baru), paper 5 (Agranat), paper 25 (Madnick), paper 30 (Mendelsohn), paper 43 (Tompa), paper 44 (Valkenburg), paper 46 (Vishwanath), paper 50 (Bosworth).
  3. Create XML schemas from non-XML data sources.
    Paper 29 (Mecca) as well as some of the papers that discuss querying over diverse data sources cited above.
  4. Support for "live" data: i.e. data that changes while user is viewing it.
    Paper 46 (Vishwanath).

2.4 Security

  1. "Security is essential on document collections, parts of collections, and on parts of individual documents."
    Paper 39 (Seligman). Paper 47 (Ward).
  2. Authorization on insert, update, delete operations. Paper 8 (Buneman) wants to store information about the update -- time, date, author.
    Paper 3 (Beech).

2.5 Using the XML Query Language

  1. Query should be usable on documents without a schema.
    Paper 23 (Maier).
  2. If a schema is available it should be possible to use it to check query correctness.
    Paper 5 (Agranat), paper 23 (Maier).
  3. Queries should incorporate variables from a local context.
    Paper 23 (Maier).
  4. It should be possible to run queries from several environments/contexts.
    Paper 11 (Cotton), Paper 23 (Maier). Paper 39 (Seligman).
  5. Ability to name, store and retrieve queries.
    Paper 40 (Shea).
  6. Support for annotating XML documents.
    Paper 8 (Buneman).

2.6 Other Requirements

  1. Support for constraints on elements.
    Paper 8 (Buneman) wants referential integrity. Paper 26 (Malloy) wants "A language for specifying and enforcing constraints between arbitrary document elements." Paper 37 (Rhys) wants "constraints specification and triggers." But isn't this a XML Schema requirement?
  2. Inference or Semantic Mediation
    Several papers worry about the problem of semantic mismatch between the query and the data. Paper 25 (Madnick) speaks of extracting price information where the price is expressed in different currencies. Paper 46 (Vishwanath) speaks of finding "related" or "similar" data. Paper 19 (Guha) and paper 12 (Decker) address the problem of inference mainly from a RDF perspective.

3. Papers We Did Not Consider

Papers were omitted from the above summary for a variety of reasons. Some (4,27,42,48) were not available to us on November 24. Others (1,7,9,13,16,20,21,36) proposed solutions in the form of a particular syntax or discussed specific systems rather than discussing requirements. This is prefectly legitimate in a position paper but our summary only attempted to extract requirements rather than list proposed solutions.

Paper 6 (Arocena) is a model of the web that is neither XML nor RDF. Paper 10 (Christian) discusses the Global Information Locator Service, paper 22 (LeVan) discusses online seraching from a library perspective, paper 32 (Mitchell) discusses querying business documents and paper 45 (Valkenburg) discusses query languages for scientific data. While these area are undoubtedly important, they did not provide new and different requirements for XML query. Perhaps we missed some subtle distinctions.

4. Bibliography

1. Serge Abiteboul (INRIA), Jennifer Widom, Tirthankar Lahiri (Stanford University) "A Unified Approach for Querying Structured Data and XML"

2. C. Baru, B. Ludäscher, Y. Papakonstantinou, P. Velikhov, V. Vianu "Features and Requirements for an XML View Definition Language: Lessons from XML Information Mediation"

3. David Beech (Oracle) "Position Paper on Query Languages for the Web"

4. Adam Bosworth (Microsoft) "Querying XML"

5. Agranat Systems "Agranat Systems XML QL Position"

6. Gustavo Arocena (IBM Toronto Laboratory), Alberto Mendelzon (University of Toronto), George Mihaila (University of Toronto) "Query Languages for the Web"

7. Tim Bray (Textuality) "Element Sets: A Minimal Basis for an XML Query Engine"

8. Peter Buneman, Alin Deutsch, Wenfei Fan, Hartmut Liefke, Arnaud Sahuguet, Wang-Chiew Tan (University of Pennsylvania) "Beyond XML Query Languages"

9. Stefano Ceri, Sara Comai, Ernesto Damiani, Piero Fraternali, Stefano Paraboschi, Letizia Tanca (Politecnico di Milano, Universita' di Milano) "XML-GL: A Graphical Language for Querying and Reshaping XML Documents"

10. Eliot Christian (United States Geological Survey) "Experiences with Information Locator Services"

11. Paul Cotton, David Fallside, Ashok Malhotra (IBM) "Position paper for the W3C Query Languages Workshop"

12. Stefan Decker (University of Karlsruhe), Dan Brickley (University of Bristol), Janne Saarela (W3C), Jurgen Angele (University of Karlsruhe) "A Query Service for RDF"

13. Steven J. DeRose (Inso Corporation and Brown University) "XQuery: A unified syntax for linking and querying general XML documents"

14. Lorrie Faith Cranor (AT&T) "Requirements for a P3P Query Language"

15. Steven J. DeRose (Inso and Brown University), C. M. Sperberg-McQueen (W3C and University of Illinois at Chicago), Bill Smith (Sun Microsystems) "Queries on Links and Hierarchies"

16. Alin Deutsch (University of Pennsylvania), Mary Fernandez (AT&T Labs), Daniela Florescu (INRIA), Alon Levy (University of Washington), Dan Suciu (AT&T Labs) "XML-QL"

17. Andrew Eisenberg (Sybase, Inc.) "QL'98 - Position Paper"

18. Mary Fernandez, Dan Suciu (AT&T Labs) "A Query Language for XML"

19. R.V. Guha (Netscape), Ora Lassila (Nokia), Eric Miller (OCLC), Dan Brickley (Bristol) "Enabling Inferencing"

20. Hiroshi Ishikawa, Kazumi Kubota, Yasuhiko Kanemasa (Fujitsu Laboratories Ltd.) "XQL: A Query Language for XML Data"

21. David Konopnicki, Oded Shmueli (Technion) "WWW Data and Services: Querying, Integration and Automation"

22. Ralph LeVan (OCLC Online Computer Library Center, Inc.) "Library Experience in Online Searching"

23. David Maier (Oregon Graduate Institute) "Database Desiderata for an XML Query Language"

24. Ashok Malhotra, Neel Sundaresan (IBM) "RDF Query Specification"

25. Stuart Madnick, Michael Siegel, Thomas Lee (MIT Sloan) "The COntext INterchange (COIN) Project: Data Extraction and Interpretation from Semi-Structured Web Sources"

26. Mary Ann Malloy, John C. Schneider (The MITRE Corporation) "Experiences Designing Query Languages for Hierarchically Structured Text Documents"

27. Massimo Marchiori, Janne Saarela (W3C) "Query + Metadata + Logic = Metalog"

28. Isao Masuda (Information Broadcasting Laboratories, Inc.) "Position Paper for "Query Language" Workshop"

29. Giansalvatore Mecca (Universita' della Basilicata), Paolo Merialdo (Universita' della Basilicata, Universita' di Roma Tre), Paolo Atzeni (Universita' di Roma Tre) "Do we really need a new query language for XML?"

30. Noah Mendelsohn (Lotus Development Corp.) "Query Languages Workshop Position Paper"

31. George Mihaila (University of Toronto), Louiqa Raschid (University of Maryland) "Locating Data Repositories using XML"

32. Gail Mitchell (GTE Laboratories Incorporated) "Querying Business Documents"

33. Makoto Murata (Fuji Xerox Information Systems), Jonathan Robie (Texcel Research) "Observations on Structured Document Query Languages"

34. Frank Olken, John McCarthy (Lawrence Berkeley National Laboratory) "Requirements and Desiderata for an XML Query Language"

35. Quark. Inc. "Non-Position Paper for Quark, Inc."

36. Jonathan Robie (Texcel), Joe Lapp (webMethods Inc.), David Schach (Microsoft) "XML Query Language (XQL)"

37. Michael Rys (Stanford University) "Query Languages for XML Documents: A QL '98 Position Paper"

38. David Schach (Microsoft), Joe Lapp (webMethods Inc.), Jonathan Robie (Texcel) "Querying and Transforming XML"

39. Len Seligman, Arnon Rosenthal (The MITRE Corporation) "XML Query Language Requirements of Large, Heterogeneous Organizations"

40. William Shea, Paul Kanevsky, Ramesh Lekshmynarayanan (Merrill Lynch) "QL'98 Position Paper"

41. Simeon Simeonov (Allaire Corporation) "Position paper for the W3C Query Language Workshop 3-Dec-98"

42. Ralph Swick (W3C, Cambridge, USA) "RDF, the Resource Description Framework" (Tutorial)

43. Frank Tompa (University of Waterloo) "Providing flexible access in a query language for XML"

44. Peter Valkenberg (SURFnet), Dan Brickley (University of Bristol) "Query Languages Issues in a Distributed Indexing Environment"

45. Peter Vanderbilt (NASA) "Query languages for scientific data"

46. Chidambaram Vishwanath, Gerhard Wetzel, Sankar Virdhagriswaran (Crystaliz, Inc.) "Querying Database-Backed Web Sites"

47. Nigel Ward, Renato Iannella, Hoylen Sue, Rob McArthur, Jane Hunter (DSTC) "Position Paper: DSTC Requirements for a Web Query Language"

48. Jennifer Widom (Stanford University) "Querying XML with Lore"

49. W3C XSL Working Group "The Query Language Position Paper of the XSL Working Group"

LATE ARRIVALS

50. Adam Bosworth (Microsoft) "Querying XML"

51. Ray Denenberg (Library of Congress) "The Library Perspective"

52. Leon Shklar (Pencom Web Works and Rutgers University) "QL'98 Position Paper"

53. W3C Math Working Group "The Query Language Position Paper of the Math Working Group"