Extending XQuery with collections, indexes, and integrity constraints

Talks

Extending XQuery with collections, indexes, and integrity constraints

Add to calendar

Event details

Date:
Coordinated Universal Time
Location:
Prague, Czech Republic
Speakers:
Matthias Brantner, Daniela Florescu, and Markos Zaharioudakis

XQuery has been designed by the World Wide Web Consortium as a general purpose XML information processing language, useful in a variety of architectures and environments. For example, XQuery can be used to process XML data on the edge of existing software architectures, where the information is temporary, and is being searched, transformed, or modified, just before being passed along for further processing to other programming languages (e.g. SQL, JAVA, Phyton, Ruby, Javascript). Another increasingly popular usage of XQuery is in XML databases or XML end-to-end architectures. In such architectures, XML is the primary form in which the information is stored and being processed, the information is persistent across successive invocations of programs, and XQuery is the primary language for accessing the information for search, filter, transform, update, and for writing more complex application workflows. Unfortunately, XQuery as it is currently standardized by the W3C is incomplete and cannot be used as such (without proprietary language extensions, or rich APis from other programming languages) in the second type of architectures: persistent databases, or XML end-to-end architectures. Unlike its cousin query language, SQL, XQuery lacks the capability to model, describe and reason about the persistent state of the "database". XQuery 1.0 does indeed have the capability to access at runtime collections of nodes, which could be envisioned as modeling the persistent state of the XML database, yet the language is underspecified in this area. Such collections have no detailed semantics (about copy, order, or multiplicity for example), the language lacks the ability to declare statically such collections, it lacks the static and/or dynamic information that is required for proper compilation and/or execution (e.g. type, update patterns), and it lacks operations to create and modify such collections. Moreover, the language lacks the ability to declare and manage access structures (e.g indexes), and integrity constraints. All such concepts are required for a complete XML/XQuery database story. Unless such concepts are included in the standard language itself, each XQuery implementation will have proprietary extensions to overcome such limitations, or such functionalities will be supplied through non XQuery rich APIs. In both cases, the portability of XQuery applications will be limited, or the simplicity and elegance of XML end-to-end architectures will be hurt. This talk proposes an extension of XQuery called XQuery Data Definition Facility (or XQDDF) to deal with such persistent artifacts: collections, indexes, and integrity constraints. The talk defines the lifetime and evolution of such artifacts: how are they declared, how do they come into existence, how are they used in the compilation and execution of XQuery programs, and how are they shared by multiple XQuery programs.