Research notebook: On edge-labelled graphs in XML

This document serves as an informal survey of XML applications that adopt an edge-labelled graph data model similar to that used in W3C's Resource Description Framework (RDF). It also points to discussion and proposals regarding improvements to the RDF XML syntax. Hopefully these pointers will prove useful even while the document is incomplete.

Status: very very sketchy, disorganised collection of URLs, excerpts and commentary. This is a pile of stuff for some (possibly non-existent) future version to flesh out. Suggestions for other related materials welcomed (W3 folk, feel free to edit directly).

Author: dan brickley

Overview

There seems to be some consensus around the claim that RDF has a useful data model but a problematic XML syntax. This document is an attempt to gather together the various discussion documents and proposals that relate this topic to the broader context of XML-based graph serialization systems.

XML Graph resources

@@in-progress. This is a dump of some resources / references to collect and summarise.

RDF Syntax proposals are tracked in the discussion documents section of the RDF Interest Group home page.

Layman et al

XML-Data submission. Subsequent publications in same tradition: QL'98 position paper from Layman, "XML Syntax Recommendation for Serializing Graphs of Data (Dec 2nd 1998).

BizTalk white papers: Serializing Graphs of Data in XML, Adam Bosworth, Andrew Layman, Michael Rys:

XML is evolving as the standard format of exchanging data among heterogeneous, distributed computer systems and as such is used to represent data of various origins in a common format. Often, this data possesses rich structure and represents relationship among various entities. These relationships form graphs, where the relations are directed from one entity to another (and may have inverses) and where there may be multiple paths to an entity. Thus, an important goal of the encoding of this data is to preserve the exact graph structure in the serialization to XML. The aim of this paper is to describe a specific way to use XML to serialize graphs of data (such as database tables and relations or nodes and edges from directed labeled graphs) in such a way that the graph structure is preserved and can be reconstructed.
Biztalk.org Resources: Canonical Reference Thu, 01 Jan 1970 00:00:00 GMT
The elements and attributes in an XML document often are a representation of objects from a specific data model such as Directed-Labelled-Graph or Database Relations. We can annotate a schema so that a reader can determine the mapping from an XML document instance to an instance of the other data model. Mapping information for a specific data model is expressed using attributes from a namespace specific to the mapping. Each mapping system will have its own rules. For example, mapping from XML instances to Directed-Labelled-Graph instances has the rule that all attributes and all elements whose names differ from their type represent edges. However, elements without a name distinct from the type may represent either nodes or edges, and we must indicate which by using a role attribute in the type declaration in the schema.
XML Schemas NG Guide Mon, 14 Jun 1999 05:54:24 GMT

XML-DEV discussion

xml-dev threads: object-oriented serialization, dec 1999.

I honestly feel that XML provides all the tools to do what RDF is trying to do, without an additional syntactic layer. What is missing from the picture is a mechanism for modelling object structures according to object-oriented principles, and this is why an OO schema language is necessary. The only other thing the RDF brings to the game is that it turns relationships into first-class objects that can be referenced as well
xml-dev-Dec-1999: Object-oriented serialization (Was Re: Some q Sat, 11 Dec 1999 00:14:28 GMT
If you're interested in a collection of objects in the first place, why should you have to see or know about XML elements and attributes at all? Or to put it a different way, why should people constantly have to redo the work of extracting objects from XML, when they're all trying to do the same thing? I think that reasonable people can argue that RDF is not the best solution to the problem of object exchange in XML, but I am somewhat surprised to hear people deny that the problem even exists: there is an enormous demand for exchanging objects in XML (businesses exchange a lot of structured data), and it's hard work to have to figure out over and over how to construct objects from a SAX stream or a DOM tree especially when programmers with XML knowledge are scarce and expensive. I have no doubt that we need an abstract object layer on top of XML. Right now, RDF is the best solution currently available (XMI also has its advocates), but I'm ready to listen about anything better.
xml-dev-Dec-1999: Re: Object-oriented serialization (Was Re: So Fri, 03 Dec 1999 14:25:07 GMT
As a recap: There are, broadly, two approaches to serializing a graph in XML. One is to invent a meta-grammar, a set of canonicalization rules. That is what RDF syntax did, and what the attribute-centric and element-centric canonical format papers do, what SOAP section eight does. I think of this as "tunnelling the graph through XML." The other is to allow XML documents to follow any pattern described in a schema, and augmenting the schema with a set of mapping rules. There appears to be significant value to each approach. (In particular, however, I disagree with the sometimes-asserted claim that graphs capture the semantics of a communication while grammars do not. Graphs are just another grammar. This makes me reluctant to deprecate grammars.)
xml-dev-Dec-1999: RE: Object-oriented serialization (Was Re: So Fri, 10 Dec 1999 23:07:19 GMT
In this vein, schematron-rdf at http://www.ascc.net/xml/resource/schematron/schematron.html generates RDF documents (currently with bogus XLinks, but you can customize it easily) based on Schematron schemas. In this case, the schema is not converted to RDF, rather the RDF shows which assertions in the schema apply to each element in the instance. This is a rather different use for schemas: as programs for automated annotation. The thing that became immediately clear from working on it was that RDF is good for arcs (relationships) but grammar-based schemas largely hide these relationships (between elements, attributes, data) behind a few generic but superficial types: containment, sequence, repetition. Schematron assertions now allow a "role" attribute, for labelling classes of arcs. I think developers of other schema languages might also consider this kind of thing too: that the connectors between particles of patterns (e.g., compositors in the content models in a grammar-based schema language) should have some role attribute (and documentation?) for labelling their significance. For example, if element A must be follwed by element B, to say why. The nodes that conventional schemas define (e.g. elements and attributes) are interesting, but the arcs between them can also be very interesting for automatic annotation using RDF.
xml-dev-Dec-1999: Re: Object-oriented serialization (Was Re: So Sun, 05 Dec 1999 17:58:02 GMT

Presentations by Michael Rys WWW8 (@@URL?)

Topic-maps syntax effort -- @@TODO: egroups URL, charter, example RDF mappingetc., emiller's msg...

Henrik Neilsen, WWW9 presentation on on RDF/SOAP. SOAP as an RDF serialization syntax:

At the WWW9 Conference, I was exited to give a presentation on dev day as part of the Semantic Web track on SOAP serialization. Dan has been so kind to make the slides available.

The purpose of the presentation was to explain the model behind the SOAP serialization as well as how it might be used to serialize RDF graphs. Similarly, SOAP may be used to serialize object graphs etc. For more information on the SOAP specification, see the W3C Note which was submitted May 8 by 11 W3C Member organizations.

There are also a set of more specific examples.

As is stated in the slides, take this as input rather than anything else.

Henrik Frystyk Nielsen mailto:frystyk@microsoft.com
www-rdf-interest@w3.org from August 2000: Slides from WWW9 pres Tue, 22 Aug 2000 07:32:46 GMT

Papers from Lore(l) group at stanford. Also Pensylvania work and some of the XML query proposals. Point to discussion point in XMLQ data model work.

Other XML specs: XML Schema 'edge-labelled graph' mention. XML Infoset RDF appendix (current status?). XML-Linking RDF model (Ron Daniel's Note draft). Context: XArc proposal, (X)HTML typed links. Web architecture stuff.

RDF Syntax proposals: Sergey's strawman (and Java parser). TimBL's strawman. EricP's syntax (and Perl parser).

RDF dump syntax proposal(s) on www-rdf-interest. Issues (raise one in rdf issue list on the dependencies between model + syntax.

Dan Connolly notes on Jigsaw's Java serialization system:

ntuition: RDF, SOAP, WebDAV, and Java Beans share a data model, and should be able to share many implementation details
A review of Jigsaw Tue, 29 Aug 2000 07:44:36 GMT

XSLT-based screenscraping approach. Cambridge communique, online demos, DanC stuff.

Annotated DTDs / schemata. Do we have any implementation experience of this? Henry Thompson had a good presentation on this topic (@@url??). Point into extensibility mechanism in XML Schema. Also issue (who owns this problem?) that XML Schema constructs need URIs.

Recent Changes (CVS Log)

$Log: Overview.html,v $
Revision 1.5  2000/09/06 19:38:22  danbri
minor tidyup, added H3s, XHTML valid.
linked from /RDF/Interest/

Revision 1.4  2000/09/06 14:14:15  danbri
added danc notes on jigsaw / rdf serialisation

Revision 1.3  2000/09/06 11:18:19  danbri
Added a bunch of excerpts from XML-DEV thread and related discussion, papers etc.

Revision 1.2  2000/09/05 18:53:45  danbri
added few links

maintained: dan brickley