W3C

XML Database Mapping

Eric's Notes

This version:
http://www.w3.org/1999/07/20-XML-DB-Mapping.html
Latest version:
http://www.w3.org/1999/07/20-XML-DB-Mapping.html
Previous versions
http://www.w3.org/1999/07/20-XML-DB-Mapping.html
Author:
Eric Prud'hommeaux <eric@w3.org>

Status of Document

This document represents an preliminary proposal for an XML database schema. It does not represent the views of the W3C team or member companies.

Table of Contents

Overview

The notion of transfering data relational databases via XML is growing in popularity. A database schema can be written in XML pretty trivially. The hard part, figuring out how to perform the inserts and updates specified by the data has been solved by anyone of a number of database interface impelementors.

Goals

In this paper, I attempt to outline a database schema and serialization that provides sufficient information for a the recipient of the data to store it in a relational database without losing synch with the presenter of the data.

Unique Keys

The secret to managing data relationships is to establish what identifies a record. For some data, the only thing that identifies a record is its primary key. This may be an employee number or a customer record (@@@ need better example). Other data, for instance, and address, may be identified by a series of fields which, when taken together, are guaranteed to be unique in the universe where that database lives.

primary keys

An employee record is likely to not have a unique on the emloyee first and last name as it is possible for a corporation to have more than one employee named "John Smith". Therefore, it is necessary to identify a particular employee by their employee id. This also helps people feel like numbers and makes them more docile and easy to pen into cubicles.

any unique keys

Field groups specified as being unique are a way of saying "There's only one of these in the universe and any time I refer to it this way, I mean the same one." An address record is likely to have a unique across the suite, street number, street name, city, state and country. This is because you want the database to be able to tell that two references to the same address are the same.

using a key

These key definitions should be all you need to uniquely identify any piece of data that you serialize into XML. If you get an employee record for for "John Smith", it better have an employee id attached if you are to be expected to add it to your relational database. On the other hand, if I maintain a secret police database and want to tag some address as being suspect, I should be able to pass that data to my bigger brother without an primary key identifier. The advantage of this is that this suspect tag may enter bigger brother's database as an update to a record that they may already be maintaining on this obvious den of ill-meaning malcontents.

Example

An employee record update may look like this:

<transfer type="employee">
  <id>123</id>
  <firstName>John</firstName>
  <lastName>Smith</lastName>
  <title>Sanitation Engineer</title>
  <jobDescription>clean up the web so it's more aesthetically pleasing<jobDescription>
</transfer>
and a secret police record transfer may look like this:
<transfer type="surveilancePoint">
  <suite>B1</suite>
  <streetNum>123</streetNum>
  <streetName>Houston</streetName>
  <city>Manhattan</city>
  <state>NY</state>
  <country>USA</country>
  <status>up to no good</status>
</transfer>

Implementation Status

I have a running, but not a all tested, object database system based on a DBI interface to a MySQL database. The system consists of an object builder that queries a database and generates . The object_builder requires guidance from the user in the form of command line interface to decide which external keys link to which other fields. It would be nice to build a pretty Tk interface, but I doubt it will happen any time soon. It also has no support for using unique keys for updates, merely for selecting records. Someday ...

The implementation currently reads a perl structure like:

%W3C::Forum::MysqlObjects::contacts::_TableDesc = ( '-primaryKey' => 'id',
                                                    '-class' => 'W3C::Forum::MysqlObjects::contacts',
                                                    '-fields' => { 'email' => { '-size' => 10,
                                                                                '-default' => '0',
                                                                                '-target' => 'uris',
                                                                                '-type' => '0' },
                                                                   'username' => { '-size' => 255,
                                                                                   '-default' => '',
                                                                                   '-type' => 4 },
                                                                   'location' => { '-size' => 64,
                                                                                   '-null' => 1,
                                                                                   '-type' => 4 },
                                                                   'comment' => { '-size' => 255,
                                                                                  '-null' => 1,
                                                                                  '-type' => 4 },
                                                                   'password' => { '-size' => 32,
                                                                                   '-null' => 1,
                                                                                   '-type' => 4 },
                                                                   'realname' => { '-size' => 64,
                                                                                   '-default' => '',
                                                                                   '-type' => 4 },
                                                                   'id' => { '-size' => 10,
                                                                             '-default' => '0',
                                                                             '-referents' => [ 'events.contact',
                                                                                               'forums.contact',
                                                                                               'links.contact',
                                                                                               'schemas.contact',
                                                                                               'skills.contact' ],
                                                                             '-type' => '0' },
                                                                   'homepage' => { '-size' => 10,
                                                                                   '-default' => '0',
                                                                                   '-target' => 'uris',
                                                                                   '-type' => '0' } },
                                                    '-index' => { 'username' => { '-fields' => { 'username' => '0' },
                                                                                  '-unique' => 1,
                                                                                  '-sequence' => [ 'username' ] } },
                                                    '-fieldOrder' => [ 'id',
                                                                       'username',
                                                                       'password',
                                                                       'realname',
                                                                       'location',
                                                                       'email',
                                                                       'homepage',
                                                                       'comment' ],
                                                    '-table' => 'contacts' );
$W3C::Forum::MysqlObjects::_AllTables{'contacts'} = \%W3C::Forum::MysqlObjects::contacts::_TableDesc;
sub W3C::Forum::MysqlObjects::contacts::getPrimaryKey {return shift->{FIELD_VALUES}{'id'};}
sub W3C::Forum::MysqlObjects::contacts::getId {return shift->{FIELD_VALUES}{'id'};}
sub W3C::Forum::MysqlObjects::contacts::getUsername {return shift->{FIELD_VALUES}{'username'};}
sub W3C::Forum::MysqlObjects::contacts::getPassword {return shift->{FIELD_VALUES}{'password'};}
sub W3C::Forum::MysqlObjects::contacts::getRealname {return shift->{FIELD_VALUES}{'realname'};}
sub W3C::Forum::MysqlObjects::contacts::getLocation {return shift->{FIELD_VALUES}{'location'};}
sub W3C::Forum::MysqlObjects::contacts::getEmail {return shift->{FIELD_VALUES}{'email'};}
sub W3C::Forum::MysqlObjects::contacts::getHomepage {return shift->{FIELD_VALUES}{'homepage'};}
sub W3C::Forum::MysqlObjects::contacts::getComment {return shift->{FIELD_VALUES}{'comment'};}

%W3C::Forum::MysqlObjects::links::_TableDesc = ( '-primaryKey' => 'id',
                                                 '-class' => 'W3C::Forum::MysqlObjects::links',
                                                 '-fields' => { 'forum' => { '-size' => 10,
                                                                             '-default' => '0',
                                                                             '-target' => 'forums',
                                                                             '-type' => '0' },
                                                                'source' => { '-size' => 10,
                                                                              '-default' => '0',
                                                                              '-target' => 'uris',
                                                                              '-type' => '0' },
                                                                'destination' => { '-size' => 10,
                                                                                   '-default' => '0',
                                                                                   '-target' => 'uris',
                                                                                   '-type' => '0' },
                                                                'comment' => { '-size' => 255,
                                                                               '-null' => 1,
                                                                               '-type' => 4 },
                                                                'rel' => { '-size' => 64,
                                                                           '-default' => '',
                                                                           '-type' => 4 },
                                                                'contact' => { '-size' => 10,
                                                                               '-default' => '0',
                                                                               '-target' => 'contacts',
                                                                               '-type' => '0' },
                                                                'rel_type' => { '-size' => 10,
                                                                                '-default' => '0',
                                                                                '-target' => 'uris',
                                                                                '-type' => '0' },
                                                                'id' => { '-size' => 10,
                                                                          '-default' => '0',
                                                                          '-type' => '0' } },
                                                 '-index' => { 'source' => { '-fields' => { 'forum' => 4,
                                                                                            'source' => '0',
                                                                                            'destination' => 1,
                                                                                            'rel' => 2,
                                                                                            'rel_type' => 3 },
                                                                             '-unique' => 1,
                                                                             '-sequence' => [ 'source',
                                                                                              'destination',
                                                                                              'rel',
                                                                                              'rel_type',
                                                                                              'forum' ] } },
                                                 '-fieldOrder' => [ 'id',
                                                                    'source',
                                                                    'destination',
                                                                    'rel',
                                                                    'rel_type',
                                                                    'forum',
                                                                    'contact',
                                                                    'comment' ],
                                                 '-table' => 'links' );
$W3C::Forum::MysqlObjects::_AllTables{'links'} = \%W3C::Forum::MysqlObjects::links::_TableDesc;
sub W3C::Forum::MysqlObjects::links::getPrimaryKey {return shift->{FIELD_VALUES}{'id'};}
sub W3C::Forum::MysqlObjects::links::getId {return shift->{FIELD_VALUES}{'id'};}
sub W3C::Forum::MysqlObjects::links::getSource {return shift->{FIELD_VALUES}{'source'};}
sub W3C::Forum::MysqlObjects::links::getDestination {return shift->{FIELD_VALUES}{'destination'};}
sub W3C::Forum::MysqlObjects::links::getRel {return shift->{FIELD_VALUES}{'rel'};}
sub W3C::Forum::MysqlObjects::links::getRel_type {return shift->{FIELD_VALUES}{'rel_type'};}
sub W3C::Forum::MysqlObjects::links::getForum {return shift->{FIELD_VALUES}{'forum'};}
sub W3C::Forum::MysqlObjects::links::getContact {return shift->{FIELD_VALUES}{'contact'};}
sub W3C::Forum::MysqlObjects::links::getComment {return shift->{FIELD_VALUES}{'comment'};}

%W3C::Forum::MysqlObjects::uris::_TableDesc = ( '-primaryKey' => 'id',
                                                '-class' => 'W3C::Forum::MysqlObjects::uris',
                                                '-fields' => { 'uri' => { '-size' => 255,
                                                                          '-default' => '',
                                                                          '-type' => 4 },
                                                               'luri' => { '-null' => 1,
                                                                           '-type' => undef },
                                                               'id' => { '-size' => 10,
                                                                         '-default' => '0',
                                                                         '-referents' => [ 'contacts.email',
                                                                                           'contacts.homepage',
                                                                                           'links.source',
                                                                                           'links.destination',
                                                                                           'links.rel_type' ],
                                                                         '-type' => '0' } },
                                                '-index' => { 'uri' => { '-fields' => { 'uri' => '0' },
                                                                         '-unique' => 1,
                                                                         '-sequence' => [ 'uri' ] },
                                                              'uri_idx' => { '-fields' => { 'uri' => '0' },
                                                                             '-unique' => '',
                                                                             '-sequence' => [ 'uri' ] } },
                                                '-fieldOrder' => [ 'id',
                                                                   'uri',
                                                                   'luri' ],
                                                '-table' => 'uris' );
$W3C::Forum::MysqlObjects::_AllTables{'uris'} = \%W3C::Forum::MysqlObjects::uris::_TableDesc;
sub W3C::Forum::MysqlObjects::uris::getPrimaryKey {return shift->{FIELD_VALUES}{'id'};}
sub W3C::Forum::MysqlObjects::uris::getId {return shift->{FIELD_VALUES}{'id'};}
sub W3C::Forum::MysqlObjects::uris::getUri {return shift->{FIELD_VALUES}{'uri'};}
sub W3C::Forum::MysqlObjects::uris::getLuri {return shift->{FIELD_VALUES}{'luri'};}

...
and has the ability to efficiently create or load any record and all the structures it references. The perl structure contains all the data that one would want to store in the XML database schema. I will build an XML interface after I finish a little more RDF work. I will not be releasing this in the W3C::Database module until I have it working hapilly with the rest of the modules. I want to make sure that folks can always grab the latest and modules and get a coherent package.

Links to background materials

Here are links to relevent specifications and products:


Last revised: $Date: 2001/07/05 14:03:56 $ by: $Author: eric $