Copyright
©
2011
©
2010
W3C
®
®
(
MIT
,
ERCIM
,
Keio
),
All
Rights
Reserved.
W3C
liability
,
trademark
and
document
use
rules
apply.
The need to share data with collaborators motivates custodians and users of relational databases (RDB) to expose relational data on the Web of Data. This document defines a direct mapping from relational data to RDF. This definition provides extension points for refinements within and outside of this document.
This
section
describes
the
status
of
this
document
at
the
time
of
its
publication.
Other
documents
may
supersede
this
document.
A
list
of
current
W3C
publications
and
the
latest
revision
of
this
technical
report
can
be
found
in
editor's
draft,
started
after
the
W3C
technical
reports
index
at
http://www.w3.org/TR/.
This
2011-09-20
Last
Call
,
is
a
Public
Working
Draft
of
the
"A
Direct
Mapping
of
Relational
Data
submitted
to
RDF"
for
review
by
W3C
members
and
other
interested
parties.
This
document
was
developed
by
the
W3C
RDB2RDF
Working
Group
.
The
Working
Group
expects
to
advance
this
Working
Draft
director
for
a
transition
to
Candidate
Recommendation
Status.
A
complete
list
of
changes
to
this
document
is
available.
Comments
on
this
document
should
be
sent
to
public-rdb2rdf-comments@w3.org
,
a
mailing
list
with
a
public
archive
(CR)
.
High-level
diffs
since
Last
Call:
For
more
detail,
see
the
W3C
Patent
Policy
HTML
diffs
.
1
Introduction
2
Direct
Mapping
Description
(Informative)
2.1
Direct
Mapping
Example
2.2
Preliminaries:
Generating
IRIs
2.2.1
IRIs
generated
for
the
initial
example
2.3
Mapping
Rules
2.3.1
Triples
generated
for
the
example
in
Section
Direct
Mapping
Example
2.4
Additional
Examples
and
Corner
Cases
2.4.1
Foreign
keys
referencing
candidate
keys
2.4.2
2.3
Multi-column
primary
keys
2.4.3
2.4
Empty
(non-existent)
primary
keys
2.4.4
2.5
Referencing
tables
with
empty
primary
keys
2.5
Hierarchical
Tables
3
Direct
Graph
Definition
3
4
References
A
Direct
Mapping
Definition
Algebra
(Informative)
3.1
A.1
Notations
3.2
A.2
Relational
Data
Model
3.2.1
A.2.1
RDB
Abstract
Data
Type
(Normative)
3.2.2
A.2.2
RDB
accessor
functions
(Normative)
3.3
A.3
RDF
Data
Model
(Non-normative)
3.4
A.4
Denotational
semantics
(Normative)
4
B
Direct
Mapping
as
Rules
(Normative)
(Informative)
4.1
B.1
Generating
Table
Row
Type
Triples
4.1.1
B.1.1
Table
has
a
primary
key
4.1.2
B.1.2
Table
does
not
have
a
primary
key
4.2
B.2
Generating
Literal
Triples
4.2.1
B.2.1
Table
has
a
primary
key
4.2.2
B.2.2
Table
does
not
have
a
primary
key
4.3
B.3
Generating
Reference
Triples
4.3.1
B.3.1
Table
r1
has
a
primary
key
and
table
r2
has
a
primary
key
4.3.2
B.3.2
Table
r1
has
a
primary
key
and
table
r2
does
not
have
a
primary
key
4.3.3
B.3.3
Table
r1
does
not
have
primary
key
and
table
r2
has
a
primary
key
4.3.4
B.3.4
Table
r1
does
not
have
primary
key
and
table
r2
does
not
have
a
primary
key
5
References
Relational databases proliferate both because of their efficiency and their precise definitions, allowing for tools like SQL [SQLFN] to manipulate and examine the contents predictably and efficiently. Resource Description Framework (RDF) [RDF-concepts] is a data format based on a web-scalable architecture for identification and interpretation of terms. This document defines a mapping from relational representation to an RDF representation.
Strategies for mapping relational data to RDF abound. The direct mapping defines a simple transformation, providing a basis for defining and comparing more intricate transformations. This document includes an informal and a formal description of the transformation.
The
Direct
Mapping
is
intended
to
provide
a
default
behavior
for
R2RML:
RDB
to
RDF
Mapping
Language
.
It
can
be
also
be
used
to
materialize
RDF
graphs
or
define
virtual
graphs,
which
can
be
queried
by
SPARQL
or
traversed
by
an
RDF
graph
API.
The
direct
mapping
defines
an
RDF
Graph
[RDF-concepts]
representation
of
the
data
in
any
a
relational
database.
The
direct
mapping
takes
as
input
a
relational
database
(data
and
schema),
and
generates
an
RDF
graph
that
is
called
the
direct
graph
.
This
The
algorithms
in
this
document
compose
a
graph
is
composed
of
relative
IRIs
that
may
which
must
be
resolved
against
a
base
IRI
per
[RFC3987]
.
to
form
an
RDF
graph.
Foreign
keys
in
relational
databases
establish
a
named
reference
from
any
row
in
a
table
to
exactly
one
row
in
a
(potentially
different)
table.
The
direct
graph
conveys
these
references,
as
well
as
each
value
in
the
rows.
row.
The
concepts
in
direct
mapping
can
be
introduced
with
an
example
RDF
graph
produced
by
a
relational
database.
Following
is
SQL
(DDL)
to
create
a
simple
example
with
two
tables
with
single-column
primary
keys
and
one
foreign
key
reference
between
them:
CREATE TABLE Addresses (
ID INT,
city CHAR(10),
state CHAR(2),
PRIMARY KEY(ID)
CREATE TABLE "Addresses" ( "ID" INT, PRIMARY KEY("ID"), "city" CHAR(10), "state" CHAR(2) )CREATE TABLE People ( ID INT, fname CHAR(10), addr INT, PRIMARY KEY(ID), FOREIGN KEY(addr) REFERENCES Addresses(ID)CREATE TABLE "People" ( "ID" INT, PRIMARY KEY("ID"), "fname" CHAR(10), "addr" INT, FOREIGN KEY("addr") REFERENCES "Addresses"("ID") )INSERT INTO Addresses (ID, city, state) VALUES (18, "Cambridge", "MA") INSERT INTO People (ID, fname, addr) VALUES (7, "Bob", 18) INSERT INTO People (ID, fname, addr) VALUES (8, "Sue", NULL)INSERT INTO "Addresses" ("ID", "city", "state") VALUES (18, 'Cambridge', 'MA') INSERT INTO "People" ("ID", "fname", "addr") VALUES (7, 'Bob', 18) INSERT INTO "People" ("ID", "fname", "addr") VALUES (8, 'Sue', NULL)
HTML
tables
will
be
used
in
this
document
to
convey
SQL
tables.
The
primary
key
of
these
tables
will
be
marked
with
the
PK
class
to
convey
an
SQL
primary
key
such
as
ID
in
CREATE
TABLE
.
Foreign
keys
will
be
illustrated
with
a
notation
like
"
Addresses
(ID
"Addresses"
("ID"
INT,
...
PRIMARY
KEY(ID))
KEY("ID"))
→
Address(ID)
"
to
convey
an
SQL
foreign
key
such
as
CREATE
TABLE
.
People
"People"
(...
addr
"addr"
INT,
FOREIGN
KEY(addr)
KEY("addr")
REFERENCES
Addresses(ID))
"Addresses"("ID"))
PK | → Address(ID) | |
---|---|---|
ID | fname | addr |
7 | Bob | 18 |
8 | Sue | NULL |
PK | ||
---|---|---|
ID | city | state |
18 | Cambridge | MA |
Given
a
base
IRI
http://foo.example/DB/
,
the
direct
mapping
of
this
database
produces
a
direct
graph:
@base <http://foo.example/DB/>
@base <http://foo.example/DB/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <People/ID-7> rdf:type <People> . <People/ID-7> <People#ID> 7 . <People/ID-7> <People#fname> "Bob" . <People/ID-7> <People#addr> 18 . <People/ID-7> <People#ref-addr> <Addresses/ID-18> . <People/ID-8> rdf:type <People> . <People/ID-8> <People#ID> 8 . <People/ID-8> <People#fname> "Sue" . <Addresses/ID-18> rdf:type <Addresses> . <Addresses/ID-18> <Addresses#ID> 18 . <Addresses/ID-18> <Addresses#city> "Cambridge" . <Addresses/ID-18> <Addresses#state> "MA" .
In
this
expression,
each
row,
e.g.
(7,
"Bob",
18)
,
produces
a
set
of
triples
with
a
common
subject.
The
subject
is
an
IRI
formed
from
the
concatenation
of
the
base
IRI,
table
name
(
People
),
primary
key
column
name
(
ID
)
and
primary
key
value
(
7
).
The
predicate
for
each
column
is
an
IRI
formed
from
the
concatenation
of
the
base
IRI,
table
name
and
the
column
name.
The
values
are
either
RDF
literals
formed
from
the
lexical
form
of
the
column
value,
or,
in
the
case
of
value.
Each
foreign
keys,
row
identifiers
(
<Addresses/ID=18>
).
Note
that
these
reference
row
identifiers
must
coincide
with
the
subject
used
for
the
triples
generated
from
the
referenced
row.
2.2
Preliminaries:
Generating
IRIs
In
the
process
of
translating
relational
data
into
RDF,
the
direct
mapping
must
create
IRIs
for
identifying
tables,
the
columns
in
a
table,
and
each
row
in
a
table.
In
this
section,
we
assume
that
http://foo.example/DB
is
the
the
base
IRI.
All
the
examples
in
this
section
will
contain
relative
IRIs
which
are
to
be
understood
as
relative
to
this
base
IRI.
The
following
are
the
IRIs
that
need
to
be
generated:
Table
IRI:
The
IRI
that
identifies
keys
produces
a
table
is
created
by
concatenating
the
base
IRI
triple
with
the
table
name.
Specifically,
if
base_IRI
is
the
base
IRI
and
table_name
is
the
table
name,
then
base_IRI/table_name
is
the
Table
IRI
for
the
table.
Column
IRI:
Single-column
IRI:
The
IRI
that
identifies
a
column
of
a
table
is
created
by
concatenating
the
base
IRI
with
the
table
name
and
the
column
name.
Specifically,
if
base_IRI
is
the
base
IRI,
table_name
is
the
table
name
and
column_name
is
predicate
composed
from
the
foreign
key
column
name,
then
base_IRI/table_name#column_name
is
the
Column
IRI
for
the
column.
Multi-column
IRI:
The
IRI
that
identifies
a
sequence
of
two
or
more
columns
of
a
table
is
created
by
concatenating
the
base
IRI
with
names,
the
table
name
referenced
table,
and
the
referenced
column
names.
Specifically,
if
base_IRI
is
the
base
IRI,
table_name
is
the
table
name
and
column_name_1
,
column_name_2
,
...,
column_name_k
is
a
sequence
of
k
columns
(k
>
1),
then
base_IRI/table_name#column_name_1,column_name_2,...,column_name_k
is
the
Column
IRI
for
the
columns.
Row
RDF
Node:
Row
RDF
Node
for
a
row
with
a
single-column
primary
key:
The
IRI
that
identifies
a
row
is
created
by
concatenating
the
base
IRI
with
the
table
name,
the
column
name
of
the
primary
key
and
the
value
of
the
row
in
that
column.
Specifically,
if
base_IRI
is
the
base
IRI,
table_name
is
the
table
name,
column_name
is
the
column
name
object
of
the
primary
key
and
value
these
triples
is
the
value
of
the
row
in
that
column,
then
identifiers
(
base_IRI/table_name/column_name=value
<Addresses/ID
-
18>
is
the
Row
RDF
Node
(or
Row
IRI)
)
for
the
row.
Row
RDF
Node
for
a
row
with
a
multi-column
primary
key:
The
IRI
that
identifies
a
row
is
created
by
concatenating
the
base
IRI
with
the
table
name,
the
names
of
the
columns
that
constitute
the
primary
key
and
the
values
of
the
row
in
those
columns.
Specifically,
if
base_IRI
is
the
base
IRI,
table_name
is
the
table
name,
column_name_1
,
column_name_2
,
...,
column_name_k
is
the
sequence
of
k
columns
(k
>
1)
that
constitute
the
primary
key,
and
value_1
,
value_2
,
...,
value_k
is
the
sequence
of
values
of
the
columns
referenced
triple.
Note
that
constitute
the
primary
key
of
the
row,
then
base_IRI/table_name/column_name_1=value_1,column_name_2=value_2,...,column_name_k=value_k
is
the
Row
RDF
Node
(or
Row
IRI)
for
the
row.
Row
RDF
Node
for
a
row
without
a
primary
key:
A
fresh
Blank
Node
is
created,
which
is
used
as
the
Row
RDF
Node
for
the
row.
Issue
(hash-vs-slash):
The
direct
graph
may
be
offered
as
Linked
Open
Data,
raising
the
issue
of
distinguishing
row
identifiers
from
the
information
resources
which
describe
them
.
This
edition
of
this
document
presumes
hash
identifiers,
allowing
a
GET
on
a
row
identifier
to
retrieve
a
small
resource
(i.e.
not
all
rows
from
the
same
table)
and
distinguish
between
the
retrieved
resource
People/ID=7
and
the
these
reference
row
People/ID=7
.
The
"slash"
alternative
would
offer
a
direct
graph
with
identifiers
like
People/ID=7
but
would
demand
the
server
respond
to
GET
/People/ID=7
must
coincide
with
a
303
redirect
to
some
other
resource.
Resolution:
None
recorded.
2.2.1
IRIs
generated
for
the
initial
example
Given
the
base
IRI
http://foo.example/DB/
,
the
following
are
some
of
the
IRIs
that
are
subject
used
when
translating
into
RDF
the
relational
data
given
in
the
initial
example:
For
the
table
People
,
the
following
IRIs
are
considered
in
the
translation
process:
Table
IRI:
<People>
Column
IRIs:
<People#ID>
<People#fname>
<People#addr>
Row
IRIs:
<People/ID=7>
<People/ID=8>
For
the
table
Addresses
,
the
following
IRIs
are
considered
in
the
translation
process:
Table
IRI:
<Addresses>
Columns
IRIs:
<Addresses#ID>
<Addresses#city>
<Addresses#state>
Row
IRI:
<Addresses/ID=18>
2.3
Mapping
Rules
Each
row
in
for
the
database
produces
a
set
of
RDF
triples
with
a
subject
,
predicate
,
and
object
composed
as
follows:
Shared
Subject:
A
Row
RDF
Node
,
which
may
be
an
IRI
or
a
Blank
Node,
is
generated
for
each
from
the
referenced
row.
Table
Triples:
The
row
generates
a
triple
with
the
following:
Predicate:
the
rdf:type
property
Object:
the
Table
IRI
for
the
table
Literal
Triples:
Each
column
with
a
non-null
value,
including
the
column(s)
that
constitute
the
primary
key,
and
that
either
is
direct
mapping
does
not
the
only
constituent
of
a
foreign
key
or
is
the
only
constituent
of
a
foreign
key
that
references
a
candidate
key,
generates
a
triple
with
the
following:
Predicate:
the
Column
IRI
for
the
column
Object:
an
RDF
Literal
with
an
XML
Schema
datatype
corresponding
to
the
SQL
datatype
of
that
value.
String
datatypes
are
expressed
as
an
RDF
plain
literal
Reference
Triples:
Columns
that
constitute
a
foreign
key
and
with
non-null
values
in
the
row
generate
triples
with
the
following:
Predicate:
the
Column
IRI
for
the
columns
that
constitute
the
foreign
key
Object:
the
Row
RDF
Node
for
the
corresponding
referenced
row
(according
to
the
foreign
key)
Issue
(primary-is-candidate-key):
Should
the
following
exception
be
included
in
the
definition
of
the
direct
mapping?
Primary-is-Candidate-Key
Exception
:
If
the
primary
key
is
also
a
candidate
key
K
to
table
R:
The
shared
subject
is
the
subject
of
the
referenced
row
in
R.
The
foreign
key
K
generates
no
reference
triple.
Even
if
K
is
a
single-column
foreign
key,
it
generates
a
literal
triple.
Resolution:
None
recorded.
2.3.1
Triples
generated
for
the
example
in
Section
Direct
Mapping
Example
Next
we
show
how
the
11
triples
in
the
example
of
Section
Direct
Mapping
Example
are
classified
into
the
above
categories:
Triples
generated
from
table
People
:
Table
Triples:
Literal
Triples:
Reference
Triple:
Triples
generated
from
table
Addresses
:
NULL
values.
More
complex
schemas
include
compound
and
composite
primary
keys.
In
this
example,
the
columns
deptName
and
deptCity
in
the
People
table
reference
name
and
city
in
the
Department
table.
The
following
is
the
schema
of
the
augmented
database:
CREATE TABLE Addresses (
ID INT,
city CHAR(10),
state CHAR(2),
PRIMARY KEY(ID)
table:
CREATE TABLE "Addresses" ( "ID" INT, "city" CHAR(10), "state" CHAR(2), PRIMARY KEY("ID") )CREATE TABLE Deparment ( ID INT, name CHAR(10), city CHAR(10), manager INT, PRIMARY KEY(ID), UNIQUE (name, city), FOREIGN KEY(manager) REFERENCES People(ID)CREATE TABLE "Department" ( "ID" INT, "name" CHAR(10), "city" CHAR(10), "manager" INT, PRIMARY KEY("ID"), UNIQUE ("name", "city") )CREATE TABLE People ( ID INT, fname CHAR(10), addr INT, deptName CHAR(10), deptCity CHAR(10), PRIMARY KEY(ID), FOREIGN KEY(addr) REFERENCES Addresses(ID), FOREIGN KEY(deptName, deptCity) REFERENCES Department(name, city)CREATE TABLE "People" ( "ID" INT, "fname" CHAR(10), "addr" INT, "deptName" CHAR(10), "deptCity" CHAR(10), PRIMARY KEY("ID"), FOREIGN KEY("addr") REFERENCES "Addresses"("ID"), FOREIGN KEY("deptName", "deptCity") REFERENCES "Department"("name", "city") ) ALTER TABLE "Department" ADD FOREIGN KEY("manager") REFERENCES "People"("ID")
The
following
Following
is
an
instance
of
the
augmented
relational
this
schema:
PK | → Addresses(ID) | → Department(name, city) | ||
---|---|---|---|---|
ID | fname | addr | deptName | deptCity |
7 | Bob | 18 | accounting | Cambridge |
8 | Sue | NULL | NULL | NULL |
PK | ||
---|---|---|
ID | city | state |
18 | Cambridge | MA |
PK | Unique Key | → People(ID) | |
---|---|---|---|
ID | name | city | manager |
23 | accounting | Cambridge | 8 |
Per the People tables's compound foreign key to Department:
deptName="accounting"
and
deptCity="Cambridge"
references
a
row
in
Department
with
a
primary
key
of
ID=23
.
deptName,deptCity
deptName
"
and
"
deptCity
",
reflecting
the
order
of
the
column
names
in
the
foreign
key.
Department
"
and
the
primary
key
value
"
ID=23
ID-23
".
In
this
example,
the
direct
mapping
generates
the
following
triples:
@base <http://foo.example/DB/>
@base <http://foo.example/DB/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <People/ID-7> rdf:type <People> . <People/ID-7> <People#ID> 7 . <People/ID-7> <People#fname> "Bob" . <People/ID-7> <People#addr> 18 . <People/ID-7> <People#ref-addr> <Addresses/ID-18> . <People/ID-7> <People#deptName> "accounting" . <People/ID-7> <People#deptCity> "Cambridge" . <People/ID-7> <People#ref-deptName.deptCity> <Department/ID-23> . <People/ID-8> rdf:type <People> . <People/ID-8> <People#ID> 8 . <People/ID-8> <People#fname> "Sue" . <Addresses/ID-18> rdf:type <Addresses> . <Addresses/ID-18> <Addresses#ID> 18 . <Addresses/ID-18> <Addresses#city> "Cambridge" . <Addresses/ID-18> <Addresses#state> "MA" . <Department/ID-23> rdf:type <Department> . <Department/ID-23> <Department#ID> 23 . <Department/ID-23> <Department#name> "accounting" . <Department/ID-23> <Department#city> "Cambridge" . <Department/ID-23> <Department#manager> 8 . <Department/ID-23> <Department#ref-manager> <People#ID-8> .
The
green
triples
above
are
generated
by
considering
the
new
elements
in
the
augmented
database.
It
should
be
noticed
that:
Note:
The
Reference
Triple
is
generated
by
considering
a
foreign
key
referencing
a
candidate
key
<People/ID=7>
<People#deptName,deptCity>
<Department/ID=23>
<People/ID
-
7>
<People#ref-deptName,deptCity>
<Department/ID
-
23>
(instead
of
(different
from
the
primary
key):
(deptName,
deptCity)
is
a
multi-column
foreign
key
in
the
table
People
which
references
the
multi-column
candidate
key
(name,
city)
in
the
table
Department
.
key).
We
note
that
primary
Primary
keys
may
also
be
composite.
For
example,
if
If,
in
the
above
example
,
the
primary
key
for
Department
were
(
name
,
city
)
instead
of
ID
in
the
example
in
Section
Foreign
keys
referencing
candidate
keys
,
then
,
the
identifier
for
the
only
row
in
this
table
would
be
.
The
triples
involving
<Department/name=accounting,city=Cambridge>
,
and
the
following
<Department/name
-
accounting
.
city
-
Cambridge>
<Department/ID
-
23>
would
have
been
generated
by
be
substituted
with
the
direct
mapping:
following
triples:
<People/ID-7> <People#ref-deptName.deptCity> <Department/name-accounting.city-Cambridge> . <Department/name-accounting.city-Cambridge> rdf:type <Department> . <Department/name-accounting.city-Cambridge> <Department#ID> 23 . <Department/name-accounting.city-Cambridge> <Department#name> "accounting" .<Department/name-accounting.city-Cambridge> <Department#city> "Cambridge" .
Even
if
If
there
is
no
primary
key,
rows
generate
implies
a
set
of
triples
with
a
shared
subject,
but
that
subject
is
a
blank
node.
For
instance,
assume
that
the
following
A
Tweets
table
is
can
be
added
to
the
schema
of
the
above
example
in
Section
Foreign
keys
referencing
candidate
keys
(for
keeping
to
keep
track
of
employees'
tweets
in
Twitter):
CREATE TABLE Tweets (
tweeter INT,
when TIMESTAMP,
text CHAR(140),
FOREIGN KEY(tweeter) REFERENCES People(ID)
Twitter:
CREATE TABLE "Tweets" ( "tweeter" INT, "when" TIMESTAMP, "text" CHAR(140), FOREIGN KEY("tweeter") REFERENCES "People"("ID") )
The following is an instance of table Tweets :
→ People(ID) | ||
---|---|---|
tweeter | when | text |
7 | 2010-08-30T01:33 | I really like lolcats. |
7 | 2010-08-30T09:01 | I take it back. |
Given
that
table
Tweets
does
not
have
a
primary
key,
each
row
in
this
table
is
identified
by
a
Blank
Node.
In
fact,
when
translating
the
above
table
the
direct
mapping
generates
the
following
triples:
@base <http://foo.example/DB/> @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . _:a rdf:type <Tweets> . _:a <Tweets#ref-tweeter> <People/ID-7> . _:a <Tweets#when> "2010-08-30T01:33"^^xsd:dateTime . _:a <Tweets#text> "I really like lolcats." . _:b rdf:type <Tweets> . _:b <Tweets#tweeter> <People/ID-7> . _:b <Tweets#when> "2010-08-30T09:01"^^xsd:dateTime . _:b <Tweets#text> "I take it back." .
Rows in tables with no primary key may still be referenced by foreign keys. (Relational database theory tells us that these rows must be unique as foreign keys reference candidate keys and candidate keys are unique across all the rows in a table.) References to rows in tables with no primary key are expressed as RDF triples with blank nodes for objects, where that blank node is the same node used for the subject in the referenced row.
This
example
includes
several
foreign
keys
with
mutual
column
names.
For
clarity;
here
is
the
DDL
to
clarify
these
keys:
CREATE TABLE Projects (
lead INT,
FOREIGN KEY (lead) REFERENCES People(ID),
name VARCHAR(50),
UNIQUE (lead, name),
deptName VARCHAR(50),
deptCity VARCHAR(50),
UNIQUE (name, deptName, deptCity),
FOREIGN KEY (deptName, deptCity) REFERENCES Department(name, city)
CREATE TABLE "Projects" ( "lead" INT, FOREIGN KEY ("lead") REFERENCES "People"("ID"), "name" VARCHAR(50), UNIQUE ("lead", "name"), "deptName" VARCHAR(50), "deptCity" VARCHAR(50), UNIQUE ("name", "deptName", "deptCity"), FOREIGN KEY ("deptName", "deptCity") REFERENCES "Department"("name", "city") )CREATE TABLE TaskAssignments ( worker INT, FOREIGN KEY (worker) REFERENCES People(ID), project VARCHAR(50), PRIMARY KEY (worker, project), deptName VARCHAR(50), deptCity VARCHAR(50), FOREIGN KEY (worker) REFERENCES People(ID), FOREIGN KEY (project, deptName, deptCity) REFERENCES Projects(name, deptName, deptCity), FOREIGN KEY (deptName, deptCity) REFERENCES Department(name, city)CREATE TABLE "TaskAssignments" ( "worker" INT, FOREIGN KEY ("worker") REFERENCES "People"("ID"), "project" VARCHAR(50), PRIMARY KEY ("worker", "project"), "deptName" VARCHAR(50), "deptCity" VARCHAR(50), FOREIGN KEY ("worker") REFERENCES "People"("ID"), FOREIGN KEY ("project", "deptName", "deptCity") REFERENCES "Projects"("name", "deptName", "deptCity"), FOREIGN KEY ("deptName", "deptCity") REFERENCES "Department"("name", "city") )
The following is an instance of the preceding schema:
Unique key | |||
---|---|---|---|
Unique key | |||
→ People(ID) | → Department(name, city) | ||
lead | name | deptName | deptCity |
8 | pencil survey | accounting | Cambridge |
8 | eraser survey | accounting | Cambridge |
PK | |||
---|---|---|---|
→ Projects(name, deptName, deptCity) | |||
→ People(ID) | → Departments(name, city) | ||
worker | project | deptName | deptCity |
7 | pencil survey | accounting | Cambridge |
In
this
case,
the
direct
mapping
generates
the
following
triples
from
the
preceding
tables:
@base <http://foo.example/DB/> @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .@prefix pencil: <http://foo.example/DB/TaskAssignment/worker=7,project=pencil+survey> ._:c rdf:type <Projects> . _:c <Projects#lead> <People/ID-8> . _:c <Projects#name> "pencil survey" . _:c <Projects#deptName> "accounting" . _:c <Projects#deptCity> "Cambridge" . _:c <Projects#ref-deptName.deptCity> <Department/ID-23> . _:d rdf:type <Projects> . _:d <Projects#lead> <People/ID-8> . _:d <Projects#name> "eraser survey" . _:d <Projects#deptName> "accounting" . _:d <Projects#deptCity> "Cambridge" . _:d <Projects#ref-deptName.deptCity> <Department/ID-23> . <TaskAssignment/worker-7.project-pencil+survey> rdf:type <TaskAssignments> . <TaskAssignment/worker-7.project-pencil+survey> <TaskAssignments#worker> 7 . <TaskAssignment/worker-7.project-pencil+survey> <TaskAssignments#ref-worker> <People/ID-7> . <TaskAssignment/worker-7.project-pencil+survey> <TaskAssignments#project> "pencil survey" . <TaskAssignment/worker-7.project-pencil+survey> <TaskAssignments#deptName> "accounting" . <TaskAssignment/worker-7.project-pencil+survey> <TaskAssignments#deptCity> "Cambridge" . <TaskAssignment/worker-7.project-pencil+survey> <TaskAssignments#ref-deptName.deptCity> <Department/ID-23> . <TaskAssignment/worker-7.project-pencil+survey> <TaskAssignments#ref-project.deptName.deptCity> _:c .
The absence of a primary key forces the generation of blank nodes, but does not change the structure of the direct graph or names of the predicates in that graph.
It
The
Direct
Graph
is
common
to
express
specializations
of
some
concept
as
multiple
tables
sharing
a
common
primary
key.
In
such
cases,
formula
for
creating
an
RDF
graph
from
the
primary
keys
rows
of
each
table
and
view
in
a
database
schema.
A
base
IRI
defines
a
web
space
for
the
inherited
tables
are
IRIs
in
turn
foreign
keys
this
graph;
for
the
purposes
of
this
specification,
all
IRIs
are
generated
by
appending
to
a
base.
Terms
enclosed
in
<>
are
defined
in
the
table
from
which
they
derive.
Addresses
PK
ID
city
state
18
Cambridge
MA
Offices
PK
→
Addresses(ID)
ID
building
ofcNumber
18
32
G528
ExecutiveOffices
PK
→
Offices(ID)
ID
desk
18
oak
SQL
specification
[SQLFN]
.
In
this
example,
Offices
are
An
SQL
table
has
a
specialization
set
of
Addresses
uniquely-named
columns
and
ExecutiveOffices
are
a
specialization
set
of
Offices
.
The
subjects
for
foreign
keys,
each
mapping
a
<column
name
list>
to
a
<unique
column
list>
(a
list
of
columns
in
some
table).
SQL
table
and
column
identifiers
compose
RDF
IRIs
in
the
triples
implied
direct
graph.
These
identifiers
are
separated
by
rows
the
punctuation
characters
'#',
'
.
',
'/'
and
'
-
'.
All
SQL
identifiers
are
escaped
following
URL-encoding
HTML
form
data
except
that
only
the
above
punctuation
and
the
characters
not
permitted
in
RDF
IRIs
are
escaped.
Definition
Offices
percent-encode
:
(a
subset
of
HTML5
form
dataset
encoding
):
There is either a blank node or IRI assigned to each each row in a table:
Definition
ExecutiveOffices
are
row
node
:
A table forms a table IRI:
Definition table IRI : the relative IRI consisting of the percent-encoded form of the table name.
A column in a table forms a literal property IRI:
Definition literal property IRI : the concatenation of:
A
foreign
key
in
a
rearrangement
table
forms
a
reference
property
IRI:
Definition reference property IRI : the concatenation of:
Any
input
database
with
a
primary
key
(project,
worker),
had
given
schema
has
a
foreign
key
(worker,
project)?
Resolution:
direct
graph
defined
as:
Definition direct graph : the union of the table graph s for each table in a database schema.
Definition
table
graph
:
the
union
of
the
row
graph
Issue
(many-to-many-as-repeated-properties):
s
for
each
row
in
a
table.
Definition
row
graph
is
arguably
more
faithful
to
:
an
RDF
graph
consisting
of
the
conceptual
model
if
it
reflects
e.g.
following
triples:
Definition row type triple : an RDF triple with:
rdf:type
.Definition literal triple : an RDF triple with:
Definition reference triple : an RDF triple with:
The
RDB
and
RDF
data
models
make
use
of
the
commonly
defined
Abstract
Data
Types
Set
,
List
and
MultiSet
,
used
here
as
type
constructors.
For
example,
Set(A)
denotes
the
type
for
the
sets
of
elements
of
type
A
.
We
assume
that
they
come
with
their
common
operations,
such
as
the
function
.
size : Set → Int
size : Set → Int
The
definitions
follow
a
type-as-specification
approach,
thus
the
models
are
based
on
dependent
types
.
For
example,
is
a
type
denoting
the
sets
for
elements
of
type
A,
such
that
those
sets
have
at
most
one
element.
{ s:Set(A) | size(s) ≤ 1 }
{ s:Set(A) | size(s) ≤ 1 }
The denotational RDF semantics makes use of the set-builder notation for building the RDF sets.
[1] |
Database
|
::= |
|
|
[1] |
Database
| ::= |
{
Table
}
| |
A relational database is a set of tables. | ||||
[2] |
Table
|
::= |
|
|
[2] |
Table
| ::= |
(
TableName
,
{
ColumnName
→
Datatype
},
{
CandidateKey
},
PrimaryKey
?,
{
ForeignKey
},
Body
)
| |
A
relation
has
|
||||
[3] |
|
::= |
|
|
|
Body
|
::= |
|
|
A
body
is
a
set
of
|
||||
|
Row
|
::= |
|
|
[4] |
Row
| ::= |
{
ColumnName
→
CellValue
}
| |
A row is a associative array mapping each column in a row to a value. | ||||
|
CellValue
|
::= |
|
|
[5] |
CellValue
| ::= |
Value
|
Null
| |
A cell value is either a lexical value or NULL, denoting the absence of value. | ||||
|
ForeignKey
|
::= |
|
|
[6] |
ForeignKey
| ::= |
{
[
ColumnName
]
→
(
Table
,
[
ColumnName
]
)
}
| |
A
foreign
key
=
operator)
to
the
values
of
a
|
||||
[7] |
PrimaryKey
| ::= |
CandidateKey
| |
[7] |
PrimaryKey
| ::= |
CandidateKey
|
|
A primary key is a candidate key with the additional constraint that none of the columns can have a NULL value. | ||||
[8] |
CandidateKey
|
::= |
|
|
[8] |
CandidateKey
| ::= |
[
ColumnName
]
| |
A
candidate
key
is
=
operator).
|
||||
[9] |
Datatype
|
::= |
Int
|
Float
|
Date
|
…
|
|
[9] |
Datatype
| ::= |
{
INT
|
FLOAT
|
DATE
|
TIME
|
TIMESTAMP
|
CHAR
|
VARCHAR
|
STRING
}
| |
A datatype is a common SQL datatype. | ||||
[10] |
TableName
|
::= |
String
|
|
[10] |
TableName
| ::= |
String
| |
A table name is a string. | ||||
[11] |
ColumnName
|
::= |
String
|
|
[11] |
ColumnName
| ::= |
String
| |
A column name is a string. |
[12] |
tablename
|
: |
Table
→
TableName
|
|
Given a table, tablename returns its name. | ||||
[13] |
header
|
: |
Table
→
|
|
Given a table, header returns its header. | ||||
[14] |
candidateKeys
|
: |
Table
→
|
|
Given a table, candidateKeys returns the list of candidate keys. | ||||
[15] |
primaryKey
|
: |
Table
→
{
|
|
Given a table, primaryKey returns a set containing the primary key if it exists, otherwise it returns an empty set. | ||||
[16] |
foreignKeys
|
: |
Table
→
|
|
Given a table, foreignKeys returns the set of foreign keys. | ||||
[17] |
unary
|
: |
ForeignKey
→
Boolean
|
|
Given a foreign key, unary tells if this is a unary foreign key, meaning it has exactly one column. | ||||
[18] |
lexicals
|
: |
Table
→
Set({
|
|
Given a table, lexicals returns the set of columns that do not constitute a unary foreign key. | ||||
[19] |
body
|
: |
Table
→
Body
|
|
Given a table, body returns its body. | ||||
[20] |
datatype
|
: |
{
|
|
Given a header and a column in this header, datatype returns the datatype associated with this column. | ||||
[21] |
table
|
: |
{
|
|
Given a row, table returns the table to which this row belongs. | ||||
[22] |
value
|
: |
{
|
|
Given a row and a column in this row, value returns the cell value (can be NULL) for this column. | ||||
[23] |
dereference
|
: |
{
|
|
Given
a
row
and
a
foreign
key
from
the
table
containing
this
row,
dereference
returns
the
row
which
is
referenced
by
this
foreign
=
operator)
to
the
values
|
Per RDF Concepts and Abstract Syntax , an RDF graph is a set of triples of a subject, predicate and object. The subject may be an IRI or a blank node, the predicate must be an IRI and the object may be an IRI, blank node, or an RDF literal.
This section recapitulates for convience the formal definition of RDF.
[24] |
Graph
|
::= |
|
|
[24] |
Graph
| ::= |
{
Triple
}
| |
An RDF graph is a set of RDF triples. | ||||
[25] |
Triple
|
::= |
|
|
[25] |
Triple
| ::= |
(
Subject
,
Predicate
,
Object
)
| |
An RDF triple is composed of a subject, predicate and object. | ||||
[26] |
Subject
|
::= |
IRI
|
BlankNode
|
|
[26] |
Subject
| ::= |
IRI
|
BlankNode
| |
A subject is either an IRI or a blank node. | ||||
[27] |
Predicate
|
::= |
IRI
|
|
[27] |
Predicate
| ::= |
IRI
| |
A predicate is always an IRI. | ||||
[28] |
Object
|
::= |
IRI
|
BlankNode
|
Literal
|
|
[28] |
Object
| ::= |
IRI
|
BlankNode
|
Literal
| |
An object is either an IRI, a blank node, or a literal. | ||||
[29] |
BlankNode
|
::= |
RDF
blank
node
|
|
[29] |
BlankNode
| ::= |
RDF
blank
node
| |
A blank node is an arbitrary term used only to establish graph connectivity. | ||||
[30] |
Literal
|
::= |
PlainLiteral
|
TypedLiteral
|
|
[30] |
Literal
| ::= |
PlainLiteral
|
TypedLiteral
| |
A literal is either a plain literal or a typed literal. | ||||
[31] |
PlainLiteral
|
::= |
lexicalForm
|
|
|
[31] |
PlainLiteral
| ::= |
(
lexicalForm
)
|
(
lexicalForm
,
langageTag
)
| |
A plain literal has a lexical form and an optional language tag . | ||||
[32] |
TypedLiteral
|
::= |
|
|
[32] |
TypedLiteral
| ::= |
(
lexicalForm
,
IRI
)
| |
An typed literal is composed of lexical form and a datatype IRI . | ||||
[33] |
IRI
|
::= |
RDF
URI-reference
as
subsequently
restricted
by
SPARQL
|
|
[33] |
IRI
| ::= |
RDF
URI-reference
as
subsequently
restricted
by
SPARQL
| |
An IRI is an RDF URI reference as subsequently restricted by SPARQL . | ||||
[34] |
lexicalForm
|
::= |
a
Unicode
String
|
|
[34] |
lexicalForm
| ::= |
a
Unicode
String
| |
SQL string representing a value. |
In this model, Databases are inhabitants of RDB and they are denoted by mathematical objects living in the RDF domain . This denotational semantics is what we call the Direct Mapping .
The url-encoding function renders strings in a form suitable to insert into IRIs. Data values are expressed in the XML Schema canonical form before url-encoding.
[35] |
ue
| : |
String
→
String
| |
[35] |
UE(s)
| = | s percent-encoded . | |
| ||||
⟦ , ⟧
canon
|
: |
(
Row
,
Column
)
→
String
| ||
[36] |
⟦r,
c⟧
canon
| = |
let
v
=
value
(r,
c)
in
| |
[36] |
canon(A)
| = |
canonical
RDF
literal
(A)
| |
lexical form of the canonical RDF literal representation of the column value as defined in R2RML section 10.2 Natural Mapping of SQL Values |
Most
of
the
functions
defining
the
Direct
Mapping
are
higher-order
functions
parameterized
by
a
function
φ : Row → Node
.
This
function
φ
(r)
row_node
(r)
which
maps
any
row
to
a
unique
node
IRI
or
Blank
Node
.
φ
is
formally
defined
by
the
following
axioms:
|
|
: |
∀
|
|
[37] |
row_node
|
= |
if
(pk(R)
≠
∅)
then
| |
|
||||
⟦ ⟧
tableIRI
|
: |
TableName
→
IRI
| ||
|
⟦t⟧
tableIRI
|
= | ue ( tablename (t)) | |
[38] |
| = |
IRI
(R.name)
|
|
the relative IRI consisting of the percent-encoded form of the table name. | ||||
⟦ , ⟧
litcol
|
: |
(
Row
,
Column
)
→
IRI
|
||
[39] |
|
=
|
ue
(
tablename
(
table
(r)))
+
'#'
+
ue
(c))
|
|
[39] |
literal_property_IRI(R,
A)
| = |
IRI
(UE(R.name)
+
"#"
+
UE(A.name))
|
|
the
concatenation
of:
| ||||
⟦ , ⟧
refcol
|
: |
(
Row
,
ForeignKey
)
→
IRI
|
||
|
|
= |
let
(from*,
reftable,
to*)
=
fk
in
|
|
[40] |
reference_property_IRI(R,
As)
| = |
IRI
(UE(R.name)
+
"#ref-"
+
join('.',
UE(A.name))
∣
A
∈
As
)
| |
the
concatenation
of:
|
The
Direct
Mapping
is
defined
by
induction
on
the
structure
of
RDB.
Thus
it
is
defined
for
any
relational
database.
The
entry
point
for
the
Direct
Mapping
is
the
function
direct_graph
(r)
.⟦ ⟧
⟦ ⟧
φ
database
.
|
|
: |
Database
→
Graph
|
|
|
⟦db⟧
φ
database
|
= |
{
triple
|
triple
∈
⟦t⟧
φ
table
|
t
∈
db
}
|
|
[41] |
direct_graph()
| = |
{
table_graph
(R)
∣
R
∈
DB
}
| |
the
union
of
the
|
||||
|
|
: |
Table
→
|
|
|
⟦t⟧
φ
table
|
= |
{
triple
|
triple
∈
⟦r⟧
φ
row
|
r
∈
body(t)
}
|
|
[42] |
table_graph(R)
| = |
{
row_graph
(T,
R)
∣
T
∈
R.Body
}
| |
the
union
of
the
row
graph
s
for
each
row
in
| ||||
noNULLs
| : |
Row
→
ForeignKey
→
Boolean
|
||
[43] |
noNULLs(r,
fk)
| = |
let
(columnNames,
_,
_)
=
fk
in
|
|
[43] |
noNULLs(T,
As)
| = |
∄(T(A)
=
Null
∣
A
∈
As)
| |
|
: |
Row
→
|
||
|
⟦r⟧
φ
row
|
= |
let
s
=
|
|
[44] |
row_graph(T,
R)
| = |
{
type_triple
(R)
}
| |
an
RDF
graph
consisting
of
the
|
||||
|
|
: |
|
|
|
|
= |
let
|
|
[45] |
type_triple(R)
| = |
| |
an RDF triple with: | ||||
|
|
: |
|
|
[46] |
⟦r,
c⟧
lex
|
= |
let
s
=
φ
(r)
in
|
|
|
|
|
|
|
an
RDF
triple
with:
|
||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
an
RDF
triple
with:
|
In
this
section,
we
formally
present
the
Direct
Mapping
as
rules
in
Datalog
syntax.
syntax,
inspired
by
previous
approach
[SQL2SW]
[DMSurvey]
.
The
left
hand
side
of
each
rule
is
the
RDF
Triple
output.
The
right
hand
side
of
each
rule
consists
of
a
sequence
of
predicates
from
the
relational
database
and
built-in
predicates.
The
built-in
predicates
are
divided
into
three
four
groups.
The
first
group
contains
some
built-in
predicates
for
dealing
with
repeated
rows
in
a
table
without
a
primary
key.
The second group contains a predicate to deal with null values.
Finally,
the
The
third
group
of
built-in
predicates
is
used
to
generate
IRIs
for
identifying
tables
and
the
columns
in
a
table,
and
to
generate
IRIs
or
blank
nodes
for
identifying
each
row
in
a
table.
Finally, the fourth group of built-in predicates is used to generate typed literals.
Throughout the section, boxes containing Direct Mapping rules and examples will appear. These boxes are color-coded. Yellow boxes contain Direct Mapping rules:
This box contains a Direct Mapping rule
Green boxes contain examples of applying the previous Direct Mapping rule:
This box contains examples of applying a Direct Mapping rule
Consider
again
the
example
from
Section
Direct
Mapping
Example
.
It
should
be
noticed
that
in
the
rules
presented
in
this
section,
a
formula
of
the
form
Addresses(X,
Y,
Z)
indicates
that
the
variables
X,
Y
and
Z
are
used
to
store
the
values
of
a
row
in
the
three
columns
of
the
table
Addresses
(according
to
the
order
specified
in
the
schema
of
the
table,
that
is,
X,
Y
and
Z
store
the
values
of
ID,
city
and
state,
respectively).
In
particular,
uppercase
letters
like
X,
Y,
Z,
S,
P
and
O
are
used
to
denote
variables.
Moreover,
double
quotes
are
used
in
the
rules
to
refer
to
the
string
with
the
name
of
a
table
or
a
column.
For
example,
a
formula
of
the
form
generateRowIRI("Addresses",
["ID"],
[X],
S)
is
used
to
generate
the
Row
RDF
Node
row
node
(or
Row
IRI)
for
the
row
of
table
"Addresses"
whose
value
in
the
primary
key
"ID"
is
the
value
stored
in
the
variable
X.
The
value
of
this
Row
IRI
is
stored
in
the
variable
S.
Assume
that
r
is
a
table
with
columns
a
1
,
...,
a
m
and
such
that
[a
p
1
,
...,
a
p
n
]
is
the
primary
key
of
r,
where
1
≤
n
≤
m
and
1
≤
p
1
<
...
<
p
n
≤
m.
Then
the
following
is
the
direct
mapping
rule
to
generate
Table
Triples
row
type
triples
from
r:
Triple(S, "rdf:type", O) ← r(X], S), generateTableIRI("r", O)
Triple(S, "rdf:type", O) ← r(X1, ..., Xm), generateRowIRI("r", ["ap1", ..., "apn"], [Xp1, ..., Xpn], S), generateTableIRI("r", O)
For
example,
table
Addresses
in
the
Direct
Mapping
Example
has
columns
ID
,
city
and
state
,
and
it
has
column
ID
as
its
primary
key.
Then
the
following
is
the
direct
mapping
rule
to
generate
Table
Triples
row
type
triples
from
Addresses
:
Triple(S, "rdf:type", O) ← Addresses(X1, X2, X3), generateRowIRI("Addresses", ["ID"], [X1], S), generateTableIRI("Addresses", O)
As
a
second
example,
consider
table
Department
from
the
example
in
Section
Foreign
keys
referencing
candidate
keys
,
which
has
columns
ID
,
name
,
city
and
manager
,
and
assume
that
(
name
,
city
)
is
the
multi-column
primary
key
of
this
table
(instead
of
ID
).
Then
the
following
is
the
direct
mapping
rule
to
generate
Table
Triples
row
type
triples
from
Department
:
Triple(S, "rdf:type", O) ← Department(X], S), generateTableIRI("Department", O)
Triple(S, "rdf:type", O) ← Department(X1, X2, X3, X4), generateRowIRI("Department", ["name","city"], [X2, X3], S), generateTableIRI("Department", O)
Assume
that
r
is
a
table
with
columns
a
1
,
...,
a
m
and
such
that
r
does
not
have
a
primary
key.
Then
the
following
is
the
direct
mapping
rule
to
generate
Table
Triples
row
type
triples
from
r:
Triple(S, "rdf:type", O) ← r(X], V, S),
Triple(S, "rdf:type", O) ← r(X1, ..., Xm), card("r", [X1, ..., Xm], U), V ≤ U, generateRowBlankNode("r", [X1, ..., Xm], V, S), generateTableIRI("r", O)
For
example,
table
Tweets
from
Section
Empty
(non-existent)
primary
keys
has
columns
tweeter
,
when
and
text
,
and
it
does
not
have
a
primary
key.
Then
the
following
is
the
direct
mapping
rule
to
generate
Table
Triples
row
type
triples
from
Tweets
:
Triple(S, "rdf:type", O) ← Tweets(X], V, S),
Triple(S, "rdf:type", O) ← Tweets(X1, X2, X3), card("Tweets", [X1, X2, X3], U), V ≤ U, generateRowBlankNode("Tweets", [X1, X2, X3], V, S), generateTableIRI("Tweets", O)
Assume
that
r
is
a
table
with
columns
a
1
,
...,
a
m
and
such
that
[a
p
1
,
...,
a
p
n
]
is
the
primary
key
of
r,
where
1
≤
n
≤
m
and
1
≤
p
1
<
...
<
p
n
≤
m.
Then
for
every
a
j
(1
≤
j
≤
m)
that
is
not
the
only
constituent
of
a
foreign
key
of
r
or
is
the
only
constituent
of
a
foreign
key
of
r
that
references
a
candidate
key,
m),
the
direct
mapping
includes
the
following
rule
for
r
and
a
j
to
generate
Literal
Triples:
Triple(S, P, X"], P)
literal
triples
:
Triple(S, P, V) ← r(X1, ..., Xm), nonNull(Xj), generateRowIRI("r", ["ap1", ..., "apn"], [Xp1, ..., Xpn], S), generateLiteralPropertyIRI("r", "aj", P), generateTypedLiteral(Xj, "aj", "r", V)
For
example,
table
Addresses
in
the
Direct
Mapping
Example
has
columns
ID
,
city
and
state
,
and
it
has
column
ID
as
its
primary
key.
Then
the
following
are
the
direct
mapping
rules
to
generate
Literal
Triples
literal
triples
from
Addresses
:
Triple(S, P, X], S), generateColumnIRI("Addresses", ["ID"], P)
Triple(S, P, X], S), generateColumnIRI("Addresses", ["city"], P)
Triple(S, P, X], S), generateColumnIRI("Addresses", ["state"], P)
Triple(S, P, V) ← Addresses(X1, X2, X3), nonNull(X1), generateRowIRI("Addresses", ["ID"], [X1], S), generateLiteralPropertyIRI("Addresses", "ID", P), generateTypedLiteral(X1, "ID", "Addresses", V) Triple(S, P, V) ← Addresses(X1, X2, X3), nonNull(X2), generateRowIRI("Addresses", ["ID"], [X1], S), generateLiteralPropertyIRI("Addresses", "city", P), generateTypedLiteral(X2, "city", "Addresses", V) Triple(S, P, V) ← Addresses(X1, X2, X3), nonNull(X3), generateRowIRI("Addresses", ["ID"], [X1], S), generateLiteralPropertyIRI("Addresses", "state", P), generateTypedLiteral(X3, "state", "Addresses", V)
As
a
second
example,
consider
again
table
Department
from
the
example
in
Section
Foreign
keys
referencing
candidate
keys
,
which
has
columns
ID
,
name
,
city
and
manager
,
and
assume
that
(
name
,
city
)
is
the
multi-column
primary
key
of
this
table
(instead
of
ID
).
Then
the
following
are
the
direct
mapping
rules
to
generate
Literal
Triples
literal
triples
from
Department
:
Triple(S, P, X], S),
generateColumnIRI("Department", ["name"], P)
Triple(S, P, X], S),
generateColumnIRI("Department", ["city"], P)
Triple(S, P, X], S),
generateColumnIRI("Department", ["ID"], P)
Triple(S, P, V) ← Department(X1, X2, X3, X4), nonNull(X1), generateRowIRI("Department", ["name", "city"], [X2, X3], S), generateLiteralPropertyIRI("Department", "ID", P), generateTypedLiteral(X1, "ID", "Department", V) Triple(S, P, V) ← Department(X1, X2, X3, X4), nonNull(X2), generateRowIRI("Department", ["name", "city"], [X2, X3], S), generateLiteralPropertyIRI("Department", "name", P), generateTypedLiteral(X2, "name", "Department", V) Triple(S, P, V) ← Department(X1, X2, X3, X4), nonNull(X3), generateRowIRI("Department", ["name", "city"], [X2, X3], S), generateLiteralPropertyIRI("Department", "city", P), generateTypedLiteral(X3, "city", "Department", V) Triple(S, P, V) ← Department(X1, X2, X3, X4), nonNull(X4), generateRowIRI("Department", ["name", "city"], [X2, X3], S), generateLiteralPropertyIRI("Department", "manager", P), generateTypedLiteral(X4, "manager", "Department", V)
Assume
that
r
is
a
table
with
columns
a
1
,
...,
a
m
and
such
that
r
does
not
have
a
primary
key.
Then
for
every
a
j
(1
≤
j
≤
m)
that
is
not
the
only
constituent
of
a
foreign
key
of
r
or
is
the
only
constituent
of
a
foreign
key
of
r
that
references
a
candidate
key,
m),
the
direct
mapping
includes
the
following
rule
for
r
and
a
j
to
generate
Literal
Triples:
Triple(S, P, X], V, S),
generateColumnIRI("r", ["a"], P)
literal
triples
:
Triple(S, P, V) ← r(X1, ..., Xm), nonNull(Xj), card("r", [X1, ..., Xm], U), V ≤ U, generateRowBlankNode("r", [X1, ..., Xm], V, S), generateLiteralPropertyIRI("r", "aj", P), generateTypedLiteral(Xj, "aj", "r", V)
For
example,
table
Tweets
from
Section
Empty
(non-existent)
primary
keys
has
columns
tweeter
,
when
and
text
,
and
it
does
not
have
a
primary
key.
Then
the
following
are
the
direct
mapping
rules
to
generate
Literal
Triples
literal
triples
from
Tweets
:
Triple(S, P, X], V, S),
generateColumnIRI("Tweets", ["when"], P)
Triple(S, P, X], V, S),
generateColumnIRI("Tweets", ["text"], P)
Triple(S, P, V) ← Tweets(X1, X2, X3), nonNull(X1), card("Tweets", [X1, X2, X3], U), V ≤ U, generateRowBlankNode("Tweets", [X1, X2, X3], V, S), generateLiteralPropertyIRI("Tweets", "tweeter", P), generateTypedLiteral(X1, "tweeter", "Tweets", V) Triple(S, P, V) ← Tweets(X1, X2, X3), nonNull(X2), card("Tweets", [X1, X2, X3], U), V ≤ U, generateRowBlankNode("Tweets", [X1, X2, X3], V, S), generateLiteralPropertyIRI("Tweets", "when", P), generateTypedLiteral(X2, "when", "Tweets", V) Triple(S, P, V) ← Tweets(X1, X2, X3), nonNull(X3), card("Tweets", [X1, X2, X3], U), V ≤ U, generateRowBlankNode("Tweets", [X1, X2, X3], V, S), generateLiteralPropertyIRI("Tweets", "text", P), generateTypedLiteral(X3, "text", "Tweets", V)
For each foreign key from a table r 1 to a table r 2 , one of the following four cases is applied.
Assume that:
r 1 is a table with columns a 1 , ..., a i and such that [a p 1 , ..., a p j ] is the primary key of r 1 , where 1 ≤ j ≤ i and 1 ≤ p 1 < ... < p j ≤ i
r 2 is a table with columns c 1 , ..., c k and such that [c q 1 , ..., c q m ] is the primary key of r 2 , where 1 ≤ m ≤ k and 1 ≤ q 1 < ... < q m ≤ k
the
foreign
key
indicates
that
the
columns
a
p
s
1
,
...,
a
p
s
n
of
r
1
reference
the
columns
c
q
t
1
,
...,
c
q
t
n
of
r
2
,
where
(1)
1
≤
p
s
1
,
...,
p
s
n
≤
i,
(2)
1
≤
q
t
1
,
...,
q
t
n
≤
k,
and
(3)
n
≥
1
Then
the
direct
mapping
includes
the
following
rule
for
r
1
and
r
2
to
generate
Reference
Triples:
Triple(S, P, O) ← r], S),
r], O),
nonNull(X"], P)
Triple(S, P, O) ← r1(X1, ..., Xi), generateRowIRI("r1", ["ap1", ..., "apj"], [Xp1, ..., Xpj], S), r2(Y1, ..., Yk), generateRowIRI("r2", ["cq1", ..., "cqm"], [Yq1, ..., Yqm], O), nonNull(Xs1), ..., nonNull(Xsn), Xs1 = Yt1, ..., Xsn = Ytn, generateReferencePropertyIRI("r1", ["as1", ..., "asn"], P)
For
example,
...
to-do
...
table
Addresses
in
the
Direct
Mapping
Example
has
columns
ID
,
city
and
state
,
where
column
ID
is
the
primary
key.
Table
People
in
this
example
has
columns
ID
,
fname
and
addr
,
where
column
ID
is
the
primary
key,
and
it
has
a
foreign
key
in
the
column
addr
that
references
the
column
ID
in
the
table
Addresses
.
In
this
case,
the
following
is
the
direct
mapping
rule
to
generate
Reference
Triples
:
Triple(S, P, O) ← People(X1, X2, X3), generateRowIRI("People", ["ID"], [X1], S), Addresses(Y1, Y2, Y3), generateRowIRI("Addresses", ["ID"], [Y1], O), nonNull(X3), X3 = Y1, generateReferencePropertyIRI("People", ["addr"], P)
Assume that:
r 1 is a table with columns a 1 , ..., a i and such that [a p 1 , ..., a p j ] is the primary key of r 1 , where 1 ≤ j ≤ i and and 1 ≤ p 1 < ... < p j ≤ i
r 2 is a table with columns c 1 , ..., c k , and it does not have a primary key
the
foreign
key
indicates
that
the
columns
a
p
s
1
,
...,
a
p
s
n
of
r
1
reference
the
columns
c
q
t
1
,
...,
c
q
t
n
of
r
2
,
where
(1)
1
≤
p
s
1
,
...,
p
s
n
≤
i,
(2)
1
≤
q
t
1
,
...,
q
t
n
≤
k,
and
(3)
n
≥
1
Then
the
direct
mapping
includes
the
following
rule
for
r
1
and
r
2
to
generate
Reference
Triples:
Triple(S, P, O) ← r], S),
r, O),
nonNull(X"], P)
Triple(S, P, O) ← r1(X1, ..., Xi), generateRowIRI("r1", ["ap1", ..., "apj"], [Xp1, ..., Xpj], S), r2(Y1, ..., Yk), card("r2", [Y1, ..., Yk], U), V ≤ U, generateRowBlankNode("r2", [Y1, ..., Yk], V, O), nonNull(Xs1), ..., nonNull(Xsn), Xs1 = Yt1, ..., Xsn = Ytn, generateReferencePropertyIRI("r1", ["as1", ..., "asn"], P)
For
example,
...
to-do
...
assume
that
table
Addresses
in
the
Direct
Mapping
Example
has
columns
ID
,
city
and
state
,
and
that
column
ID
is
a
candidate
key
(instead
of
a
primary
key),
so
that
table
Addresses
does
not
have
a
primary
key.
Moreover,
assume
that
table
People
in
this
example
has
columns
ID
,
fname
and
addr
,
it
has
column
ID
as
its
primary
key,
and
it
has
a
foreign
key
in
the
column
addr
to
the
candidate
key
ID
in
the
table
Addresses
.
In
this
case,
the
following
is
the
direct
mapping
rule
to
generate
Reference
Triples
:
Triple(S, P, O) ← People(X1, X2, X3), generateRowIRI("People", ["ID"], [X1], S), Addresses(Y1, Y2, Y3), card("Addresses", [Y1, Y2, Y3], U), V ≤ U, generateRowBlankNode("Addresses", [Y1, Y2, Y3], V, O), nonNull(X3), X3 = Y1, generateReferencePropertyIRI("People", ["addr"], P)
Assume that:
r 1 is a table with columns a 1 , ..., a i , and it does not have a primary key
r 2 is a table with columns c 1 , ..., c k and such that [c q 1 , ..., c q m ] is the primary key of r 2 , where 1 ≤ m ≤ k and 1 ≤ q 1 < ... < q m ≤ k
the
foreign
key
indicates
that
the
columns
a
p
s
1
,
...,
a
p
s
n
of
r
1
reference
the
columns
c
q
t
1
,
...,
c
q
t
n
of
r
2
,
where
(1)
1
≤
p
s
1
,
...,
p
s
n
≤
i,
(2)
1
≤
q
t
1
,
...,
q
t
n
≤
k,
and
(3)
n
≥
1
Then
the
direct
mapping
includes
the
following
rule
for
r
1
and
r
2
to
generate
Reference
Triples:
Triple(S, P, O) ← r, S),
r], O),
nonNull(X"], P)
Triple(S, P, O) ← r1(X1, ..., Xi), card("r1", [X1, ..., Xi], U), V ≤ U, generateRowBlankNode("r1", [X1, ..., Xi], V, S), r2(Y1, ..., Yk), generateRowIRI("r2", ["cq1", ..., "cqm"], [Yq1, ..., Yqm], O), nonNull(Xs1), ..., nonNull(Xsn), Xs1 = Yt1, ..., Xsn = Ytn, generateReferencePropertyIRI("r1", ["as1", ..., "asn"], P)
For
example,
...
to-do
...
table
People
in
the
Direct
Mapping
Example
has
columns
ID
,
fname
and
addr
,
and
it
has
column
ID
as
its
primary
key,
while
table
Tweets
from
Section
Empty
(non-existent)
primary
keys
has
columns
tweeter
,
when
and
text
,
it
does
not
have
a
primary
key,
and
it
has
a
foreign
key
in
column
tweeter
that
references
column
ID
in
table
People
.
In
this
case,
the
following
is
the
direct
mapping
rule
to
generate
Reference
Triples
:
Triple(S, P, O) ← Tweets(X1, X2, X3), card("Tweets", [X1, X2, X3], U), V ≤ U, generateRowBlankNode("Tweets", [X1, X2, X3], V, S), People(Y1, Y2, Y3), generateRowIRI("People", ["ID"], [Y1], O), nonNull(X1), X1 = Y1, generateReferencePropertyIRI("Tweets", ["tweeter"], P)
Assume that:
r 1 is a table with columns a 1 , ..., a i , and it does not have a primary key
r 2 is a table with columns c 1 , ..., c k , and it does not have a primary key
the
foreign
key
indicates
that
the
columns
a
p
s
1
,
...,
a
p
s
n
of
r
1
reference
the
columns
c
q
t
1
,
...,
c
q
t
n
of
r
2
,
where
(1)
1
≤
p
s
1
,
...,
p
s
n
≤
i,
(2)
1
≤
q
t
1
,
...,
q
t
n
≤
k,
and
(3)
n
≥
1
Then
the
direct
mapping
includes
the
following
rule
for
r
1
and
r
2
to
generate
Reference
Triples:
Triple(S, P, O) ← r, S),
r, O),
nonNull(X"], P)
Triple(S, P, O) ← r1(X1, ..., Xi), card("r1", [X1, ..., Xi], U1), V1 ≤ U1, generateRowBlankNode("r1", [X1, ..., Xi], V1, S), r2(Y1, ..., Yk), card("r2", [Y1, ..., Yk], U2), V2 ≤ U2, generateRowBlankNode("r2", [Y1, ..., Yk], V2, O), nonNull(Xs1), ..., nonNull(Xsn), Xs1 = Yt1, ..., Xsn = Ytn, generateReferencePropertyIRI("r1", ["as1", ..., "asn"], P)
For
example,
...
to-do
...
...
to-do
...
5
References
assume
that
table
People
in
the
Direct
Mapping
Example
SPARQL
SPARQL
Query
Language
for
RDF,
Eric
Prud'hommeaux
has
columns
ID
,
fname
and
Andy
Seaborne
2008.
(See
http://www.w3.org/TR/rdf-sparql-query/.)
SQLFW
SQL.
ISO/IEC
9075-1:2008
SQL
–
Part
1:
Framework
(SQL/Framework)
International
Organization
for
Standardization,
27
January
2009.
SQLFN
ISO/IEC
9075-2:2008
SQL
–
Part
2:
Foundation
(SQL/Foundation)
International
Organization
for
Standardization,
27
January
2009.
RDF-concepts
Resource
Description
Framework
(RDF):
Concepts
addr
,
and
Abstract
Syntax,
G.
Klyne,
J.
J.
Carroll,
Editors,
W3C
Recommendation,
10
February
2004
(See
http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/.)
that
column
ID
is
a
candidate
key
(instead
of
a
primary
key),
so
that
People
does
not
have
a
primary
key.
Moreover,
assume
that
table
Tweets
from
Section
Empty
(non-existent)
primary
keys
ReuseableIDs
Reusable
Identifiers
has
columns
tweeter
,
when
and
text
,
it
does
not
have
a
primary
key,
and
it
has
a
foreign
in
column
tweeter
that
references
candidate
key
ID
in
table
People
.
In
this
case,
the
RDB2RDF
following
is
the
direct
mapping
language,
Michael
Hausenblas
and
Themis
Palpanas,
2009.
(See
http://esw.w3.org/topic/Rdb2RdfXG/ReusableIdentifier.)
URI
RFC3986
-
Uniform
Resource
Identifier
(URI):
Generic
Syntax
(See
http://tools.ietf.org/html/rfc3986.)
IRI
RFC3987
-
Internationalized
Resource
Identifier
(IRIs)
(See
http://tools.ietf.org/html/rfc3987.)
$Log: LC-to-CR.html,v $
Revision 1.1 2012/02/22 15:52:46 ivan
*** empty log message ***
Revision 1.1 2012/02/21 19:04:11 eric
cp ~/WWW/2001/sw/rdb2rdf/directMapping/LC/
Revision 1.1 2012/01/26 22:59:54 eric
CREATED: http://www.w3.org/2007/10/htmldiff?doc1=http%3A%2F%2Fwww.w3.org%2FTR%2F2011%2FWD-rdb-direct-mapping-20110324%2F&doc2=http%3A%2F%2F128.30.6.156%2F2001%2Fsw%2Frdb2rdf%2FdirectMapping%2FLC%2F
Revision 1.1 2011/03/23 22:17:52 bertails
+ snapshot of rdb-direct-mapping
Revision 1.21 2011/03/23 20:53:12 bertails
~ cleaning before moving to TR space
Revision 1.20 2011/03/17 23:16:34 eric
- fragments on node IRIs
Revision 1.19 2011/03/08 04:14:06 bertails
~ fix some typos
Revision 1.18 2011/03/07 00:48:06 bertails
+ phi function mapping rows to RDF nodes
Revision 1.17 2011/03/07 00:13:06 bertails
+ RDB accessor functions
Revision 1.16 2011/03/06 21:56:29 bertails
~ migrating to cleaner denotational semantics
Revision 1.15 2011/03/02 17:26:34 marenas
Datalog rules in Section 4 were simplified
Revision 1.14 2011/03/01 02:35:49 marenas
Section 4 now includes all the Datalog rules that define the direct mapping
Revision 1.13 2011/02/01 16:15:17 marenas
Section 4 now includes an example for each type of Datalog rule used to define the direct mapping
Revision 1.12 2011/01/27 01:16:42 marenas
New version of Datalog rules to deal with repeated tuples in a table without a primary key
Revision 1.11 2010/11/17 21:36:44 eric
~ validated HTML, CSS, links for publication
Revision 1.10 2010/11/16 17:45:35 eric
~ xml well-formed
Revision 1.9 2010/11/16 17:43:47 eric
~ 2010-11-16T17:34:38Z <ericP> mhausenblas: s/very simple direct mapping/direct mapping/
~ re-title notation
~ addressed nunolopes's issue with rule 23
~ text from #rdb2rdf 2010-11-16T17:40:18Z <juansequeda>...
Revision 1.8 2010/11/16 17:30:18 eric
~ fixed Notation title
Revision 1.7 2010/11/16 17:25:52 eric
~ re-oranized algebra section
Revision 1.6 2010/11/16 17:22:39 eric
~ re-ordered authors
Revision 1.5 2010/11/11 18:39:27 marenas
rephrasing the definition of table tuples
Revision 1.4 2010/11/11 17:58:25 marenas
New Section 2.2: "Preliminaries: Generating IRIs"
Examples are now grouped in Section 2.4: "Additional Examples and Corner Cases"
Revision 1.3 2010/11/10 14:54:48 marenas
Removing "(Editor)" from the list of authors
Revision 1.2 2010/11/10 12:56:22 eric
+
Revision 1.1 2010/11/10 02:51:03 eric
moved from ../directMapping
Revision 1.56 2010/11/10 02:47:08 eric
~ well-formedness error
Revision 1.55 2010/11/10 02:45:37 eric
~ finished adopting the all-relative-IRI model in order to sync with the merged text from alt/
~ adopted "direct" mapping per the resolution of the 2010-11-09 telecon
~ made Juan and Marcello editors instead of authors
~ fixed a couple typos
: I believe this specification follows the intent of:
RESOLUTION: http://www.w3.org/2001/sw/rdb2rdf/directMapping/ with Juan,
Marcelo and Eric as editors based on Richard's proposal as of
http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2010Nov/0052.html and
try to work in J&M's IRI and Triple generations part; move hierarchical
table and the M-M mappings into Ed note; datalog as a separate section; Eric
perform merge with review/approval/consensus of Juan, Marcelo, & Eric
Revision 1.54 2010/11/09 22:46:56 eric
+
Revision 1.53 2010/11/09 22:41:02 eric
~ date
Revision 1.52 2010/11/09 22:39:12 eric
~ s/stem/base/g
+ inclusion of collapsible sections from alt/
Revision 1.51 2010/11/09 15:39:46 eric
~ removed collapsible sections per request mid:AANLkTikvnrgXuu5fDAw+c2nUv5ENkmngPAJJ05c2gASk@mail.gmail.com
Revision 1.50 2010/11/09 15:06:34 eric
+ exp sections
Revision 1.49 2010/11/09 14:12:08 eric
~ addressed
Revision 1.48 2010/11/09 04:11:35 eric
~ addressed
+ inclusion of some explanatory details from
Revision 1.47 2010/11/04 12:42:21 eric
~ working on style for editorial choices
Revision 1.46 2010/11/04 06:10:08 eric
~ hilit triples in query in Use of Direct Mapping
Revision 1.45 2010/11/04 05:42:55 eric
~ incorporated
Revision 1.44 2010/11/02 08:18:07 eric
~ updates per DanC's feedback
Revision 1.43 2010/10/29 03:10:12 eric
~ s/relational terminology/SQL terminology/
Revision 1.42 2010/10/17 13:46:48 eric
+ SQL constraints
Revision 1.41 2010/10/12 14:21:36 eric
~ renumbered
Revision 1.40 2010/10/12 12:14:52 eric
+ SQL for example 1
Revision 1.39 2010/10/11 03:12:21 eric
~ prettied up mutual-hilights
Revision 1.38 2010/10/10 22:09:55 eric
+ pfkexception
Revision 1.37 2010/10/10 14:25:41 eric
~ re-worked front-loaded informative rules
Revision 1.36 2010/10/10 11:59:01 eric
~ prettied-up pre@class=turtle
~ experimenting with new presentation of transformation rules
~ validated XSLT output
Revision 1.35 2010/10/09 15:12:40 eric
+ crosslinks for hier-tabl
Revision 1.34 2010/10/09 14:52:31 eric
+ crosslinks for ref-no-pk
Revision 1.33 2010/10/09 13:45:17 eric
~ symmetric xrefs between tables and triples for emp-addr and multi-key
Revision 1.32 2010/10/08 21:59:54 eric
+ hilights
Revision 1.31 2010/09/29 19:53:37 eric
~ align with https://dvcs.w3.org/hg/stemGraph/
Revision 1.30 2010/09/29 15:13:18 eric
~ align with https://dvcs.w3.org/hg/stemGraph/rev/75cf39ef7d74
Revision 1.29 2010/09/29 03:34:55 eric
+ 2nd gen hierarchical example
Revision 1.28 2010/09/28 03:10:53 eric
validation
Revision 1.27 2010/09/28 03:08:52 eric
+ hierarchical (untested)
Revision 1.26 2010/09/27 21:49:18 eric
~ XML validation (per xsltproc)
Revision 1.25 2010/09/27 21:46:42 eric
~ fixed reference table name
Revision 1.24 2010/09/27 18:48:46 eric
+ noticed another key in ref-no-pk
Revision 1.23 2010/09/27 18:13:03 eric
+ ref-no-pk
Revision 1.22 2010/09/27 14:50:44 eric
+ nodemap
+ a rough pass on <scala>scala code</scala>
Revision 1.21 2010/09/26 04:50:07 eric
~ fix load state for syntax display
Revision 1.20 2010/09/25 18:40:39 eric
+ some tips
Revision 1.19 2010/09/24 16:34:02 eric
+ some tips
Revision 1.18 2010/09/24 16:00:53 eric
+ some tips
Revision 1.17 2010/09/24 15:50:41 eric
+ buttons for different languages
Revision 1.16 2010/09/07 12:14:44 eric
~ fixed pk invocation errors per mid:04C1B62C-42A5-424C-974B-6E894ED7B11A@cyganiak.de
Revision 1.15 2010/08/30 18:37:19 eric
+ section
Revision 1.14 2010/08/30 14:05:45 eric
+ fks
Revis
rule
to
generate
Reference
Triples
:
Triple(S, P, O) ← Tweets(X1, X2, X3), card("Tweets", [X1, X2, X3], U1), V1 ≤ U1, generateRowBlankNode("Tweets", [X1, X2, X3], V1, S),