1 Abstract Data Models
The RDB and RDF Abstract Data Types (ADTs) make use of the commonly defined ADTs Set, List and MultiSet, used here as type constructors. For example, Set(A) denotes the type for the sets of elements of type A. We assume that they come with their common operations, such as the function size : Set → Int.
We follow a type-as-specification approach, thus the ADTs are actually dependent types. For example, { s:Set(A) | size(s) ≤ 1 } is a subtype of Set(A) with at most one element.
1.1 RDB Abstract Data Type
Database | ::= | Set(Table) |
Table | ::= | (Header, List(CandidateKey), Set(ForeignKey), Body) |
Header | ::= | Set((ColumnName, Datatype)) |
Body | ::= | MultiSet(Row) |
Row | ::= | Set((ColumnName, CellValue)) |
CellValue | ::= | LexicalValue | NULL |
ForeignKey | ::= | (List(ColumnName), Table, CandidateKey) |
CandidateKey | ::= | List(ColumnName) |
Datatype | ::= | Int | Float | Date | … |
TableName | ::= | String |
ColumnName | ::= | String |
|
1.2 RDB accessor functions
tablename | : | Table → String |
header | : | Table → Header |
candidateKeys | : | Table → List(CandidateKey) |
primaryKey | : | Table → {s:Set(CandidateKey) | size(s) ≤ 1} |
body | : | Table → Body |
datatype | : | {h:Header} → {c:ColumnName | ∃ d, (c,d) ∈ h)} → {d:Datatype | (c,d) ∈ h)} |
table | : | {r:Row} → {t:Table | r ∈ t} |
value | : | {r:Row} → {a:ColumnName | a ∈ r} → CellValue |
lexicals | : | {r:Row} → List({a:ColumnName | a ∈ r} → List(LexicalValue) |
references | : | {t:Table} → {r:Row | r ∈ t} → Set({fk:ForeignKey | fk ∈ references(table(r), r)}) |
scalars | : | {t:Table} → {r:Row | r ∈ t} → Set(ColumnName) |
dereference | : | {r:Row} → {fk:ForeignKey | fk ∈ references(table(r), r)} |
| | → {targetRow:Row | let (columnNames, targetTable, ck) = fk in |
| | targetRow ∈ body(targetTable) |
| | and ∀ (ci, vi) ∈ r, ∀ (c′i, v′i) ∈ targetRow, ci ∈ columnNames → c′i ∈ ck → vi = v′i) } |
|
1.3 RDF Abstract Data Type
Graph | ::= | Set(Triple) |
Triple | ::= | (Subject, Predicate, Object) |
Subject | ::= | IRI | BlankNode |
Predicate | ::= | IRI |
Object | ::= | IRI | BlankNode | Literal |
BlankNode | ::= | RDF blank node |
Literal | ::= | PlainLiteral | TypedLiteral |
PlainLiteral | ::= | lexicalForm | (lexicalForm, langageTag) |
TypedLiteral | ::= | (lexicalForm, IRI) |
IRI | ::= | RDF URI−reference [] 1 |
lexicalForm | ::= | a Unicode string 2 [] |
|
We don’t need to provide accessors as we are simply constructing RDF graphs and their components. In order to stay simple, we’ll use a Turtle [] like syntax for injecting elements in these types.
So armed, we set forth to define the Direct Mapping.
2 Direct Mapping
Inhabitants of RDB 1.1 are denoted by mathematical objects living in the RDF domain 1.1. We call this denotational semantics of RDB the Direct Mapping.
Most of the functions are higher-order functions, relying on a function φ : Row → Node. We assume that this function maps any Row to a unique row IRI. φ is formally defined by the following axioms:
| | ∀ db:Database, ∀ row:Row, | | | | | | | | | (1) |
| row ∈ db → primaryKey(row) ≠ ∅ → φ(row) is an IRI
| | | | | | | | | |
| ∀ db:Database, ∀ row:Row, | | | | | | | | | (2) |
| row ∈ db → primaryKey(row) = ∅ → φ(row) is a BlankNode
| | | | | | | | | |
|
2.1 Denotational semantics
The Direct Mapping is defined by induction on the structure of RDB.
Thus it is defined for any database and its execution is guarantied to terminate.
The entry point is ⟦ ⟧databaseφ. Note that not all the functions need to be parameterized by φ.
⟦ ⟧databaseφ | : | Database → Graph |
⟦db⟧databaseφ | = | { triple | triple ∈ ⟦t⟧tableφ | t ∈ db } |
⟦ ⟧tableφ | : | Table → Set(Triple) |
⟦t⟧tableφ | = | { triple | triple ∈ ⟦r⟧rowφ | r ∈ body(t) } |
⟦ ⟧rowφ | : | Row → Set(Triple) |
⟦r⟧rowφ | = | let s = φ(r) in |
| | { (s, p, o) | (p, o) ∈ ⟦r, fk⟧refφ | fk ∈ references(t) } |
| ⋃ | { (s, p, o) | (p, o) ∈ ⟦r, c⟧lex | c ∈ lexicals(t) } |
| ⋃ | (s, rdf : type, ue(tablename(table(r)))) |
⟦r, fk⟧refφ | : | (Row, ForeignKey) → (Predicate, Object) |
⟦r, fk⟧refφ | = | let p = ⟦table(r), fk⟧col in |
| | let targetRow = dereference(r, fk) in |
| | let o = φ(targetRow) in |
| | (p, o) |
⟦ , ⟧lex | : | (Row, Column) → { s:Set((Predicate, Object)) | size(s) ≤ 1 } |
⟦r, c⟧lex | = | let p = ⟦table(r), c⟧col in |
| | let v = value(r, c) in |
| | let d = datatype(header(table(r))(c)) in |
| | if v is NULL then empty set |
| | else if d is String then {(p, v)} |
| | else let datatype_iri = ⟦d⟧datatype in |
| | {(p, (v, datatype_iri))} |
⟦ , ⟧col | : | (Row, List[Column]) → IRI |
⟦r, c*⟧col | = | ue(tablename(table(r))) + ‵#‵ + c1 + ‵_‵ + ⋯ + ‵_‵ + cn |
⟦ ⟧datatype | : | Datatype → IRI |
⟦Int⟧datatype | = | XSD : integer |
⟦Float⟧datatype | = | XSD : float |
⟦Date⟧datatype | = | XSD : date |
ue | : | String → String |
ue(s) | = | url_encode(s) [] |
|
This document was translated from LATEX by
HEVEA.