Direct Mapping: denotational demantics

1 Abstract Data Models

The RDB and RDF Abstract Data Types (ADTs) make use of the commonly defined ADTs Set, List and MultiSet, used here as type constructors. For example, Set(A) denotes the type for the sets of elements of type A. We assume that they come with their common operations, such as the function size : Set → Int.

We follow a type-as-specification approach, thus the ADTs are actually dependent types. For example, { s:Set(A) | size(s) ≤ 1 } is a subtype of Set(A) with at most one element.

1.1 RDB Abstract Data Type

Database	::=	Set(Table)
Table	::=	(Header, List(CandidateKey), Set(ForeignKey), Body)
Header	::=	Set((ColumnName, Datatype))
Body	::=	MultiSet(Row)
Row	::=	Set((ColumnName, CellValue))
CellValue	::=	LexicalValue \| NULL
ForeignKey	::=	(List(ColumnName), Table, CandidateKey)
CandidateKey	::=	List(ColumnName)
Datatype	::=	Int \| Float \| Date \| …
TableName	::=	String
ColumnName	::=	String

1.2 RDB accessor functions

tablename	:	Table → String
header	:	Table → Header
candidateKeys	:	Table → List(CandidateKey)
primaryKey	:	Table → {s:Set(CandidateKey) \| size(s) ≤ 1}
body	:	Table → Body
datatype	:	{h:Header} → {c:ColumnName \| ∃ d, (c,d) ∈ h)} → {d:Datatype \| (c,d) ∈ h)}
table	:	{r:Row} → {t:Table \| r ∈ t}
value	:	{r:Row} → {a:ColumnName \| a ∈ r} → CellValue
lexicals	:	{r:Row} → List({a:ColumnName \| a ∈ r} → List(LexicalValue)
references	:	{t:Table} → {r:Row \| r ∈ t} → Set({fk:ForeignKey \| fk ∈ references(table(r), r)})
scalars	:	{t:Table} → {r:Row \| r ∈ t} → Set(ColumnName)
dereference	:	{r:Row} → {fk:ForeignKey \| fk ∈ references(table(r), r)}
		→ {targetRow:Row \| *let* (columnNames, targetTable, ck) = fk in
		targetRow ∈ body(targetTable)
		*and* ∀ (c_i, v_i) ∈ r, ∀ (c′_i, v′_i) ∈ targetRow, c_i ∈ columnNames → c′_i ∈ ck → v_i = v′_i) }

1.3 RDF Abstract Data Type

Graph	::=	Set(Triple)
Triple	::=	(Subject, Predicate, Object)
Subject	::=	IRI \| BlankNode
Predicate	::=	IRI
Object	::=	IRI \| BlankNode \| Literal
BlankNode	::=	RDF blank node
Literal	::=	PlainLiteral \| TypedLiteral
PlainLiteral	::=	lexicalForm \| (lexicalForm, langageTag)
TypedLiteral	::=	(lexicalForm, IRI)
IRI	::=	RDF URI−reference [] ¹
lexicalForm	::=	a Unicode string ² []

We don’t need to provide accessors as we are simply constructing RDF graphs and their components. In order to stay simple, we’ll use a Turtle [] like syntax for injecting elements in these types.

So armed, we set forth to define the Direct Mapping.

2 Direct Mapping

Inhabitants of RDB 1.1 are denoted by mathematical objects living in the RDF domain 1.1. We call this denotational semantics of RDB the Direct Mapping.

Most of the functions are higher-order functions, relying on a function φ : Row → Node. We assume that this function maps any Row to a unique row IRI. φ is formally defined by the following axioms:

	∀ db:Database, ∀ row:Row,									(1)
	row ∈ db → primaryKey(row) ≠ ∅ → φ(row) is an IRI
	∀ db:Database, ∀ row:Row,									(2)
	row ∈ db → primaryKey(row) = ∅ → φ(row) is a BlankNode

2.1 Denotational semantics

The Direct Mapping is defined by induction on the structure of RDB. Thus it is defined for any database and its execution is guarantied to terminate.

The entry point is ⟦ ⟧_database^φ. Note that not all the functions need to be parameterized by φ.

⟦ ⟧_database^φ	:	Database → Graph
⟦db⟧_database^φ	=	{ triple \| triple ∈ ⟦t⟧_table^φ \| t ∈ db }
⟦ ⟧_table^φ	:	Table → Set(Triple)
⟦t⟧_table^φ	=	{ triple \| triple ∈ ⟦r⟧_row^φ \| r ∈ body(t) }
⟦ ⟧_row^φ	:	Row → Set(Triple)
⟦r⟧_row^φ	=	*let* s = φ(r) in
		{ (s, p, o) \| (p, o) ∈ ⟦r, fk⟧_ref^φ \| fk ∈ references(t) }
	⋃	{ (s, p, o) \| (p, o) ∈ ⟦r, c⟧_lex \| c ∈ lexicals(t) }
	⋃	(s, rdf : type, ue(tablename(table(r))))
⟦r, fk⟧_ref^φ	:	(Row, ForeignKey) → (Predicate, Object)
⟦r, fk⟧_ref^φ	=	*let* p = ⟦table(r), fk⟧_col in
		*let* targetRow = dereference(r, fk) in
		*let* o = φ(targetRow) in
		(p, o)
⟦ , ⟧_lex	:	(Row, Column) → { s:Set((Predicate, Object)) \| size(s) ≤ 1 }
⟦r, c⟧_lex	=	*let* p = ⟦table(r), c⟧_col in
		*let* v = value(r, c) in
		*let* d = datatype(header(table(r))(c)) in
		if v is NULL *then* empty set
		*else* if d is String *then* {(p, v)}
		*else* *let* datatype_iri = ⟦d⟧_datatype in
		{(p, (v, datatype_iri))}
⟦ , ⟧_col	:	(Row, List[Column]) → IRI
⟦r, c*⟧_col	=	ue(tablename(table(r))) + ‵#‵ + c₁ + ‵_‵ + ⋯ + ‵_‵ + c_n
⟦ ⟧_datatype	:	Datatype → IRI
⟦Int⟧_datatype	=	XSD : integer
⟦Float⟧_datatype	=	XSD : float
⟦Date⟧_datatype	=	XSD : date
ue	:	String → String
ue(s)	=	url_encode(s) []

1: subsequently restricted by SPARQL to exclude character x20
2: http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#dfn-literal

This document was translated from L^AT_EX by H^EV^EA.