Warning:
This wiki has been archived and is now readonly.
DatabaseInstanceOnly and DatabaseInstancesandSchema Mapping
We present two mapping languages. The first mapping language, "DatabaseInstancesOnly", is a simple language that only takes into account the instances of the database. The schema is not considered, and the output is only RDF.
The second mapping language, "DatabaseInstancesandSchema", is a more expressive language that considers the schema and instances of the database. The output is an ontology in RDFS/OWL and RDF.
Contents
Preliminaries
Relational schemas and instances
Let U be an infinite set of constants and V an infinite set of variables. We assume that U and V are disjoint sets.
A relational schema R, or just schema, is a finite set {R_{1}, ..., R_{k}} of relation symbols, with each relation symbol R_{i} having a fixed arity n_{i} > 0. An instance D of R assigns to each relation symbol R_{i} a finite n_{i}ary relation R_{i}^{D} of elements from U (that is, R_{i}^{D} is a finite subset of U^{ni}).
Let consider the following running example:
Relational schema R
student(s_id, name) course(c_id, title, d_id) enrolled(s_id, c_id) department(d_id, title)
Database instance D
student(1, John) course(2, CS101, 3) enrolled(1, 2) department(3, CS)
Then we have that R = {student, course, enrolled, department}, where the arity of relation symbols student, enrolled and department is 2, and the arity of relation symbol course is 3. Moreover, we have that:
student^{D} = { (1, John) } course^{D} = { (2, CS101, 3) } enrolled^{D} = { (1, 2) } department^{D} = { (3, CS) }
Datalog
Syntax of Datalog rules
A Datalog rule is a rule of the form:

P(x) ← P_{1}(x_{1}), ..., P_{k}(x_{k}), not Q_{1}(y_{1}), ..., not Q_{m}(y_{m}), u_{1} ≠ v_{1}, ..., u_{n} ≠ v_{n}
where:
 k > 0, m ≥ 0 and n ≥ 0
 P, P_{1}, ..., P_{k}, Q_{1}, ..., Q_{m} are (non necessarily distinct) relation symbols
 each of x, x_{1}, ..., x_{k}, y_{1}, ..., y_{m} is a tuple of variables (elements from V) and constants (elements from U)
 every u_{i} (1 ≤ i ≤ n), and every v_{i}, is either a variable or a constant
 the following safety conditions are satisfied:
 every variable in x is mentioned in some tuple x_{j} (1 ≤ j ≤ k)
 every variable in y_{i} (1 ≤ i ≤ m) is mentioned in some tuple x_{j} (1 ≤ j ≤ k)
 if u_{i} is a variable (1 ≤ i ≤ n), then u_{i} is mentioned in some tuple x_{j} (1 ≤ j ≤ k)
 if v_{i} is a variable (1 ≤ i ≤ n), then v_{i} is mentioned in some tuple x_{j} (1 ≤ j ≤ k)
It should be noticed that if m = 0 in the previous Datalog rule, then the rule does not include any element of the form "not Q(x)". Similarly, if n = 0 in the previous Datalog rule, then it does not include any inequalities.
The following is a Datalog rule for our running example:
triple(x, "name", y) ← student(x, y)
In this rule, x and y are variables, and "name" is a constant (an element from U). From now on, we use double quotes in Datalog rules to denote constants.
As a second example, consider the following Datalog rules for our running example:
A(x) ← enrolled(x, y), enrolled(x, z), y ≠ z B(x) ← enrolled(x, y_{1}), enrolled(x, y_{2}), course(y_{1}, w_{1}, z_{1}), course(y_{2}, w_{2}, z_{2}), y_{1} ≠ y_{2}, z_{1} ≠ "CS", z_{2} ≠ "CS"
Intuitively, the first rule retrieves all the students that are taking at least two distinct courses, while the second rule retrieves all the students that are taking at least two distinct courses that are not given in the "CS" department. Finally, consider the following Datalog rule that uses the relation symbol A just defined:
C(x) ← enrolled(x, y), not A(x)
Intuitively, this rule retrieves all the students that are taking exactly one course: enrolled(x, y) indicates that x is taking at least one course, while not A(x) indicates that x is not in the table A, that is, x is not taking at least two distinct courses.
We conclude this section by mentioning that in the Datalog rule shown at the beginning, P(x) is the head of the rule and P_{1}(x_{1}), ..., P_{k}(x_{k}), not Q_{1}(y_{1}), ..., not Q_{m}(y_{m}), u_{1} ≠ v_{1}, ..., u_{n} ≠ v_{n} is the body of the rule.
Nonrecursive Datalog programs
A Datalog program Π is a finite set of Datalog rules. In Π, a relation symbol P is intensional if it is mentioned in the head of some rule of Π, and it is extensional otherwise. Then a Datalog program Π is said to be defined over a schema R if the set of extensional relation symbols of Π is a subset of R. For example, the following is a Datalog program over the schema of our running example:
A(x) ← enrolled(x, y), enrolled(x, z), y ≠ z C(x) ← enrolled(x, y), not A(x)
To see why this is the case, it should be noticed that {A, C} is the set of intesional relation symbols of this program, while {enrolled} is the set of extensional relation symbols of this program.
A Datalog program Π is said to be nonrecursive if there exists a function f that assigns a positive number to each relation symbol in Π in such a way that for every rule in Π, if P is the relation symbol in the head of the rule and Q is a relation symbol in the body of the rule, then f(P) > f(Q). Thus, for example, the Datalog program:
A(x) ← enrolled(x, y), enrolled(x, z), y ≠ z C(x) ← enrolled(x, y), not A(x)
is nonrecursive as the function f defined as f(C) = 3, f(A) = 2, f(enrolled) = 1 satisfies the aforementioned condition. On the other hand, the following Datalog program:
A(x) ← E(x,y), A(y)
as well as:
A(x) ← E(x,y) A(x) ← B(x) B(x) ← C(x) C(x) ← A(x)
are recursive Datalog programs.
Intuitively, a Datalog program is recursive if one of its intensional predicates is defined in terms of itself. The following is a typical example of a recursive Datalog program:
flight(x,y) ← non_stop_flight(x,y) flight(x,y) ← non_stop_flight(x,z), flight(z,y)
This program retrieves the pairs of cities (x,y) such that there is a way to flight from x to y that may include an arbitrary number of stopovers.
Important: In what follows, we consider only nonrecursive Datalog programs.
Semantics of Datalog programs
To define the semantics of Datalog programs, we need to introduce some terminology. A substitution σ is a function from V ∪ U to U that is the identity on the constants (that is, σ(c) = c for every c ∈ U). Given a tuple x=(x_{1}, ..., x_{k}) of constants and variables and a substitution σ, tuple σ(x) is defined as (σ(x_{1}), ..., σ(x_{k})). For example, if x= (x, "name", y) and σ is a substitution such that σ(x) = "1" and σ(y) = "John", then σ(x) = ("1", "name", "John").
Let R be a schema, D an instance of R, and Π a Datalog program over R, and assume that f is a function that assigns a positive number to each relation symbol in Π in such a way that for every rule in Π, if P is the relation symbol in the head of the rule and Q is a relation symbol in the body of the rule, then f(P) > f(Q) (such a function exists since Π is assumed to be nonrecursive). The evaluation of Π over D assigns to every predicate symbol R mentioned in Π a relation R^{D} of the corresponding arity. Formally, this evaluation is recursively defined as follows.
(1) For every extensional relation symbol R mentioned in Π, relation R^{D} is just the relation assigned to R by D.
(2) Assume that P is an intensional relation symbol in Π and that for every relation symbol Q such that f(P) > f(Q), relation Q^{D} has already been computed. Then a tuple c of constants is in P^{D} if and only if there exist a rule in Π:
and a substitution σ such that:
 σ(x) = c
 for every i ∈ {1, ..., k}: σ(x_{i}) is a tuple in P_{i}^{D}
 for every i ∈ {1, ..., m}: σ(y_{i}) is not a tuple in Q_{i}^{D}
 for every i ∈ {1, ..., n}: σ(u_{i}) ≠ σ(v_{i})
It is important to notice that in the previous definition, a function f is used to define the evaluation of Π over D. Thus, it is natural to ask what would happen if one replaces this function by another function g satisfying the condition mentioned in the definition. Given that f is only used to determine the order in which the rules of Π have to be evaluated, it can be formally proved that if one uses g instead of f when evaluating Π over D, then the result is the same.
We now show an example of the evaluation process. Let Π be the following Datalog program over the schema of our running example:
A(x) ← enrolled(x, y), enrolled(x, z), y ≠ z C(x) ← enrolled(x, y), not A(x)
Moreover, let f be a function defined as f(C) = 3, f(A) = 2 and f(enrolled) = 1, which actually shows that Π is a nonrecursive Datalog program. To evaluate Π over our running database instance D, we first notice that:
enrolled^{D} = { (1,2) }
Then we consider intensional predicate A, as we have that f(A) > f(enrolled), enrolled^{D} has already been computed and enrolled in the only relation symbol mentioned in Π whose image under f is smaller than f(A). In this case, we have that:
A^{D} = ∅
as there is no substitution σ such that (σ(x), σ(y)) is in enrolled^{D}, (σ(x), σ(z)) is in enrolled^{D} and σ(y) ≠ σ(z). Finally, we consider intensional predicate C, for which we have that:
C^{D} = { 1 }
since for the substitution σ such that σ(x) = 1 and σ(y) = 2, we have that (σ(x), σ(y)) is in enrolled^{D} and σ(x) is not in A^{D}. Notice that this result corresponds with our intuition about the definition of relation symbol C, as C^{D} contains the set of students in D that are taking exactly one course.
Builtin predicates
Fix a relational schema R, and assume that B is a set of relation symbols that are not mentioned in R. Moreover, assume that for every relation symbol R in B of arity n, there exists a (nonnecessarily finite) nary relation I of elements from U such that for every instance D of R, it holds that R^{D} = I. That is, the interpretation of each relation symbol in B is fixed (it does not depend on the database instances) and may be infinite.
Each relation symbol in B is called a builtin predicate. For example, if U is the set of natural numbers, then < is a builtin predicate whose interpretation in each database instance D is:
<^{D} = { (n,m)  n and m are natural numbers and n is smaller than m }
Builtin predicates can be included in Datalog rules. More precisely, a Datalog rule with builtin predicates is a rule of the form:

P(x) ← P_{1}(x_{1}), ..., P_{k}(x_{k}), not Q_{1}(y_{1}), ..., not Q_{m}(y_{m}), u_{1} ≠ v_{1}, ..., u_{n} ≠ v_{n}, R_{1}(w_{1}), ..., R_{s}(w_{s})
where:
 k > 0, m ≥ 0, n ≥ 0 and s ≥ 0
 P, P_{1}, ..., P_{k}, Q_{1}, ..., Q_{m} are (non necessarily distinct) relation symbols not mentioned in B
 R_{1}, ..., R_{s} are (non necessarily distinct) builtin predicate symbols (relation symbols from B)
 each of x, x_{1}, ..., x_{k}, y_{1}, ..., y_{m}, w_{1}, ..., w_{s} is a tuple of variables and constants
 every u_{i} (1 ≤ i ≤ n), and every v_{i}, is either a variable or a constant
 the following safety conditions are satisfied:
 every variable in x is mentioned in some tuple x_{j} (1 ≤ j ≤ k)
 every variable in y_{i} (1 ≤ i ≤ m) is mentioned in some tuple x_{j} (1 ≤ j ≤ k)
 if u_{i} is a variable (1 ≤ i ≤ n), then u_{i} is mentioned in some tuple x_{j} (1 ≤ j ≤ k)
 if v_{i} is a variable (1 ≤ i ≤ n), then v_{i} is mentioned in some tuple x_{j} (1 ≤ j ≤ k)
 every variable in w_{i} (1 ≤ i ≤ s) is mentioned in some tuple x_{j} (1 ≤ j ≤ k)
It is important to notice that the last safety condition is imposed to avoid rules like the following:
A(x) ← B(y), y < x
which may have an infinite number of solutions if U is the set of natural numbers and < is the usual order on these numbers.
A Datalog program Π with builtin predicates is a finite set of Datalog rules with builtin predicates. A Datalog program Π is said to be defined over relational schema R with builtin predicates B if the set of extensional relation symbols of Π is a subset of R ∪ B.
The semantics of Datalog programs with builtin predicates is a simple extension of the semantics defined above. Let D be an instance of schema R and Π a Datalog program over R with builtin predicates B, and assume that f is a function that assigns a positive number to each relation symbol in Π in such a way that for every rule in Π, if P is the relation symbol in the head of the rule and Q is a relation symbol in the body of the rule, then f(P) > f(Q) (such a function exists since Π is assumed to be nonrecursive). The evaluation of Π over D assigns to every predicate symbol R mentioned in Π a relation R^{D} of the corresponding arity. Formally, this evaluation is recursively defined as follows.
(1) For every extensional relation symbol R mentioned in Π, relation R^{D} is just the relation assigned to R by D (in particular, for every builtin predicate R mentioned in Π, relation R^{D} is just the fixed interpretation assigned to R by D).
(2) Assume that P is an intensional relation symbol in Π and that for every relation symbol Q such that f(P) > f(Q), relation Q^{D} has already been computed. Then a tuple c of constants is in P^{D} if and only if there exist a rule in Π:
and a substitution σ such that:
 σ(x) = c
 for every i ∈ {1, ..., k}: σ(x_{i}) is a tuple in P_{i}^{D}
 for every i ∈ {1, ..., m}: σ(y_{i}) is not a tuple in Q_{i}^{D}
 for every i ∈ {1, ..., n}: σ(u_{i}) ≠ σ(v_{i})
 for every i ∈ {1, ..., s}: σ(w_{i}) is a tuple in R_{i}^{D}
Assumptions
As we mentioned before, we only consider nonrecursive Datalog programs in this document. Moreover, for the sake of readability we only consider unary keys and unary foreign keys in this document. Besides, we assume given the following builtin predicate:
generateURI(pk, s) is a builtin predicate which holds if s is a URI generated with pk
DatabaseInstanceOnly Mapping
This mapping makes the following assumptions:
 The user only cares about the database instances
 No need of taking into account the schema of the database or creating an ontology (RDFS/OWL)
 The output of the mapping are only RDF triples
 The predicate in the RDF triple is always type rdf:Property (even though this is not explicit)
 A relation can be either an existing table in the database or a user generated SQL query
 The mapping is written in Datalog. However, this can be syntactically translated to W3C's RIF
Formal definition of the language
Given a relational schema R, a databaseinstanceonly mapping over R is a Datalog program over R with builtin predicates { generateURI }.
In what follows, we show two alternative approaches to write databaseinstanceonly mappings.
First approach: Databaseinstanceonly mapping as a default mapping
The rules are generated automatically in this case, so that the user does not need to know any of rules or have any ontology in mind to translate his/her relational data into RDF.
To present the mapping language, for each type of rule in it: we show its template, we then apply it to the running example, and we finally present the output that an RDB2RDF system should give in the example.
Case 1: Generate triples for each attribute in a relation
Mapping template:
Triple(s, "p", p) ← r(..., pk, ..., p, ...), generateURI(pk, s)
where pk is the primary key of relation r and "p" is the attribute label of p.
Example mapping:
Triple(s, "name", name) ← student(s_id, name), generateURI(s_id, s) Triple(s, "title", title) ← course(c_id, title, _), generateURI(c_id, s) Triple(s, "title", title) ← department(d_id, title), generateURI(d_id, s)
Output:
Triple(http://..../student#1, name, John) Triple(http://..../course#2, title, CS101) Triple(http://..../department#3, title, CS)
Case 2: Generate triples from a foreign key relationship between two relations
Mapping template:
Triple(s, "fk", o) ← r(..., pk, ..., fk, ...), generateURI(pk, s), generateURI(fk, o)
where pk is the primary key of relation r, fk is a foreign key in relation r and "fk" is the attribute label of fk.
Example mapping:
Triple(s, "d_id", o) ← course(c_id, _, d_id), generateURI(c_id, s), generateURI(d_id, o)
Output:
Triple(http://..../course#2, d_id, http://..../department#3)
Case 3: Generate triples from a manytomany relation (binary relation)
Mapping Template:
Triple(s, "p", o) ← r(fk1, fk2), generateURI(fk1, s), generateURI(fk2, o)
where r is a binary manytomany relation (fk1 and fk2 are foreign keys in relation r) and "p" is the concatenation of the attribute label of fk1 with the symbol _ and the attribute label of fk2.
Example mapping:
Triple(s, "s_id_c_id", o) ← enrolled(fk1, fk2), generateURI(fk1, s), generateURI(fk2, o)
Output:
Triple(http://..../student#1, s_id_c_id, http://..../course#2)
Second approach: User can also create his/her own databaseinstanceonly mapping rules
The rules are created by the user in this case, although he/she could use some rules that are generated automatically (like in the first approach).
To present the mapping language, we show how the rules in the first approach could be modified to consider an existing ontology (for example, name is replaced by foaf:name), and also we show some extra rules generated by a user.
Case 1: Generate triples for each attribute in a relation
Mapping template:
Triple(s, "q", p) ← r(..., pk, ..., p, ...), generateURI(pk, s)
where pk is the primary key of relation r and "q" is either the attribute label of p or a user generated property label.
Example mapping:
Triple(s, "foaf:name", name) ← student(s_id, name), generateURI(s_id, s) Triple(s, "dc:title", title) ← course(c_id, title, _), generateURI(c_id, s) Triple(s, "dc:title", title) ← department(d_id, title), generateURI(d_id, s)
Output:
Triple(http://..../student#1, foaf:name, John) Triple(http://..../course#2, dc:title, CS101) Triple(http://..../department#3, dc:title, CS)
Case 2: Generate triples from a foreign key relationship between two relations
Mapping template:
Triple(s, "q", o) ← r(..., pk, ..., fk, ...), generateURI(pk, s), generateURI(fk, o)
where pk is the primary key of relation r, fk is a foreign key in relation r and "q" is the attribute label of fk or a user generated property label.
Example mapping:
Triple(s, "ex:given_at", o) ← course(c_id, _, d_id), generateURI(c_id, s), generateURI(d_id, o)
Output:
Triple(http://..../course#2, ex:given_at, http://..../department#3)
Case 3: Generate triples from a manytomany relation (binary relation)
Mapping Template:
Triple(s, "q", o) ← r(fk1, fk2), generateURI(fk1, s), generateURI(fk2, o)
where r is a binary manytomany relation (fk1 and fk2 are foreign keys in relation r) and "q" is some user generated property label.
Example mapping:
Triple(s, "ex:enrolled", o) ← enrolled(fk1, fk2), generateURI(fk1, s), generateURI(fk2, o)
Output:
Triple(http://..../student#1, ex:enrolled, http://..../course#2)
Case 4: Rules generated by the user
The databaseinstanceonly mapping language could be used to map some additional knowledge that the user has about the information in the relational database.
For example, assume that the user knows that every student that takes a "CS" course must belong to the "CS" department. Then he/she can add this additional knowledge with the following rule:
Triple(s, "ex:student_at", o) ← enrolled(s_id, c_id), course(c_id, _, d_id), department(d_id, "CS"), generateURI(s_id, s), generatedURI(d_id, o)
Output:
Triple(http://..../student#1, ex:student_at, http://..../department#3)
DatabaseInstancesandSchema Mapping
This mapping makes the following assumptions:
 The user is interested in having the database instances and the schema
 The schema of the database gets mapped to an ontology (RDFS/OWL)
 The output of the mapping is in RDF and OWL
 A relation can be either an existing table in the database or a user generated SQL query
 The mapping is written in Datalog. However, this can be syntactically translated to W3C's RIF
Characteristics
 More expressive mapping language
Representing relational schemas and instances
The following predicate symbols are used to represent the schema and the instances of a relational database. These predicates can be automatically computed, and they are necessary for the databaseinstancesandschema mapping, as this language takes into consideration both the instances and the schema of a relational database.
RDBschema predicates
Rel(r) = r is a relation Attr(x, r, t) = x is an attribute in relation r of type t PK(x, r) = attribute x is the primary key of relation r FK(x, r, y, s) = attribute x is a foreign key in relation r that references attribute y in relation s
RDBinstances predicates
Value(r, tuple_id, a, v): tuple_id is the identifier of a tuple in the relation r, which has value v in attribute a
Continuing our running example, the following RDBschema predicates are used to represent the schema of the database:
Rel(student) Rel(course) Rel(enrolled) Rel(department) Attr(s_id, student, int) Attr(name, student, string) Attr(c_id, course int) Attr(title, course, string) Attr(d_id, course, int) Attr(s_id, enrolled, int) Attr(c_id, enrolled, int) Attr(d_id, department, int) Attr(title, department, string) PK(s_id, student) PK(c_id, course) PK(d_id, department) FK(d_id, course, d_id, department) FK(s_id, enrolled, s_id, student) FK(c_id, enrolled, c_id, course)
and the following RDBinstances predicates are used to represent the given instance:
Value(student, t1, s_id, 1) Value(student, t1, name, John) Value(course, t2, c_id, 2) Value(course, t2, title, CS101) Value(course, t2, d_id, 3) Value(enrolled, t3, s_id, 1) Value(enrolled, t3, c_id, 2) Value(department, t4, d_id, 3) Value(department, t4, title, CS)
Formal definition of the language
A databaseinstancesandschema mapping is a Datalog program over { Rel, Att, PK, FK, Value } with builtin predicates { generateURI }.
In what follows, we show how to translate relational data into RDF triples through databaseinstancesandschema mappings.
Representing ontologies (RDFS/OWL)
In the databaseinstancesandschema mapping language, the user can define any number of additional predicates by writing Datalog programs over the RDBschema and RDBinstances predicates. In this section, we show some predicates that can be generated in this way. The predicates and the rules defining them are useful when mapping relational data into RDFS/OWL, and they could be generated automatically as they are generic.
Auxiliary rules: Are useful when defining the ontology predicates
ExistsFK(x, r) ← FK(x, r, _, _) NonFK(x, r) ← Att(x, r, _), not ExistsFK(x, r)
Notice that ExistsFK(x, r) holds if attribute x is a foreign key in relation r, while NonFK(x, r) holds if x is an attribute of relation r that is not a foreign key in r.
Rule 1: Identify Binary Relations
BinRel(r, s, t) ← Rel(r), FK(p, r, _, s), FK(q, r, _, t), p ≠ q, not ExistThreeFK(r), not ExistsNonFKAtt(r) ExistThreeFK(r) ← FK(p1, r, _, _), FK(p2, r, _, _), FK(p3, r, _, _), p1 ≠ p2, p1 ≠ p3, p2 ≠ p3 ExistsNonFKAtt(r) ← NonFK(x, r)
Notice that ExistThreeFK(r) holds if relation r has at least three distinct foreign keys, and ExistsNonFKAtt(r) holds if relation r has at least one attribute that is not a foreign key of r.
Rule 2: Identify Ontology Classes (relations that are not binary relations)
Class(r) ← Rel(r), not IsBinRel(r) IsBinRel(r) ← BinRel(r, _, _)
Notice that Class(r) holds if relation r represents a class, and IsBinRel(r) holds if relation r is a binary relation.
Rule 3: Identify Object Properties through binary relations
ObjP(r, s, t) ← BinRel(r, s, t), not IsBinRel(s), not IsBinRel(t)
Notice that ObjP(r, s, t) holds if relation r represents an object property with domain s and range t
Rule 4: Identify Object Properties through a foreign key relationship
ObjP(x, s, t) ← FK(x, s, y, t), not IsBinRel(s), not IsBinRel(t)
Rule 5: Identify Datatype Properties
DTP(x, r, t) ← NonFK(x, r), Att(x, r, t)
Notice that DTP(x, r, t) holds if attribute x represents a data type property with domain r and range t.
The following is the result of applying the previous rules to our running example:
Auxiliary rules:
ExistsFK(d_id, course) ExistsFK(s_id, enrolled) ExistsFK(c_id, enrolled) NonFK(s_id, student) NonFK(name, student) NonFK(c_id, course) NonFK(title, course) NonFK(d_id, department) NonFK(title, department)
Rule 1:
BinRel(enrolled, student, course) ExistsNonFKAtt(student) ExistsNonFKAtt(course) ExistsNonFKAtt(department)
Rule 2:
Class(student) Class(course) Class(department) IsBinRel(enrolled).
Rule 3:
ObjP(enrolled, student, course)
Rule 4:
ObjP(d_id, course, department)
Rule 5:
DTP(name, student, string) DTP(title, course, string) DTP(title, department, string)
Translating relational data into RDF triples
In this section, we present an example of the rules that the user can write in the databaseinstancesandschema mapping language. It should be noticed the mapping is written using the ontology predicates, and it can be considered as a default mapping (that can be generated automatically) as it is generic. Moreover, we present the output that an RDB2RDF system should give in each of the example rules.
Case 1: Generate rdf:type triple instances
Example mapping:
Triple(s, "rdf:type", m) ← Class(m), PK(p, m), Value(m, _, p, v), generateURI(v, s)
Output:
Triple(http://..../student#1, rdf:type, student)
Notice that this triple is generated by considering the facts: Class(student), PK(s_id, student), Value(student, t1, s_id, 1), generateURI(1, http://..../student#1).
Case 2: Generate owl:DatatypeProperty triple instances
Example mapping:
Triple(s, p, o) ← DTP(p, d, r), Value(d, t, p, o), PK(pk, d), Value(d, t, pk, v), generateURI(v, s)
Output:
Triple(http://..../student#1, name, John)
Notice that this triple is generated by considering the facts: DTP(name, student, string), Value(student, t1, name, John), PK(s_id, student), Value(student, t1, s_id, 1), generateURI(1, http://..../student#1).
Case 3a: Generate owl:ObjectProperty triple instances from a binary relation
Example mapping:
Triple(s, p, o) ← BinRel(p, d, r), PK(pk1, d), Value(d, _, pk1, v1), PK(pk2, r), Value(r, _, pk2, v2), Value(p, t, pk1, v1), Value(p, t, pk2, v2), generateURI(v1, s), generateURI(v2, o)
Output:
Triple(http://..../student#1, enrolled, http://..../course#2)
Notice that this triple is generated by considering the facts: BinRel(enrolled, student, course), PK(s_id, student), Value(student, t1, s_id, 1), PK(c_id, course), Value(course, t2, c_id, 2), Value(enrolled, t3, s_id, 1), Value(enrolled, t3, c_id, 2), generateURI(1, http://..../student#1), generateURI(2, http://..../course#2)
Case 3b: Generate owl:ObjectProperty triple instances from a foreign key relationship
Example mapping:
Triple(s, p, o) ← ObjP(p, d, r), PK(pk, d), Value(d, t, pk, v1), FK(fk, d, _, r), Value(d, t, fk, v2), generateURI(v1, s), generateURI(v2, o)
Output:
Triple(http://..../course#2, d_id, http://..../department#3)
Notice that this triple is generated by considering the facts: ObjP(d_id, course, department), PK(c_id, course), Value(course, t2, c_id, 2), FK(d_id, course, d_id, department), Value(course, t2, d_id, 3), generateURI(2, http://..../course#2), generateURI(3, http://..../department#3)
Rules generated by the user
As for the case of the databaseinstanceonly mapping language, the databaseinstancesandschema mapping language could be used to map some additional knowledge that the user has about the information in the relational database.
Example 1. To use an existing ontology, the user can use his/her own rules:
Triple(s, "ex:given_at", o) ← Value("course", t, "c_id", x), Value("course", t, "d_id", y), generateURI(x, s), generateURI(y, o)
Output:
Triple(http://..../student#1, ex:student_at, http://..../department#3)
Example 2. Assume that the user knows that every student that takes a "CS" course must belong to the "CS" department. Then he/she can add this additional knowledge with the following rule:
Triple(s, "ex:student_at", o) ← Value("enrolled", t1, "s_id", x), Value("enrolled", t1, "d_id", y), Value("department", t2, "d_id", y), Value("department", t2, "title", "CS"), generateURI(x, s), generatedURI(y, o)
Output:
Triple(http://..../student#1, ex:student_at, http://..../department#3)