W3C

R2RML: RDB to RDF Mapping Language

W3C Working Draft 20 September 2011

This version:
http://www.w3.org/TR/2011/WD-r2rml-20110920/
Latest Editor's Draft:
http://www.w3.org/2001/sw/rdb2rdf/r2rml/
Latest published version:
http://www.w3.org/TR/r2rml/
Previous version:
http://www.w3.org/TR/2011/WD-r2rml-20110324/
Editors:
Souripriya Das, Oracle
Seema Sundara, Oracle
Richard Cyganiak, DERI, National University of Ireland, Galway

Abstract

This document describes R2RML, a language for expressing customized mappings from relational databases to RDF datasets. Such mappings provide the ability to view existing relational data in the RDF data model, expressed in a structure and target vocabulary of the mapping author's choice. R2RML mappings are themselves RDF graphs and written down in Turtle syntax. R2RML enables different types of mapping implementations. Processors could, for example, offer a virtual SPARQL endpoint over the mapped relational data, or generate RDF dumps, or offer a Linked Data interface.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document is a Last Call Working Draft of the "R2RML: RDB to RDF Mapping Language". Publication as a Last Call Working Draft indicates that the RDB2RDF Working Group believes it has addressed all substantive issues and that the document is stable. The Working Group expects to advance this specification to Recommendation Status.

Comments on this document should be sent to public-rdb2rdf-comments@w3.org, a mailing list with a public archive. Comments on this working draft are due on or before 1 November 2011.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by the W3C RDB2RDF Working Group. R2RML has changed significantly since the previous Working Draft. Much of the document has been rewritten, many terms were renamed and other design details have changed. New language features include a detailed account of the conversion of SQL datatypes to RDF. The triples in the output dataset are now more accurately specified. One major open question remains, and the working group seeks feedback on it: should R2RML processors be required to support the Turtle syntax? Apart from this, the working group anticipates no major further changes.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents


1 Introduction

This specification describes R2RML, a language for expressing customized mappings from relational databases to RDF datasets. Such mappings provide the ability to view existing relational data in the RDF data model, expressed in a structure and target vocabulary of the mapping author's choice.

This specification has a companion that defines a direct mapping from relational databases to RDF [DM]. In the direct mapping of a database, the structure of the resulting RDF graph directly reflects the structure of the database, the target RDF vocabulary directly reflects the names of database schema elements, and neither structure nor target vocabulary can be changed. With R2RML on the other hand, a mapping author can define highly customized views over the relational data.

Every R2RML mapping is tailored to a specific database schema and target vocabulary. The input to an R2RML mapping is a relational database that conforms to that schema. The output is an RDF dataset [SPARQL], as defined in SPARQL, that uses predicates and types from the target vocabulary. The mapping is conceptual; R2RML processors are free to materialize the output data, or to offer virtual access through an interface that queries the underlying database, or to offer any other means of providing access to the output RDF dataset.

R2RML mappings are themselves expressed as RDF graphs and written down in Turtle syntax [TURTLE].

The intended audience of this specification is implementors of software that generates or processes R2RML mapping documents, as well as mapping authors looking for a reference to the R2RML language constructs. The document uses concepts from RDF Concepts and Abstract Syntax [RDF] and from the SQL language specifications [SQL1][SQL2]. A reader's familiarity with the contents of these documents, as well as with the Turtle syntax, is assumed.

The R2RML language is designed to meet the use cases and requirements identified in Use Cases and Requirements for Mapping Relational Databases to RDF [UCNR].

1.1 Document Conventions

In this document, examples assume the following namespace prefix bindings unless otherwise stated:

Prefix IRI
rr: http://www.w3.org/ns/r2rml#
rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs: http://www.w3.org/2000/01/rdf-schema#
xsd: http://www.w3.org/2001/XMLSchema#
ex: http://example.com/ns#

Throughout the document, boxes containing Turtle markup and SQL data will appear. These boxes are color-coded. Gray boxes contain RDFS definitions of R2RML vocabulary terms:

# This box contains RDFS definitions of R2RML vocabulary terms

Yellow boxes contain example fragments of R2RML mappings in Turtle syntax:

# This box contains example R2RML mappings

Blue tables contain example input into an R2RML mapping:

EXAMPLE
ID INTEGER PRIMARY KEYDESC VARCHAR(100)
1This is an example input table.
2The table name is EXAMPLE.
3It has six rows.
4It has two columns, ID and DESC.
5ID is the table's primary key and of type INTEGER.
6DESC is of type VARCHAR(100)

Green boxes contain example output:

# This box contains example output RDF triples or fragments

2 R2RML Overview and Example (Informative)

This section gives a brief overview of the R2RML mapping language, followed by a simple example relational database with an R2RML mapping document and its output RDF. Further R2RML examples can be found in the R2RML and Direct Mapping Test Cases [TC].

An R2RML mapping refers to logical tables to retrieve data from the input database. A logical table can be one of the following:

  1. A base table,
  2. a view, or
  3. a valid SQL query (called an “R2RML view” because it emulates a SQL view without modifying the database).

Each logical table is mapped to RDF using a triples map. The triples map is a rule that maps each row in the logical table to a number of RDF triples. The rule has two main parts:

  1. A subject map that generates the subject of all RDF triples that will be generated from a logical table row.
  2. Multiple predicate-object maps that in turn consist of predicate maps and object maps (or referencing object maps).

Triples are produced by combining the subject map with a predicate map and object map, and applying these three to each logical table row. For example, the complete rule for generating a set of triples might be:

By default, all RDF triples are in the default graph of the output dataset. A triples map can contain graph maps that place some or all of the triples into named graphs instead.

UML overview diagram of R2RML

Figure 1: An overview of R2RML

2.1 Example Input Database

The following example database consists of two tables, EMP and DEPT, with one row each:

EMP
EMPNO INTEGER PRIMARY KEY ENAME VARCHAR(100) JOB VARCHAR(20) DEPTNO INTEGER REFERENCES DEPT (DEPTNO)
7369 SMITH CLERK 10
DEPT
DEPTNO INTEGER PRIMARY KEY DNAME VARCHAR(30) LOC VARCHAR(100)
10 APPSERVER NEW YORK

2.2 Desired RDF Output

The desired RDF triples to be produced from this database are as follows:

<http://data.example.com/employee/7369> rdf:type ex:Employee.
<http://data.example.com/employee/7369> ex:name "SMITH".
<http://data.example.com/employee/7369> ex:department <http://data.example.com/department/10>.

<http://data.example.com/department/10> rdf:type ex:Department.
<http://data.example.com/department/10> ex:name "APPSERVER".
<http://data.example.com/department/10> ex:location "NEW YORK".
<http://data.example.com/department/10> ex:staff 1.

Note in particular:

2.3 Example: Mapping a Simple Table

The following partial R2RML mapping document will produce the desired triples from the EMP table (except the ex:department triple, which will be added later):

@prefix rr: <http://www.w3.org/ns/r2rml#>.

<#TriplesMap1>
    rr:logicalTable [ rr:tableName "EMP" ];
    rr:subjectMap [
        rr:template "http://data.example.com/employee/{EMPNO}";
        rr:class ex:Employee;
    ];
    rr:predicateObjectMap [
        rr:predicate ex:name;
        rr:objectMap [ rr:column "ENAME" ];
    ].
<http://data.example.com/employee/7369> rdf:type ex:Employee.
<http://data.example.com/employee/7369> ex:name "SMITH".

2.4 Example: Computing a Property with an R2RML View

Next, the DEPT table needs to be mapped. Instead of using the table directly as the basis for that mapping, an “R2RML view” will be defined based on a SQL query. This allows computation of the staff number. (Alternatively, one could define this view directly in the database.)

<#DeptTableView> rr:sqlQuery """
SELECT DEPTNO,
       DNAME,
       LOC,
       (SELECT COUNT(*) FROM EMP WHERE EMP.DEPTNO=DEPT.DEPTNO) AS STAFF
FROM DEPT;
""".

The definition of a triples map that generates the desired DEPT triples based on this R2RML view follows.

<#TriplesMap2>
    rr:logicalTable <#DeptTableView>;
    rr:subjectMap [
        rr:template "http://data.example.com/department/{DEPTNO}";
        rr:class ex:Department;
    ];
    rr:predicateObjectMap [
        rr:predicate ex:name;
        rr:objectMap [ rr:column "DNAME" ];
    ];
    rr:predicateObjectMap [
        rr:predicate ex:location;
        rr:objectMap [ rr:column "LOC" ];
    ];
    rr:predicateObjectMap [
        rr:predicate ex:staff;
        rr:objectMap [ rr:column "STAFF" ];
    ].
<http://data.example.com/department/10> rdf:type ex:Department.
<http://data.example.com/department/10> ex:name "APPSERVER".
<http://data.example.com/department/10> ex:location "NEW YORK".
<http://data.example.com/department/10> ex:staff 1.

2.5 Example: Linking Two Tables

To complete the mapping document, the ex:department triples need to be generated. Their subjects come from the first triples map (<#TriplesMap1>), the objects come from the second triples map (<#TriplesMap2>).

This can be achieved by adding another rr:predicateObjectMap to <#TriplesMap1>. This one uses the other triples map, <#TriplesMap2>, as a parent triples map:

<#TriplesMap1>
    rr:predicateObjectMap [
        rr:predicate ex:department;
        rr:objectMap [
            rr:parentTriplesMap <#TriplesMap2>;
            rr:joinCondition [
                rr:child "DEPTNO";
                rr:parent "DEPTNO";
            ];
        ];
    ].

This performs a join between the EMP table and the R2RML view, on the DEPTNO columns. The objects will be generated from the subject map of the parent triples map, yielding the desired triple:

<http://data.example.com/employee/7369> ex:department <http://data.example.com/department/10>.

This completes the R2RML mapping document. An R2RML processor will generate the triples listed above from this mapping document.

2.6 Example: Many-to-Many Tables

A final example will assume that a many-to-many relationship exists between the extended versions of EMP table and the DEPT table shown below. This many-to-many relationship is captured by the content of the EMP2DEPT table. The database consisting of the EMP, DEPT, and EMP2DEPT tables are shown below:

EMP
EMPNO INTEGER PRIMARY KEY ENAME VARCHAR(100) JOB VARCHAR(20) DEPTNO INTEGER REFERENCES DEPT (DEPTNO)
7369 SMITH CLERK 10
7369 SMITH NIGHTGUARD 20
7400 JONES ENGINEER 10
DEPT
DEPTNO INTEGER PRIMARY KEY DNAME VARCHAR(30) LOC VARCHAR(100)
10 APPSERVER NEW YORK
20 RESEARCH BOSTON
EMP2DEPT PRIMARY KEY (EMPNO, DEPTNO)
EMPNO INTEGER REFERENCES EMP (EMPNO) DEPTNO INTEGER REFERENCES DEPT (DEPTNO)
7369 10
7369 20
7400 10
<http://data.example.com/employee=7369/department=10> 
    ex:employee   <http://data.example.com/employee/7369> ;
    ex:department <http://data.example.com/department/10> .

<http://data.example.com/employee=7369/department=20> 
    ex:employee <http://data.example.com/employee/7369> ;
    ex:department <http://data.example.com/department/20> .

<http://data.example.com/employee=7400/department=10> 
    ex:employee <http://data.example.com/employee/7400> ;
    ex:department <http://data.example.com/department/10> .
    

The following R2RML mapping will produce the desired triples listed above:

<#TriplesMap3>
    rr:tableName "EMP2DEPT";
    rr:subjectMap [ rr:template "http://data.example.com/employee={EMPNO}/department={DEPTNO}" ];
    rr:predicateObjectMap [
        rr:predicate ex:employee;
        rr:objectMap [ rr:template "http://data.example.com/employee/{EMPNO}" ; rr:termType rr:IRI ]
    ];
    rr:predicateObjectMap [
        rr:predicate ex:department;
        rr:objectMap [ rr:template "http://data.example.com/department/{DEPTNO}" ; rr:termType rr:IRI ]
    ].
    

However, if one does not require that the subjects in the desired output uniquely identify the rows in the EMP2DEPT table, the desired output may look as follows:

<http://data.example.com/employee/7369> 
    ex:department <http://data.example.com/department/10> ;
    ex:department <http://data.example.com/department/20> .

<http://data.example.com/employee/7400> 
    ex:department <http://data.example.com/department/10>.
    

The following R2RML mapping will produce the desired triples:

<#TriplesMap3>
    rr:tableName "EMP2DEPT";
    rr:subjectMap [
        rr:template "http://data.example.com/employee/{EMPNO}";
    ];
    rr:predicateObjectMap [
      rr:predicate ex:department;
      rr:objectMap [ rr:template "http://data.example.com/department/{DEPTNO}"; rr:termType rr:IRI ]
    ].

3 Conformance

As well as sections marked as non-normative in the section heading, all diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words must, must not, required, should, should not, recommended, may, and optional in this specification are to be interpreted as described in RFC 2119 [RFC2119].

This specification describes conformance criteria for:

A collection of test cases for R2RML processors and R2RML data validators is available in the R2RML and Direct Mapping Test Cases [TC].

This specification defines R2RML for databases that conform to Core SQL 2008, as defined in ISO/IEC 9075-1:2008 [SQL1] and ISO/IEC 9075-2:2008 [SQL2]. Processors and mappings may have to deviate from the R2RML specification in order to support databases that do not conform to this version of SQL.

Where SQL queries are embedded into R2RML mappings, SQL version identifiers can be used to indicate the specific version of SQL that is being used.

4 R2RML Processors and Mapping Documents

An R2RML mapping defines a mapping from a relational database to RDF. It is a structure that consists of one or more triples maps.

The input to an R2RML mapping is called the input database.

An R2RML processor is a system that, given an R2RML mapping and an input database, provides access to the output dataset.

There are no constraints on the method of access to the output dataset provided by a conforming R2RML processor. An R2RML processor MAY materialize the output dataset into a file, or offer virtual access through an interface that queries the input database, or offer any other means of providing access to the output dataset.

An R2RML processor also has access to an execution environment consisting of:

The SQL connection is used by the R2RML processor to evaluate SQL queries against the input database. It MUST be established with sufficient privileges for read access to all base tables and views that are referenced in the R2RML mapping. It MUST be configured with a default catalog and default schema that will be used when tables and views are accessed without an explicit catalog or schema reference.

How the SQL connection is established, or how users are authenticated against the database, is outside of the scope of this document.

The base IRI MUST be a valid IRI. It SHOULD end in a slash (“/”) character.

Resolution of relative IRIs in R2RML uses simple string concatenation instead of the more complex algorithm defined in RFC 3986. This ensures that the original database value can be reconstructed from the generated IRI.

An R2RML data validator is a system that takes as its input an R2RML mapping, a base IRI, and a SQL connection to an input database, and checks for the presence of data errors. When checking the input database, a data validator MUST report any data errors that are raised in the process of generating the output dataset.

An R2RML processor MAY include an R2RML data validator, but this is not required.

4.1 Mapping Graphs and the R2RML Vocabulary

An R2RML mapping is represented as an RDF graph. In other words, RDF is used not just as the target data model of the mapping, but also as a formalism for representing the R2RML mapping itself.

An RDF graph that represents an R2RML mapping is called an R2RML mapping graph.

The R2RML vocabulary is the set of IRIs defined in this specification that start with the rr: namespace IRI:

http://www.w3.org/ns/r2rml#

An R2RML mapping graph:

The R2RML vocabulary also includes the following R2RML classes, which represent various R2RML mapping constructs. Using these classes is OPTIONAL in a mapping graph. The applicable class of a resource can always be inferred from its properties.

Many of these classes differ only in capitalization from properties in the R2RML vocabulary.

4.2 RDF-based Turtle Syntax; Media Type

ISSUE-57: Should R2RML require a specific syntax?

The working group has proposed two alternate proposals for this issue:

  1. The R2RML mapping document specifies both the vocabulary and the syntax. The R2RML document MUST be a Turtle document and R2RML processors MUST support Turtle to be able to read such documents. Conformance criteria requires support of R2RML vocabulary written in Turtle.
  2. The R2RML mapping document specifies only the vocabulary. There is no syntax specified. There can be an accompanying "R2RML mapping document in Turtle" that specifies the Turtle syntax. Conformance criteria for the R2RML mapping document requires supporting the vocabulary in any language (Turtle, N-Triple, RDF/XML etc.) Additionally, if an implementation supports the Turtle syntax, it can claim conformance to the "R2RML mapping document in Turtle".

The advantage of the first approach is that it promotes interoperability between different producers and consumers of R2RML files by requiring all to support at least one shared syntax. Without such a shared syntax, an R2RML file created in one tool may be rejected by another tool because both assume different RDF syntaxes. R2RML examples found in educational material may not work in actual implementations due to different syntaxes. This is seen as an impediment to the uptake of R2RML.

The second approach distinguishes between the R2RML vocabulary and the syntax and wants to keep them separate. The advantage of the second approach is that the R2RML mapping document remains independent of any exchange format. This gives flexibility as different syntax flavours of R2RML could be easily defined. It is in the spirit of RDF as an abstract format. Users may have to convert between different RDF syntaxes in order to use R2RML files, but such conversion is not difficult and therefore not seen as an impediment. Thereby, it allows conformance with the R2RML mapping document using any of the standard exchange formats.

There is consensus that Turtle should be used for the examples in this document, as well as for the test cases.

The working group seeks comments and opinions on this question and encourages reports to public-rdb2rdf-comments mailing list.

An R2RML mapping document is any document written in the Turtle [TURTLE] RDF syntax that encodes an R2RML mapping graph.

The media type for R2RML mapping documents is the same as for Turtle documents in general: text/turtle. The content encoding of Turtle content is always UTF-8 and the charset parameter on the media type SHOULD always be used: text/turtle;charset=utf-8. The preferred file extension is .ttl.

A conforming R2RML processor MUST accept R2RML mapping documents in Turtle syntax. It MAY accept R2RML mapping graphs encoded in other RDF syntaxes.

It is common to use document-local IRIs in mapping documents by defining the default prefix in the beginning of the document, and using it for creating IRIs for mapping components such as triples maps:

@prefix : <#>
…
:EmpQuery rr:sqlQuery """SELECT * FROM EMP WHERE …""".
…
:EmpTriples rr:logicalTable :EmpQuery.

4.3 Data Errors

A data error is a condition of the data in the input database that would lead to the generation of an invalid RDF term, such as an invalid IRI or an ill-typed literal.

When providing access to the output dataset, an R2RML processor MUST abort any operation that requires inspecting or returning an RDF term whose generation would give rise to a data error, and report an error to the agent invoking the operation. A conforming R2RML processor MAY, however, allow other operations that do not require inspecting or returning these RDF terms, and thus MAY provide partial access to an output dataset that contains data errors. Nevertheless, an R2RML processor SHOULD report data errors as early as possible.

The following conditions give rise to data errors:

  1. A term map with term type rr:IRI results in the generation of an invalid IRI.
  2. A term map with a datatype override produces an ill-typed literal of a supported RDF datatype.

The presence of data errors does not make an R2RML mapping non-conforming.

Data errors cannot generally be detected by analyzing the table schema of the database, but only by scanning the data in the tables. For large and rapidly changing databases, this can be impractical. Therefore, R2RML processors are allowed to answer queries that do not “touch” a data error, and the behavior of such operations is well-defined. For the same reason, the conformance of R2RML mappings is defined without regard for the presence of data errors.

R2RML data validators can be used to explicitly scan a database for data errors.

5 Defining Logical Tables

Diagram: The properties of logical tables

Figure 2: The properties of logical tables

A logical table is a possibly virtual database table that is to be mapped to RDF triples. A logical table is either

Every logical table has an effective SQL query that, if executed over the SQL connection, produces as its result the contents of the logical table.

A logical table row is a row in a logical table.

A column name is the name of a column of a logical table. A column name MUST be a valid SQL identifier. Column names do not include any qualifying table, view or schema names.

A SQL identifier is the name of a SQL object, such as a column, table, view, schema, or catalog. A SQL identifier MUST match the <identifier> production in [SQL2]. When comparing identifiers for equality, the comparison rules of [SQL2] MUST be used.

An informative summary of SQL identifier syntax rules:
  1. SQL identifiers can be delimited identifiers (with double quotes), or regular identifiers.
  2. Regular identifiers must start with a Unicode character from any of the following character classes: upper-case letter, lower-case letter, title-case letter, modifier letter, other letter, or letter number. Subsequent characters may be any of these, or a nonspacing mark, spacing combining mark, decimal number, connector punctuation, and formatting code.
  3. Regular identifiers are case-insensitive.
  4. Delimited identifiers can contain any character.
  5. Double quotes inside delimited identifiers must be immediately followed by another double quote.
  6. Delimited identifiers are case-sensitive.
  7. deptno and "deptno" are not equivalent (delimited identifiers that are not in all-upper-case are not equivalent to any undelimited identifiers).
  8. DEPTNO and "DEPTNO" are equivalent (all-upper-case delimited and undelimited identifiers are equivalent).
  9. Five examples of valid column names: deptno, dept_no, "dept_no", "Department Number", "Identifier ""with quotes""".
Note that in R2RML, column name specified as an RDF plain literal or within curly braces, is considered a delimited SQL identifier. Thus the SQL column name identifiers deptno, dept_no, "dept_no", "Department Number" can be used as (part of) object value for the various relevant R2RML properties as follows:
[] rr:column "DEPTNO".
[] rr:parent "DEPT_NO".
[] rr:child "dept_no".
[] rr:template "http://data.example.com/department/{Department Number}".
    
Note that Turtle string syntax requires escaping of double quotes with a backslash, so the identifier "Identifier ""with quotes""" can be used as (part of) value for the various relevant R2RML properties as follows:
[] rr:column "Identifier \"\"with quotes\"\"".
[] rr:template "http://data.example.com/department/{Identifier \"\"with quotes\"\"}".
These rules are for Core SQL 2008. See Section 3, Conformance regarding databases that do not conform to this version of SQL.

5.1 Base Tables and SQL Views (rr:tableName)

A SQL base table or view is a logical table containing SQL data from a base table or view in the input database. A SQL base tables or views is represented by a resource that has exactly one rr:tableName property.

The value of rr:tableName specifies the table or view name of the base table or view. Its value MUST be a valid schema-qualified name that names an existing base table or view in the input database.

A schema-qualified name is a sequence of one, two or three valid SQL identifiers, separated by the dot character (“.”). The three identifiers name, respectively, a catalog, a schema, and a table or view. If no catalog or schema are specified, then the default catalog and default schema of the SQL connection are assumed.

The effective SQL query of a SQL base table or view is:

SELECT * FROM {table}

with {table} replaced with the table or view name.

The following example shows a logical table specified using a schema-qualified table name.

[] rr:tableName "SCOTT.DEPT".

The following example shows a logical table specified using an unqualified table name. The SQL connection's default schema will be used.

[] rr:tableName "DEPT".

5.2 R2RML Views (rr:sqlQuery, rr:sqlVersion)

An R2RML view is a logical table whose contents are the result of executing a SQL query against the input database. It is represented by a resource that has exactly one rr:sqlQuery property, whose value MUST be a valid SQL query.

R2RML mappings sometimes require data transformation, computation, or filtering before generating triples from the database. This can be achieved by defining a SQL view in the input database and referring to it with rr:tableName. However, this approach may not be practical for lack of database privileges or other reasons. R2RML views achieve the same effect without requiring changes to the input database.

Note that unlike “real” SQL views, an R2RML view can not be used as an input table in further SQL queries.

A SQL query is a SELECT query in the SQL language that can be executed over the input database. The value of rr:sqlQuery MUST conform to the production <direct select statement: multiple rows> in [SQL2] with an OPTIONAL trailing semicolon character and OPTIONAL surrounding white space (excluding comments) as defined in [TURTLE]. It MUST be a valid SQL query if executed over the SQL connection. It MUST NOT have duplicate column names or unnamed derived columns in the SELECT list. For any database objects referenced without an explicit catalog name or schema name, the default catalog and default schema of the SQL connection are used.

An R2RML view MAY have one or more SQL version identifiers. They MUST be valid IRIs and are represented as values of the rr:sqlVersion property. The following SQL version identifier indicates that the SQL query conforms to Core SQL 2008:

http://www.w3.org/ns/r2rml#SQL2008

The absence of a SQL version identifier indicates that no claim to Core SQL 2008 conformance is made.

No further identifiers besides rr:SQL2008 are defined in this specification. The RDB2RDF Working Group intends to maintain a non-normative list of identifiers for other SQL versions [SQLIRIS].

The effective SQL query of an R2RML view is the value of its rr:sqlQuery property.

The following example shows a logical table specified as an R2RML view conforming to Core SQL 2008.

[] rr:sqlQuery """
        Select ('Department' || DEPTNO) AS DEPTID
             , DEPTNO
             , DNAME
             , LOC
          from SCOTT.DEPT
    """;
    rr:sqlVersion rr:SQL2008.

6 Mapping Logical Tables to RDF with Triples Maps

Diagram: The properties of triples maps

Figure 3: The properties of triples maps

A triples map specifies a rule for translating each row of a logical table to zero or more RDF triples.

The RDF triples generated from one row in the logical table all share the same subject.

A triples map is represented by a resource that references the following other resources:

The referenced columns of all term maps of a triples map (subject map, predicate maps, object maps, graph maps) MUST be column names that exist in the term map's logical table. Furthermore, the columns carrying these names in the logical table MUST be of a SQL datatype for which conversion to string is defined.

Conversion to string is undefined in SQL 2008 for row types, array types, user-defined datatypes that do not have a user-defined string CAST, and a few other exotic types.

The following example shows a triples map including its logical table, subject map, and two predicate-object maps.

[]
    rr:logicalTable [ rr:tableName "DEPT" ];
    rr:subjectMap [ rr:template "http://data.example.com/department/{DEPTNO}" ];
    rr:predicateObjectMap [
        rr:predicate ex:name;
        rr:objectMap [ rr:column "DNAME" ];
    ].
    rr:predicateObjectMap [
        rr:predicate ex:location;
        rr:objectMap [ rr:column "LOC" ];
    ].

The logical table may also be specified directly on the same resource, without introducing an intermediate resource:

[]
    rr:tableName "DEPT";
    rr:subjectMap [ rr:template "http://data.example.com/department/{DEPTNO}" ];
    # …
    .

6.1 Creating Resources with Subject Maps

A subject map is a term map. It specifies a rule for generating the subjects of the RDF triples generated by a triples map.

6.2 Typing Resources (rr:class)

A subject map MAY have one or more class IRIs. They are represented by the rr:class property. The values of the rr:class property MUST be IRIs. For each RDF term generated by the subject map, RDF triples with predicate rdf:type and the class IRI as object will be generated.

This property is merely a shortcut for specifying an rr:predicateObjectMap with predicate rdf:type and the rr:class IRI as a constant object. Mappings where the class IRI is not constant, but needs to be computed based on the contents of the database, can be achieved by defining such a rr:predicateObjectMap with a non-constant object.

In the following example, the generated subject will be asserted as an instance of the ex:Employee class.

[] rr:template "http://data.example.com/employee/{EMPNO}"; 
   rr:class ex:Employee.

Using the example EMP table, the following RDF triple will be generated:

<http://data.example.com/emp/7369> rdf:type ex:Employee.

6.3 Creating Properties and Values with Predicate-Object Maps

A predicate-object map is a function that creates predicate-object pairs from logical table rows. It is used in conjunction with a subject map to generate RDF triples in a triples map.

A predicate-object map is represented by a resource that references the following other resources:

A predicate map is a term map.

An object map is a term map.

7 Creating RDF Terms with Term Maps

Diagram: The properties of term maps

Figure 4: The properties of term maps

An RDF term is either an IRI, or a blank node, or a literal.

A term map is a function that generates an RDF term from a logical table row. The result of that function is known as the term map's generated RDF term.

Term maps are used to generate the subjects, predicates and objects of the RDF triples that are generated by a triples map. Consequently, there are several kinds of term maps, depending on where in the mapping they occur: subject maps, predicate maps, object maps and graph maps.

A term map MUST be exactly one of the following:

The referenced columns of a term map are the set of column names referenced in the term map and depend on the type of term map.

7.1 Constant RDF Terms (rr:constant)

A constant-valued term map is a term map that ignores the logical table row and always generates the same RDF term. A constant-valued term map is represented by a resource that has exactly one rr:constant property.

The constant value of a constant-valued term map is the RDF term that is the value of its rr:constant property.

If the constant-valued term map is a subject map, predicate map or graph map, then its constant value MUST be an IRI.

If the constant-valued term map is an object map, then its constant value MUST be an IRI or literal.

The referenced columns of a constant-valued term map is the empty set.

Constant-valued term maps can be expressed more concisely using the constant shortcut properties rr:subject, rr:predicate, rr:object and rr:graph. Occurrances of these properties MUST be treated exactly as if the following triples were present in the mapping graph instead:

Triple involving constant shortcut property Replacement triples
aaa rr:subject bbb. aaa rr:subjectMap [ rr:constant bbb ].
aaa rr:predicate bbb. aaa rr:predicateMap [ rr:constant bbb ].
aaa rr:object bbb. aaa rr:objectMap [ rr:constant bbb ].
aaa rr:graph bbb. aaa rr:graphMap [ rr:constant bbb ].

The following example shows a predicate-object map that uses a constant-valued term map both for its predicate and for its object.

[] rr:predicateMap [ rr:constant rdf:type ];
   rr:objectMap [ rr:constant ex:Employee ].

If added to a triples map, this predicate-object map would add the following triple to all resources ?x generated by the triples map:

?x rdf:type ex:Employee.

The following example uses constant shortcut properties and is equivalent to the example above:

[] rr:predicate rdf:type;
   rr:object ex:Employee.

7.2 From a Column (rr:column)

A column-valued term map is a term map that is represented by a resource that has exactly one rr:column property.

The value of the rr:column property MUST be a valid column name. The column value of the term map is the data value of that column in a given logical table row.

The referenced columns of a column-valued term map is the singleton set containing the value of rr:column.

The following example defines an object map that generates literals from the DNAME column of some logical table.

[] rr:objectMap [ rr:column "DNAME" ].

Using the sample row from the DEPT table as a logical table row, the column value of the object map would be “APPSERVER”.

7.3 From a Template (rr:template)

A template-valued term map is a term map that is represented by a resource that has exactly one rr:template property. The value of the rr:template property MUST be a valid string template.

A string template is a format string that can be used to build strings from multiple components. It can reference column names by enclosing them in curly braces. The following syntax rules apply to valid string templates:

The template value of the term map for a given logical table row is determined as follows:

  1. Let result be the template string
  2. For each pair of unescaped curly braces in result:
    1. Let value be the data value of the column whose name is enclosed in the curly braces
    2. If value is NULL, then return NULL
    3. Apply conversion to string to value
    4. If the term type is rr:IRI, then replace the pair of curly braces with an IRI-safe version of value; otherwise, replace the pair of curly braces with value
  3. Return result

The IRI-safe version of a string is obtained by applying the following transformation to any character that is not in the iunreserved production in [RFC3987]:

  1. Convert the character to a sequence of one or more octets using UTF-8 [RFC3629]
  2. Percent-encode each octet [RFC3986]

The referenced columns of a template-valued term map is the set of column names enclosed in unescaped curly braces in the template string.

The following example defines a subject map that generates IRIs from the DEPTNO column of a logical table.

[] rr:subjectMap [ rr:template "http://data.example.com/department/{DEPTNO}" ].

Using the sample row from the DEPT table as a logical table row, the template value of the subject map would be:

http://data.example.com/department/10

The following example shows how an IRI-safe template value is created:

[] rr:subjectMap [ rr:template "http://data.example.com/site/{LOC}" ].

Using the sample row from the DEPT table as a logical table row, the template value of the subject map would be:

http://data.example.com/site/NEW%20YORK

The space character is not in the iunreserved set, and therefore percent-encoding is applied to the character, yielding “%20”.

The following example shows the use of backslash escapes in string templates. The template will generate a fancy title such as

{{{ Hello World! }}}

from a string “Hello World!” in the TITLE column.

[] rr:objectMap [ rr:template "\\{\\{\\{ {TITLE} \\}\\}\\}" ].

Note that because backslashes need to be escaped by a second backslash in the Turtle syntax [TURTLE], a double backslash is needed to escape each curly brace.

7.4 IRIs, Literal, Blank Nodes (rr:termType)

The term type of a column-valued term map or template-valued term map determines the kind of generated RDF term (IRIs, blank nodes or literals).

If the term map has an optional rr:termType property, then its term type is the value of that property. The value MUST be an IRI and MUST be one of the following options:

If the term map does not have a rr:termType property, then its term type is:

Term maps with term type rr:IRI cause data errors if the value is not a valid IRI (see generated RDF term for details). Data values from the input database may require percent-encoding before they can be used in IRIs. Template-valued term maps are a convenient way of percent-encoding data values.

7.5 Language Tags (rr:language)

A term map with a term type of rr:Literal MAY have a specified language tag. It is represented by the rr:language property on a term map. If present, its value MUST be a valid language tag.

A specified language tag causes generated literals to be language-tagged plain literals. In the following example, plain literals with language tag “en-us” (U.S. English) will be generated for the data values in the DNAME column.

[] rr:objectMap [ rr:column "DNAME"; rr:language "en-us" ].

7.6 Typed Literals (rr:datatype)

A typeable term map is a term map with a term type of rr:Literal that does not have a specified langauge tag.

Typeable term maps may generate typed literals. The datatype of these literals can be explicitly specified using rr:datatype, or automatically determined based on the SQL datatype of the underlying logical table column.

A typeable term map MAY have a rr:datatype property. Its value MUST be an IRI. This IRI is the specified datatype of the term map.

A term map MUST NOT have more than one rr:datatype value.

A term map that is not a typeable term map MUST NOT have an rr:datatype property.

A typeable term map has an implicit datatype and an implicit transform. They are determined as follows:

A datatype override is in effect on a typeable term map if it has a specified datatype, and the specified datatype is different from its implicit datatype.

See generated RDF term for further details.

R2RML does not allow generating plain literals without language tag from non-string columns. One can use a derived column that uses a SQL CAST expression instead.

The following example shows an object map that overrides the default datatype of the logical table with an explicitly specified xsd:positiveInteger type. Whatever is in the EMPNO column will be subjected to conversion to string, and turned into a literal of that type.

[] rr:objectMap [ rr:column "EMPNO"; rr:datatype xsd:positiveInteger ].

7.7 Inverse Expressions (rr:inverseExpression)

An inverse expression is a string template associated with a column-valued term map or template-value term map. It is represented by the value of the rr:inverseExpression property. This property is OPTIONAL and there MUST NOT be more than one for a term map.

Inverse expressions are useful for optimizing term maps that reference derived columns in R2RML views. An inverse expression specifies an expression that allows “reversing” of a generated RDF term and the construction of a SQL query that efficiently retrieves the logical table row from which the term was generated. In particular, it allows the use of indexes on the underlying relational tables.

Every pair of unescaped curly braces in the inverse expression is a column reference in an inverse expression. The string between the braces MUST be a valid column name.

An inverse expression MUST satisfy the following condition:

For example, for the DEPTID column in the logical table used for mapping the DEPT table in this example mapping, an inverse expression could be defined as follows:

[] rr:column "DEPTID";
   rr:inverseExpression "{DEPTNO} = substr({DEPTID},length('Department')+1)";

This facilitates the use of an existing index on the DEPTNO column of the DEPT table.

A quoted and escaped data value is a SQL literal that can be used in a SQL query, such as:

8 Foreign Key Relationships among Logical Tables (rr:parentTriplesMap, rr:joinCondition, rr:child and rr:parent)

Diagram: The properties of referencing object maps

Figure 5: The properties of referencing object maps

A referencing object map allows using the subjects of another triples map as the objects generated by a predicate-object map. Since both triples maps may be based on different logical tables, this may require a join between the logical tables. A referencing object map is represented by a resource that:

A join condition is a resource that has exactly two properties:

The child query of a referencing object map is the effective SQL query of the logical table containing the referencing object map.

The parent query of a referencing object map is the effective SQL query of the logical table of its parent triples map.

If the child query and parent query of a referencing object map are not identical, then the referencing object map MUST have at least one join condition.

The joint SQL query of a referencing object map is:

The following example shows a referencing object map as part of a predicate-object map:

[] rr:predicateObjectMap [
    rr:predicate ex:department;
    rr:refObjectMap [
        rr:parentTriplesMap <#TriplesMap2>;
        rr:joinCondition [
            rr:child "DEPTNO";
            rr:parent "DEPTNO";
        ];
    ];
].

If the logical table of the surrounding triples map is EMP, and the logical table of <#TriplesMap2> is DEPT, this would result in a join between these two tables with the condition

EMP.DEPTNO = DEPT.DEPTNO

and the objects of the triples would be generated using the subject map of <#TriplesMap2>.

Given the two example tables, and subject maps as defined in the example mapping, this would result in a triple:

<http://data.example.com/employee/7369> ex:department <http://data.example.com/department/10>.

9 Assigning Triples to Named Graphs

Diagram: The properties of graph maps

Figure 6: The properties of graph maps

Each triple generated from an R2RML mapping is placed into one or more graphs of the output dataset. Possible target graphs are the unnamed default graph, and the IRI-named named graphs.

Any subject map or predicate-object map MAY have one or more associated graph maps. They are specified in one of two ways:

  1. using the rr:graphMap property, whose value MUST be a graph map,
  2. using the constant shortcut property rr:graph.

Graph maps are themselves term maps. When RDF triples are generated, the set of target graphs is determined by taking into account any graph maps associated with the subject map or predicate-object map.

If a graph map generates the special IRI rr:defaultGraph, then the target graph is the default graph of the output dataset.

In the following subject map example, all generated RDF triples will be stored in the named graph ex:DepartmentGraph.

[] rr:subjectMap [
    rr:template "http://data.example.com/department/{DEPTNO}";
    rr:graphMap [ rr:graph ex:DepartmentGraph ];
].

This is equivalent to the following example, which uses a constant shortcut property:

[] rr:subjectMap [
    rr:template "http://data.example.com/department/{DEPTNO}";
    rr:graph ex:DepartmentGraph;
].

In the following example, RDF triples are placed into named graphs according to the job title of employees:

[] rr:subjectMap [
    rr:template "http://data.example.com/employee/{EMPNO}";
    rr:graphMap [ rr:template "http://data.example.com/jobgraph/{JOB}" ];
].

The triples generated from the EMP table would be placed in the named graph with the following IRI:

<http://data.example.com/jobgraph/CLERK>

9.1 Scope of Blank Nodes

Blank nodes in the output dataset are scoped to a single RDF graph. If the same blank node identifier occurs in multiple RDF triples that are in the same graph, then the triples will share the same single blank node. If, however, the same blank node identifier occurs in multiple graphs, then a distinct blank node is created for each graph. An R2RML-generated blank node can never be shared by two triples in two different graphs.

This implies that triples generated from a single logical table row will have different subjects if the subjects are blank nodes and the triples are placed into different graphs.

10 Datatype Conversions

This section defines various conversion rules applicable to data values. The rules are invoked in various places throughout this specification, in particular around rr:datatype and in hte term generation rules.

A typed literal of a supported RDF datatype is ill-typed if its lexical form is not in the lexical space of the RDF datatype identified by its datatype IRI.

For example, "X"^^xsd:boolean is ill-typed because “X” is not in the lexical space of xsd:boolean [XMLSCHEMA2].

10.1 Table of Corresponding Datatypes

The corresponding RDF datatype of a SQL datatype is given in the table below, or empty if the SQL datatype does not occur in the table.

The RDF transformation of a SQL datatype is a transformation rule given in the table below, or conversion to string if the SQL datatype does not occur in the table.

The supported RDF datatypes are the datatypes for which an implementation can detect ill-typed literals. This set MUST include all datatypes mentioned in the table below in the column “Corresponding RDF datatype”, according to their definitions in [XMLSCHEMA2]. This set MAY include arbitrary further datatypes.

SQL datatype Corresponding RDF datatype Transformation
BINARY, BINARY VARYING, BINARY LARGE OBJECT xsd:base64Binary base64 encoding
NUMERIC, DECIMAL xsd:decimal conversion to string
SMALLINT, INTEGER, BIGINT xsd:integer conversion to string
FLOAT, REAL, DOUBLE PRECISION xsd:double conversion to string
BOOLEAN xsd:boolean conversion to boolean
DATE xsd:date conversion to datetime
TIME xsd:time conversion to datetime
TIMESTAMP xsd:dateTime conversion to datetime
INTERVAL undefined undefined

Any types not appearing in the table, including all character string types and vendor-specific types, will default to producing RDF plain literals by using conversion to string.

R2RML processor implementations are expected to augment the table with additional rows for mapping vendor-specific datatypes to appropriate XSD types.

The translation of INTERVAL is left undefined due to the complexity of the translation. [SQL14] describes a translation of INTERVAL to xdt:yearMonthDuration and xdt:dayTimeDuration.

The following table shows examples of various SQL data values after conversion to string, and a typed literal of the corresponding RDF datatype, derived by applying the SQL datatype's RDF transformation:

SQL datatype Conversion to string example Typed literal example
DECIMAL 2000000000005.9 "2000000000005.9"^^xsd:decimal
DECIMAL 2000000000000 "2000000000000"^^xsd:decimal
INTEGER -1 "-1"^^xsd:integer
REAL 5.0E-1 "5.0E-1"^^xsd:double
REAL 0E0 "0E0"^^xsd:double
DATE DATE 2011-08-23 "2011-08-23"^^xsd:date
TIME TIME 22:17:00 "22:17:00"^^xsd:time
TIME TIME 22:17:00.0000 "22:17:00.0000"^^xsd:time
TIME TIME 22:17:00+01:00 "22:17:00+01:00"^^xsd:time
TIMESTAMP TIMESTAMP 2011-08-23 22:17:00 "2011-08-23T22:17:00"^^xsd:dateTime

10.2 Conversion to string

Conversion to string is the process of transforming a SQL data value to a Unicode string. Its result MUST be the same as evaluating the following SQL expression, as defined in [SQL2]:

CAST(value AS CHARACTER VARYING(max))

where value is the quoted and escaped form of the SQL data value, and max is the implementation-dependent maximum length of a variable-length character string.

An informative summary of the rules for casting standard SQL 2008 datatypes to string follows. The right column of the table contains a regular expression that is matched by the string-converted form of all SQL data values of the types in the left column:

TypePattern
DECIMAL, NUMERIC-?(\.\d+|\d+(\.\d+)?)
SMALLINT, INTEGER, BIGINT-?\d+
FLOAT, REAL, DOUBLE PRECISION0E0|-?[1-9]\.\d+
DATE\d\d\d\d-\d\d-\d\d
TIME\d\d:\d\d:\d\d(.\d+)?([+-]\d\d:\d\d)?
TIMESTAMP\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d(.\d+)?([+-]\d\d:\d\d)?
BOOLEANTRUE|FALSE

The result of conversion to string is always the shortest possible string that, if interpreted as a SQL literal of the original SQL datatype, has the same value as the original SQL data value. For example, converting the DECIMAL value 1 to string yields 1, not the longer equal-valued strings 01 or 1.0.

10.3 Conversion to xsd:boolean

Conversion to boolean is the process of transforming a SQL data value of datatype BOOLEAN to a string that is compatible with the xsd:boolean datatype. It consists of the following steps:

  1. Apply conversion to string to the SQL data value,
  2. convert the resulting string to lowercase.

Example: The result of converting a BOOLEAN SQL data value to string is either TRUE or FALSE. The resulting typed literal is either "true"^^xsd:boolean or "false"^^xsd:boolean.

10.4 Conversion to Datetime

Conversion to datetime is the process of transforming a SQL data value of datatype DATE, TIME or TIMESTAMP to a string that is compatible with the corresponding XSD datatype. It consists of the following steps:

  1. Apply conversion to string to the SQL data value.
  2. Remove any initial string “DATE”, “TIME” or “TIMESTAMP” and any leading spaces from the resulting string.
  3. If the SQL data value is of datatype TIMESTAMP, then replace the 11th character of the string (a space) with an upper-case “T”.

Any fractional seconds and/or time zone interval present after conversion to string is included in the resulting string.

Examples for conversion to datetime can be found in the table above.

10.5 Conversion to xsd:base64Binary

Base64 encoding is the process of transforming a binary SQL data value to a string that is compatible with the xsd:base64Binary datatype, by applying base64 encoding as restricted for xsd:base64Binary [XMLSCHEMA2] on the binary value.

11 The Output Dataset

The output dataset of an R2RML mapping is an RDF dataset that contains the generated RDF triples for each of the triples maps of the R2RML mapping. The output dataset MUST NOT contain any other RDF triples or named graphs besides these.

If a table or column is not explicitly referenced in a triples map, then no RDF triples will be generated for that table or column.

Conforming R2RML processors MAY rename blank nodes when providing access to the output dataset. This means that client applications may see actual blank node identifiers that differ from those produced by the R2RML mapping. Client applications SHOULD NOT rely on the specific text of the blank node identifier for any purpose.

RDF datasets may contain empty named graphs. R2RML cannot generate such output datasets.

11.1 The Generated RDF Triples of a Triples Map

This subsection describes the process of generating RDF triples from a triples map. This process adds RDF triples to the output dataset. Each generated triple is placed into one or more particular graphs of the output dataset.

The generated RDF triples are determined by the following algorithm. R2RML processors MAY use other means than implementing this algorithm to compute the generated RDF triples, as long as the result is the same.

  1. Let sm be the subject map of the triples map
  2. Let rows be the result of evaluating the effective SQL query of the triples map's logical table using the SQL connection
  3. Let classes be the class IRIs of sm
  4. Let sgm be the set of graph maps of sm
  5. For each logical table row row in rows, apply the following steps:
    1. Let subject be the generated RDF term that results from applying sm to row
    2. Let subject_graphs be the union of the generated RDF terms that result from applying any term maps in sgm to row
    3. If classes is not empty, then for each IRI in classes, add the following triples to the output dataset:

      Subject: subject
      Predicate: rdf:type
      Object: classes
      Target graphs: If sgm is empty: rr:defaultgraph; otherwise: subject_graphs

    4. For each predicate-object map of the triples map, apply the following steps:
      1. If the predicate-object map has no object map (but a referencing object map), then skip these substeps for this predicate-object map
      2. Let predicate be the generated RDF term that results from applying the predicate-object map's predicate map to row
      3. Let object be the generated RDF term that results from applying the predicate-object map's object map to row
      4. Let pogm be the set of graph maps of the predicate-object map
      5. Let predicate-object_graphs be the union of the generated RDF terms that result from applying any graph maps in pogm to row
      6. Add the following triples to the output dataset:

        Subject: subject
        Predicate: predicate
        Object: object
        Target graphs: If sgm and pogm are empty: rr:defaultGraph; otherwise: union of subject_graphs and predicate-object_graphs

  6. For each predicate-object map of the triples map, apply the following steps:
    1. If the predicate-object map has no referencing object map (but a normal object map), then skip these substeps for this predicate-object map
    2. Let psm be the subject map of the parent triples map of the referencing object map
    3. Let pogm be the set of graph maps of the predicate-object map
    4. Let rows be the result of evaluating the joint SQL query of the referencing object map
    5. For each row in rows, apply the following steps:
      1. Let child_row be the subset of row whose columns are present in the referencing object map's child query
      2. Let parent_row be the subset of row whose columns are present in the referencing object map's parent query
      3. Let subject be the generated RDF term that results from applying sm to child_row
      4. Let predicate be the generated RDF term that results from applying the predicate-object map's predicate map to child_row
      5. Let object be the generated RDF term that results from applying psm to parent_row
      6. Let subject_graphs be the union of the generated RDF terms that result from applying any graph maps of sgm to child_row
      7. Let predicate-object_graphs be the union of the generated RDF terms that result from applying any graph maps in pogm to child_row
      8. Add the following triples to the output dataset:

        Subject: subject
        Predicate: predicate
        Object: object
        Target graphs: If neither sgm nor pogm has any graph maps: rr:defaultGraph; otherwise: union of subject_graphs and predicate-object_graphs

The process of adding triples to the output dataset takes as its input:

For each possible combination <s, p, o>, where s is a member of Subjects, p a member of Predicates and o a member of Objects:

  1. Generate an RDF triple <s, p, o>
  2. If the set of target graphs includes rr:defaultGraph, add the triple to the default graph of the output dataset.
  3. For each IRI in the set of target graphs that is not equal to rr:defaultGraph, add the triple to a named graph of that name in the output dataset. If the output dataset does not contain a named graph with that IRI, create it first.

RDF graphs cannot contain duplicate RDF triples. Placing multiple equal triples into the same graph has the same effect as placing it into the graph only once.

11.2 The Generated RDF Terms of a Term Map

A term map is a function that generates a set of RDF terms from a logical table row. The result of that function can be:

The generated RDF terms of a term map for a given logical table row are determined as follows:

The term generation rules are as follows:

  1. If the value is NULL, then no RDF term is generated.
  2. Otherwise, if the term map's term type is rr:IRI:
    1. Apply conversion to string to the value.
    2. If the value is a valid absolute IRI [RFC3987], then generate an IRI.
    3. Otherwise, prepend the value with the base IRI. If the result is a valid absolute IRI [RFC3987], then generate an IRI from the result.
    4. Otherwise, raise a data error.
  3. Otherwise, if the term type is rr:BlankNode:
    1. Apply conversion to string to the value.
    2. Generate a blank node whose blank node identifier is the value.
  4. Otherwise, if the term type is rr:Literal:
    1. If the term map has a specified language tag, then apply conversion to string to the value, and generate a plain literal with that language tag.
    2. Otherwise, if a datatype override is in effect on the term map:
      1. Apply conversion to string to the value.
      2. Generate a typed literal whose datatype IRI is the specified datatype.
      3. If the specified datatype is a supported RDF datatype and the generated typed literal is ill-typed, then raise a data error.
    3. Otherwise, if the term map's implicit datatype is empty, then apply conversion to string to the value, and generate a plain literal without language tag.
    4. Otherwise, apply the term map's implicit transform to the value, and generate a typed literal whose datatype IRI is the implicit datatype.

The algorithm uses simple string concatenation for obtaining an absolute IRI from a relative IRI, rather than the more complex algorithm defined in RFC 3986. This ensures that the original database value can be reconstructed from the generated IRI.

A. RDF Terminology (Informative)

This section lists some terms normatively defined in other specifications.

The following terms are defined in RDF Concepts and Abstract Syntax [RDF] and used in R2RML:

The following terms are defined in SPARQL Query Language for RDF [SPARQL] and used in R2RML:

B. Index of R2RML Vocabulary Terms (Informative)

This appendix lists all the classes, properties and other terms defined by this specification within the R2RML vocabulary.

An RDFS representation of the vocabulary is available from the namespace IRI.

B.1 Classes

ClassRepresents
rr:GraphMap graph map
rr:Join join condition
rr:LogicalTable logical table
rr:ObjectMap object map
rr:PredicateMap predicate map
rr:PredicateObjectMap predicate-object map
rr:RefObjectMap referencing object map
rr:SubjectMap subject map
rr:TriplesMap triples map

B.2 Properties

The cardinality column indicates how often this property occurs within its context. Note that additional constraints not stated in this table might apply, and making a property forbidden or required in certain situations.

PropertyRepresentsContextCardinality
rr:child child column join condition 1
rr:class class IRI subject map 0…∞
rr:column column name column-valued term map1
rr:datatype specified datatype term map 0…1
rr:constant constant value constant-valued term map 1
rr:graph constant shortcut property subject map, predicate-object map 0…∞
rr:graphMap graph map subject map, predicate-object map 0…∞
rr:inverseExpression inverse-expression term map 0…1
rr:joinCondition join condition referencing object map 0…∞
rr:language specified language tag term map 0…1
rr:logicalTable logical table triples map 1
rr:object constant shortcut property predicate-object map 1
rr:objectMap object map predicate-object map 1
rr:parent parent column join condition 1
rr:parentTriplesMap parent triples map referencing object map 1
rr:predicate constant shortcut property predicate-object map 1
rr:predicateMap predicate map predicate-object map 1
rr:predicateObjectMap predicate-object map triples map 0…∞
rr:sqlQuery SQL query R2RML view 1
rr:sqlVersion SQL version identifier R2RML view 0…∞
rr:subject constant shortcut property triples map 1
rr:subjectMap subject map triples map 1
rr:tableName table or view name SQL base table or view 1
rr:template string template template-valued term map 1
rr:termType term type term map 0…1

B.3 Other Terms

TermDenotesUsed with property
rr:defaultGraph default graph rr:graph
rr:SQL2008 Core SQL 2008 rr:sqlVersion
rr:IRI IRI rr:termType
rr:BlankNode blank node rr:termType
rr:Literal literal rr:termType

C. References

C.1 Normative References

[RDF]
Resource Description Framework (RDF): Concepts and Abstract Syntax, Graham Klyne, Jermey J. Carroll, Editors. World Wide Web Consortium, 10 February 2004. This version is http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/. The latest version is http://www.w3.org/TR/rdf-concepts/.
[RFC2119]
Key words for use in RFCs to Indicate Requirement Levels, S. Bradner, March 1997. Internet RFC 2119, http://tools.ietf.org/html/rfc2119.
[RFC3629]
UTF-8, a transformation format of ISO 10646, F. Yergeau. November 2003. Internet RFC 3629, http://tools.ietf.org/html/rfc3629.
[RFC3986]
Uniform Resource Identifier (URI): Generic Syntax, T. Berners-Lee, R. Fielding, L. Masinter. January 2005. Internet RFC 3986, http://tools.ietf.org/html/rfc3986.
[RFC3987]
Internationalized Resource Identifiers (IRIs), M. Duerst, M. Suignard. January 2005. Internet RFC 3987, http://tools.ietf.org/html/rfc3987.
[SPARQL]
SPARQL Query Language for RDF, Eric Prud'hommeaux, Andy Seaborne, Editors. World Wide Web Consortium, 15 January 2008. This version is http://www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/. The latest version is http://www.w3.org/TR/rdf-sparql-query/.
[SQL1]
ISO/IEC 9075-1:2008 SQL - Part 1: Framework (SQL/Framework). International Organization for Standardization, 27 January 2009.
[SQL2]
ISO/IEC 9075-2:2008 SQL - Part 2: Foundation (SQL/Foundation). International Organization for Standardization, 27 January 2009.
[TURTLE]
Turtle - Terse RDF Triple Language, Dave Beckett, Tim Berners-Lee. World Wide Web Consortium, 14 January 2008. This version is http://www.w3.org/TeamSubmission/2008/SUBM-turtle-20080114/. The latest version is http://www.w3.org/TeamSubmission/turtle/.
[XMLSCHEMA2]
XML Schema Part 2: Datatypes Second Edition, Paul V. Biron, Ashok Malhotra. World Wide Web Consortium, 28 October 2004. This version is http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/. The latest version is http://www.w3.org/TR/xmlschema-2/.

C.2 Other References

[DM]
A Direct Mapping of Relational Data to RDF, Alexandre Bertails, Marcelo Arenas, Eric Prud'hommeaux, Juan Sequeda, Editors. World Wide Web Consortium, 20 September 2011. This version is http://www.w3.org/TR/2011/WD-rdb-direct-mapping-20110920/. The latest version is http://www.w3.org/TR/rdb-direct-mapping/. This document is work in progress.
[SQL14]
ISO/IEC 9075-14:2008 SQL - Part 14: XML-Related Specifications (SQL/XML). International Organization for Standardization, 27 January 2009.
[SQLIRIS]
SQL Version IRIs, Members of the W3C RDB2RDF Working Group. The latest version is http://www.w3.org/2001/sw/rdb2rdf/wiki/SQL_Version_IRIs. This is a public wiki page.
[TC]
R2RML and Direct Mapping Test Cases (Editor's Draft), Boris Villazón-Terrazas, Michael Hausenblas, Alexander de Leon, Editors. World Wide Web Consortium, 31 August 2011. The latest version is http://www.w3.org/2001/sw/rdb2rdf/test-cases/. This document is work in progress.
[UCNR]
Use Cases and Requirements for Mapping Relational Databases to RDF, Eric Prud'hommeaux, Michael Hausenblas, Editors. World Wide Web Consortium, 8 June 2010. This version is http://www.w3.org/TR/2010/WD-rdb2rdf-ucr-20100608/. The latest version is http://www.w3.org/TR/rdb2rdf-ucr/. This document is work in progress.

D. Acknowledgements (Informative)

The Editors would like to give special thanks to the following members: Nuno Lopes for help in designing the datatyping related text, David McNeil for raising many of the issues that needed addressing, Eric Prud'hommeaux for designing the SQL compatibility text, and Boris Villazón-Terrazas for drawing all the diagrams.

In addition, the Editors gratefully acknowledge contributions from: Marcelo Arenas, Sören Auer, Samir Batla, Alexander de Leon, Orri Erling, Lee Feigenbaum, Enrico Franconi, Howard Greenblatt, Wolfgang Halb, Harry Halpin, Michael Hausenblas, Patrick Hayes, Ivan Herman, Nophadol Jekjantuk, Li Ma, Nan Ma, Ashok Malhotra, Ivan Mikhailov, Percy Enrique Rivera Salas, Juan Sequeda, Ben Szekely, Ted Thibodeau, and Edward Thomas.