10 Datatype Conversions

This section defines the following mappings from SQL data values to RDF literals:

10.1 Introduction (Informative)

The following mappings from SQL data values are defined:

  1. The natural RDF literal is a mapping to typed literals. It is used in R2RML and in the Direct Mapping of Relational Data to RDF [DM] as the default mapping when literals are created. It maps SQL datatypes to corresponding XML Schema datatypes [XMLSCHEMA2] and loosely follows ISO/IEC 9075-14:2008 [SQL14].
  2. The natural RDF lexical form is similar, but produces only the lexical form of the typed literal and recommends that implementations perform XSD canonicalization. It is used in R2RML when non-string columns are used in a string context, for example when a TIMESTAMP is used in an IRI template.
  3. The canonical RDF lexical form is again similar, but requires XSD canonicalization. It is used in the Direct Mapping when IRIs are generated.
  4. The datatype-override RDF literal is a mapping that constructs typed literals by using the natural RDF lexical form and applying a specified datatype IRI. The mapping author is responsible for ensuring that the generated lexical form is valid for the datatype. It is used in R2RML when the target datatype of a literal-generating term map is overridden using rr:datatype.

The mappings cover all predefined SQL 2008 datatypes except INTERVAL. INTERVAL, vendor-specific types and types added to future SQL specifications should be mapped to RDF with extensions to the mapping to natural rdf literal.

The mappings are referenced in the R2RML term generation rules.

An informative summary of XSD lexical forms is provided to aid implementers.

10.2 Natural Mapping of SQL Values

The natural RDF literal corresponding to a SQL data value is the result of applying the following steps:

  1. Let dt be the SQL datatype of the SQL data value.
  2. If dt is a character string type, then the result is a plain literal without language tag whose lexical form is the SQL data value.
  3. Otherwise, if dt is listed in the table below: The result is a typed literal whose datatype IRI is the IRI indicated in the RDF datatype column in the same row as dt. The lexical form may be any lexical form that represents the same value as the SQL data value, according to the definition of the RDF datatype. If there are multiple lexical forms available that represent the same value (e.g., 1, +1, 1.0 and 1.0E0), then the choice is implementation-dependent. However, the choice MUST be made so that given a target RDF datatype and value, the same lexical form is chosen consistently (e.g., INTEGER 5 and BIGINT 5 must be mapped to the same lexical form, as both are mapped to the RDF datatype xsd:integer and are equal values; mapping one to 5 and the other to +5 would be in error). The canonical lexical representation [XMLSCHEMA2] MAY be chosen. (See also: Summary of XSD Lexical Forms)
SQL datatype RDF datatype Lexical transformation (informative)
BINARY, BINARY VARYING, BINARY LARGE OBJECT xsd:base64Binary base64 encoding
NUMERIC, DECIMAL xsd:decimal none required
SMALLINT, INTEGER, BIGINT xsd:integer none required
FLOAT, REAL, DOUBLE PRECISION xsd:double none required
BOOLEAN xsd:boolean ensure lowercase (true, false)
DATE xsd:date none required
TIME xsd:time none required
TIMESTAMP xsd:dateTime replace space character with “T
INTERVAL undefined undefined

The natural rdf literal is defined for SQL 2008 datatypes. The natural rdf literal should be extended to map vendor-specific datatypes to RDF by behaving as if the table above contained additional rows that associate the SQL datatypes with appropriate RDF-compatible datatypes (e.g., the XML Schema built-in types [XMLSCHEMA2]), and appropriate lexical transformations where required. If there is no appropriate datatype, the value may be cast to string and expressed as an RDF plain literal. Future versions of this specification may define mappings for vendor-specific datatypes or datatypes added to the SQL specification.

The translation of INTERVAL is left undefined due to the complexity of the translation. [SQL14] describes a translation of INTERVAL to xdt:yearMonthDuration and xdt:dayTimeDuration.

In [SQL2], the precision of many SQL datatypes is not fixed, but left implementation-defined. Therefore, the mapping to XML Schema datatypes must rely on arbitrary-precision types such as xsd:decimal, xsd:integer and xsd:dateTime. Implementers of the mapping may wish to set upper limits for the supported precision of these XSD types. The XML Schema specification allows such partial implementations of infinite datatypes [XMLSCHEMA2], and defines specific minimum requirements.

The natural RDF datatype corresponding to a SQL datatype is the value of the RDF datatype column in the row corresponding to the SQL datatype in the table above.

The natural RDF lexical form corresponding to a SQL data value is the lexical form of its corresponding natural RDF literal, with the additional constraint that the canonical lexical representation [XMLSCHEMA2] SHOULD be chosen.

The canonical RDF lexical form corresponding to a SQL data value is the lexical form of its corresponding natural RDF literal, with the additional constraint that the canonical lexical representation [XMLSCHEMA2] MUST be chosen.

Cast to string is an implementation-dependent function that maps SQL data values to equivalent Unicode strings. It is undefined for the following kinds of SQL datatypes: collection types, row types, user-defined types without a user-defined string CAST, reference types whose referenced type does not have a user-defined string CAST, binary types, and OPTIONAL further implementation-defined types.

Cast to string is a fallback that handles vendor-specific and user-defined datatypes not supported by the R2RML processor. It can be implemented in a number of ways, including explicit SQL casts (“CAST(value AS VARCHAR(n))”, where n is an arbitrary large integer), implicit SQL casts (concatenation with the empty string), or by employing a database access API that presents return values as strings.

Changes

$Log: sec-10.html,v $
Revision 1.1  2012/01/12 19:34:02  eric
cp §10.html sec-10.html

Revision 1.2  2012/01/12 19:27:45  eric
~ changes outlined in ericP's proposed changes to §10


Revision 1.1  2012/01/12 19:06:37  eric
~ extracted from R2RML Editor's Draft