Ashok's UCR Review

From RDB2RDF
Jump to: navigation, search

Comments on RDB2RDF Requirements and Usecases – Ashok Malhotra

Version reviewed: Apr 20, 2010 Revised 2010/05/11 17:21:15


GENERAL COMMENTS:

The Requirements document is produced by the WG after it is chartered and its deliverables have been agreed to. There is no need to go over this material again.

The requirements mentioned in this document bind the WG. The final standards we produce MUST meet these requirements or it will be rejected by the director. So, we should list only requirements that are really necessary.

Ideally, requirements should follow from the usecases. I do not see that link here. I have suggested such links in the detailed comments below.

One of the requirements that was discussed in the XG and seems to have got lost here is that R2RML should be able to produce RDF that is materialized and stored as well as a virtual RDF schema The virtual schema is used to translate SPARQL queries to the underlying SQL. This is not spelled out clearly in the document

STATUS of this Document: No need to mention the documents to be produced by the WG. That’s in the charter.

TOC: A1?

INTRODUCTION: Remove 1.1, 1.2

1.3 Add “This means that data can be extracted from the Relational Database but not updated.” Remove Ed. Note following this.

1.4 Glossary needs to be completed or removed. Suggest remove because time is short

2. USECASES Replace section with the following:

Apart from Usecase 3, which is concerned with the creation of identifiers for entities described by the data, the other three usecases can be classified along two dimensions:

1. whether the mapping is to an RDF Schema or OWL ontology derived from the Relational Schema or to a schema/ontology obtained from an analysis of the domain 2. whether the RDF is materialized and stored (sometimes called ETL – extract, transform, load) or whether the mapping is a virtual mapping that is used to translate SPARQL queries based on the schema/ontology to SQL queries on the underlying data.

Usecase 1 describes a mapping from Relational data to two ontologies derived from the Relational Schema. The data is not stored but SPARQL queries against the ontologies are translated to SQL queries against the underlying data.

Usecase 2 describes a mapping from a simple Relational schema to an RDF Schema where the RDF is materialized and stored.

Usecase 4 describes two situations. In 2.4.1, the OWL ontology is derived from the database schema. In 2.4.2. the data from the Relational database is mapped onto an existing domain ontology

2.1 UC1-Patient Recruitment

First line second para “Accompanying each table is are two RDF views (represented in Turtle) corresponding to RDF HL7/RIM and CDISK SDTM data structures”.

2.1.3, 2.1.4, 2.1.5 have only one RDF view.

2.1.7 Remove first two sentences.

ADD at the end: This usecase leads to the following requirements

  • Map Relational data to an RDF schema derived from the Relational Schema
  • Create a virtual RDF view that is used to translate SPARQL queries over the RDF views to SQL queries over the relational data.
  • Mapping of Relational datatypes to RDF/XML datatypes –e.g. SQL date/time formats to XMLSchema date/time formats
  • Mapping column names to RDF property names e.g. in 2.1.4 DaysToTake is mapped to hl7:durationInDays

2.1 UC2 - Web applications (Wordpress)

Para 4, need reference for MySQL

ADD at the end: This usecase leads to the following requirements

  • Map Relational data to an RDF schema derived from the Relational Schema
  • Extract-Transform-Load the RDF created by the mapping .

2.2UC3 - Integrating Enterprise Relational databases for tax control

Add at end of last para: “The RDF generated from the two databases is materialized and joined using the generated unique identifiers

ADD at the end: This usecase leads to the following requirements

  • Create unique identifiers for the entities described by the data.

2.4 UC4 - rCAD: RNA Comparative Analysis Database

Based on Microsoft SQL-server, we have designed and implemented the RNA Comparative Analysis Database -rCAD which supports comparative analysis of RNA sequence and structure, …

Lose Rob!

ADD at the end: This usecase leads to the following requirement

  • Map Relational data to a prexisting OWL ontology derived from an analysis of the domain

3 Approaches

I don’t think this section adds anything to the document. I recommend we remove it

Specifically, does Option 2 require 2 transformations?

4 Requirements

“The RDB2RDF Working Group will prioritize the proposed requirements and accept those that enable useful mappings without excessively delaying development of the specification or implementations of the specification.”

Sorry, you cannot say this. The requirements in the document MUST be satisfied. Others can be listed as Optional Requirements.

4.1 Core Requirements

4.1.1

Add “This requirement comes from Usecases 1,2 and the first part of usecase 4.”

Why is this requirement a SHOLD and the following requirement a MUST. I recommend we remove the sentences with SHOULD/MUST. All core requirements are MUST.

4.1.2

Remove Ed. Note.

Remove heading 4.1.2.1 – no need for additional layer of hierarchy

Add “This requirement comes from the second part of usecase 4.”

4.1.2.2 LABELGEN - Label Generation – Change to 4.1.3

Add “This requirement comes from usecase 3.”

4.1.2.3 DATATYPES – Datatypes Change to 4.1.4

Add “This requirement comes from usecase 1 but, in fact all usecases require this.”

Remove Ed. Note.

4.1.3 SQLGEN - Query Translation

Add “This requirement comes from the second part of usecase 4.”

No need for capital, emphasized MUST.

4.1.4 CONNECTION - Database Connection -- Remove this requirement

ADD NEW REQUIREMENT: 4.1.? Ability to Rename SQL Column Names

When mapping from a Relational Schema to an RDF Schema or OWL ontology when column names are mapped to relation/property names it must be possible to rename the columns/properties.

This requirement comes from Usecase 1

CREATE NEW SECTION: 4.2 Additional Core Requirements for Mapping

The Relational model and the RDF/OWL model differ in significant ways and the modeling languages enable concepts to be expressed as different structures. The following requirements are based on general modeling and model transformation capabilities.

4.2.1 Table Parsing – I don’t understand what this means. Pl. explain.

4.2.2 Value type RENAME as Creating Classes based on Attribute Values

ADD as first para:

The requirement is for creating multiple classes from a single column based on the values of a related attribute.

4.2.3 Many-to-Many

ADD as first para:

Relational databases typically use three tables to represent Many-to-Many relationships. This requirements is to allow such relationships to be mapped using direct links between the entities.

4.2.4 Microparsing RENAME as Apply a Function before Mapping

NEW TEXT: In some cases what you want to map is not the original value in the database but the result of applying a function to the value. For example, the value may be temperature in Centigrade and you may want to convert to Fahrenheit. Or from Euros to Dollars. It is easy to think of other examples. Position may be stored as a tuple {latitude, longitude} and you may want to map to two separate properties. Or the address may be stored in a number of columns and you may want to map as a single string. A more complex example: a database row might contain Wiki text, which should be transformed into HTML.

Comment: I don’t think we want to legislate which functions we allow. The user can always use a special function library.

4.2 Non-Core Requirements – Rename as 4.3 Optional Requirements

4.3.1 Named Graphs

4.3.2 NSDecl

4.3.3 Metadata

4.3.4 Provenance

4.3.5 Updates

4.3 Application Requirements – Remove this section

6 References

SQL Reference

ISO (International Organization for Standardization). ISO/IEC 9075-2:1999, Information technology --- Database languages --- SQL --- Part 2: Foundation (SQL/Foundation). [Geneva]: International Organization for Standardization, 1999. See http://www.iso.ch/cate/d26197.html

MySQL Reference - http://www.mysql.com/