Draft of Use Cases

From RDB2RDF
Revision as of 10:16, 20 April 2010 by Tthibodeau (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Expressivity

Analysis of these use cases yields a set of functional requirements for the transformation from relational to RDF. Relational data, with it's foreign and primary keys, is a potentially cyclic graph, as is RDF. Edges in this relational graph are realised join constraints in SQL queries, often corresponding to foreign/primary key declarations in the relational schema. These edges can be expressed as RDF triples, as can all of the attribute in an relation.

Some use cases require that a relational graph be represented directly as an isomorphic RDF graph, while others require transformations of the graph. Below, the former use cases are labeled isomorphic.

example patient database graph:

27-db_data.png

RDF graph isomorphic to patient database graph:

27-db_rdf.png

graphs not isomorphic to DB graph:

27-rim_sdtm_data.png

RDF identifiers for objects in the [conceptual model] can, in some cases, be generated from a transformation of the schema and data in a tuple representing that conceptual model. For example, it may be sufficient to identify a patient in a clinical database with primary key patientID and value 1234561 as http://myclinic.example/patient/patientID.12334561#x, while but if the patient IDs are shared with another database, it will be necessary to transform one or both of these identifiers into a common form, e.g. http://allclinics.example/sharedRecords/patient.12334561#y. These use cases are labeled custom-identifier.


Migration Use Case

One of the reasons for open standards is to allow easy migration between different systems. Just as a single web-page in HTML can be viewed both my two different web browsers from different vendors, a single relational data to RDF mapping standard should allow a custodian of one database to expose their data as RDF and then, when they export their data to another database, allow the newly imported data to be queried as RDF without changing the mapping file.

For example, imagine that a database administrator is working on exposing weather data as linked data to be found and consumed by other applications. At first, this weather data is stored in a light-weight database (such as MySQL). However, as more and more weather data is collected, and more and more users access this RDF, the light-weight database has difficulty scaling, providing problems for users. Therefore, the database administrator migrates their database to a more heavy-weight database (such as Oracle Database 11g).

Of course, the database administrator does not want to have re-create the ability to view the data as RDF using a vendor-specific mapping file, but instead wants to seamlessly migrate the view of their data as RDF. A standardized mapping between relational data and RDF allows the database administrator to migrate the view of their data as RDF across databases, allowing the vendors to compete on functionality and features rather than forcing database administrators to rewrite their entire relational data to RDF mapping when they want to migrate their data from one database to another.

HCLS RDB2RDF Use Cases

The Semantic Web for Health Care and Life Sciences Interest Group has created several demonstrators using SPARQL to query clinical and biological relational databases. Included is the database structure, sample data, a SPARQL query with results, and an equivalent SQL query.

COI Patient Recruitment Demo

Diabetic patient data from the University of Texas Health Science Center is expressed as RDF views corresponding to HL7/RIM and CDISK SDTM data structures. Some columns have been omitted from this use case for brevity.

While there are many motivations for providing a common interface to administratively distinct databases (access to patient history, shared rules for clinical decision support, etc), the example queries following the table description are used to find candidates for clinical studies.

Namespace Prefixes

@prefix xsd: <http://www.w3.org/2001/XMLSchema#>                                      .
@prefix hl7: <http://www.hl7.org/v3ballot/xml/infrastructure/vocabulary/vocabulary#>  .
@prefix spl: <http://www.hl7.org/v3ballot/xml/infrastructure/vocabulary/vocabulary#>  .

Person

ID MiddleName SexDE DateOfBirth LastEditedDTTM
1234561 2 1983-01-02 00:00:00 2007-11-13 15:49:20
1234562 3 1963-12-27 00:00:00 2008-01-30 17:08:42
1234563 2 1983-02-25 00:00:00 2007-03-10 06:01:55
<http://hospital.example/DB/Person/ID.1234561#record> a hl7:Person ;
   hl7:entityName "" ; hl7:administrativeGenderCodePrintName Sex_DE:M ;
   hl7:livingSubjectBirthTime "1983-01-02T00:00:00Z"^^xsd:dateTime .
<http://hospital.example/DB/Person/ID.1234562#record> a hl7:Person ;
   hl7:entityName "" ; hl7:administrativeGenderCodePrintName Sex_DE:F ;
   hl7:livingSubjectBirthTime "1963-12-27T00:00:00Z"^^xsd:dateTime .
<http://hospital.example/DB/Person/ID.1234563#record> a hl7:Person ;
   hl7:entityName "" ; hl7:administrativeGenderCodePrintName Sex_DE:M ;
   hl7:livingSubjectBirthTime "1983-02-25T00:00:00Z"^^xsd:dateTime .
<http://hospital.example/DB/Person/ID.1234561#record> a sdtm:Patient ;
  stdm:middleName "" ;
  stdm:dateTimeOfBirth "1983-01-02T00:00:00Z"^^xsd:dateTime .
<http://hospital.example/DB/Person/ID.1234562#record> a sdtm:Patient ;
  sdtm:middleName ""  ;
  sdtm:dateTimeOfBirth "1963-12-27T00:00:00Z"^^xsd:dateTime ;
  sdtm:sex <http://hospital.example/DB/Sex_DE#F> .
<http://hospital.example/DB/Person/ID.1234563#record> a sdtm:Patient ;
  sdtm:middleName ""  ;
  sdtm:dateTimeOfBirth "1983-02-25T00:00:00Z"^^xsd:dateTime ;
  sdtm:sex <http://hospital.example/DB/Sex_DE#F> .

Sex_DE

ID EntryCode EntryName EntryMnemonic
2 1 Male M
3 2 Female F
<http://hospital.example/DB/Zex_DE/ID.2#record>
    hl7:administrativeGenderCodePrintName "Male"@en-us ;
    Sex_DE:EntryMnemonic "M"@en-us .
<http://hospital.example/DB/Zex_DE/ID.3#record>
    hl7:administrativeGenderCodePrintName "Female"@en-us ;
    Sex_DE:EntryMnemonic "F"@en-us .
<http://hospital.example/DB/Person/ID.1234561#record>
    stdm:sex <http://hospital.example/DB/Sex_DE#M> .
<http://hospital.example/DB/Person/ID.1234562#record>
    stdm:sex <http://hospital.example/DB/Sex_DE#F> .
<http://hospital.example/DB/Person/ID.1234563#record>
    stdm:sex <http://hospital.example/DB/Sex_DE#M> .

Item_Medication

ID PatientID ItemType PerformedDTTM
99999999002 1234561 ME 2007-09-28 00:00:00
99999999003 1234562 ME 2007-09-28 00:00:00
99999999004 1234562 ME 2008-07-28 00:00:00
<http://hospital.example/DB/Person/ID.1234561#record>
    hl7:substanceAdministration <http://hospital.example/DB/Item_Medication/ID.99999999002#record> .
<http://hospital.example/DB/Item_Medication/ID.99999999002#record>
    a hl7:SubstanceAdministration ;
    hl7:effectiveTime _:t1 .
_:t1 hl7:start "2007-09-28T00:00:00"^^xsd:dateTime .
<http://hospital.example/DB/Person/ID.1234562#record>
    hl7:substanceAdministration <http://hospital.example/DB/Item_Medication/ID.99999999003#record> ;
    hl7:substanceAdministration <http://hospital.example/DB/Item_Medication/ID.99999999004#record> .
<http://hospital.example/DB/Item_Medication/ID.99999999003#record>
    a hl7:SubstanceAdministration ;
    hl7:effectiveTime _:t2 .
_:t2 hl7:start "2007-09-28T00:00:00"^^xsd:dateTime .
<http://hospital.example/DB/Item_Medication/ID.99999999004#record>
    a hl7:SubstanceAdministration ;
    hl7:effectiveTime _:t3 .
_:t3 hl7:start "2008-07-28T00:00:00"^^xsd:dateTime .

Medication

ID ItemID Dose Refill QuantityToDispense DaysToTake PrescribedByID MedDictDE
88888888002 99999999002 2 6 180 45 1004682 132139
88888888003 99999999002 2 0 180 45 1004683 132139
88888888004 99999999003 2 6 180 45 1004682 132139
88888888005 99999999004 4 6 180 45 1004682 132139
<http://hospital.example/DB/Item_Medication/ID.99999999002#record>
    hl7:consumable <http://hospital.example/DB/Medication_DE/ID.132139#record> .
_:t1 hl7:durationInDays 45 .
<http://hospital.example/DB/Item_Medication/ID.99999999003#record>
    hl7:consumable <http://hospital.example/DB/Medication_DE/ID.132139#record> .
<http://hospital.example/DB/Item_Medication/ID.99999999004#record>
    hl7:consumable <http://hospital.example/DB/Medication_DE/ID.132139#record> .
_:t2 hl7:durationInDays 45 .
<http://hospital.example/DB/Item_Medication/ID.99999999005#record>
    hl7:consumable <http://hospital.example/DB/Medication_DE/ID.132139#record> .
_:t3 hl7:durationInDays 45 .

Medication_DE

ID Entry EntryCode EntryName NDC Strength Form UnitOfMeasure DrugName DisplayName
132139 131933 98630 GlipiZIDE-Metformin HCl 2.5-250 MG Tablet 54868079500 2.5-250 TABS MG GlipiZIDE-Metformin HCl GlipiZIDE-Metformin HCl 2.5-250 MG Tablet
<http://hospital.example/DB/Medication_DE/ID.132139#record>
    hl7:displayName "GlipiZIDE-Metformin HCl 2.5-250 MG Tablet" .

NDCcodes

ingredient RxCUI labelType name NDC
6809 351273 Clinical Glipizide 2.5 MG / Metformin 500 MG Oral Tablet 54868079500
<http://hospital.example/DB/Medication_DE/ID.132139#record>
    spl:activeIngredient _:i1 .
_:i1 spl:classCode 54868079500 .
[ a sdtm:ConcomitantMedication ;
    sdtm:subject <http://hospital.example/DB/Person/ID.1234561#record> ;
    sdtm:standardizedMedicationName "GlipiZIDE-Metformin HCl 2.5-250 MG Tablet" ;
    hl7:activeIngredient [hl7:classCode 54868079500 ] ;
    sdtm:startDateTimeOfMedication "2007-09-28 00:00:00"^^xsd:dateTime ] .

[ a sdtm:ConcomitantMedication ;
    sdtm:subject <http://hospital.example/DB/Person/ID.1234562#record> ;
    sdtm:standardizedMedicationName "GlipiZIDE-Metformin HCl 2.5-250 MG Tablet" ;
    hl7:activeIngredient [ hl7:classCode 54868079500 ] ;
    sdtm:startDateTimeOfMedication "2007-09-28 00:00:00"^^xsd:dateTime ] .

[ a sdtm:ConcomitantMedication ;
    sdtm:subject <http://hospital.example/DB/Person/ID.1234562#record> ;
    sdtm:standardizedMedicationName "GlipiZIDE-Metformin HCl 2.5-250 MG Tablet" ;    
    hl7:activeIngredient [ hl7:classCode 54868079500 ] ;
    sdtm:startDateTimeOfMedication "2008-07-28 00:00:00"^^xsd:dateTime ] .

Queries Over the RDF Graph.

The RDF graphs place the relational data into the Semantic Web. There are many ways to consume RDF data, integration with other data sources, inference according to OWL or RIF rules, browsing with a linked data browser like Bubbles or Tabulator. This patient recruitment use case can be realised as a SPARQL query over the RDF graphs. Following is a query which extracts patients taking a particular class of medication (an anticoagulant, in this example) and not another (weight loss, here).

PREFIX sdtm: <http://www.sdtm.org/vocabulary#>
PREFIX spl: <http://www.hl7.org/v3ballot/xml/infrastructure/vocabulary/vocabulary#>

SELECT ?patient ?dob ?sex # ?takes ?indicDate ?indicEnd ?contra
 WHERE {
  ?patient a sdtm:Patient ;
          sdtm:middleName ?middleName ;
          sdtm:dateTimeOfBirth ?dob ;
          sdtm:sex ?sex .

  [	  sdtm:subject ?patient ;
	  sdtm:standardizedMedicationName ?takes ;

	  spl:activeIngredient [ spl:classCode 6809 ] ;
          sdtm:startDateTimeOfMedication ?indicDate
  ] .
  OPTIONAL {
  [	  sdtm:subject ?patient ;
	  sdtm:standardizedMedicationName ?disqual ;
	  spl:activeIngredient [ spl:classCode 11289 ]
          sdtm:startDateTimeOfMedication ?indicDate
  ] }
} LIMIT 30

RDF Graphs as Non-materialized Views

The RDF graphs defined by a mapping over a relational database need not be materialised. Indeed, queries or inference may be expressed in terms of a notional representation of relational data, but executed over the original relational store. Executing the following SQL query over any data in the relational tables should give results identical to a SPARQL query over the RDF graph mapped from that relational data:

SELECT R_patient.ID AS patient, R_patient.DateOfBirth AS dob, R__0_sexEntry.EntryName AS sex
  FROM Person AS R_patient
       INNER JOIN Sex_DE AS R_sexEntry
       INNER JOIN Item_Medication AS R_indicItem
       INNER JOIN Medication AS R_indicMed
       INNER JOIN Medication_DE AS R_indicDE
       INNER JOIN NDCcodes AS R_indicCode
       LEFT OUTER JOIN (
       SELECT 6 AS _DISJOINT_, R_disqualMed.MedDictDE AS disqualDE, 
              R_disqualItem.PerformedDTTM AS disqualDate, R_disqualItem.ID AS disqualItem, 
              R_disqualMed.ID AS disqualMed, R_disqualDE.NDC AS disqualNDC, 
              R_disqualItem.PatientID AS patient, R_disqualItem.EntryName AS takes
         FROM Item_Medication AS R_disqualItem
              INNER JOIN Medication AS R_disqualMed
              INNER JOIN Medication_DE AS R_disqualDE
              INNER JOIN NDCcodes AS R_disqualCode
        WHERE (R_disqualCode.NDC=R_disqualDE.NDC)
          AND (R_disqualCode.ingredient=11289)
          AND (R_disqualDE.ID=R_disqualMed.MedDictDE)
          AND (R_disqualItem.EntryName IS NOT NULL)
          AND (R_disqualItem.PatientID IS NOT NULL)
          AND (R_disqualItem.PerformedDTTM IS NOT NULL)
          AND (R_disqualMed.ID IS NOT NULL)
          AND (R_disqualMed.ItemID=R_disqualItem.ID)
                  ) AS G_opt6 ON (G_opt6.patient=R_patient.ID)
   AND (G_opt6.takes=R_indicItem.EntryName)
 WHERE (R_indicCode.NDC=R_indicDE.NDC)
   AND (R_indicCode.ingredient=6809)
   AND (R_indicDE.ID=R_indicMed.MedDictDE)
   AND (R_indicItem.EntryName IS NOT NULL)
   AND (R_indicItem.PatientID=R_patient.ID)
   AND (R_indicItem.PerformedDTTM IS NOT NULL)
   AND (R_indicMed.ID IS NOT NULL)
   AND (R_indicMed.ItemID=R_indicItem.ID)
   AND (R_patient.DateOfBirth IS NOT NULL)
   AND (R_patient.MiddleName IS NOT NULL)
   AND (R_sexEntry.EntryName IS NOT NULL)
   AND (R_sexEntry.ID=R_patient.SexDE);

This query is not a normative representation of the earlier SPARQL query; it is included here only to illustrate the ability of R2ML tools to produce efficient SQL queries.


BioPAX Use Cases

Set of use cases mapping pathway data to BioPAX or PRO .

DB 2 BioPAX

source

Namespace Prefixes

@prefix bp: <http://www.biopax.org/release/biopax-level1.owl#> .
@prefix bpdemo: <biopax-short-demo-pathway.rdf#> .

Molecule Tables

Molecules
_pk NAME SHORTNAME MOLECULAR-WEIGHT CHEMICAL-FORMULA STRUCTURE AVAILABILITY
10 alpha-D-glucose a-D-glu 180.16 C6H12O6 12 see http://www.amaze.ulb.ac.be/
MoleculeSynonyms
MOLECULE SYNONYM
10 α-D-glucose
MoleculeCrossReferences
MOLECULE CROSSREFERENCE
10 11
MoleculePhysicalEntityParticipant
_pk PHYSICAL-ENTITY_molecule STOICHIOMETRIC-COEFFICIENT CELLULAR-LOCATION_physent
9 10 1.0 31
MoleculeDataSources
MOLECULE DATASOURCE
10 14
Structures
_pk FORMAT DATA
12 SMILES C1(C(O)C(O)C(O)C(O1)CO)(O)

Protein Tables

Protiens
_pk NAME SHORTNAME SEQUENCE ORGANISM AVAILABILITY
32 glucokinase GLK MTK...HIL @blob@ 33 see http://www.amaze.ulb.ac.be/
ProteinSynonyms
PROTEIN SYNONYM
32 Glucose kinase
32 GLK_ECOLI
ProteinCrossReferences
PROTEIN CROSSREFERENCE
32 35
ProtienPhysicalEntityParticipant
_pk PHYSICAL-ENTITY_protein STOICHIOMETRIC-COEFFICIENT CELLULAR-LOCATION_ocv
31 32 1.0 15 @@
ProteinDataSources
PROTEIN DATASOURCE
32 14

Other Tables

BioSources
_pk NAME TAXON-XREF
33 Escherichia coli 34
UnificationCrossReferences
_pk ID DB
11 C00267 KEGG
34 562 taxon
35 P46880 Swiss-Prot/TrEMBL
DataSources
_pk NAME
14 aMAZE
bpdemo:smallMolecule10 a bp:smallMolecule ;
    bp:NAME  "alpha-D-glucose"^^xsd:string ;
    bp:SHORT-NAME  "a-D-glu"^^xsd:string ;
    bp:SYNONYMS  "α-D-glucose"^^xsd:string ;
    bp:MOLECULAR-WEIGHT  180.16 ;
    bp:CHEMICAL-FORMULA  "C6H12O6"^^xsd:string ;
    bp:STRUCTURE  bpdemo:chemicalStructure12 ;
    bp:XREF  bpdemo:unificationXref11 ;
    bp:XREF  bpdemo:unificationXref13 ;
    bp:DATA-SOURCE  bpdemo:KB_439584_Individual_47 ; # !! doesn't fit
    bp:DATA-SOURCE  bpdemo:dataSource14 ;
    bp:AVAILABILITY  "see http://www.amaze.ulb.ac.be/"^^xsd:string .

bpdemo:physicalEntityParticipant9 a bp:physicalEntityParticipant ;
    bp:STOICHIOMETRIC-COEFFICIENT  1.0 ;
    bp:PHYSICAL-ENTITY  bpdemo:smallMolecule10 ;
    bp:CELLULAR-LOCATION  bpdemo:physicalEntityParticipant31 .

bpdemo:physicalEntityParticipant31 a bp:physicalEntityParticipant ;
    bp:STOICHIOMETRIC-COEFFICIENT  1.0 ;
    bp:CELLULAR-LOCATION  bpdemo:openControlledVocabulary15 ;
    bp:PHYSICAL-ENTITY  bpdemo:protein32 .

bpdemo:openControlledVocabulary15 a openControlledVocabulary ;
    bp:TERM  "cytoplasm" ;
    bp:XREF  bpdemo:unificationXref16 .

bpdemo:protein32 a bp:protein ;
    bp:NAME  "glucokinase"^^xsd:string ;
    bp:SHORT-NAME  "GLK"^^xsd:string ;
    bp:SEQUENCE  "MTK...HIL"^^xsd:string ;
    bp:ORGANISM bpdemo:bioSource33 .
    bp:SYNONYMS  "Glucose kinase"^^xsd:string ;
    bp:SYNONYMS  "GLK_ECOLI"^^xsd:string ;
    bp:XREF  bpdemo:unificationXref35 ;
    bp:DATA-SOURCE  bpdemo:KB_439584_Individual_48 ; # doesn't fit
    bp:DATA-SOURCE  bpdemo:dataSource14 ;
    bp:AVAILABILITY  "see http://www.amaze.ulb.ac.be/"^^xsd:string ;

bpdemo:bioSource33 a bp:bioSource ;
    bp:NAME  "Escherichia coli"^^xsd:string ;
    bp:TAXON-XREF bpdemo:unificationXref34 .

bpdemo:unificationXref11 a bp:unificationXref ;
    bp:ID  "C00267"^^xsd:string ;
    bp:DB  "KEGG"^^xsd:string .
bpdemo:unificationXref34 a bp:unificationXref 
    bp:ID  "562"^^xsd:string ;
    bp:DB  "taxon"^^xsd:string .
bpdemo:unificationXref35 a bp:unificationXref ;
    bp:ID  "P46880"^^xsd:string ;
    bp:DB  "Swiss-Prot/TrEMBL"^^xsd:string .

bpdemo:chemicalStructure12 a bp:chemicalStructure ;
    bp:STRUCTURE-FORMAT "SMILES"^^xsd:string ;
    bp:STRUCTURE-DATA "C1(C(O)C(O)C(O)C(O1)CO)(O)"^^xsd:string .

Integrating Enterprise Relational databases for tax control

Responsible: Angela Fogarolli


Goal: Integrating relational databases and exposing them on the web or intranet based on the final RDB2RDF XG 1.1.3 and 1.1.2 use cases ) through the use of unique identifiers. This approach consist of integrating and interlinking data about entities on different databases.

Problem:

a. Join between entities described in different databases.

b. Join structured data (SQL) to structured data, from incompatible schema, or where data is dirty, poorly normalized, lacking proper keys/indices.

Requirements:

  • mapping RDB2RDF (need to write User friendly tools that implement the mappings);
  • federated SPARQL queries (SPARQL2SQL...);
  • unique identifiers ;

Use Case Description: This use case is a pilot project for the Trentino region tax agency. Trentino is an autonomous region in the north of Italy. The region has a population of 1 mil. People and more than 200 municipalities with their on information systems. The goal of the project is to integrate and link tax related data about people, organizations, buildings etc. This data come from different databases especially from the region’s many municipalities, each with their own individual data structures, and other sources. With our methodology we will provide a lightweight method for aggregating the data. In this way we are providing the user, a tax agent in our case an intelligent tool for navigating through the data present in the many different databases. The tool aggregates data and creates a profile for each tax payer. Each user profile shows different type of information , with links to other entities such as the buildings owned , payments made , location of residence etc.

Example

Supposed that we have two tables (Anagrafe and Urban_Cadastre) from different databases, we select some typical attributes for the two tables to explain our conversion method. Table “Angrafe” includs the information about two type of entities, persons and locations (person’s residence place), and some other information which are not properties of persons or locations:

Firstname Lastname City_Residence_Place Country_Residence_Place Other_Info
Paolo Bouquet Trento Italy xyz...

Table “Urban_Cadastre” contains the information about buildings and their owners:

Owner_Name Building_LocalID Building_Address Building_Type
Paolo Bouquet 123456 VIA G.LEOPARDI 3

DDL

Ddl.JPG

Using traditional RDB2RDF translation methods the RDF representation for the two example tables coming from two different databases is shown below:

<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns:database_anagrafe="http://www.database1.org/anagrafe/">
 <rdf:Description rdf:about="http://www.database1.org/anagrafe/entry_row1">
   <database_anagrafe:Other_Info>xyz</database_anagrafe:Other_Info>
   <database_anagrafe:Country_Residence_Place>Italy</database_anagrafe:Country_Residence_Place>
   <database_anagrafe:City_Residence_Place>Trento</database_anagrafe:City_Residence_Place>
   <database_anagrafe:Last_Name>Bouquet</database_anagrafe:Last_Name>
   <database_anagrafe:First_Name>Paolo</database_anagrafe:First_Name>
 </rdf:Description> </rdf:RDF>
<rdf:RDF
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:database_urbano="http://www.database1.org/Urban_Cadastre/">
 <rdf:Description rdf:about="http://www.database1.org/Urban_Cadastre/entry_row1">
   <database_urbano:Building_Type>3</database_urbano:Building_Type>
   <database_urbano:Building_Address>VIA G.LEOPARDI</database_urbano:Building_Address>
   <database_urbano:Building_LocalID>123456</database_urbano:Building_LocalID>
   <database_urbano:Owner_Name>Paolo Bouquet</database_urbano:Owner_Name>
 </rdf:Description> </rdf:RDF>

Querying

In SQL there is no way to create a query which joins data of two tables coming from different databases. For solving the identity problem which is required by the use case creating a RDF representation is not enough. The use case demands the use of unique identifier to refer to entities in order to join descriptions about the same entity coming from different datasources.

Web applications - Wordpress

In order to make the Semantic Web useful to ordinary Web users, RDF and OWL have to be deployed on the Web on a much larger scale. Web applications such as Content Management Systems, online shops or community applications (e.g. Wikis, Blogs, Fora) already store their data in relational databases. Providing a standardized way to map the relational data structures behind these Web applications into RDF, RDF-Schema and OWL will facilitate broad penetration and enrich the Web with RDF data and ontologies and facilitate novel semantic browsing and search applications.

By supporting the long tail of Web applications and thus counteracting the centralization of the Web 2.0 applications the planned RDB2RDF standardization will help to give control over data back to end-users and thus promote a democratization of the Web.

To support this usecase scenario, the mapping language should be easily implementable for lightweight Web applications and have a shallow learning curve to foster early adoption by Web developers.

We illustrate this use case with the example of Wordpress. Wordpress is a popular blogging Web application and installed on tens of thousands of Web servers. Wordpress used a relational database (MySQL) with a relatively simple schema:

Wordpress SQL Schema

A mapping should be able to reuse existing vocabularies.

Mapped to RDF the resulting ontology should contain the classes post, attachment, tag, category, user and comment. An example instance of the post class, for example, should look like:

<http://blog.aksw.org/triplify/post/8> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdfs.org/sioc/ns#Post> .
<http://blog.aksw.org/triplify/post/8> <http://rdfs.org/sioc/ns#has_creator>             <http://blog.aksw.org/triplify/user/5> .
<http://blog.aksw.org/triplify/post/8> <http://purl.org/dc/terms/created>                "2007-02-27T17:23:36"^^<http://www.w3.org/2001/XMLSchema#dateTime> .
<http://blog.aksw.org/triplify/post/8> <http://purl.org/dc/elements/1.1/title>           "Submissions open: 3rd Workshop on Scripting for the Semantic Web" .
<http://blog.aksw.org/triplify/post/8> <http://rdfs.org/sioc/ns#content>                 "The submissions web-site is now open for the 3rd Workshop on Scripting for the Semantic Web..." .
<http://blog.aksw.org/triplify/post/8> <http://purl.org/dc/terms/modified>               "2008-02-22T21:41:00"^^<http://www.w3.org/2001/XMLSchema#dateTime> .

An example SPARQL query on the resulting ontology could be:

SELECT ?name, ?title
WHERE {
  ?post   rdf:type       sioc:Post .
  ?post   dc:title       ?title .
  ?post   dc:has_creator ?author
  ?author foaf:name      ?name
}

Uniprot

+---------------------+
| Tables_in_uniProt   |
+---------------------+
| accToKeyword        | 
+--------+---------+
| acc    | keyword |
+--------+---------+
| Q197F8 |       1 |
| Q197F8 |       2 |
+--------+---------+
| accToTaxon          | 
+--------+--------+
| acc    | taxon  |
+--------+--------+
| Q197F8 | 345201 |
| Q197F7 | 345201 |
+--------+--------+
| author              | 
+----+-------------+
| id | val         |
+----+-------------+
|  1 | Delhon G.   |
|  2 | Tulman E.R. |
+----+-------------+
| citation            | 
+----+--------+-----------+----+
| id | acc    | reference | rp |
+----+--------+-----------+----+
|  1 | Q197F8 |         1 |  1 |
|  2 | Q197F7 |         1 |  1 |
+----+--------+-----------+----+
| citationRc          | 
+----------+--------+-------+
| citation | rcType | rcVal |
+----------+--------+-------+
|      101 |      1 |     1 |
|      102 |      1 |     2 |
+----------+--------+-------+
| citationRp          | 
+----+------------------------------------------------+
| id | val                                            |
+----+------------------------------------------------+
|  1 | NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]. |
|  2 | GENOME REANNOTATION.                           |
+----+------------------------------------------------+
| comment             | 
+--------+-------------+------------+
| acc    | commentType | commentVal |
+--------+-------------+------------+
| Q91G88 |           1 |          1 |
| Q91G88 |           1 |          2 |
+--------+-------------+------------+
| commentType         |
+----+----------------------+
| id | val                  |
+----+----------------------+
|  1 | SIMILARITY           |
|  2 | SUBCELLULAR LOCATION |
+----+----------------------+
| commentVal          |
+----+--------------------------------------------------+
| id | val                                              |
+----+--------------------------------------------------+
|  1 | Belongs to the IIV-6 006L/238R/313L/468L family. |
|  2 | Contains 1 KilA-N domain.                        |
+----+--------------------------------------------------+
| commonName          |
+--------+---------------------------+
| taxon  | val                       |
+--------+---------------------------+
| 345201 | IIV-3                     |
| 345201 | Mosquito iridescent virus |
+--------+---------------------------+
| description         |
+--------+---------------------------------------------+
| acc    | val                                         |
+--------+---------------------------------------------+
| Q197F8 | RecName: Full=Uncharacterized protein 002R; |
| Q197F7 | RecName: Full=Uncharacterized protein 003L; |
+--------+---------------------------------------------+
| displayId           |
+--------+-----------+
| acc    | val       | (display ID)
+--------+-----------+
| Q197F8 | 002R_IIV3 |
| Q197F7 | 003L_IIV3 |
+--------+-----------+
| extDb               |
+----+--------+
| id | val    |
+----+--------+
|  1 | EMBL   |
|  2 | RefSeq |
+----+--------+
| extDbRef            |
+--------+-------+-------------+------------+---------+
| acc    | extDb | extAcc1     | extAcc2    | extAcc3 |
+--------+-------+-------------+------------+---------+
| Q197F8 |     1 | DQ643392    | ABF82032.1 | -       |
| Q197F8 |     2 | YP_654574.1 | -          |         |
+--------+-------+-------------+------------+---------+
| feature             |
+--------+-------+-----+--------------+-------------+-------------+-----------+
| acc    | start | end | featureClass | featureType | softEndBits | featureId |
+--------+-------+-----+--------------+-------------+-------------+-----------+
| Q197F8 |     0 | 458 |            1 |           1 |           0 |         1 |
| Q197F7 |     0 | 156 |            1 |           2 |           0 |         2 |
+--------+-------+-----+--------------+-------------+-------------+-----------+
| featureClass        |
+----+--------+
| id | val    |
+----+--------+
|  1 | CHAIN  |
|  2 | SIGNAL |
+----+--------+
| featureId           |
+----+----------------+
| id | val            |
+----+----------------+
|  1 | PRO_0000377938 |
|  2 | PRO_0000377939 |
+----+----------------+
| featureType         |
+----+------------------------------+
| id | val                          |
+----+------------------------------+
|  1 | Uncharacterized protein 002R |
|  2 | Uncharacterized protein 003L |
+----+------------------------------+
| gene                |
+--------+-----------+
| acc    | val       |
+--------+-----------+
| Q197F8 | IIV3-002R |
| Q197F7 | IIV3-003L |
+--------+-----------+
| geneLogic           |
+--------+---------------------+
| acc    | val                 |
+--------+---------------------+
| Q197F8 | ORFNames=IIV3-002R; |
| Q197F7 | ORFNames=IIV3-003L; |
+--------+---------------------+
| history             |
+----+---------+-------+------+-----------------------------------------------------+---------------------+--------+
| ix | startId | endId | who  | what                                                | modTime             | errata |
+----+---------+-------+------+-----------------------------------------------------+---------------------+--------+
|  1 |       0 |    10 | kent | New                                                 | 2009-08-23 12:31:16 | NULL   |
|  2 |       0 |     0 | kent | Add contents of 1 text file(s) to table varProtein. | 2009-08-23 12:31:16 | NULL   |
+----+---------+-------+------+-----------------------------------------------------+---------------------+--------+
| info                |
+--------+-----------+--------+-----------+------------+------------+------------+-----------+
| acc    | isCurated | aaSize | molWeight | createDate | seqDate    | annDate    | organelle |
+--------+-----------+--------+-----------+------------+------------+------------+-----------+
| Q197F8 |         1 |    458 |     53921 | 2009-06-16 | 2006-07-11 | 2009-07-28 |         0 |
| Q197F7 |         1 |    156 |     17043 | 2009-06-16 | 2006-07-11 | 2009-07-28 |         0 |
+--------+-----------+--------+-----------+------------+------------+------------+-----------+
| keyword             |
+----+------------------------+
| id | val                    |
+----+------------------------+
|  1 | Complete proteome      |
|  2 | Virus reference strain |
+----+------------------------+
| organelle           |
+----+--------------------------+
| id | val                      | (organelle name)
+----+--------------------------+
|  1 | Plasmid megaplasmid Rsp. |
|  2 | Plasmid pZM2.            |
+----+--------------------------+
| otherAcc            |
+--------+--------+
| acc    | val    | (secondary accession IDs)
+--------+--------+
| P15711 | Q4N2B5 |
| P68744 | P18556 |
+--------+--------+
| pathogenHost        |
+----------+-------+
| pathogen | host  |
+----------+-------+
|   345201 |  7163 |
|   345201 | 42431 |
+----------+-------+
| protein             |
+--------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| acc    | val                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
+--------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Q197F8 | MASNTVSAQGGSNRPVRDFSNIQDVAQFLLFDPIWNEQPGSIVPWKMNREQALAERYPELQTSEPSEDYSGPVESLELLPLEIKLDIMQYLSWEQISWCKHPWLWTRWYKDNVVRVSAITFEDFQREYAFPEKIQEIHFTDTRAEEIKAILETTPNVTRLVIRRIDDMNYNTHGDLGLDDLEFLTHLMVEDACGFTDFWAPSLTHLTIKNLDMHPRWFGPVMDGIKSMQSTLKYLYIFETYGVNKPFVQWCTDNIETFYCTNSYRYENVPRPIYVWVLFQEDEWHGYRVEDNKFHRRYMYSTILHKRDTDWVENNPLKTPAQVEMYKFLLRISQLNRDGTGYESDSDPENEHFDDESFSSGEEDSSDEDDPTWAPDSDDSDWETETEEEPSVAARILEKGKLTITNLMKSLGFKPKPKKIQSIDRYFCSLDSNYNSEDEDFEYDSDSEDDDSDSEDDC |
| Q197F7 | MYQAINPCPQSWYGSPQLEREIVCKMSGAPHYPNYYPVHPNALGGAWFDTSLNARSLTTTPSLTTCTPPSLAACTPPTSLGMVDSPPHINPPRRIGTLCFDFGSAKSPQRCECVASDRPSTTSNTAPDTYRLLITNSKTRKNNYGTCRLEPLTYGI                                                                                                                                                                                                                                                                                                               |
+--------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| proteinEvidence     |
+--------+---------------------+
| acc    | proteinEvidenceType |
+--------+---------------------+
| Q197F8 |                   4 |
| Q197F7 |                   4 |
+--------+---------------------+
| proteinEvidenceType |
+----+------------------------------+
| id | val                          |
+----+------------------------------+
|  1 | Evidence at protein level    |
|  2 | Evidence at transcript level |
+----+------------------------------+
| rcType              |
+----+--------+
| id | val    |
+----+--------+
|  1 | STRAIN |
|  2 | TISSUE |
+----+--------+
| rcVal               |
+----+--------+
| id | val    |
+----+--------+
|  1 | Ankara |
|  2 | Muguga |
+----+--------+
| reference           |
+----+-------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+----------+----------+------------------------+
| id | title                                                                                                                               | cite                          | pubMed   | medline  | doi                    |
+----+-------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+----------+----------+------------------------+
|  1 | Genome of invertebrate iridescent virus type 3 (mosquito iridescent virus).                                                         | J. Virol. 80:8439-8449(2006). | 16912294 |          | 10.1128/JVI.00464-06   |
|  2 | Analysis of the first complete DNA sequence of an invertebrate iridovirus: coding strategy of the genome of Chilo iridescent virus. | Virology 286:182-196(2001).   | 11448171 | 21342589 | 10.1006/viro.2001.0963 |
+----+-------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+----------+----------+------------------------+
| referenceAuthors    |
+-----------+--------+
| reference | author |
+-----------+--------+
|         1 |      1 |
|         1 |      2 |
+-----------+--------+
| tableDescriptions   |
| taxon               |
+--------+---------------------------------+----------------------------------------------------------------------+
| id     | binomial                        | toGenus                                                              |
+--------+---------------------------------+----------------------------------------------------------------------+
| 345201 | Invertebrate iridescent virus 3 | Viruses; dsDNA viruses, no RNA stage; Iridoviridae; Chloriridovirus. |
| 176652 | Invertebrate iridescent virus 6 | Viruses; dsDNA viruses, no RNA stage; Iridoviridae; Iridovirus.      |
+--------+---------------------------------+----------------------------------------------------------------------+
| varAcc              |
+----------+--------+---------+
| varAcc   | parAcc | variant |
+----------+--------+---------+
| P15455-2 | P15455 | 2       |
| P48347-2 | P48347 | 2       |
+----------+--------+---------+
| varProtein          |
+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| acc      | val                                                                                                                                                                                                                                                                                                                                                                              |
+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| P15455-2 | MGKVIPGCAETFQDSSEFQPRFEGQGQSQRFRDMHQKVEHIRSGDTIATTPGVAQWFYNDGQEPLVIVSVFDLASHQNQLDRNPRPFYLAGNNPQGQVWLQGREQQPQKNIFNGFGPEVIAQALKIDLQTAQQLQNQDDNRGNIVRVQGPFGVIRPPLRGQRPQEEEEEEGRHGRHGNGLEETICSARCTDNLDDPSRADVYKPQLGYISTLNSYDLPILRFIRLSALRGSIRQNAMVLPQWNANANAILYVTDGEAQIQIVNDNGNRVFDGQVSQGQLIAVPQGFSVVKRATSNRFQWVEFKTNANAQINTLAGRTSVLRGLPLEVITNGFQISPEEARRVKFNTLETTLTHSSGPASYGRPRVAAA |
| P48347-2 | MENEREKQVYLAKLSEQTERYDEMVEAMKKVAQLDVELTVEERNLVSVGYKNVIGARRASWRILSSIEQKEESKGNDENVKRLKNYRKRVEDELAKVCNDILSVIDKHLIPSSNAVESTVFFYKMKGDYYRYLAEFSSGAERKEAADQSLEAYKAAVAAAENGLAPTHPVRLGLALNFSVFYYEILNSPESACQLAKQAFDDAIAELDSLNEESYKDSTLIMQLLRDNLTLWTSDLNEEGDERTKGADEPQDEV                                                                                                                   |
+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+---------------------+

rCAD: RNA Comparative Analysis Database

Responsible: Juan Sequeda

Goal: Exposing rCAD relational databases as RDF with a SPARQL endpoint and linking it to other datasets in the life science Linked Data cloud

Requirements:

  • mapping part of a schema to an existing domain ontology
  • automatically generating an ontology from the SQL-DDL, because a domain ontology does not exist

rCAD - RNA Comparative Analysis using SQLServer: The tremendous increase in available biological information creates opportunities to decipher the structure, function and evolution of cellular components while presenting new computational challenges for performance and scalability. To fully utilize this large increase in knowledge, it must be organized for efficient retrieval and integrated for multi-dimensional analysis. Given this, biologists are able to invent new comparative sequence analysis protocols that will yield new and different structural and functional information. Based on Microsoft SQL-server, we have designed and implemented the RNA Comparative Analysis Database -rCAD which supports comparative analysis of RNA sequence and structure, and unites, for the first time in a single environment, multiple dimensions of information necessary for alignment viewing, sequence metadata, structural annotations, structure prediction studies, structural statistics of different motifs, and phylogenetic analysis. This system provides a queryable environment that hosts efficient updates and rich analytics.


The rCAD consists of different schema: Sequence Metadata, Evolutionary Relationships, Structural Relationships and Sequence Alignment.


For this use-case, we will be presenting only the Sequence Alignment schema. The SQL-DDL for Microsoft SQL Server can be found here: rCAD Sequence Alignment SQL DDL


This use-case presents the issue of a schema that does not have an existing domain ontology in which it can be mapped to. The closest domain ontology is the Multiple Alignment Ontology (MAO) which can only be mapped to the Sequence Alignment part of the entire rCAD database. However MAO is in the OBO language. Nevertheless, OBO ontologies can be translated to OWL (and back).

Use Case: Expose Relational Data as RDF when there is no existing Domain Ontology to map the relational schema to

Rob, from the RNA lab would like to expose the Sequence Alignment data from the rCAD database as RDF. However, there is no existing domain ontology in which the relational schema can be mapped to. Therefore, an ontology should be derived automatically from the relational schema. The RDF data will be come instance of this automatically generated ontology.

The Alignment table from the rCAD database is the following:

CREATE TABLE [AlignmentClassic].[Alignment]  ( 
    [AlnID]           	int NOT NULL,
    [SeqTypeID]       	tinyint NOT NULL,
    [AlignmentName]   	varchar(max) NULL,
    [ParentAlnID]     	int NULL,
    [NextColumnNumber]	int NOT NULL,
    CONSTRAINT [PK_Alignment] PRIMARY KEY([AlnID])
)
ON [PRIMARY]
GO
ALTER TABLE [AlignmentClassic].[Alignment]
    ADD CONSTRAINT [FK_Alignment_SequenceType]
	FOREIGN KEY([SeqTypeID])
	REFERENCES [dbo].[SequenceType]([SeqTypeID])
	ON DELETE NO ACTION 
	ON UPDATE NO ACTION 
GO
ALTER TABLE [AlignmentClassic].[Alignment]
    ADD CONSTRAINT [FK_Alignment_Alignment]
	FOREIGN KEY([ParentAlnID])
	REFERENCES [AlignmentClassic].[Alignment]([AlnID])
	ON DELETE NO ACTION 
	ON UPDATE NO ACTION 
GO

and the desired ontology that is generated automatically from the relational schema is the following:

PREFIX rcad: <http://rcad.org/vocabulary/rcad.owl#>
rcad:Alignment rdf:type owl:Class .
rcad:AlignmentName rdf:type owl:DatatypeProperty.
rcad:AlignmentName rdfs:domain rcad:Alignment.
rcad:AlignmentName rdfs:range xsd:string.
rcad:NextColumnNumber rdf:type owl:DatatypeProperty.
rcad:NextColumnNumber rdfs:domain rcad:Alignment.
rcad:NextColumnNumber rdfs:range xsd:integer.
rcad:ParentAlnID rdf:type owl:ObjectProperty
rcad:ParentAlnID rdfs:domain rcad:Alignment
rcad:ParentAlnID rdfs:range rcad:Alignment

and the desired RDF triples, which are instances of the automatically generated ontology are the following:

PREFIX rcad: <http://rcad.org/vocabulary/rcad.owl#>
PREFIX rcad-data: <http://rcad.org/vocabulary/rcad-data.rdf#>

rcad-data:alignment1000 a rcad:Alignment;
	rcad:AlignmentName "My Alignment 1000"^^xsd:string;
	rcad:NextColumnNumber "123"^^xsd:int;
	rcad:ParentAlnID rcad-data:alignment2000

Use Case: Expose Mapping a Relational Database to an OWL DL ontology

Rob, from the RNA lab would like to expose the Sequence Alignment data from the rCAD database as RDF. Just recently the Multiple Alignment Ontology (MAO) as been released, which is an "ontology for data retrieval and exchange in the fields of multiple DNA/RNA alignment, protein sequence and protein structure alignment." However, this ontology has been developed in OBO. Nevertheless, OBO ontologies can be translated to OWL ontologies, specifically OWL DL. Therefore, Rob will like to map his rCAD database to the MAO ontology

The Alignment and Alignment Column tables from the rCAD database is the following:


CREATE TABLE [AlignmentClassic].[Alignment]  ( 
    [AlnID]           	int NOT NULL,
    [SeqTypeID]       	tinyint NOT NULL,
    [AlignmentName]   	varchar(max) NULL,
    [ParentAlnID]     	int NULL,
    [NextColumnNumber]	int NOT NULL,
    CONSTRAINT [PK_Alignment] PRIMARY KEY([AlnID])
)
GO
ALTER TABLE [AlignmentClassic].[Alignment]
    ADD CONSTRAINT [FK_Alignment_SequenceType]
	FOREIGN KEY([SeqTypeID])
	REFERENCES [dbo].[SequenceType]([SeqTypeID])
	ON DELETE NO ACTION 
	ON UPDATE NO ACTION 
GO
ALTER TABLE [AlignmentClassic].[Alignment]
    ADD CONSTRAINT [FK_Alignment_Alignment]
	FOREIGN KEY([ParentAlnID])
	REFERENCES [AlignmentClassic].[Alignment]([AlnID])
	ON DELETE NO ACTION 
	ON UPDATE NO ACTION 
GO

---

CREATE TABLE [AlignmentClassic].[AlignmentColumn]  ( 
    [AlnID]        	int NOT NULL,
    [ColumnNumber] 	int NOT NULL,
    [ColumnOrdinal]	int NOT NULL,
    CONSTRAINT [PK_AlignmentColumn] PRIMARY KEY([AlnID],[ColumnNumber])
)
GO
ALTER TABLE [AlignmentClassic].[AlignmentColumn]
    ADD CONSTRAINT [FK_AlignmentColumn_Alignment]
	FOREIGN KEY([AlnID])
	REFERENCES [AlignmentClassic].[Alignment]([AlnID])
	ON DELETE NO ACTION 
	ON UPDATE NO ACTION 
GO

This is just part of the Multiple Alignment Ontology in OWL DL


PREFIX ro: <http://purl.org/obo/owl/RO#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX _global: <http://purl.org/obo/owl/_global#>

_global:part_of	rdf:type	owl:ObjectProperty
_global:part_of	rdf:type	owl:TransitiveProperty
_global:part_of	rdfs:label	 "part_of"

ro:RO_0000031	rdf:type	owl:Class
ro:RO_0000031	rdfs:label	 "sub_alignment"
genid:A129876	rdf:type	owl:Restriction
ro:RO_0000031	rdfs:subClassOf	genid:A129876
genid:A129876	owl:onProperty	_global:part_of
genid:A129876	owl:someValuesFrom	ro:RO_0000000

ro:RO_0000084	rdf:type	owl:Class
ro:RO_0000084	rdfs:label	 "alignment_column"
genid:A129906	rdf:type	owl:Restriction
ro:RO_0000084	rdfs:subClassOf	genid:A129906
genid:A129906	owl:onProperty	_global:part_of
genid:A129906	owl:someValuesFrom	ro:RO_0000031

Exposing many-to-many join tables as simple triples

The community's school district maintains an RDB with basic student and personnel information, including a STUDENTS and a TEACHERS table. The relationship between the two is given in a STUDENT_TEACHER table:


+------------+-------------+
| StudentId  |  StudentName |
+------------+-------------+
|    1       |     Lee      |
|    2       |     Eric     |
|    3       |     Juan     |
|   ...      |    ...       |
+------------+-------------+


+------------+-------------+
| StudentId  |  TeacherID  |
+------------+-------------+
|    1       |     1       |
|    1       |     2       |
|    2       |     3       |
|    3       |     2       |
|   ...      |    ...      |
+------------+-------------+

+------------+-------------+
| TeacherID  |  TeacherName |
+------------+-------------+
|    1       |     Michael  |
|    2       |     Ahmed    |
|    3       |     Marcelo  |
|   ...      |    ...       |
+------------+-------------+

The SQL DDL for these tables are the following:

CREATE TABLE student (
  studentID int PRIMARY KEY,
  studentName varchar
)

CREATE TABLE teacher (
  teacherID int PRIMARY KEY,
  teacherName varchar
)

CREATE TABLE student_teacher (
  studentID int,
  teacherID int,
  PRIMARY KEY(studentID, teacherID),
  FOREIGN KEY(studentID) REFERENCES student(studentID), 
  FOREIGN KEY(teacherID) REFERENCES teacher(teacherID)
)


SemantEducaTrix, the most recent Semantic Web company to burst into the educational software market, is mapping the school system's relational database to RDF / SPARQL. They'd like to access the relationships modeled with this join table as simple links between students and teachers:

 ex:student1 ex:studentName "Lee".
 ex:student2 ex:studentName "Eric".
 ex:student2 ex:studentName "Juan".
 ex:teacher1 ex:teacherName "Michael".
 ex:teacher2 ex:teacherName "Ahmed".
 ex:teacher3 ex:teacherName "Marcelo".
 ex:student1 ex:has_teacher ex:teacher1, ex:teacher 2 ;
 ex:student2 ex:has_teacher ex:teacher3 ;
 ex:student3 ex:has_teacher ex:teacher2 ;
 ...

Value-based type specification

The TEACHERS table, see above, has a Classification column:

+------------+-----------------+--- ... ---+
| TeacherId  |  Classification |    ...    |
+------------+-----------------+--- ... ---+
|    1       |   History       |           |
|    2       |   Physics       |           |
|    3       |   Music         |           |
+------------+-----------------+-----------+

SemantEducaTrix wants to instantiate these teachers as different {{{rdf:type}}}s depending on the value in the Classification column:

 ex:teacher1 a ex:HistoryTeacher .
 ex:teacher2 a ex:PhysicsTeacher .
 ex:teacher3 a ex:MusicTeacher .