W3C logo
slanted W3C logo

SWObjects - Semantic Web Toolbox

Decentralized (Biomedical) Data Access with SPARQL tools.


Resources: <SWObjects sparql binary> and <sample queries>
(contains <goProt.map> <goProt.rq>. <goProt2.map> <goProt2.rq> <goProt2-bug.rq>.)

Eric Prud'hommeaux, Sanitation Engineer.
Last modified: $Date: 2011/10/19 20:23:27 $
Creative Commons License This work is licensed under a Creative Commons Attribution 3.0 License, with attribution to W3C.

Valid XHTML + RDFa

Data Merging Problem

SPARQL State of the Art

Exercise: run SPARQL binary

Exercise: SPARQL as Web Server

CONSTRUCT Usage Patterns

... typically used to materialize transformations:

   PREFIX :mydb <http://cityhospital.example/dbs>
CONSTRUCT { ?o a               study:SubjectObservation .
            ?o study:subject   ?p .
            ?o study:clinician ?d .
            ?d :foaf:name ?dName }

    WHERE { ?o mydb:patient ?p .
            ?o mydb:doctor  ?d .
            ?d mydb:name    ?dName }

Exercise: Simple CONSTRUCT

Virtual Views

Query Transformation

CONSTRUCT
{
  ?g uniprot:id ?id . 
  ?g skos:prefLabel ?gene_symbol .
}
WHERE
{
  ?g <uniProt/gene#acc> ?id .
  ?g <uniProt/gene#val> ?gene_symbol .
}

Exercise: Query Transformation

goProt0.rq

SELECT ?symbol  WHERE {
  _:prot uniprot:id 'P04637' .
  _:prot skos:prefLabel ?symbol .
}

transformed query

SELECT  ?symbol
 WHERE {
   _:p <uniProt/gene#acc> "P04637"  .
   _:p <uniProt/gene#val> ?symbol .
 }

goProt0.map

CONSTRUCT
{
  ?g uniprot:id ?id . 
  ?g skos:prefLabel ?gene_symbol .
}
WHERE
{
  ?g <uniProt/gene#acc> ?id .
  ?g <uniProt/gene#val> ?gene_symbol .
}

goProt.ttl

<gene#_1> <uniProt/gene#acc> 'P04637' .
<gene#_1> <uniProt/gene#val> 'P53' .
<gene#_1> <uniProt/gene#val> 'TP53' .

results

?symbol
  "P53"
 "TP53"

Exercise: Query Execution

goProt0.rq

SELECT ?symbol  WHERE {
  _:prot uniprot:id 'P04637' .
  _:prot skos:prefLabel ?symbol .
}

transformed query

SELECT  ?symbol
 WHERE {
   _:p <uniProt/gene#acc> "P04637"  .
   _:p <uniProt/gene#val> ?symbol .
 }

goProt0.map

CONSTRUCT
{
  ?g uniprot:id ?id . 
  ?g skos:prefLabel ?gene_symbol .
}
WHERE
{
  ?g <uniProt/gene#acc> ?id .
  ?g <uniProt/gene#val> ?gene_symbol .
}

goProt.ttl

<gene#_1> <uniProt/gene#acc> 'P04637' .
<gene#_1> <uniProt/gene#val> 'P53' .
<gene#_1> <uniProt/gene#val> 'TP53' .

results

?symbol
  "P53"
 "TP53"

Mergable Data

What makes data sharable?

How do we make this happen?

Producing Computable Shared Names

If it's not computable, you need a lookup.

Exercise: Term Transformation

goProt.ttl

<gene#_1> <uniProt/gene#acc> 'P04637' .
<gene#_1> <uniProt/gene#val> 'P53' .
<gene#_1> <uniProt/gene#val> 'TP53' .

result

<http…/P04637> skos:prefLabel "p53" .
<http…/P04637> skos:prefLabel "tp53" .

goProt1.map

CONSTRUCT
{
  ?gene uniprot:id ?id . 
  ?gene skos:prefLabel ?sym .
}
WHERE
{
  SELECT (IRI(fn:concat("http…/", ?id)) AS ?gene)
         (fn:lower-case(?u_sym) AS ?sym) {
    _:x <uniProt/gene#acc> ?id .
    _:x <uniProt/gene#val> ?u_sym .
  }
}

goProt.ttl

<gene#_1> <uniProt/gene#acc> 'P04637' .
<gene#_1> <uniProt/gene#val> 'P53' .
<gene#_1> <uniProt/gene#val> 'TP53' .

results

?symbol
  "P53"
 "TP53"

Exercise: Query Term Transformation

goProt1.rq

SELECT ?symbol  WHERE {
  <http…/P04637>
      skos:prefLabel ?symbol .
}

goProt1.map

CONSTRUCT
{
  ?gene uniprot:id ?id . 
  ?gene skos:prefLabel ?sym .
}
WHERE
{
  SELECT (fn:concat(<http…/>, ?id) AS ?gene)
         (fn:lower-case(?u_sym) AS ?sym) {
    _:x <uniProt/gene#acc> ?id .
    _:x <uniProt/gene#val> ?u_sym .
  }
}

transformed query

SELECT ?symbol WHERE {
  SELECT (<http…/P04637> AS ?gene)
         (fn:lower-case(?_r1_0_u_sym) AS ?symbol) 
  WHERE {
    _:_r1_0_x <uniProt/gene#acc> "P04637"  .
    _:_r1_0_x <uniProt/gene#val> ?_r1_0_u_sym .
  }
}

goProt.ttl

<gene#_1> <uniProt/gene#acc> 'P04637' .
<gene#_1> <uniProt/gene#val> 'P53' .
<gene#_1> <uniProt/gene#val> 'TP53' .

results

?symbol
  "P53"
 "TP53"

Query Federation

Exercise: Choreograph a Query

sparql -m goProt2.map -np goProt2.rq

SELECT ?symbol ?label 
WHERE
{
  {
    SELECT (<http://www.uniprot.org/uniprot/P04637> AS ?gene)
           (fn:lower-case(?_uniProt_0_u_gene_symbol) AS ?symbol) 
    WHERE
    SERVICE <http://localhost:8001/uniProt>
      {
        _:_uniProt_0_gene <http://ucsc.example/uniProt/gene#acc> "P04637"  .
        _:_uniProt_0_gene <http://ucsc.example/uniProt/gene#val> ?_uniProt_0_u_gene_symbol .
      }
      }
  SERVICE <http://localhost:8003/go>
    {
      ?_go_0_gp <http://ucsc.example/go/gene_product#Symbol> ?symbol .
      ?_go_0_association <http://ucsc.example/go/association#gene_product_id> ?_go_0_gp .
      ?_go_0_association <http://ucsc.example/go/association#term_id> ?_go_0_t .
      ?_go_0_t <http://ucsc.example/go/term#name> ?label .
    }
}

Exercise: Debug a Choreography

sparql -m goProt2.map -np goProt2-bug.rq

failed to match triples prefixed by "!" in
SELECT ?symbol ?label 
WHERE
{
     <http://www.uniprot.org/uniprot/P04637> skos:prefLabel ?symbol .
!    ?product <http://yetanothergenevocabulary.org999/#symbol> ?symbol .
!    ?id <http://yetanothergenevocabulary.org999/#product> ?product .
     ?id <http://www.geneontology.org/dtd/go.dtdterm> ?goterm .
     ?goterm <http://www.w3.org/2000/01/rdf-schema#label> ?label .
}

sparql --debug 1 -m goProt2.map -np goProt2-bug.rq

Intra-model Mapping

Direct Mapping

People
PK→ Address(ID)
IDfnameaddr
7Bob18
8SueNULL
Addresses
PK
IDcitystate
18CambridgeMA
<People/ID=7#_> <People#ID> 7 .

<People/ID=7#_> <People#fname> "Bob" .
<People/ID=7#_> <People#addr> <Addresses/ID=18#_> .
<People/ID=8#_> <People#ID> 8 .

<People/ID=8#_> <People#fname> "Sue" .

<Addresses/ID=18#_> <Addresses#ID> 18 .
<Addresses/ID=18#_> <Addresses#city> "Cambridge" .

<Addresses/ID=18#_> <Addresses#state> "MA" .
      

Direct Graph

<People/ID=7#_> <People#ID> 7 .

<People/ID=7#_> <People#fname> "Bob" .
<People/ID=7#_> <People#addr> <Addresses/ID=18#_> .
<People/ID=8#_> <People#ID> 8 .

<People/ID=8#_> <People#fname> "Sue" .

<Addresses/ID=18#_> <Addresses#ID> 18 .
<Addresses/ID=18#_> <Addresses#city> "Cambridge" .

<Addresses/ID=18#_> <Addresses#state> "MA" .
      

Interface Graph

One Hammer, So Many Nails

?symbol?label
"tp53""DNA binding
"tp53""transcription factor activity"
… +156 rows …

SPARQL 2 SPARQL 2 SQL

Federation Dependencies

Standards Issues

Rules Languages

The semantics align well, but implementations focus on different use case optimizations.

The Hard^h^hFun Stuff

Function Inversion

Resolve as much as possible;
huge impact on performance.

  <http://www.uniprot.org/uniprot/P04637>
           skos:prefLabel ?symbol .

= >

SELECT (fn:lower-case(?_uniProt_0_u_gene_symbol) AS ?symbol) 
 WHERE {
   _:_uniProt_0_gene <http://ucsc.example/uniProt/gene#acc> "P04637"  .
   _:_uniProt_0_gene <http://ucsc.example/uniProt/gene#val> ?_uniProt_0_u_gene_symbol .
}

Function Inversion

Resolve as deeply as possible.

SWObjects Implementation Status

Acknowledgements

Questions?