HCLSHackathon2012

From W3C Wiki

Semantic Web for Health Care and Life Sciences Summer School

Goals

  • Exploit Semantic Web data for health care and life sciences use cases.
  • Create code and data development communities
  • Capture communities in VIVO.
  • Leverage communities for a lasting healthy ecosystem of SemWeb-friendly researchers and developers.

Strategy

  • Collaborate on use cases, data and code development
  • No prize, funding for tools and comfort only.
  • Leverage expertise focused in Boston area.
  • Grab people after that return from vacation but before their jobs get crazy.

Dates

27-30 Aug

Venue

Kiva conference room (G449)
4th floor, Gates Tower
MIT, Stata Center
32 Vassar Street, Cambridge, MA

For the Gates Tower, enter Stata via the doors nearest to Main Street. Go past the information desk (with the big "?" sign) and take the elevators at the right to the 4th floor. Exit the transporter bay doors to the right, go left at the split and then forward. The door to Kiva is on the left just past the stairs.

Dial-In

The outgoing single is pretty hissy (room noise amplified by omni-directional microphones), but it's audible. Here's how you dial in and how you monitor the events on IRC:

 phone: +1.617.761.6200
 code: 4225#
 irc:
   server: irc.w3.org
   port: 6665
   channel: #hack

For the room, the network is StataCenter.

Strategy Questions

  • planned work vs. unconference-style popularity contest
  • leverage geography without diminishing value to those not in Boston
  • dates -- 27-30 August

Agenda:

  • DAY1: Introduction to Semantic Web Technologies:

    minutes

    • 9.00 – 11.00:

      Introduction – Linked Data in Life Sciences

      Helena Deus (slides)

    • 11.00-11.15: Coffee Break
    • 11.15 – 12.30:

      Tutorial – SPARQL by Example

      Lee Feigenbaum or Eric Prud’hommeax (slides)

      Just as RDF is machine-readable data for the Web, SPARQL is SQL for the Semantic Web. This popular tutorial will introduce the audience to the web of data and get them comfortable using any of several tools for query, integrate and manipulate that data.

    • 12.30-13.30: Lunch
    • 13.30-15.30:

      SADI – enabling distributed biological reasoning

      Luke McCarthy slides

      @@fill in abstract from SADI website and Luke's text@@

    • 15.30-15.40: Coffee Break
    • 15.40-17.00: Tutorial:

      Talk – Big datasets in Life Sciences

      Helena F. Deus (slides)

      Big Data is the buzzword these days. With the price of sequencing dropping faster than Moore's law, biotechnology is producing some of the largest datasets known to man. Added value for this data comes from our ability to extract the meaningful pieces of information such that can it be used for decision making and strategic planning. This talk will provide an overview of life sciences datasets available on the web and how they can be used for discovery and support of hypothesis.

  • DAY2: Datasets and Tools:

    minutes

    • 9.00 – 10.30:

      Tutorial - Tools for data integration

      Eric Prud’hommeax and Helena F. Deus (slides)

      BioHackers need tools to integrate data from different sources. Semantic web and Linked Data technologies can be used to accelerate integration of heterogeneous datasets in Life Sciences and for discovering and inferring knowledge. In this hand-on tutorial, we will demonstrate how some of the most complex issues in data integration can be easily solved through the usage of SWobjects.

    • 10.30-11.00: Coffee Break
    • 11.00 – 13.30:

      Tutorial - Tools for data integration (continued)

    • 12.30-13.30: Lunch
    • 13.30-15.30:

      Seminar – Bel Framework

      Ted Slater (slides)

      The Biological Expression Language (BEL) Framework is a set of technologies for capturing, storing and operationalizing structured biological knowledge. This talk will cover the BEL syntax and progress on its mapping to RDF.

    • 15.30-15.40: Coffee Break
    • 15.40-16.20:

      Seminar - BioPAX/SBPAX

      Oliver Ruebenacker (slides)

      Learn how to use BioPAX/SBPAX to turn biological pathways and networks into kinetic models for computer simulations and other methods for a quantitative understanding of living organisms.

    • 16.20 – 17.30:

      Talk – Healthcare Informatics Infrastructure

      Eric Prud’hommeax (slides) Travers Franckle (slides]

      Understand the relationships between traditional standards organizations for clinical care (HL7) and clinical trial (CDISC) and the terminologies upon which they rely for practical semantics. HL7's Reference Information Model and emergent formats like I2B2 Shrine, Indivo and SMART provide a structural framework for extracting meaning from clinical records to support clinical decision support, monitor efficacy and detect and respond to adverse events.

    • 16.20 – 17.30:

      Goals Discussion

      Community

      What have we achieved so far; what are the next tasks?

  • DAY3: Reasoning, Vendor Demonstrations and Lightening Talks:

    minutes

    • 9.00 – 11.00:

      Reasoning on the Semantic Web

      Eric Prud’hommeax

      If 80% of our data discovery and retrieval needs for any project can be met with simple data sources and SPARQL, how can we meet the rest of the project's needs? This talk will describe the reasoning systems available on the Semantic Web and show examples of scalable reasongin systems like JESS.

    • 11.00-11.30: Coffee Break
    • 11.30-12.30:

      Tutorial – Ontology Design and Constraints Checking With OWL and Protégé

      Eric Prud'hommeaux and Luke McCarthy

      When manipulating large numbers of inter-related classes, ontology design can be a life-saver or a time sink. The Protégé tool makes ontology design as easy as it can be. Learn by doing, following the Oxford University Pizza Tutorial (a gentle introduction to classifying pizzas and parts of pizzas).

    • 12.30-13.30: Lunch
    • 14.30-15.30:

      Vendor Demonstrations

      Various Vendors, e.g. Cambridge Semantics, Metaome, Virtuoso

      Learn how to use existing products to work with data on the Semantic Web.

    • 15.30-16.00: Coffee Break
    • 16.00-16.30:

      R2RML with XSparql

      download SaxonHE

       unzip ~/Downloads/SaxonHE9-4-0-4J.zip
       mvn install:install-file -DgroupId=net.sf.saxon -DartifactId=saxon -Dversion=9.3 -Dpackaging=jar -Dfile=saxon9he.jar 
       svn co https://xsparql.svn.sourceforge.net/svnroot/xsparql/trunk xsparql
       cd xsparql
       mvn install
      

      go get a cup of coffee; come back and download dbConfig, SPAAACE-map.ttl .

       java -jar cli/target/cli-0.4-SNAPSHOT-jar-with-dependencies.jar -r2rml SPAAACE-map.ttl --mysql -dbConfig dbConfig | sed 's/&gt;/\>/g'|sed 's/&lt;/\</g'
       
       java -jar cli/target/cli-0.4-SNAPSHOT-jar-with-dependencies.jar -r2rml SPAAACE-map.ttl --mysql -dbConfig dbConfig | sed 's/&gt;/\>/g'|sed 's/&lt;/\</g' | sparql -l turtle -d - -e "PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name { ?x a foaf:Person ; foaf:name  ?name }"
      
    • 16.30-17.30:

      Lightning talks and project pitches

      Community

      The participants have substantial expertise in areas of interest to the community. We will ask people to present their skills and their interests in order to develop collaborative projects. In addition to exposing biological and clinical data sources as RDF, the community will have needs for interface development, outreach material, etc.

  • DAY4: Spillover and Wrap-up

    minutes

  • 9.00 – 10.00:

    Lightning Talks

  • 10.00 – 10.30:

    Elsevier - Semantic Web in HCLS for Commercial Applications

    Iker Huerga, Elsevier Inc. (slides)

    Even though the value added by Semantic Web in HCLS has been clearly defined there are still very few commercial applications of Semantic Web. During this presentation we'll show how Elsevier has put Semantic Web in practice behind the scene in some of its commercial applications

  • 10.30 – 11.30:

    Tutorial - SPARQL Rules

    Iker Huerga, Elsevier Inc.

    SPARQL Rules are a collection of RDF vocabularies such as the SPARQL Inferencing Notation (SPIN) that build on the W3C SPARQL standard to let you define new functions, stored procedures, constraint checking, and inferencing rules for your Semantic Web models,

  • ??-??

    Jena Tutorial

    Ian Jacobi (slides)

Additional Ideas for the Next Hackathon

  • Going from XML to RDF
  • Macros for going from a spreadsheet to RDF
  • Going from RDF to relational (in order to use SQL-oriented tools)
  • Using the Sesame API

Registration

We're sorry but registration for food and seating is now full. You are welcome to come by Kiva but seating priority goes to those who have already registered. Please contact eric at w3 dot org with any questions.

Sponsors