This is an archive of an inactive wiki and cannot be modified.

Squiggle: an application framework for model-driven development of real-world Semantic Search Engines

Contact e-mail: irene.celino # cefriel.it, emanuele.dellavalle # cefriel.it, dario.cerizza # cefriel.it, andrea.turati # cefriel.it

Application

General purpose and services to the end user

Squiggle is a framework to support the development of domain-specific search engines that exploit the semantics of domain ontologies to improve the search functionalities. It supports both the conceptual indexing phase and the semantic search (the runtime interaction).

A more detailed description can be found at http://squiggle.cefriel.it, where you can also find the links to some running search engines: Squiggle Music to find songs and artists (at http://squiggle.cefriel.it/music) and Squiggle Ski to find images of alpine skiers (at http://squiggle.cefriel.it/ski).

Functionality examples

Application architecture

Squiggle is composed of two main parts: the Conceptual Indexer, which takes as input the contents to be indexed and the domain ontology and produces as output a set of indexes, and the Semantic Searcher, which queries the indexes to return matching results and semantic suggestions in response to the user. There are two kinds of indexes: the syntactic ones that are queried for textual matching (based on Apache Lucene) and the semantic ones that are queried for ontological matching (based on Sesame). In the running systems, the components are on a single server, however there's no technical reason to avoid the distribution of some of the components.

Data to be indexed are obviously distributed over the network, like for any other search engine, as well as it is possible to distribute the ontologies used to annotate and describe those data.

Special strategies involved in the processing of user actions

As explained before, the vocabulary is used in the semantic interpretation of the query (by accessing to all the labels in the knowledge base) and during the semantic suggestions (by following some relationships between the concepts).

Integration between vocabulary-linked functions and other application functions

The final user is provided with an interface he/she is already accustomed to, i.e. a textual search engine; however, the semantically-enriched functionalities are hidden "behind the scenes" and are employed to give the user the value added of the semantic searching.

During semantic interpretation of query, Squiggle compares it both against the preferred labels (skos:prefLabel) the alternative labels (skos:altLabel) and misspelled labels (skos:hiddenLabel).

For semantic suggestions, Squiggle exploits the relationships between the identified concepts and other concepts within the ontology, so as to propose query expansion.

Additional references

Two domain specific search engines were built on top of Squiggle framework and are available on-line:

Some publications about Squiggle are available on the web at http://swa.cefriel.it/Publications#squiggle-pub.

Vocabularies

Title

The Squiggle framework uses the SKOS vocabulary. Squiggle Music uses a Music ontology. Squiggle Ski uses a Ski ontology.

General characteristics (size, coverage) of the vocabulary

The Music ontology contains more that 2 millions triples in RDF/OWL. The knowledge base, derived by freely accessible sources like MusicBrainz (http://www.musicbrainz.org) and MusicMoz (http://www.musicmoz.org), describes artists and bands, songs, albums, music genres, etc. The Ski ontology contains more that 2000 triples in RDF/OWL. The knowledge base was derived by information of the International Ski Federation (http://www.fis-ski.com/) about athletes, disciplines, races, podiums, etc.

Language(s) in which the vocabulary is provided

The Music ontology is not multilingual (the names of artists and songs are not "translatable"). The Ski ontology contains the name of the disciplines in 7 languages (English, Italian, German, French, Swedish, Norwegian and Finnish.)

Vocabulary extract

Music ontology:

Ski ontology:

Structure explanation

The domain vocabularies in Squiggle use hyperonymy/hyponymy, meronymy/holonymy (part-of relation), multiple wordings (homonymy/pseudonymy/synonymy) and generic semantic relationship (when two items are "related").

Music ontology:

and the bands can be connected to their components

Ski ontology:

Machine-readable representation of the vocabulary

All the data is rendered in RDF/OWL using some of the SKOS primitives. The ontologies are not publicly available on the Web, hereafter some sample triples.

Music ontology:

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
    xmlns:owl="http://www.w3.org/2002/07/owl#"
    xmlns:skos="http://www.w3.org/2004/02/skos/core#"
    xmlns:m="URN:it:cefriel:music#" 
  <rdf:Description rdf:about="URN:it:cefriel:music#artist-8bfac288-ccc5-448d-9573-c33ea2aa5c30">
    <skos:prefLabel>Red Hot Chili Peppers</skos:prefLabel>
    <skos:altLabel>RHCP</skos:altLabel>
    <skos:altLabel>The Red Hot Chili Peppers</skos:altLabel>
    <skos:hiddenLabel>The Red Hot Chilli Peppers</skos:hiddenLabel>
    <skos:hiddenLabel>Red Hot Chilli Peppers</skos:hiddenLabel> 
    <skos:hiddenLabel>Red Hot Chilly Peppers</skos:hiddenLabel>
    <m:hasStyle rdf:resource="URN:it:cefriel:music#style-2150036357"/>
    <m:hasStyle rdf:resource="URN:it:cefriel:music#style-2147564081"/>
    <m:hasStyle rdf:resource="URN:it:cefriel:music#style-2149982882"/> 
    <m:performs>
      <rdf:Description rdf:about="URN:it:cefriel:music#song-1599985063">
        <skos:prefLabel>Californication</skos:prefLabel>
      </rdf:Description>
    </m:performs>
    <m:performs rdf:resource="URN:it:cefriel:music#song-0432808595"/>
  </rdf:Description>
  <rdf:Description rdf:about="URN:it:cefriel:music#style-2149982882">
    <skos:prefLabel>Punk</skos:prefLabel>
  </rdf:Description>
  <rdf:Description rdf:about="URN:it:cefriel:music#style-2147564081">
    <skos:prefLabel>Pop</skos:prefLabel>
  </rdf:Description>
  <rdf:Description rdf:about="URN:it:cefriel:music#style-2150036357">
    <skos:prefLabel>Rock</skos:prefLabel>
  </rdf:Description> 
  <rdf:Description rdf:about="URN:it:cefriel:music#song-0432808595">
    <skos:prefLabel>Otherside</skos:prefLabel>
  </rdf:Description> 
</rdf:RDF>

Ski ontology:

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:ski="URN:it:cefriel:ski#"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
    xmlns:owl="http://www.w3.org/2002/07/owl#"
    xmlns:skos="http://www.w3.org/2004/02/skos/core#">
  <rdf:Description rdf:about="URN:it:cefriel:ski#Athlete">
    <skos:prefLabel>Athlete</skos:prefLabel>
    <skos:altLabel>Atleta</skos:altLabel>
  </rdf:Description>
  <ski:Athlete rdf:about="URN:it:cefriel:ski#athlete-1298393383">
    <ski:practice rdf:resource="URN:it:cefriel:ski#Slalom"/>
    <ski:practice rdf:resource="URN:it:cefriel:ski#Combined"/>
    <skos:prefLabel>ROCCA Giorgio</skos:prefLabel>
  </ski:Athlete>
  <rdf:Description rdf:about="URN:it:cefriel:ski#Slalom">
    <skos:altLabel>Slalom Speciale</skos:altLabel>
    <skos:prefLabel>Slalom</skos:prefLabel>
    <skos:altLabel>SlalÄm</skos:altLabel>
    <skos:altLabel>SL</skos:altLabel>
    <skos:altLabel>Speciale</skos:altLabel>
  </rdf:Description>
  <rdf:Description rdf:about="URN:it:cefriel:ski#Combined">
    <skos:altLabel>Combinata</skos:altLabel>
    <skos:prefLabel>Combined</skos:prefLabel>
  </rdf:Description>
</rdf:RDF>

Software applications used to create and/or maintain the vocabulary, features lacking for the case

The vocabulary maintenance is performed through an RDF/SKOS editor.

Structure of the database used to currently manage the vocabulary

Squiggle uses Sesame repositories with a MySQL backend to store the knowledge bases in RDF format, using Sesame pre-defined structures.

Standards and guidelines considered during the design and construction of the vocabulary

In the modeling of our SKOS-based ontologies, we made use of the "Quick Guide to Publishing a Thesaurus on the Semantic Web" (http://www.w3.org/TR/swbp-thesaurus-pubguide/) and the "SKOS Core Guide" (http://www.w3.org/TR/swbp-skos-core-guide/).

Additional references

The sources of information to build the knowledge bases are:

Vocabulary mappings

[Note from the editor: this contribution refers to some mappings, but this is more at the meta-language level, with domain-specific relations being mapped to standard SKOS or DC properties, as examplified in previous section.]