This document is no longer maintained. For a more recent report on database schemas, please see Mapping Semantic Web Data with RDBMSes.
Following is a snapshot of script output as of 22 April, 2003. The script is no longer available and the source data should be considered static.
Property | Value |
---|---|
author | Alberto Reggiori |
introduction | RDFStore implements a generic hashed data storage that allows to serialise RDF models, resources, properties and property values either to disk or in-memory data structures. It does support several different persistent storage models such as SDBM, BerkeleyDB (standard and Sleepycat) and DBMS. The latter is a custom TCP/IP based storage library that allows to a perl script to transparently read/write hashed data values stored on a remote database server. One RDFStore database currently consists of 4 on-disk DB files but it is under development a completly new indexing method that should use 5 distinct files. |
implementation | The DBMS storage module is a fast networked transactional object store that uses multiple single key hash based BerkeleyDB along with Object Serialization in Perl and an optimized network routing daemon with a single thread/process per database. The acutal running code consists of two parts: the TCP/IP deamon (written in C) and a perl extension to tie hashes to DBMS storages. The deamon can handle multiple connections concurrently, where each table accessor is given its own thread of execution by forking. Having them forked means no locking overhead; the dbmsd supports only original BerkeleyDB 1.85 style interface. All oprerations are atomic and serialised using a FIFO like algorithm; the storage support arbitrary sized data. To reduce latence and avoid stagger situations dbms uses non blocking IO and extensive buffering, but the dbms server is still 100% IO limited. The DBMS system has been tested in the past to use threading instead (like rdfdb) of forking but it did not show any serious performance (in part because both on FreeBSD 3.2 and on Linux 2.2.3 threads were not yet that efficient). |
query | triples-matching. Planned to add free-text search over triples and a SQL/DBI interface on persistent storages. |
inference | basic RDF Schema inference |
scalability | 1470000 triples get stored in a ~98MB database. A new indexing method is under development that should reduce the storage requirements even more. |
performance | Remote DBMS has been tested in the past for 2000/tps. Local Sleepycat BerkeleyDB storages using locking under apache+mod_perl perform ~183 read operations/second |
provenance | None |
license | BSD like license |
api | Perl5 apis |
transaction | None |
platform | Tested on FreeBSD and Linux but should run on any platform Perl runs :) |
seealso | http://rdfstore.jrc.it/dbms.html |
lastupdate | 2001-06-06 |
Property | Value |
---|---|
author | Tim Berners-Lee, Dan Connolly, et al |
introduction | A general purpose data processor for the Semantic Web. Not optimized, but demonstrating the feasability of Semantic Web ideas. Please see the Cwm home page for details. |
implementation | Python, Open source |
query | Notation3 is an RDF syntax which is extened to be able to express queries and rules. |
inference | Forward chaining, with built-in functions and remote query delegation |
scalability | |
performance | Not optimized. |
provenance | |
license | W3C licence |
api | |
transaction | |
platform | Any Python platform. |
distribution | Query delegation |
seealso | |
lastupdate | 2003-02-26 |
Property | Value |
---|---|
author | Sofia Alexaki ICS-FORTH mailto:alexaki@ics.forth.gr |
introduction | RSSDB is a persistent RDF Store for loading resource descriptions inan object-relational DBMS (ORDBMS) by exploiting the available RDFschema knowledge. It preserves the flexibility of RDF in refiningschemas and/or enriching descriptions at any time whilst it can becustomized in several ways (as opposed to triple-based repositories)according to the specificities of both the manipulated RDF descriptions(i.e., schemas) and the underlying RDF application queries. |
implementation | RSSDB has been implemented on top of an ORDBMS (i.e., PostgresrSql). Itcomprises a Loading and an Update module, both implemented in Javausing a number of primitive methods (i.e., APIs) forinserting/deleting/modifying RDF triples. Access to the ORDBMS isaccomplished through the JDBC interface in order to ensureinteroperability with various commercial or public domain ORDBMS. |
query | Querying of stored RDF descriptions is accomplished by RQL. RQL is atyped language following a functional approach (a la ODMG OQL) andsupports generalized path expressions featuring variables on bothlabels for nodes (i.e., classes) and edges (i.e., properties). RQLrelies on a formal graph model (as opposed to triple-based approaches)that captures the RDF modeling primitives and permits theinterpretation of superimposed resource descriptions by means of oneor more schemas. The novelty of RQL lies in its ability to smoothlyswitch between schema and data querying while exploiting - in atransparent way - the taxonomies of labels and multiple classificationof resources. The functionality and formal interpretation of RQL isgiven for several classes of useful queries required by Semantic WebApplications. For a comparison between RQL and Squish see http://swordfish.rdfweb.org:8085/tests/. |
inference | RSSDB and RQL have built-in support for recursive traversal of classand property hierarchies. Furthermore, RQL provides universal andexistential quantification over RDFS classes and properties. Thus,members of a Community Web are able to query resources describedaccording to their preferred schema, while discover, in the sequel,how the same resources are also described using another communityschema. Finally, RQL fully supports (a) XML Schema data types (forfiltering literal values), (b) powerful grouping primitives (forconstructing complex XML results), (c) aggregate functions (forextracting statistics) and, (d) in the near future, sorting. An onlinedemo of the RQL filtering/navigation/restructuring capabilities isavailable at http://139.91.183.30:9090/RDF/RQL/. |
scalability | Our experiments showed that the size of DBMS scales linearly with thenumber of triples (seehttp://139.91.183.30:9090/RDF/publications/semweb2001.html). We usedas testbed the Open Directory RDF dump, which comprises about 6 milliontriples. |
performance | In most real-scale RDF applications, variations of a basic databaserepresentation are required in order to take into account the specificcharacteristics of the employed schema classes and properties, as wellas those of the intended query functionality. The main goal of RSSDBschema-specific representation is the separation of the RDF schemafrom data information, as well, as the distinction between unary andbinary relations holding the instances of classes and properties. Wehave carried out experiments in order to compare RSSDB representationwith the triple-based one, using as testbed the Open Directory RDFdump. The results illustrate that our approach yields considerableperformance gains in query processing and storage volumes. Detailedinformation about the database size and time required for the storageof RDF descriptions, as well as about the querying time for bothrepresentations can be found at the publication "The ICS-FORTHRDFSuite: Managing Voluminous RDF Description Bases" at the URLhttp://139.91.183.30:9090/RDF/publications/semweb2001.html |
license | C-Web (c-web.inria.fr) Open Source Software License |
api | We are curretly working on the specification of a Java API coveringthe whole spectrum of RDF manipulation, namely, constructing,validating, storing, updating, and querying RDF triples. |
transaction | Dependent on the underlying ORDBMS support (user option) |
platform | Java 2 Platform. Tested on Solaris and Lunix (RQL+PostgresSql). |
seealso | RDFSuite http://139.91.183.30:9090/RDF/ |
lastupdate | 2001-08-15 |
Property | Value |
---|---|
author | R.V.Guha |
introduction | rdfDB is intended to be a simple, scalable, open-source database for RDF. |
implementation | Small, C based, uses Sleepycat DB for on-disk storage. |
query | Supports a graph oriented API via a textual query language ala SQL (aka Squish). C and perl bindings for this api. |
inference | none. |
scalability | Tested with about 20 million triples. Should scale to much more. |
license | Mozilla Public License |
api | C and perl apis. |
transaction | none |
platform | Unix (linux, bsd, solaris) |
lastupdate | 2001-05-22 |
Property | Value |
---|---|
author | Dave Beckett |
introduction | Redland abstracts the storage implementation, but I'll consider the Berkeley DB (BDB) based storage which uses several (currently 3) on-disk BDB databases. |
query | triples-matching with wildcards |
inference | None |
scalability | Unknown, but tested with 1.5M stored statements. |
performance | The exact query speed is 6,200 statements/second for the 1.5M statements stored |
provenance | None |
license | 3 alternatives - GPL, LGPL or MPL (Mozilla) |
api | C (native); Perl; Python; Tcl; will compile with C++ Java support tested, not complete |
transaction | None |
platform | Pretty portable POSIX based - has been built on Linux, Solaris, OSF/1 Alpha, FreeBSD, MacOS X. |
distribution | None |
lastupdate | 2001-04-20 |
Property | Value |
---|---|
author | OCLC Office of Research and the Dublin Core Metadata Initiative |
introduction | EOR is an open source project, whose goal is to facilitate the rapid development of RDF applications focused on the discovery, management, integration and navigation of metadata. |
implementation | The EOR toolkit is a collection of extensible Java classes and services which serve as a code base, demonstrating by example functions and services common to RDF applications, i.e., metadata capture, search engines, etc.. |
query | Triples-matching with wildcards. Model-centric approach. RDB/JDBC data store using Melnik "Hashed With Origin" data model. |
scalability | Unknown, but tested with > 1000 RDF models. |
performance | Unknown. |
license | Dublin Core Open Source Software License |
api | SQL/JDBC |
transaction | Dependent on the underlying RDB (user option) |
platform | All Java Platforms |
lastupdate | 2001-06-12 |
Property | Value |
---|---|
author | Sesame |
introduction | Sesame is a system consisting of a repository, a query engine and an administration module for adding and deleting RDF data and Schema information. Sesame is being developed as part of the OnToKnowledge project. A public demo server running Sesame can be found at http://sesame.aidministrator.nl/. This site also contains documentation on Sesame. People who would like to test Sesame on their own data can get their own repository on this demo server. |
implementation | Sesame's repository is not only a repository for RDF, but also for RDF Schema. Sesame understands the semantics of most of the RDF Schema classes and properties and correctly handles transitive properties like rdfs:subClassOf and rdfs:subPropertyOf. Sesame currently uses PostgreSQL for its repository, but it can switch to other (kinds of) databases quite easily. The rest of Sesame is completely implemented in Java and can run on any platform for which there exists a Java 2 runtime environment. |
query | The language for the query engine is based on RQL (from ICS-FORTH), which offers full support for querying both plain RDF and RDF Schema. Our RQL implementation is slightly different than ICS-FORTH's because our interpretation of RDF Schema differs from theirs and because Sesame is less restrictive on the (RDF Schema-) ontologies that can be used. Our query engine does not support all features of RQL yet. |
inference | Sesame support the basic inferencing needed for supporting RDF Schema, such as transitivity of subClassOf- and subPropertyOf-properties. |
scalability | Unknown, but tested with 300,000 stored statements from the wordnet nouns file available at http://www.semanticweb.org/library/ |
performance | We haven't done any serious performance testing on Sesame yet, but the aim for the OnToKnowledge project is to support *at least* ontologies of O(10^3) classes and O(10^5) triples on desk-top hardware. Note that these are minimum requirements on Sesame, and that it probably will support larger ontologies and larger numbers of triples. |
platform | Java 2 |
seealso |
Some documents and papers related to Sesame: [1] OnToKnowledge deliverable 9: Query Language Definition http://www.ontoknowledge.org/countd/countdown.cgi?del9.pdf [2] Babysteps in Sesame RQL: a tuturial on Sesame's RQL http://sesame.aidministrator.nl/doc/rql-babysteps.html [3] Sesame's interpretation of RDF Schema http://sesame.aidministrator.nl/doc/rdf-interpretation.html http://sesame.aidministrator.nl/doc/rdf-interpretation.html |
lastupdate | 2001-05-22 |
Property | Value |
---|---|
author | The Jena Team |
introduction |
Jena is an RDF toolkit that contains:
Jena has a storage abstraction that enables new storage subsystems to beintegrated. The persistent storage mechanisms are current based onBerkeleyDB and SQL. |
implementation | The SQL implementation of this storage system supports multiple databaselayouts and different database types through a mixture of Java subclassing anddynamically loaded SQL driver files. Current versions support two variants on ageneric triple table layout, two variants on a hash indexed layout, andhave been tested on Interbase and Postgresql. Other layouts are in development.The BerkeleyDB implementation uses indexes a number of tables: the basedata and indexing tables SP->O, PO->S and OS->P. |
query |
The query language is RDQL,which is a syntax and a query API that can extract information from a model. RDQLis not tied to any storage implementation but can be used with any Jena modelimplementation, including any storage mechanism. RDQL provides subgraph patterns and boolean expressions in an SQL-likesyntax (see SquishQL). |
inference | The current toolkit does not provide any inferencing mechanisms. RDQL doesnot provide inference; model implementers can do so if this manifest itselfthrough the triple interface. |
scalability | Unknown - limitations are due to the underlying storage technology. RDQLqueries have been executed on 800K statement models (custom memory mappedfile). In-memory storage has been used with 600K statements (wordnet). |
performance |
Small scale performance tests have used a tiny fragment of the dmoz datasetand tested link following, reversed link following and search based onstring and integer matching constraints. Performance on workstation classmachines for the SQL store is around 10ms/statement load,1-7ms/returned-statement search. The BDB implementation is typically10x faster but does not provide transaction support. |
provenance | No support provided |
license | BSD (version 1) |
api | Java |
transaction | SQL backend supports transactions through the transaction isolationcapability of the underlying database. The BDB subsystem does not currentlysupport transactions. |
platform | Java 1.2 and up. Tested on MS Windows and Linux with BerkeleyDB version 3.3.11 |
distribution | None |
lastupdate | 2001-11-23 |
seealso |
More information is available at the HPLabs semantic web website |