W3C is pleased to receive the Semantically-Interlinked Online Communities (SIOC) Submission from DERI, Fundación CTIC, Asemantics, OpenLink Software, Opera Software, STFC Rutherford Appleton Laboratory, DFKI.
The goals of SIOC are described in the introduction of the SIOC Core Ontology Specification:
Semantically-Interlinked Online Communities, or SIOC, is an attempt to link online community sites, to use Semantic Web technologies to describe the information that communities have about their structure and contents, and to find related information and new connections between content items and other community objects. SIOC is based around the use of machine-readable information provided by these sites.
To achieve these goals SIOC defines a series of OWL ontologies. By providing these common vocabularies it becomes possible to exchange data among different online communities, link conversations for the purpose of a common project, extend personal information (like, for example, FOAF data) with roles the person may play in a specific community (e.g., contributor, administrator, etc). The vocabularies define terms such as topics of items (blogs, news items), authors, previous and next items (e.g., to describe threads of conversations), etc. This subject area (blogs, wikis, etc) makes SIOC potentially very important to Semantic Web applications in the context of various “social”, Web 2.0 sites.
SIOC has the potential to become one of the foundational vocabularies that
make Semantic Web applications useful, alongside DOAP, FOAF, Dublin Core, etc. SIOC itself links to
those vocabularies whenever possible. For example, some of its terms are
subproperties or subclasses of terms defined elsewhere (in FOAF, Dublin Core,
RSS1.0); some other terms makes explicit reference to vocabularies such as SKOS, DOAP, W3C’s ical using, e.g.,
the rdfs:seeAlso
property. By doing so, the SIOC vocabulary is
defined very much in the spirit of the Semantic Web, i.e., by sharing,
linking, and reusing vocabularies and ontologies. One of the “vision”
figures (originating from the SIOC
project’s site) nicely shows the interlinked nature of SIOC with some
of these vocabularies:
One of the submitted documents (“Related Ontologies and RDF Vocabularies”) gives some more details on how SIOC relates to FOAF, Dublin Core, RSS1.0 and SKOS. However, the OWL specification of the “types” namespace reveals even more connections than the ones described in the submitted document (it is a pity that those connections have not been made more explicit in the accompanying text).
A possible criticism on the submission is the lack of some sort of a primer or more detailed guide (the scholarly publications, referred to from the documents in section 3.2, do not replace those). SIOC is not a small vocabulary, and the examples given in the submission are deceptively simple compared to the complexity of what can be described. It would be good to see such guides more widely available in future.
The namespaces defined by the submission are owned by the submittor, which is encouraged in Semantic Web architecture. However, the submission does not currently describe the policy with respect to changes; in our view it should do so; see the TAG finding on Disposition of Names in an XML Namespace.
As soon as there is metadata about a person—even being linked to some
“thing”—this is considered to be personal data and falls into the
regime of privacy protection. A section or paragraph about those issues is
clearly missing from this submission. The privacy challenge was partly
addressed by W3C in the Platform for
Privace Preferences (P3P) specification. Most sioc:Site
-s
that collect and distribute SIOC related data may be interested in using P3P
to describe their privacy practice. Though there is a certain overlap of some
of the terms used by the two specifications (eg, sioc:name
and
P3P’s user.name
), and an RDF Schema for P3P is also
available, the fact is that P3P applications themselves are not using this
RDF schema and we are not aware of any deployed RDF privacy tools that use
the P3P RDF schema. However, there are lots of research going on in the area
of privacy, constraints (e.g., access control), and rights management related
to community sites. The ontologies of the PRIME project
may be of some help to address privacy issues, and the methods of the policy aware web project might also
be inspiring for developers and users of this SIOC specification. There is
clearly more research needed in this area; it is important to keep the
general privacy concerns in mind for future work on SIOC.
A final comment is related to a design pattern adopted by SIOC, namely the
fact that a number of property pairs are explicitly defined that are the
inverse of one another (e.g., sioc:subscriber_of
and
sioc:has_subscriber
). Although these properties are properly
defined as inverses via the owl:inverseOf
property (thereby
avoiding any ambiguities) it should be pointed out that this practice leads
to a possibly unnecessary duplication of terms. The blog entry of Tim
Berners-Lee and the resulting set of comments might be of interest in
this respect.
There are already a number of tools, browser extensions, special browsers, etc, that are built around the SIOC concepts. A separate document, part of the W3C submission, is a report on SIOC applications and implementations, which include plug-ins to popular blogging software like Wordpress or b2evolution, or an extension to Firefox to explore SIOC data.
The SIOC vocabulary is a useful component of the Semantic Web. SIOC is based on RDF and OWL and does not appear to demand new features of those base technologies. It would be interesting to see whether the evolution of OWL (currently referred to as OWL1.1 and is a proposed new work item within W3C) may provide additional features that could be exploited in future releases of SIOC.
There is an explicit relationship between SIOC and W3C work on SKOS. Indeed, SIOC uses
references to SKOS categories (e.g., as a possible object of the
sioc:topic
predicate). Though SKOS is still under development in
the Semantic Web Deployment Working
Group, this aspect of SKOS appears to be quite stable, i.e., the future
evolution of SKOS is not likely to invalidate the current usage of SKOS by
SIOC. On the other hand, deployed SIOC applications may provide good use
cases for SKOS in the future.
The Semantic Web architecture allows and encourages communities to develop and deploy vocabularies and ontologies with no obligation to register them with any central authority. W3C itself normally refrains from standardizing vocabularies or ontologies for specific application areas unless they have foundational character (e.g., SKOS) or they are an integral part of some other W3C activity (e.g., CC/PP). It is indeed very much in line with the ethos of the Semantic Web for vocabularies to be defined by various communities, user groups, domain experts, and made publicly available. SIOC is a fine example for such a vocabulary.
It is not clear at the moment whether any W3C Working Group or Interest Group would consider SIOC as part of its future work. By emphasizing the distributed nature of vocabulary development on the Semantic Web, the community may be better served if further evolution of widely-used vocabularies such SIOC is decentralized and not directly incorporated into a W3C activity. On the other hand, knowing about SIOC and, more importantly, referring to it from other vocabularies (including those developed by W3C Working or Interest Groups) is certainly appropriate. In this respect having a stable SIOC specification as part of a W3C Submission is extremely helpful. W3C is pleased to be able to provide a stable URI to which others can refer. (Although care should be taken; the vocabularies themselves, as defined by their namespaces, are living documents and may evolve in future; the submission provides a snapshot of the current status only.)
Finally, SIOC usage should be another valuable use case for the Policy Languages Interest Group, if chartered by the W3C Advisory Committee (see the PLING charter draft, currently under review by the AC).