The aim is to describe documents published under the formal W3C process as 'Semantic Assets,' defined
by ADMS as:
highly reusable metadata (e.g. xml schemata, generic data models) and reference data (e.g. code lists, taxonomies, dictionaries, vocabularies) which are used for eGovernment system development.
The full data model for ADMS is shown below in the diagram that includes a lot of detail. The basic
structure of Repository-Asset-Distribution is common to several vocabularies of this general type
and the RADion vocabulary encapsulates this, providing a substrate for ADMS. Notice that ADMS includes cardinality constraints on several properties although many are optional.
To aid discussion of how we have implemented this, we'll refer to the following example (written in Turtle):
1 @prefix : <http://www.w3.org/ns/adms#> .
2 @prefix cat: <http://www.w3.org/2012/05/cat#> .
3 @prefix data: <http://www.w3.org/data#> .
4 @prefix radion: <http://www.w3.org/ns/radion#> .
5 @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
6 @prefix xhv: <http://www.w3.org/1999/xhtml/vocab#> .
7 @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
8 @prefix dcterms: <http://purl.org/dc/terms/> .
9 <http://www.w3.org/TR/2009/REC-skos-reference-20090818/> a :SemanticAsset, :SemanticAssetDistribution;
10 dcterms:description """This document defines the Simple Knowledge Organization System (SKOS), a common data model for sharing and linking knowledge organization systems via the Web.""";
11 dcterms:format <http://mediatypes.appspot.com/text/html>;
12 dcterms:issued "2009-08-18";
13 dcterms:license cat:DocLicense;
14 dcterms:publisher data:W3C;
15 dcterms:title "SKOS Simple Knowledge Organization System Reference";
16 dcterms:type <http://purl.org/adms/assettype/InformationExchangePackageDescription>;
17 :accessURL <http://www.w3.org/TR/2009/REC-skos-reference-20090818/>;
18 :status <http://purl.org/adms/status/Completed>;
19 xhv:last <http://www.w3.org/TR/skos-reference>;
20 xhv:previous <http://www.w3.org/TR/2009/PR-skos-reference-20090615/>;
21 rdfs:label "SKOS Simple Knowledge Organization System Reference";
22 radion:distribution <http://www.w3.org/TR/2009/REC-skos-reference-20090818/> .
According to ADMS, each W3C document is a Semantic Asset (and therefore also a radion:Asset) and the data model makes clear that the following properties are required:
The RDF schema for ADMS uses rdfs:label to provide an asset's name although, in the case of documents, the dcterms:title property feels more natural so we'll provide both (lines 15 and 21).
The description of the asset can be considered as the abstract of the W3C document and that is what is provided where it's available. This is not true for all documents in TR space but it is for many.
In line 14 we link the asset to a long-published piece of data that describes W3C.
All our Assets are also Distributions so that for each document, we assert two types
in line 9 (adms:SemanticAsset and adms:SemanticAssetDistribution). Although not
necessary, we assert the RADion triple that links an asset to its distribution (line 22). This is
because it is likely that any query run against the data may look for the
radion:distribution property and to omit it may lead to false negative results.
The cardinality constraints on a SemanticAssetDristribution mean that we need to add further triples.
Some catalogues make a distinction between the identifier for an asset and the URL from where it can be obtained. Although this does not apply to W3C where all identifiers are URIs from which the asset can be accessed directly, we need to include the adms:accessURL property for conformance with the ADMS model (line 17).
The format of the Asset must be given which for all documents in TR space is text/html. ADMS recommends
using Ed Summers' work at http://mediatypes.appspot.com for this (line 11).
We do not need to use the adms:representationTechnique property which is useful for providing a
finer grained description of a document format than is possible through MIME types (such as "Word 6.0" cf.
"Word 97" both of which have the MIME type of application/msword.
Distributions are required to declare a license. For this we refer to the W3C document license in line 13. This hasn't been described in RDF to date so we created one quickly (that no doubt could be improved). It simply says:
This creates a little class for the license that is given the label "W3C Document License"
(in English) and we declare that it is the subject of the document at http://www.w3.org/Consortium/Legal/2002/copyright-documents-20021231 (which is actually the primary document itself). We also had to choose a license type from the list provided in the ADMS spec for which No Derivative Work is the appropriate one for W3C documents.
On line 16 we can see that ADMS uses dcterms:type to point to the asset type and provides a controlled vocabulary as possible enumerations. Of these, "Information Exchange Package Description" is the less than ideal but nearest acceptable value. Several of the controlled vocabularies defined within ADMS are encoded within SKOS concept schemes that have been given purl.org URLs. These currently dereference to a file at https://joinup.ec.europa.eu/svn/adms/ADMS_v1.00/ADMS_SKOS_v1.00.rdf which is seen as a temporary location (but the Purls will remain stable of course).
In similar vein to the asset type, ADMS provides a controlled vocabulary for the status of assets.
The W3C process is such that documents will fall into
one of the first three of these (but not the fourth – Withdrawn). The simplest mapping is for Recommendations (completed)
and both Superseded and Retired that map to Deprecated. The remainder must be classed as Under development. This
is easy to understand for Working Drafts but we also have 'Notes.' These are documents produced by Working
Groups, Interest Groups etc. but that have no formal standing. Therefore, they can be made obsolete at any time
and therefore must count as being 'Under development' even if the relevant working group has long ago been disbanded.
For this reason, and because the data is generated by algorithm, it is possible for a single document to be
both Under development and Deprecated. The example shows a recommendation which is linked to the completed status on
Finally, we add in some optional properties for which we have data readily available.
And that's it – for now. We publish other data on w3.org, notably about our
translations and there is data available
that should allow us to link the majority of TR space documents to information about the
working groups that produced them and the subject matter (such as HTML, CSS, SKOS etc.). Those are
details we hope to add in the near future but for now, the data at http://www.w3.org/2012/06/tr2adms/adms
is, we feel, a good start.