The SmartOpenData project, SmOD, developed a Linked Data model based on the European Union's INSPIRE data specifications. The SmOD work lead to the creation of a set of very small vocabularies that define classes and properties that mirror those in INSPIRE that were useful to a series of pilots, focusing on the rural economy, tourism, protected sites etc.
This document describes and aggregates the set of SmOD-INSPIRE vocabularies.
This vocabulary is stable. Definitions may be updated to clarify semantics if appropriate but the basic definitions will not change. If you wish to add terms in the INSPIRE model not included here, please contact Phil Archer
This is not a W3C standard and has not been endorsed by the W3C Membership.
The INSPIRE data model is large, complex, and designed for use in a Geospatial Information System and not for Linked Data. Rather than try and replicate the whole model in RDF, SmOD takes a much more Linked Data centric approach, re-using concepts wherever necessary but simplifying it where possible. An important reference in this work was the Study on RDF and PIDs for INSPIRE by Diederik Tirry and Danny Vandenbroucke. It summarised work by three experts: Clemens Portele, Linda van den Brink and Stuart Williams.
In addition to the vocabularies described here, the Smart Open Data Project also developed a specific vocabulary for its pilot projects and a SKOS concept scheme for the Corine Land Cover Nomenclature.
The Study on RDF and PIDs's first recommendation is that RDF namespaces should be aligned
with the XML namespaces so that, for example, the namespace used for Protected Sites
should be http://inspire.jrc.ec.europa.eu/schemas/ps/3.0/
. However, the project received
anecdotal information that it would be unwise to wait for the RDF schemas to be developed and published
at those URLs. Therefore the decision was taken to use the W3C infrastructure with namespaces of the form
http://www.w3.org/2015/03/inspire/{xx}
where xx
is the INSPIRE theme. The W3C website is extremely stable and these
namespaces should be considered as persistent although they are not the product of any W3C
working group. If the European Commission's Joint Research Centre were to publish its own schemas
then the ones created by SmOD would be deprecated in favour of them. If, however, the JRC or
other parties wish to extend the vocabularies hosted at W3C then this would be possible,
particularly through the Locations and Addresses Community Group.
The most visible difference between the INSPIRE model and the SmOD interpretation is the
elimination of the Geographical Names theme. The full INSPIRE model supports the provision
of multiple spellings of place names, using multiple scripts, linked to audio files for
pronunciation and more. Initial work in the project used a model that mirrored this. However, the
result was a lot of complexity in the data with many of the properties unused and
several unnecessary blank nodes. The Study on RDF and PIDs's recommendation in section 3.2.12
is to simply use rdfs:label
(a string with an optional language tag). This simplification
risks losing some of the rich data that might be available in some situations but is absent in the
SmOD pilots. Making this change was inline with the overall recommendation of 'thinking Linked Data'
and made the data in the pilots much easier to work with at a stroke.
The SmOD partners have also used GeoSPARQL's gsp:SpatialObject
class in preference
to creating a new class of gcm:SpatialObject
(INSPIRE has a 'generic concept model' at its core).
The definition of gsp:SpatialObject
is:
"The class Spatial Object represents everything that can have a spatial representation. It
is superclass of feature and geometry." [GeoPSARQL, PDF, p6].
This very general class therefore fits very will
within INSPIRE and the SmOD pilots and there is no gain in defining an INSPIRE-specific class of the same name.
This section describes the decisions made when incorporating each INSPIRE theme in the SmartOpenData model.
Namespace | http://www.w3.org/2015/03/inspire/ps# |
---|---|
Class | ps:ProtectedSite |
Object Properties | ps:legalFoundationDocument |
Datatype property | ps:legalFoundationDate |
Concept Schemes used | INSPIRE Registry, Protection Classification |
External vocabularies used | FOAF, ORG |
Protected Sites are defined by a document that details the relevant protection,
this might be legislation but is more usually some sort of order or notice. Using
ps:legalFoundationDocument
to link to a class describing a
document, such as the gcm:DocumentCitation
class is quite awkward from a
Linked data point of view. The natural thing to do is simply to link to the document
itself and that document will be an instance of the well used foaf:Document
class.
The ps:legalFoundationDocument
property has this as its range. But such
documents may not be available online (or their URL unknown) and so a method of
referring to offline documents needs to be provided. SmOD simplifies the various
properties of the INSPIRE gcm:DocumentCitation
class (title, shortname, date etc.) down
to the dcterms:bibliographicCitation
property. Where a legal foundation document
is not available online, a blank node will be created in the graph with this
property that gives a reference to the actual document. Where the document does exist online it will be linked to directly.
The Protected Sites theme is the first of many in the SmartOpenData model that makes use of the SKOS concept schemes published in the INPSIRE Registry.
The ps:siteDesignation
property can point to one or more of the specialisations
of the Designation Value code list type (http://inspire.ec.europa.eu/codelist/DesignationValue/).
These are:
So for example, if a site were designated as an area of special conservation under
Natura 2000, the value of the ps:siteDesignation
property would be
http://inspire.ec.europa.eu/codelist/Natura2000DesignationValue/specialAreaOfConservation
.
The provision of SKOS Concepts schemes makes this easy and avoids this or any other project
writing its own version of designation schemes like Natura 2000. However, the Registry
does not provide SKOS concepts schemes for all aspects of INSPIRE or some of the closely
related data models. For example, the ps:siteProtectionClassification
property in the
Protected Sites theme takes one of 7 enumerated values:
In XML-centric systems these would be provided as strings but in Linked Data, they are better
rendered as SKOS concepts so that they can pointed to via their URI, with multilingual labels etc.
A very simple SKOS concept scheme was created to provide such URIs for the values in the ProtectedClassificationValue
enumeration at
http://www.w3.org/2015/03/inspire/ProtectionClassification#
replacing the namespace URI with the prefix pspc
, each of the terms in the
list can be referred to as pspc:natureConservation
, pspc:archaeological
etc. Note that
the lower camel case capitalisation has been preserved from the original, rather than the more
usual practice in Linked Data of naming classes using title case.
The SmOD data model makes use of the Protected Sites Simple data model from
INSPIRE but takes one extra class and relationship from Protected Sites Full,
namely ps:isManagedBy
. This has a range of foaf:Agent
to keep the vocabulary
as general as possible but it is expected that in practice, org:Organization
(or one of its sub classes)
will be used. org:Organization
is a sub class of foaf:Agent
. In some cases
foaf:Group
or event foaf:Person
will be better, both of these are also sub
classes of foaf:Agent
. As well as basic information like the organisation's name,
the ORG ontology is recommended as it has the following features:
The latter aspect matches the Responsible Agency class in Protected Sites Full that includes properties for recording the beginning and end of the agency's lifespan.
Namespace | http://www.w3.org/2015/03/inspire/lu# |
---|---|
Class | lu:ExistingLandUseObject |
Object Property | lu:hilucsLandUse |
In the same way that the INSPIRE Registry is used as a source of SKOS concepts
as value for the ps:siteDesignation
property, the lu:hilucsLandUse
property can link a Spatial Object to one of the values from the code list at
http://inspire.ec.europa.eu/codelist/HILUCSValue/.
The domain of lu:hilucsLandUse
is the lu:ExistingLandUseObject
and so
systems can, at least in theory, infer that any Spatial Object that has a lu:hilucsLandUse
property
is also an instance of lu:ExistingLandUseObject
.
Namespace | http://www.w3.org/2015/03/inspire/au# |
---|---|
Class | au:AdministrativeUnit |
Object Properties | au:nationalLevel |
Datatype property | au:nationalCode |
Concept Schemes used | INSPIRE Registry, EEA Codelist for bio-geographical regions, Europe 2011 |
From a SmartOpenData perspective, the Administrative Units theme is very simple. The
au:AdministrativeUnit
class itself is defined as a sub class of the gsp:SpatialObject
class so it inherits rdfs:label
as the property for its name and the usual means of providing boundary
information (via gsp:Geometry
). Administrative Units typically have a national
code associated with them and this is provided as a string value for the au:nationalCode
property which is defined as a sub property of skos:notation
.
The INSPIRE Registry provides a SKOS concept scheme for the Administrative Hierarchy Level
(http://inspire.ec.europa.eu/codelist/AdministrativeHierarchyLevel/)
with URIs for the 6 levels as
http://inspire.ec.europa.eu/codelist/AdministrativeHierarchyLevel/1stOrder/
http://inspire.ec.europa.eu/codelist/AdministrativeHierarchyLevel/2ndOrder/
etc. These
URIs are the value for the au:nationalLevel property
.
Finally, SmOD can make use of the Metadata Registry (MDR) provided by the
European Publications Office as a source of URIs as values for au:country
. This URI set
provides the names of all countries in the world in all official languages of the EU and follows a predictable
pattern, based on a country's ISO 3166 3 character code:
http://publications.europa.eu/resource/authority/country/FIN
etc.
http://publications.europa.eu/resource/authority/country/GBR
The downside of using these URIs is that, for now, they are not resolvable. The Publications Office is known to be working on making them so but at the time of writing they are not following Linked Data principles – something the Publications Office is very aware of.
Namespace | http://www.w3.org/2015/03/inspire/br# |
---|---|
Class | br:Bio-geographicalRegion |
Object Properties | br:regionClassification |
Datatype property | au:nationalCode |
Concept Schemes used | INSPIRE Registry, MDR (countries) |
INSPIRE recognises 4 regional classification schemes within this theme:
The Natura 2000 And Emerald Bio-geographical Region Classification is the one of most interest for SmartOpenData. The European Environment Agency maintains this list and publishes it in a variety of formats.
Code | Name | Region | pre_2012 |
---|---|---|---|
alpine | Alpine Bio-geographical Region | Bio-geographical Region | ALP |
Anatolian | Anatolian Bio-geographical Region | Bio-geographical Region | ANA |
arctic | Arctic Bio-geographical Region | Bio-geographical Region | ARC |
atlantic | Atlantic Bio-geographical Region | Bio-geographical Region | ATL |
blackSea | Black Sea Bio-geographical Region | Bio-geographical Region | BLS |
boreal | Boreal Bio-geographical Region | Bio-geographical Region | BOR |
continental | Continental Bio-geographical Region | Bio-geographical Region | CON |
macaronesian | Macaronesian Bio-geographical Region | Bio-geographical Region | MAC |
marineAtlantic | Marine Atlantic Region | Marine Region | MATL |
marineBaltic | Marine Baltic Region | Marine Region | MBAL |
marineBlackSea | Marine Region Black Sea | Marine Region | MBLS |
marineMacaronesian | Marine Macaronesian Region | Marine Region | MMAC |
marineMediterranean | Marine Mediterranean Region | Marine Region | MMED |
Mediterranean | Mediterranean Bio-geographical Region | Bio-geographical Region | MED |
pannonian | Pannonian Bio-geographical Region | Bio-geographical Region | PAN |
steppic | Steppic Bio-geographical Region | Bio-geographical Region | STE |
An RDF vocabulary at http://rdfdata.eionet.europa.eu/eea/biogeographic-regions2011.rdf
provides URIs for each of the classifications in the form
http://rdfdata.eionet.europa.eu/eea/biogeographic-regions2011/{code}
and
these can be used as values for the br:regionClassification
property.
Since this is not published as a SKOS Concept scheme per se, the range of
br:regionClassification
is undefined. The property could therefore also be used
to link to Concept schemes for this or any of the other regional classification schemes.
In common with lu:hilucsLandUse
, the domain of br:regionClassification
is defined.
In this case the domain is br:Bio-geographicalRegion
, which allows
systems to infer that a gsp:SpatialObject
with the property is also
an instance of its subclass, br:Bio-geographicalRegion
. This is not shown in the diagram
to aid readability.
The br:regionClassificationLevel
property links directly to the
concept scheme in the INSPIRE Registry at http://inspire.ec.europa.eu/codelist/RegionClassificationLevelValue/
that provides URIs for the 4 possible values of International, Local, National and Regional in the form
http://inspire.ec.europa.eu/codelist/RegionClassificationLevelValue/{code}
where {code}
is the terms from the list all in lower case.
Namespace | http://www.w3.org/2015/03/inspire/sd# |
---|---|
Class | sd:SpeciesDistributionUnit |
Object Properties | sd:eunisSpeciesCode |
Datatype properties | sd:eunomenID |
Concept Schemes used | INSPIRE Registry, EUNIS/EEA, EU-NOMEN |
The Species Distribution theme provides a framework to support detailed information about population densities, counting methodologies etc. For SmOD, and again, when 'thinking Linked Data,' it is sufficient to use a simpler model.
The sd:SpeciesDistributionUnit
class uses sd:hasSpecies
to link to a class
that represents any species of interest. This is equivalent to INSPIRE's Species Name Type.
Species can be identified in multiple ways.
The European Environment Agency maintains its European Nature Information System, EUNIS,
as a URI set for species and serves the data in a HTML or RDF/XML using content negotiation. The URIs are of the form
http://eunis.eea.europa.eu/species/{species No}
so that, for example, the yellowhammer is identified by
http://eunis.eea.europa.eu/species/1023.
The data returned from the EUNIS system is comprehensive, providing the species' vernacular
name in the official languages of the EU and equivalent identifiers from many other schemes.
SmOD defines the domain of both sd:eunisSpeciesCode
and sd:occurenceCategory
as
sd:Species
. The range of sd:occurenceCategory
is SKOS Concept and the INSPIRE
Registry provides the relevant concept scheme at http://inspire.ec.europa.eu/codelist/OccurrenceCategoryValue/
but for sd:eunisSpeciesCode
, the range is eunis:SpeciesSynonym
, the type defined in the EUNIS data.
One of the other identifiers included in the EUNIS data is the EU-Nomen identifier which is
present in some of the data used in the SmOD pilots. This identifier can be included directly
in SmOD data using the sd:eunomenID property, which is defined as a subProperty of skos:notation
.
The literal value, e.g. 97523
, is typed as such. A further property, smod:eunomenPage
,
links the species to its EU-Nomen Web page, e.g.
http://www.eu-nomen.eu/portal/taxon.php?GUID=urn:lsid:faunaeur.org:taxname:97523
This page, indeed EU-Nomen, is not Linked data friendly since non-URI identifiers
are used and the associated information is only available as a Web page, not as RDF.
The EUNIS system uses the eunis:sameSynonymFaEu
property to provide the
EU-Nomen species number and the property is defined as having a domain of eunis:SameSynonym
and range of rdfs:literal
. This is semantically close enough to define sd:eunomenID
as
a subproperty of this as well as of skos:notation
. The two together provide the detailed
semantics we need – that the literal value is of a specific type and that that value
can also be matched against values of the eunis:sameSynonymFaEu
property.
As with any class, properties like rdfs:label
may be used to give the name of the species as a string literal if needed.
Namespace | http://www.w3.org/2015/03/inspire/lc# |
---|---|
Class | None |
Object Properties | lc:corineLandCover |
Concept Schemes used | Corine Land Cover |
The Corine Land Cover taxonomy is defined by EIONET and
published as a set of Web pages. At the time of writing, the SmOD partners understand
that plans are in place to publish it as a SKOS Concept scheme but that
has not yet happened. Therefore, a scheme was created and published at
http://www.w3.org/2015/03/corine. The
definition text for each concept is taken from the EIONET pages and served in HTML,
RDF/XML and Turtle. The RDF serialisations also include labels and definitions in
Spanish and Slovak as well as English, the latter taken from SAŽP's website. SAŽP also supplied the RGB colours associated with each Corine Land Cover type and these are used in the HTML page.
An issue to highlight in this work is the choice of identifiers for each CLC type.
These are usually in available data as three digit numbers, sometimes with, sometimes
without separating dots (i.e. 111 or 1.1.1). It proved much easier therefore to use
these numbers in the URIs than to use the names to create URIs like http://www.w3.org/2015/03/corine#ContinuousUrbanFabric
.
However, XML, and therefore RDF, requires that class names begin with either a letter or
an underscore to each class begins with 'clc' – something that easy be inserted when processing input data.
It should be noted that Clemens Portele created a vocabulary for recording Corine Land Cover
datasets. This captures the full complexity of the original model that, again, goes
beyond what is required for SmOD. The equivalent property to lc:corineLandCover
in Portele's
work is lcv:class
defined thus:
lcv:class a owl:ObjectProperty ; rdfs:comment "The range is a type for which no RDF representation is known: LandCoverClassValue"@en ; rdfs:range owl:Class ; skos:definition "The assignment of a land cover class through a classification code identifier"@en ; skos:notation "class"^^xsd:NCName ; skos:prefLabel "class"@en ; skos:scopeNote "The identifier, eg 1, 1.1.2, ... (for CORINE LC classes) allow to access to the value and the definition or narrative description of the corresponding class."@en.
Rather than create the missing SKOS Concept scheme, Portele defines a general object
property (confusingly called 'class') that has a range of owl:Class
. SmOD would
like to refer to this work but to keep the data simple, the lc:corineLandCover
property is
defined as a sub property of lcv:class
and has a range of skos:Concept
.
Namespace | http://www.w3.org/2015/03/inspire/ef# |
---|---|
Class | ef:EnvironmentalMonitoringFacility |
Object Property | ef:mediaMonitored |
Datatype Properties | ef:specialisedEMFType |
Concept Schemes used | INSPIRE Registry |
The EF vocabulary was created based on the INSPIRE Environmental Monitoring Facilities (EMF) theme (PDF). The scope includes the monitoring facilities and the observations linked to them. The latter are defined in a separate document, however, it is worth noting here that the RDF Data Cube vocabulary has been used extensively as this provides a method for recording statistical hypercube data in RDF.
One of the specific use cases for using the EF theme is to combine Protected Sites from Natura2000 and various water measurements from Waterbase (Lakes, Rivers and Underground Waters). Several possible user queries were defined that needed to be addressed within the SmOD project, some of which involved further datasets. These queries dictated the requirements for the model as follows:
To satisfy these requirements, the following classes and properties were defined.
ef:EnvironmentalMonitoringFacility
is a spatial entity that collects or processes
data about real-world objects whose properties (physical, chemical, biological or other aspects of
environmental conditions) are observed or measured. Within the SmOD vocabularies, it is defined as
a sub-class of gsp:SpatialObject
.
ef:specialisedEMFType
provides categorisation of EMF, such as platform,
site, station, sensor, etc. INSPIRE has a codelist for this
however, it is empty and therefore adds little value. For the ARPA pilot, the need is to represent the EMF type for
humans, not to integrate/link it with similar datasets and so ef:specialisedEMFType
is a
datatype property that takes a literal value.
The code list for 'purpose', defined in SMOD as ef:purpose
, is also
empty, so, again, SmOD defines
this as a datatype property.
The INPSIRE registry does, however, include a SKOS Concept Scheme
that offers values for the
ef:mediaMonitored property
(air, biota, etc.). This object property therefore has a range of
skos:Concept
and a domain of ef:EnvironmentalMonitoringFacility
.
Namespace | http://www.w3.org/2015/03/inspire/cp# |
---|---|
Class | cp:CadastralParcel |
Datatype Property | cp:nationalCadastralReference |
This very simple vocabulary introduces the Cadastral Parcel class, defined as a sub class of gsp:SpatialObject
. Just one property is defined, cp:nationalCadastralReference
, the value of which should be the thematic identifier at national level, generally the full national code of the basic property unit. Must ensure the link to the national cadastral register or equivalent.