Report Work on Semantic Markup
- 1 Semantic Markup
- 1.1 Review of basic annotation techniques
- 1.2 Semantic Markup for OGC standards
- 1.3 Examples
The last few years have seen an explosion in the number and variety of sensors being deployed in all manner of environments around the globe; and this trend will continue as sensors are becoming cheaper and more readily available. The outcome of this development is an avalanche of observational data that must be analyzed and explained in order to achieve an understanding of our environment. Currently, this data is too often stove piped, with a strong tie between the sensor network, observation database, and end-user application. With the advent of projects such as the OGC Sensor Web Enablement (SWE) and the W3C Semantic Sensor Networks Incubator Group (SSN-XG) this information is now being set free and made available on the Web. With this new freedom, however, comes significant challenges, such as the following
- How do we discover, access and search sensor data on the Web?
- How do we integrate the sensor data when it comes from many heterogeneous sources?
- How do we make raw sensor data meaningful to Web applications and naive users?
SWE has taken important initial steps towards answering these questions. It includes the development of a set of XML-based languages and Web service interface specifications. The service interfaces, such as the Sensor Observation Service (SOS) [SOS 2008], provide a means to discover, access and search sensor data (as much as it is possible in XML-level syntactic interoperability level and through the use of standardized tags); and the languages, such as the Sensor Model Language (SensorML) [SENSORML 2007] and Observations and Measurements (O&M) [OM 1 2007], [OM 2 2007], provide a means to integrate data from heterogeneous sources in a standard format accessible to Web users. Such syntactic level interoperability is a good start and provides a solid framework to begin exploring the issue of semantic level interoperability. The latter issue falls under the charter of the W3C SSN-XG and is being explored through the investigation of two separate but closely related projects -- the development of an ontology for describing sensors and sensor data, and an annotation framework for adding semantic metadata to the SWE standards. This is a description of the latter.
In this document, the semantic annotation of sensor data is being considered within the scope of the SWE specifications (SensorML, O&M, SOS, etc.). In addition to considerations of the documents to be encoded, several use-cases have been defined which showcase the value of semantically annotating sensor data. These use-cases include (1) data discovery and linking, (2) device discovery and selection, (3) provenance and diagnosis, and (4) device operation tasking and programming. For more detail regarding these use-cases, see Section 3.
Review of basic annotation techniques
XLink (XML Linking Language) is an XML markup language for creating hyperlinks in XML documents. XLink is a W3C recommendation and outlines methods of describing links between resources in XML documents. Any element in an XML document can behave as a link. XLink supports simple links (like HTML) and extended links (for linking multiple resources together). In addition, with XLink, the links can be defined outside the linked files. XLink attributes can be added to SensorML and O&M documents to provide semantic annotations for the sensor data. XLink is already used in SWE documents, thus, no syntactic or structural changes are required. This explains the relative success of XLink-based approaches in earlier attempts to add semantic annotations to SWE documents, so recognizing which XLink attributes correspond to semantic annotations and which correspond to permissible SWE usages could become difficult.
Two versions of the XLink specification are available, XLink 1.0 [XLINK 2001] and XLink 1.1 [XLINK11 2010]. The description below covers XLink 1.0 because this is the specification which is used in OGC standards like GML [GML 2007].
- xlink:type - Every element defining an XLink *must* contain a "type" attribute, which specifies what type of link it is - the value for this attribute may be any one of "simple", "extended", "locator", "arc", "resource", "title" or "none".
- xlink:href - The "href" attribute is used to specify the URL of a remote resource, and is mandatory for locator links. In addition to the URL of the remote resource, it may also contain an additional "fragment identifier", which drills down to a specific location within the target document.
- xlink:show - The "show" attribute is used to define the manner in which the endpoint of a link is presented to the user. The value of this attribute may be any one of "new" (display linked resource in a new window); "replace" (display linked resource in the current window, removing whatever is currently there); "embed" (display linked resource in a specific area of the current window); "other" (display as per other, application-dependent directives); or "none" (display method unspecified)
- xlink:actuate - The "actuate" attribute is used to specify when a link is traversed - it may take any of the values "onLoad" (display linked resource as soon as loading is complete); "onRequest" (display linked resource only when expressly directed to by the user, either via a click or other input); "other" and "none".
- xlink:label - The "label" attribute is used to identify a link for subsequent use in an arc.
- xlink:from and xlink:to - The "from" and "to" attributes are used to specify the starting and ending points for an arc respectively. Both these attributes use labels to identify the links involved.
- xlink:role and xlink:arcrole - The "role" and "arcrole" attributes reference a URL which contains information on the link's role or purpose.
- xlink:title - The "title" attribute, not to be confused with the title type of link, provides a human-readable descriptive title for a link.
RDFa (Resource Description Framework - in - attributes) is a W3C Recommendation [RDFA SYNTAX 2008] that adds a set of attribute level extensions to XHTML for embedding rich metadata within Web documents. The RDF data model mapping enables its use for embedding RDF [RDF SYNTAX GRAMMAR 2004] triples within XHTML documents, it also enables the extraction of RDF model triples by compliant user agents. RDFa attributes can be added to SensorML and O&M documents to provide semantic annotations for the sensor data. Approaches based on RDFa look promising at the level of SWE documents since it would be easy to process the annotations independently of the rest of the document. Further work is required to check that the introduction of RDFa would not bring major changes for the implementers of the SWE standards and also to investigate how RDFa-enabled SWE services could be further integrated with other RDFa-based Web mashups.
- rdfa:about - A URI or CURIE [CURIE 2009] specifying the resource the metadata is about; in its absence it defaults to the current document.
- rdfa:rel - and rdfa:rev Specifies a relationship or reverse-relationship with another resource.
- rdfa:href, rdfa:src, and rdfa:resource - Specifies the partner resource.
- rdfa:property - Specifies a property for the content of an element.
- rdfa:content - Optional attribute that overrides the content of the element when using the property attribute
- rdfa:datatype - Optional attribute that specifies the datatype of text specified for use with the property attribute
- rdfa:typeof - Optional attribute that specifies the RDF type(s) of the subject (the resource that the metadata is about).
Other Annotation Techniques
- GRDDL [GRDDL 2007] - A markup format for Gleaning Resource Descriptions from Dialects of Languages. It is a W3C Recommendation, and enables users to obtain RDF triples out of XML documents, including XHTML. It defines the syntax to include a reference to a lifting script in a source document - the lifting script can then be used to transform the document to RDF.
- Microdata - Allows nested groups of name-value pairs to be added to documents, in parallel with the existing content. A non-semantic alternative to RDFa.
- SAWSDL [SAWSDL 2007] - A set of extension attributes for the Web Services Description Language and XML Schema definition language that allows description of additional semantics of WSDL components. Allows the user to record the mapping of WSDL elements to concepts defined in a reference ontology and to specify the lifting scripts which can be applied to the output of a service to transform it into a RDF file using the reference ontology concepts.
- hRESTs [Kopecky et al. 2008] - A microformat to add additional meta-data to REST API descriptions in HTML and XHTML. Developers can directly embed meta-data from various models such an ontology, taxonomy or a tag cloud into their API descriptions. The embedded meta-data can be used to improve search (for example: perform faceted search for APIs), data mediation (in conjunction with XML annotation) as well as help in easier integration of services to create mashups.
- SA-REST [Sheth et al. 2007] and Micro-WSMO [Kopecky et al. 2009] - two similar methods to semantically annotate REST services using the same microformat (hRESTs) and a different target ontology. This is similar to SAWSDL (including the possibility to include a reference to a lifting script) but applicable to an HTML-based description of a service).
Comparison of techniques
This report focuses on XLink and RDFa because they are the two primary techniques to add semantic annotations. For additional information on how these approaches can be applied to the development of geospatial mashups, see [Lefort 2009 ].
The following tables demonstrate the mapping of XLink and RDFa attributes to RDF, respectively. Such mapping guides the translation of semantic annotations to RDF syntax.
|xlink:href||Identifier of the resource which is the target of the association, given as a URI||rdf:about of range resource|
|xlink:role||Nature of the target resource, given as a URI||rdf:about of class of range resource|
|xlink:arcrole||Role or purpose of the target resource in relation to the present resource, given as a URI||rdf:about of object property linking domain element to range resource|
|xlink:title||Text describing the association or the target resource||rdfs:comment|
|rdfa:about||The identification of the resource (to state what the data is about)||rdf:about of domain resource|
|rdfa:typeof||RDF type(s) to associate with a resource||rdf:about of class of a resource|
|rdfa:href||Partner resource of a relationship ('resource object')||rdf:about of range resource|
|rdfa:property||Relationship between a subject and some literal text ('predicate')||rdf:about of datatype property|
|rdfa:rel||Relationship between two resources ('predicate')||rdf:about of object property|
|rdfa:rev||Reverse relationship between two resources ('predicate')||rdf:about of (inverse) object property|
|rdfa:src||Base resource of a relationship when the resource is embedded ('resource object')||rdf:about of domain resource|
|rdfa:resource||Partner resource of a relationship that is not intended to be 'clickable' ('object')||rdf:about of range resource|
|rdfa:datatype||Datatype of a property||XML type range of datatype property|
|rdfa:content||Machine-readable content ('plain literal object')||Value for datatype property|
Below is a comparison of the capabilities of XLink and RDFa to express types defined in RDF. In this respect, as the table demonstrates, RDFa is more expressive than XLink.
|Domain Instance||rdfa:about or rdfa:src|
|Inverse Object Property||rdfa:rev|
|Range Instance Object Property||xlink:href||rdfa:href or rdfa:resource|
|Range Value||rdfa:content or element content|
Semantic Markup for OGC standards
The Semantic Markup approach proposed below leverage two common features of the Sensor Web Enablement (SWE) languages, inherited from the design conventions originally defined for the Geography Markup Language [GML 2007]: its basic structure with nested resource-property pairs and the use of XLink as a linking mechanism to externally managed resources.
The overall compatibility of the design principles applied in GML with RDF is now acknowledged ([Schade and Cox 2010]). The OGC community is now aware that "only minor changes to current SDI standards" are required to allow the augmentation of Spatial Data Infrastructure with Linked Data ([Schade et al. 2010]). Some of these changes have already been engaged, e.g. the replacement of the Uniform Resource Names (URNs), previously mandated for the identification of resources governed by OGC, by HTTP URIs ([Cox 2010])
Current use of XLink in OGC standards
The OGC was an early adopter of XLink. Traditionally, however, the OGC has defined the use of XLink annotation as a composition by inclusion of remote resources. This definition regards annotation as a pointer to a remote resource such that the description is deferred. While this definition is useful for pointing to concepts in an ontology, it does not allow use of annotation for adding information to an existing resource description. To add further complexity to the situation, there are several (subtly) different usages of XLink in sub-communities of OGC. The GML specification authorizes four variants on the use of XLink:
- A reference to an object element in the same GML document may be encoded as:
- A reference to an object element in a remote XML document using the gml:id value of that object may be encoded as:
- A reference to an object element in a remote XML document (or GML object repository) using the gml:identifier property value of that object may be encoded as:
<myProperty xlink:href="http://my.big.org/test.xml#element(//gml:GeodeticCRS[./gml:identifier[ @codeSpace="urn:x-ogc:def:crs:EPSG:6.3:"]="4326"])"/>
- A reference to an object element with a uniform resource name may be encoded as follows (note that a URN resolver is required to resolve the URN and access the referenced object):
These four uses of XLink correspond to the definition of XLink annotation as a composition by inclusion of remote resources. Within this framework xlink:href is used to point to a target instance and xlink:role is used to point to its nature or type (e.g. to handle unknown features in this example). Describing XLink annotation as a semantic annotation, xlink:href would be used to point to an individual in an ontology and xlink:role would be used to point to a class in an ontology.
We recommend the use of XLink for providing semantic annotations to sensor data. While, as demonstrated above, RDFa is more expressive, XLink is already widely used within the sensor web community -- in particular the Open Geospatial Consortium. Therefore, XLink provides a low barrier to entry and requires no change to the existing Sensor Web Enablement specifications.
|xlink:href||link to ontology individual||rdf:about of range resource|
|xlink:role||link to ontology class||rdf:about of class of range resource|
|xlink:arcrole||link to ontology object property||rdf:about of object property linking domain element to range resource|
Currently, xlink should only be used to annotate property nodes within the SWE XML languages; the latest specification does not cover issues dealing with "nested" annotations, or semantic annotations embedded within inner nodes of the XML tree. The use of xlink for this type of semantic annotation is being considered as a future extension.
The recommendation issued here does not cover all the use cases identified in [Maue et al. 2009] especially the opportunity to annotate service descriptions discussed in [Lefort 2009 ] or the two-step mapping annotation mechanism investigated by the SAPIENCE project.
The Review of semantic annotation proposals and [group discussion on Semantic Markup] also list specific issues like the migration from Uniform Resource Names (URNs) to HTTP URIs representing semantic web resources.
The examples provided here are based on standards which fully apply the GML design principles: Observations and Measurements (O&M) and Sensor Observation Service (SOS).
Sensor Observation Service (SOS) GetCapabilities: The following example illustrates an observation offering with sensor rain_gauge_sth_esk_up_esk_rd_bridge that measures thickness_of_rainfall_amount
<swes:offering xlink:role="http://purl.oclc.org/NET/ssnx/ssn#Observation" xlink:arcrole="http://www.loa-cnr.it/ontologies/DUL.owl#hasSetting> <sos:ObservationOffering> <swes:procedureIdentifier xlink:role="http://purl.oclc.org/NET/ssnx/ssn#SensingDevice" xlink:href="http://purl.oclc.org/NET/ssnx/ssn-dev#rain_gauge_sth_esk_up_esk_rd_bridge" xlink:arcrole="http://purl.oclc.org/NET/ssnx/ssn#observedBy"> http://csiro.au/sw/rain_gauge_sth_esk_up_esk_rd_bridge </swes:procedureIdentifier> <swes:observableProperty xlink:href="http://purl.oclc.org/NET/ssnx/cf/cf-property#thickness_of_rainfall_amount" xlink:arcrole="http://purl.oclc.org/NET/ssnx/ssn#observedProperty" xlink:role="http://purl.oclc.org/NET/ssnx/qu/dim#Distance"/> <sos:phenomenonTime xlink:role="http://www.w3.org/2006/time-entry#Interval"> xlink:arcrole="http://purl.oclc.org/NET/ssnx/ssn#observationTime" <gml:TimePeriod gml:id="phenomenonTime11"> <gml:beginPosition xlink:role="http://www.w3.org/2006/time-entry#begins" xlink:arcrole="http://www.w3.org/2001/XMLSchema#time"> 2001-01-11T16:22:25.00 </gml:beginPosition> <gml:endPosition xlink:role="http://www.w3.org/2006/time-entry#ends" xlink:arcrole="http://www.w3.org/2001/XMLSchema#time"> 2005-10-18T19:54:13.000Z </gml:endPosition> </gml:TimePeriod> </sos:phenomenonTime> </sos:ObservationOffering> </swes:offering>
Sensor Discovery Example
The following provides a more comprehensive example of the annotation of a SOS GetCapabilities document useful for sensor discovery.
Description - Find all the sensors that meet certain criteria. While all (or most) criteria necessary for sensor discovery can be found in a SensorML document, parsing through hundreds or thousands of XML documents to find a sensor matching a set of criteria would be terribly inefficient. Therefore, like finding most resources on the Web, sensor discovery will likely occur through Web services rather than document searches. The Sensor Observation Service [SOS 2008] is the prominent service within the OGC Sensor Web Enablement for searching and accessing sensor data. The Sensor Observation Service, however, currently has no method for finding relevant sensors and encourages the use of catalogue services for this task. However, by semantically annotating the GetCapabilities document of an SOS, we can provide the ability to discover relevant sensors through several criteria of interest for this use-case (i.e., location, property, availability, etc.). The following table details the search criteria for the sensor discovery use-case and the resources that contain this information.
|Within geographic region (location)||yes||yes|
|Range of measurement||yes|
+ Note: The Application Domain criteria is not part of the original Sensor Discovery Use Case description, but is part of the SOS GetCapabilities and may be useful for discovery (e.g., for Water Resource Management).
The SOS DescribeSensor method returns a SensorML document with all the information about a particular sensor, however, you must know about the sensor (i.e., sensor ID) before invoking the method. Therefore, DescribeSensor is not sufficient for finding relevant sensors based on particular attributes. The following table describes the parameters of DescribeSensor:
|outputFormat||The outputFormat attribute specifies the desired output format of the DescribeSensor operation.|
|SensorId||The sensorId parameter specifies the sensor for which the description is to be returned. This value must match the value advertised in the xlink:href attribute of a procedure element advertised in the SOS GetCapabilities response.|
|service||Service type identifier (i.e., SOS)|
|version||Specification version for operation|
The SOS GetCapabilities method, on the other hand, returns a service description that contains several attributes that could be useful for sensor discovery (i.e., location, property, availability, etc.).
|time||Time period for which observations can be obtained. This supports the advertisement of historical as well as real-time observations.||Availability|
|observedProperty||The observable/phenomenon that can be requested in this offering.||Measured phenomenon|
|featureOfInterest||Features or feature collections that represent the identifiable object(s) on which the sensor systems are making observations. In the case of an in-situ sensor this may be a station to which the sensor is attached representing the environment directly surrounding the sensor. For remote sensors this may be the area or volume that is being sensed, which is not co-located with the sensor. The feature types may be generic Sampling Features (see O&M) or may be specific to the application domain of interest to the SOS. However, features should include spatial information (such as the GML boundedBy) to allow the location to be harvested by OGC service registries.||Location|
|intendedApplication||The intended category of use for this offering such as homeland security or natural resource planning||Application Domain|
|procedure||A reference to one or more procedures, including sensor systems, instruments, simulators, etc., that supply observations in this offering. The DescribeSensor operation can be called to provide a SensorML or TML description for each system.|
The following example illustrates an SOS GetCapabilities document describing a service for a weather station. The sensor description is semantically annotated with model references to classes and individuals related to a sensor, observed properties, and a location.
<sos:Contents> <sos:ObservationOfferingList> <sos:ObservationOffering gml:id="urn:ogc:def:procedure:JPEO-CBD::WeatherStation_1"> <gml:name>urn:ogc:def:procedure:JPEO-CBD::WeatherStation_1</gml:name> <gml:boundedBy> <gml:Envelope srsName="urn:ogc:def:crs:EPSG:4326"> <gml:lowerCorner>18.556825 -72.297935</gml:lowerCorner> <gml:upperCorner>18.556825 -72.297935</gml:upperCorner> </gml:Envelope> </gml:boundedBy> <sos:time> <gml:TimeInstant xsi:type="gml:TimeInstantType"> <gml:timePosition indeterminate="now"/> </gml:TimeInstant> </sos:time> <sos:procedure xlink:href="http://purl.oclc.org/NET/ssnx/ssn-dev#WeatherStation_1"/> <sos:observedProperty xlink:href="http://purl.oclc.org/NET/ssnx/cf/cf-property#snow-precipitation"/> <sos:observedProperty xlink:href="http://purl.oclc.org/NET/ssnx/cf/cf-property#temperature"/> <sos:observedProperty xlink:href="http://purl.oclc.org/NET/ssnx/cf/cf-property#windspeed"/> <sos:featureOfInterest xlink:href="http://sws.geonames.org/5248611/"/> <sos:responseFormat>text/xml;subtype="om/1.0.0"</sos:responseFormat> <sos:resultModel xmlns:ns="http://www.opengis.net/om/1.0">ns:Observation</sos:resultModel> <sos:responseMode>inline</sos:responseMode> <sos:responseMode>resultTemplate</sos:responseMode> </sos:ObservationOffering> </sos:ObservationOfferingList> </sos:Contents>
An important benefit of semantically annotating sensor data is the additional expressivity supplied by an ontological representation. For example, if a weather ontology provides definitions for winter storms and their observable properties, then annotating sensor data with concepts (i.e., classes, individuals, and relations) from this ontology enables the ability to reason over such concepts. The GetCapabilities document above represents an offering with a weather station capable of observing three properties. Each property is a semantically annotated with a link to its definition within an ontology. From observations generated by this weather station, a blizzard may be inferred. Such knowledge may then be added to the original GetCapablities document. The GetCapabilities document below illustrates this example. In particular, notice the added sos:featureOfInterest tag which links to a blizzard individual. Within a standard Sensor Observation Service that has been extended with semantic annotations, we have now provided the ablitity to query for high-level weather events, along with low-level weather phenomena.
<sos:Contents> <sos:ObservationOfferingList> <sos:ObservationOffering gml:id="urn:ogc:def:procedure:JPEO-CBD::WeatherStation_1"> <gml:name>urn:ogc:def:procedure:JPEO-CBD::WeatherStation_1</gml:name> <gml:boundedBy> <gml:Envelope srsName="urn:ogc:def:crs:EPSG:4326"> <gml:lowerCorner>18.556825 -72.297935</gml:lowerCorner> <gml:upperCorner>18.556825 -72.297935</gml:upperCorner> </gml:Envelope> </gml:boundedBy> <sos:time> <gml:TimeInstant xsi:type="gml:TimeInstantType"> <gml:timePosition indeterminate="now"/> </gml:TimeInstant> </sos:time> <sos:procedure xlink:href="http://purl.oclc.org/NET/ssnx/ssn-dev#WeatherStation_1"/> <sos:observedProperty xlink:href="http://purl.oclc.org/NET/ssnx/cf/cf-property#snow-precipitation"/> <sos:observedProperty xlink:href="http://purl.oclc.org/NET/ssnx/cf/cf-property#temperature"/> <sos:observedProperty xlink:href="http://purl.oclc.org/NET/ssnx/cf/cf-property#windspeed"/> <sos:featureOfInterest xlink:href="http://sws.geonames.org/5248611/"/> <b><i><sos:featureOfInterest xlink:href="http://purl.oclc.org/NET/ssnx/ssn-dev#Blizzard_1"/></i></b> <sos:responseFormat>text/xml;subtype="om/1.0.0"</sos:responseFormat> <sos:resultModel xmlns:ns="http://www.opengis.net/om/1.0">ns:Observation</sos:resultModel> <sos:responseMode>inline</sos:responseMode> <sos:responseMode>resultTemplate</sos:responseMode> </sos:ObservationOffering> </sos:ObservationOfferingList> </sos:Contents>