W3C W3C Incubator Report

W3C Geospatial Vocabulary

W3C Incubator Group Report 23 October 2007

This version:
http://www.w3.org/2005/Incubator/geo/XGR-geo-20071023/
Latest version:
http://www.w3.org/2005/Incubator/geo/XGR-geo/
Authors:
Joshua Lieberman
Raj Singh
Chris Goad

Abstract

This is a report of the W3C Geospatial Incubator Group (GeoXG) as specified in the Deliverables section of its charter.

In this report we define a basic ontology and OWL vocabulary for representation of geospatial properties for Web resources.

Specifically the report:

The report identifies futher applications of this vocabulary which require additional discussion and specification. The intention is that it form input for a subsequent W3C geospatial activity.

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of Final Incubator Group Reports is available. See also the W3C technical reports index at http://www.w3.org/TR/.

This document was developed by the W3C Geospatial Incubator Group. It represents the consensus view of the group, in particular those listed in the acknowledgements, on the use cases, requirements and general approach that should be taken in meeting the identified needs.  The  vocabulary as presented is a complete update to the 2003 geo vocabulary and is recommended to supercede it, but does retain backwards compatibility with the essential properties of the earlier vocabulary. Earlier informal drafts of this report are archived.

Publication of this document by W3C as part of the W3C Incubator Activity indicates no endorsement of its content by W3C, nor that W3C has, is, or will be allocating any resources to the issues addressed by it. Participation in Incubator Groups and publication of Incubator Group Reports at the W3C site are benefits of W3C Membership.


Table of Contents


1 Introduction

The geospatial incubator group was chartered to begin addressing issues of location and geographical properties of resources for the Web of today and tomorrow, by taking a concrete step to update the W3C GEO vocabulary, laying the groundwork for a more comprehensive geospatial ontology, and formulating a proposal for a W3C  Working Group to develop recommendations to further the Web representation of physical location and geography.

The Incubator's work has been greatly influenced by the work of the Open Geospatial Consortium (OGC), ISO/TC 211, and georss.org. While the rigor of the OGC and ISO/TC 211 General Feature Model is essential for clarity of spatial representations, the breadth and depth of geographic information handling developed by those organizations is considered to be beyond the needs of most Web use cases. The Incubator has followed the lead of GeoRSS in seeking to complement those efforts with a simpler baseline implementation of geospatial resource description for the Web.

A set of use cases demonstrates the aims in more detail. A set of high level requirements was derived from the use cases that were then formalized for the work presented in this report. A model has been developed that encapsulates the issues discussed and discovered during the XG's work. Comments are also made on possible system architectures for geotag use, and a detailed glossary is provided. Throughout the Incubator Activity, decisions have been taken via consensus during regular telephone conferences, online collaboration, and face to face meetings.

The Incubator Group is now considering the approach of re-forming as a W3C geospatial interest group to further the activity of developing geospatial foundation vocabularies and considering the geospatial aspects of other W3C activities.

1.1 Examples

For continuity, these examples follow the general pattern of those from the Basic Geo Vocabulary.

A basic example showing one way to assign location to a person that combines Geo and FOAF vocabularies:

<rdf:RDF xmlns="http://xmlns.com/foaf/0.1/" 
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:geo="http://www.georss.org/georss/">
<Person>
<name>Josh Lieberman</name>
<geo:point>42.34 -71.21</geo:point>
</Person>
</rdf:RDF>

A similar example using Geo's GML syntax that increases the semantic meaning of the location using FOAF's based_near concept:

<rdf:RDF xmlns="http://xmlns.com/foaf/0.1/" 
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:gml="http://www.opengis.net/gml/">
<Person>
<name>Josh Lieberman</name>
<based_near>
<gml:Point>
<gml:pos>42.34 -71.21</gml:pos>
</gml:Point>
</based_near>
</Person>
</rdf:RDF>

An example of geo-coding with RSS 1.0:

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:geo="http://www.georss.org/georss/"
xmlns="http://purl.org/rss/1.0/">

...

<item rdf:about="http://example.com/geo">
<title>A walk in the park</title>
<link>http://example.com/geo</link>
<description>Just an example</description>
<geo:line>40.73158 -73.999559 40.732188 -73.999079 40.732688 -74.0234</geo:line>
</item>

...

</rdf:RDF>

1.2 Detailed Requirements

Based on the use cases and the original high level requirements that were derived from them, a set of more detailed requirements was established. The group decided to adopt the GeoRSS feature model, allowing the description of rectangles, points, lines, and polygons as geometric representation properties of discerned geographic features. This is slightly different and substantially reduced from the OpenGIS® Simple Features model, which is widely used in spatial databases. Simple Features' geometric representations consist of point and multipoint, curve (a.k.a. line) and multicurve, surface (polygon) and multisurface. Note that Simple Features does specify a rectangle. A rectangle is a specialization of a surface, but for clarity and simplicity this group felt it worthwhile to follow the example of georss.org and define it separately. In all cases, however, the essential nature of the ISO General Feature Model is preserved, which separates the discernment of a feature object such as a city from the particular coordinate geometry property such as a point or a polygon by which it may be represented.

At this point, the Incubator has completed work on the geometric model described above, along with its instantiation in RDF and OWL. The group was not able to get as far with a model for spatial relationships. The most common spatial relationships in use are equals, disjoint, intersects, touches, crosses, within, contains, and overlaps. These can be tremendously useful in many of the group's use cases. For example, in use case 4 the researcher wants to find county-level recycling programs. One would need to describe the fact that a recycling program is within a particular county; or that a semantic model of a facility would want to describe that a printer was within a certain room, and that the room touches hallway A-14. While these core spatial relationships are well established within the geographic sciences, this group was not able to validate them as a necessary and sufficient representation in the Web context, nor develop specific semantic encodings in the initial time frame.

In summary, the Geo XG has successfully developed a basic geography model that can update the W3C GEO vocabulary, as stated in its charter. The group identified a core set of spatial relationships as well as other significant geospatial ontology components or categories, but implementation of these in W3C is an item for future work.

1.3 Participants

The companies and organizations that participated in or supported GEO XG are as follows:

  • Open Geospatial Consortium
  • SRI
  • USC ISI
  • Stanford University
  • Oracle Corporation
  • Traverse Technologies
  • Platial
  • High Earth Orbit

2 The GeoRSS Model

Such a discussion leads us to the GeoRSS Feature Model as shown in Figure 1. The model provides a general feature property which can be used to characterize any appropriate content as a geographic feature. Specific subproperties such as <where> associate the discerned feature with one of a limited number of  geometry types which provide a numerical representation for analysis and visualization. Other subproperties describe additional commonly used feature attributes such as feature name and feature type. 

GeoRSS Feature Model

While this model is consistent in essence with ISO standards, it supports a subtle difference in emphasis, providing a Web-like feature view or aspect to existing content rather than a database-like content-specialized subclass to an existing abstract feature. The latter is suitable and effective for many geographic information applications but is too constrained for working with Web resources in general.

Specific model objects shown in Figure 1 are described below

<where>

Ths is the general geometry property for GeoRSS GML. The GeoRSS Simple properties combine this property with the corresponding geometry and coordinate property to produce single-tag geometry properties.

<point>

A point contains a single coordinate pair. The coordinate pair contains a latitude value and a longitude value in that order. The preferred serialization of this uses a space to separate the two values.

<line>

A line contains two or more coordinate pairs. Each pair contains a latitude value and a longitude value in that order. The preferred serialization of this uses a space to separate the two values. Pairs are separated from each other by a space.

<polygon>

A polygon contains at least four coordinate pairs. Each pair contains a latitude value and a longitude value in that order. The preferred serialization of this uses a space to separate the two values. Pairs are separated from each other by a space. The last coordinate pair must be identical to the first.

<box>

A box contains exactly two coordinate pairs. Each pair contains a latitude value and a longitude value in that order. The preferred serialization of this uses a space to separate the two values. Pairs are separated from each other by a space. The first coordinate pair (lower corner) must be a point further west and south of the second coordinate pair (upper corner) and the box is always interpreted as not containing the 180 (or -180) degree longitude line other than on its boundary and not containing the North or South pole other than on its boundary. A box is generally used to roughly demarcate an area within which other data lie.

<featuretypetag>

GeoRSS geometry is meant to represent a real feature of the Earth's surface. The GeoRSS model allows for a single string containing a featuretypetag. No constraints are placed on this string. The intent is to allow a Feature Type folksonomy to emerge. The default is "location".

<featurename>

A GeoRSS geometry may represent a well-known feature with a well-known name. The GeoRSS model allows for a single string containing a feature name. 

<relationshiptag>

GeoRSS is a way of relating Web content to Earth features. The GeoRSS model allows for a single string containing a relationshiptag. No constraints are placed on this string. The intent is to allow a relationship folksonomy to emerge.The default relationship, "is-located-at" simply indicates that the subject of the content is located at the GeoRSS feature.

<elev>

Elev is meant to contain "common" GPS elevation readings, i.e. height in meters from the WGS84 geoid, which is a reading that should be easy to get from any GPS device.

<floor>

Floor is meant to contain the floor number of a building. In some countries the numbering is different than in other countries, but since we'll know the location of the building, it should be fairly unambiguous.

<radius>

Since a GeoRSS geometry is only one representation of a given feature, coordinate precision may not give a clear idea of the precision of representation. The radius property is a distance in meters expressing that precision, e.g. within 1000 meters of the given point rather than exactly on the point.

3 Serializations and Encodings

As shown in Figure 1, GeoRSS feature properties may be applied to a range of content in a variety of Web resource contexts. This has led to development of alternative serializations. Specifically, GeoRSS Simple collapses the object structure of GeoRSS GML to support single-element feature properties but is exactly equivalent in meaning to the GeoRSS GML serialization. 

The primary encoding presented in this report is OWL RDF but the definition of this encoding has been intended to support XML elements as close as possible to those defined by the GeoRSS XML Schema  which in turn has been developed for use with RSS and ATOM

3.1 GeoRSS XML

For an XML encoding of the GeoRSS model, an XML Schema definition is contained in two files. The first file is a strict and valid GeoRSS GML profile which includes only the elements of the larger GML schema required to support GeoRSS GML. This schema profile is a convenience only: by definition any XML document which validates against the profile schema will also validate against the full schema. Nevertheless it is useful for constraining what GML elements are compatible with the GeoRSS model. The latest version will be available from georss.org and OGC for this purpose.

The second file defines those GeoRSS XML elements belonging to georss itself which in turn derive from or include GML elements from the profile schema. The latest version of this schema will be available from georss.org.

Perhaps needless to say, but important to mention, XML Schema is not sufficiently expressive to fully constrain the expresssions and usage of GeoRSS XML, particularly within RSS and Atom but also in other contexts. The full definition of GeoRSS XML includes text explanations maintained at georss.org. 

3.2 Geo OWL

Geo OWL provides an ontology which closely matches the GeoRSS feature model and which utilizes the existing GeoRSS vocabulary for geographic properties and classes. The practical consequence is that fragments of GeoRSS XML within RSS 1.0 or Atom which conform to the GeoRSS specification will also conform to the Geo OWL ontology (front-matter aside). Thus, the ontology provides a compatible extension of GeoRSS practice for use in more general RDF contexts.

The ontology consists of a root property _featureproperty which takes as its domain any OWL/RDF class that it makes sense (after ISO 19109) to cast as a geographic feature. The property _featureproperty  has a series of subproperties. A particular subproperty is geo:where which takes as its range the abstract class _geometry.

Subclasses of _geometry include gml:Point, gml:Linestring, gml:Polygon, and gml:Envelope after the corresponding GML objects. The properties of these classes are a subset of the corresponding properties defined in the GML model and schema. This represents GeoRSS GML.

Other subproperties of geo:where represent GeoRSS Simple and include geo:Point, geo:Line, geo:Polygon, geo:Circle, and geo:Box. These properties each take a literal list of doubles as their range, but are equivalent in definition to (are a shorthand for) geo:where plus the corresponding GeoRSS GML classes and their properties.

For backwards compatibility, geo:lat and geo:long are retained as subproperties of geo:where, but are taken together as the equivalent of geo:where plus gml:Point plus gml:pos, or of geo:Point.

Another set of subproperties of _featureproperty further define the "featureness" of whatever class the geometry properties are applied to. They include geo:featurename, geo:featuretype, geo:relationship, geo:elev, geo:floor, and geo:radius. The nominal ranges of the first three properties are literal strings (for the latter three, doubles), but are envisioned to represent or evolve first into "folksonomies" and later into more formal ontology concepts.

The Geo OWL vocabulary is nominally classified as OWL Lite, but what this designation means for decidability in terms of spatial reasoning is at present uncertain.

4 Summary

The Geospatial Incubator Group began with a simple mission and a likely candidate vocabulary. 

The open questions raised throughout the XG process as reported in this document are collated and presented below in no particular order of priority.


5 Glossary

The following terms are used throughout this report. Definitions have been collected from W3C glossaries where possible and provided a priori where necessary.

Assertion Any expression which is claimed to be true. [W3C definition source]

Category A thematically-related sub-group of terms within a vocabulary.

Classification A specialization of a description; one that is pre-defined .

Resource Anything that might be identified by a URI. [W3C definition source]

Schema (pl., schemata) A document that describes an XML or RDF vocabulary. Any document which describes, in a formal way, a language or parameters of a language. [ W3C definition source]

Vocabulary A collection of vocabulary terms, usually linked to a document that defines the precise meaning of the descriptors and the domain in which the vocabulary is expected to be used. When associated with a schema, attributes are expressed as URI references. [This definition is an amalgam of those provided in Composite Capability/Preference Profiles (CC/PP): Structure and Vocabularies 1.0 and OWL Web Ontology Language Guide.]

Vocabulary term An attribute that can describe one or more resources using a defined set of values or data type. Attributes may be expressed as a URI reference. See also descriptor and expression.

Well-formed Syntactically legal. [W3C definition source]

6 Links and References

Dublin Core
http://dublincore.org/
Georss.org
http://www.georss.org/
OWL Time
Defined in the http://www.isi.edu/~pan/OWL-Time.html
vCard
IMC specification; W3C Note on encoding vCard in RDF/XML
RDF
http://www.w3.org/RDF
ATOM
http://www.atomenabled.org/developers/syndication/atom-format-spec.php
RSS 1.0
http://web.resource.org/rss/1.0/spec
RSS 2.0
http://cyber.law.harvard.edu/rss/rss.html
RDFa
http://www.w3.org/TR/xhtml-rdfa-primer/
GRDDL
Gleaning Resource Descriptions from Dialects of Languages
SPARQL
SPARQL Query Language for RDF

7 Acknowledgements

The editors acknowledge significant contributions from:


Appendix

Original Use Cases

The demand for flexible and powerful geospatial enablement of the Web is exemplified in the following use cases.

Use case 1: Find stuff nearby

Web publishers have tagged their HTML content with a variety of standard geographic properties, including absolute geometries, well-known placenames, street addresses, and geospatial domain addresses. Internet search engines have translated and indexed these geospatial properties according to location and content relationship. Web user Harold shares his location in a search request for available sports-related resources within 15 minutes travel time. An initial search for nearby transportation uncovers roads, trails, and a commuter rail line which define a travel time envelope. A second search finds a number of Web pages which refer to sports-related resources within the envelope. The resources include a sports bar within walking distance and the segment of a lake shore recreation area within driving distance. It does not include the travel blog of Maude, a former professional triathlete sitting at a cafe nearby, because the current blog entry is tagged by a geospatial domain name which can only be resolved to an absolute location by requests from an identified group of friends or emergency response organizations. Since the local time of the search is 9:15 pm and the lake park closes at 9 pm, the home page of the sports bar is listed first.

Use case 2: News of the world

Web news services provide their stories in the form of GeoRSS feeds. Sven at UNHCR is tasked with monitoring both new and known areas for refugee issues. He utilizes an aggregator service which plots on a world map the locations of public news items which also reference refugee issues. Sven's GeoRSS client also allows him to visualize private news feeds of current UNHCR activities and available relief resources. Sven is able to use several map visualization techniques to look at the combined distribution and nature of events referenced by the public and private news feeds. Clicking on a particular entry, he brings up that entry's source news story or internal report. Once he has identified a significant collection of events and commented on it, he saves a Web map context document (WMC) with GeoRSS annotations, specific Web Map Server requests, and general map tile references to his weblog. UN colleagues who subscribe to Sven's weblog feed receive a GeoRSS news item outlining his area of interest and follow it to bring up the news map he has constructed for them.

Use case 3: New knowledge from old geography

A new educational initiative has published to the Web in geo-enabled form the results of many years of scientific and cultural study related to Breechcloth National Monument. Joe, a Park Service volunteer organizes virtual tours by publishing Web pages which reference those Web resources related to a particular theme along popular hiking trails. Mary, a park visitor, is able to assemble her own personal tours by drawing a path of interest on a visitor center kiosk and searching for resources of a particular time and theme of interest. Since the wireless connectivity in the Monument is not yet widespread, she downloads the tour into her GPS-equipped phone to take along. Her personal tour includes geoweblog entries and photos posted by visitors two years previous at a time when heavy rains caused many unusual plants to bloom along her chosen (and now quite dusty) path. Another tour resource is a page describing the site of a rare archaeological find. Mary is able to view the photographs and drawings on her phone, but the public page is only tagged with a rather large bounding box to reduce the risk of a visitor finding and damaging the site itself. Park personnel and researchers have access to a separate page tagged with the actual GPS coordinates of the site.

Use case 4: Follow the geography

Alice is preparing a grant proposal to support a new recycling initiative in Nepotist County. She wants to research county-level recycling programs worldwide. Firing up her semantic search client, she initiates a SPARQL query which includes among others the concepts of "county", "spatialScaleOf", and "recycling". Referencing a geospatial ontology, the query agent infers further geospatial concepts such as county instances and the names of county equivalents such as "parish" within the state of Louisiana. Inferred queries are passed on to other query agents which resolve county locations and synonyms, as well as concepts related to "recycling" such as "waste disposal", "sanitation", and "reuse". Filter agents reason on "spatialScaleOf" to eliminate discovered knowledge which is too limited in scale. Semantic similarity analysis finally returns to Alice information about a recycling program only two counties over which is a good model for her proposal but has been sparsely documented as "regional resource recovery". The query agent also processes her personal context with the query and returns unexpected references to two foundations with new programs to fund combined recycling and clean government initiatives.

Use case summary

These use cases serve to illustrate that tagging Web pages with latitude-longitude coordinates is only a starting point to the geospatial representations, relationships, resources, and interfaces which will form the functional basis of the Local Web.

Requirements

The following requirements have been approved by the group.

  1. A consistent vocabulary is needed to identify geospatial aspects of a variety of Web resources.

  2. The geospatial vocabulary should be consistent with OGC and ISO standards and the General Feature Model, but simple enough for widespread application.

  3. The vocabulary should describe geospatial aspects of existing resources and not require their reformulation as geodata.

  4. The vocabulary should include both highly constrained single-term properties for ubiquitous use and more configurable terms from GML for more specialized uses.

  5. The vocabulary should be serialized both in XML and as an equivalent OWL ontology for greatest flexibility.

  6. The vocabulary should be backwards compatible (i.e. include the most commonly used terms) from the 2003 geo vocabulary.

  7. The vocabulary should be complete in itself but include points of extension for future formalization of additional spatial concepts and knowledge.