Spatial Data on the Web Best Practices

Abstract

This document advises on best practices related to the publication of spatial data on the Web; the use of Web technologies as they may be applied to location. The best practices presented here are intended for practitioners, including Web developers and geospatial experts, and are compiled based on evidence of real-world application. These best practices suggest a significant change of emphasis from traditional Spatial Data Infrastructures by adopting an approach based on general Web standards. As location is often the common factor across multiple datasets, spatial data is an especially useful addition to the Web of data.

RDF namespaces used in the document
Prefix	Namespace IRI	Source
`admingeo`	http://data.ordnancesurvey.co.uk/ontology/admingeo/	Ordnance Survey's Administrative geography and civil voting area ontology
`adms`	http://www.w3.org/ns/adms#	Asset Description Metadata Schema (ADMS)
`bag`	http://bag.basisregistraties.overheid.nl/def/bag#	Dutch Government Base Registry Adressen en Gebouwen (BAG)
`dcat`	http://www.w3.org/ns/dcat#	Data Catalog Vocabulary (DCAT)
`dcterms`	http://purl.org/dc/terms/	Dublin Core Metadata Initiative (DCMI) Metadata Terms
`dqv`	http://www.w3.org/ns/dqv#	DWBP Data Quality Vocabulary (DQV)
`foaf`	http://xmlns.com/foaf/0.1/	FOAF Vocabulary Specification
`geom`	http://data.ordnancesurvey.co.uk/ontology/geometry/	Ordnance Survey's Geometry Ontology
`geonames`	http://www.geonames.org/ontology#	GeoNames Ontology
`georss`	http://www.georss.org/georss/	GeoRSS :: Geographically Encoded Objects for RSS feeds, Geo OWL encoding
`geosparql`	http://www.opengis.net/ont/geosparql#	GeoSPARQL - A Geographic Query Language for RDF Data
`gml-ont`	http://www.opengis.net/ont/gml#	GeoSPARQL - A Geographic Query Language for RDF Data
`locn`	http://www.w3.org/ns/locn#	ISA Location Core Vocabulary
`osuk`	http://data.ordnancesurvey.co.uk/id/	Ordnance Survey Linked Data Platform
`ov`	http://open.vocab.org/terms/	Open.vocab.org
`owl`	http://www.w3.org/2002/07/owl#	Web Ontology Language (OWL)
`pdok`	http://data.pdok.nl/def/pdok#	PDOK Data Platform
`qudt`	http://qudt.org/schema/qudt#	Quantities, Units, Dimensions and Data Types Ontologies (QUDT)
`rdf`	http://www.w3.org/1999/02/22-rdf-syntax-ns#	Resource Description Framework (RDF)
`rdfs`	http://www.w3.org/2000/01/rdf-schema#	RDF Schema vocabulary (RDFS)
`schema`	http://schema.org/	Schema.org
`scotgov-stat`	http://statistics.gov.scot/id/statistical-geography/	STATISTICS.GOV.SCOT Geography Linked Data
`sf`	http://www.opengis.net/ont/sf#	GeoSPARQL - A Geographic Query Language for RDF Data
`skos`	http://www.w3.org/2004/02/skos/core#	Simple Knowledge Organization System (SKOS)
`ukgov-stat`	http://statistics.data.gov.uk/id/statistical-geography/	Office for National Statistics Geography Linked Data
`vcard`	http://www.w3.org/2006/vcard/ns#	vCard Ontology - for describing People and Organizations
`void`	http://rdfs.org/ns/void#	Describing Linked Datasets with the VoID Vocabulary
`w3cgeo`	http://www.w3.org/2003/01/geo/wgs84_pos#	Basic Geo (WGS 84 lat/long) Vocabulary

XML namespaces used in the document
Prefix	Namespace IRI	Source
`bagwfs`	http://bag.geonovum.nl	XML schema for the Dutch Government Base Registry Adressen en Gebouwen (BAG)
`gml`	http://www.opengis.net/gml/3.2	Geography Markup Language (GML) Encoding Standard
`sam`	http://www.opengis.net/sampling/2.0	Observations and Measurements - XML Implementation
`sams`	http://www.opengis.net/samplingSpatial/2.0	Observations and Measurements - XML Implementation
`wml2`	http://www.opengis.net/waterml/2.0	WaterML 2.0 Encoding Standard
`xlink`	http://www.w3.org/1999/xlink	XML Linking Language (XLink) Version 1.1

12. The Best Practices

12.1 Web principles for spatial data

Spatial data, like any other data, should be published on the Web. By this we mean more than providing spatial data file downloads or services; for data to be on the Web, the resources it describes need to be identified using HTTP URIs, be published in such a way that they are indexable by search engines, and be connected, or linked, to other resources. This makes the data easy to find and easy to access for non-specialist users: the spatial data becomes integrated within the wider Web of data.

12.1.1 Spatial data identifiers

As a first step in publishing your spatial data on the Web, you should assign a URI to each of your datasets (see [DWBP] Best Practice 9: Use persistent URIs as identifiers of datasets).

Note

Deciding whether your spatial data is a single dataset or not is somewhat arbitrary. To decide this, it is often useful to consider attributes such as the license under which the data will be made available, the refresh or publication schedules, the quality of the data and the governance regime applied in managing the data. Typically, all of these attributes should be consistent within a single dataset.

[VOCAB-DCAT] provides a useful definition of dataset that supports this approach: “A collection of data, published or curated by a single agent, and available for access or download in one or more formats.”

However, we need to look inside the datasets at the resources described within your data. If you want these resources to be visible within the Web’s information space, by which we mean that others can refer to or talk about those resources, then they must also be assigned URIs (see [DWBP] Best Practice 10: Use persistent URIs as identifiers within datasets). These URIs are like 'Web-scale foreign keys' that enable information from different sources to be stitched together.

The primary topics of any spatial dataset are Spatial Things — anything from physical things like people, places and post boxes to abstractions such as administrative areas. Each Spatial Thing will be described by a set of attributes and usually at least one geometry. How your spatial data is structured will depend on the vocabulary or data model you use (see section 12.2.1 Spatial data encoding for further details on vocabulary choice). This will determine the types of entities that, along with the Spatial Things themselves, are important enough to be given identifiers so that statements can be made about them. Geometry objects are an example of an entity that is often assigned a unique identifier so that they can be referenced or reused.

Given the widespread use of the Hyper Text Transfer Protocol (HTTP) on the Web, we SHOULD use HTTP URIs to identify resources in spatial data.

This is a fundamentally different approach to that of typical data publication today — where the dataset is (often) globally identified, but individual Spatial Things ( "features" in SDI parlance), are assigned local identifiers which may, or may not, be persistent.

Note

We consider identifiers in the Web’s information space to be unaffected by the choice to serve HTTP content securely or not. For example, http://example.org/country/suriname and https://example.org/country/suriname both identify the same Spatial Thing - in this case the South American country of Suriname.

Best Practice 1: Use globally unique persistent HTTP URIs for Spatial Things

Use stable HTTP URIs to identify Spatial Things, re-using commonly used URIs where they exist and it is appropriate to do so.

Why

To publish spatial data on the Web, we need to stitch the Spatial Things and their corresponding entities into the Web’s information space; contributing to the Web of data. First: [WEBARCH] Good Practice: Identify with URIs states that "agents should provide URIs as identifiers for resources". Second: the 5 Star Data scheme states: "★★★★ use URIs to denote things, so that people can point at your stuff".

Resources identified with HTTP URIs can be specified as the target of links within the Web’s global information space, enabling information to be related, combined and referred to. This is the fundamental basis of 5★ Linked Data: "★★★★★ link your data to other data to provide context".

The HTTP URIs used to identify Spatial Things need to be stable or persistent so that relationships that link them to other resources don’t break.

Intended Outcome

Spatial Things become part of the Web’s global information space enabling them be linked with other Spatial Things and other resources and for those links to be durable. In other words, spatial data becomes part of the Web of Data.

Possible Approach to Implementation

[DWBP] Best Practice 10: Use persistent URIs as identifiers within datasets provides directly applicable guidance when identifying resources. It advises:

Seek and reuse existing URIs, ensuring that the URIs are persistent and they are published by a trusted group or organization; or
Create your own persistent URIs.

However, we need to look a little more closely at how and where to apply that guidance.

The Web of data is made up of subjects and objects; the things we talk about and the things we refer to. For example, we could say that Anne Frank's House (the subject) is within the Municipality of Amsterdam (the object). In RDF [RDF11-PRIMER], this looks like:

<https://g.co/kg/m/02s5hd> schema:containedInPlace <http://sws.geonames.org/2759793/> .

When considering HTTP URIs for objects (e.g. the target of our hyperlinks) it makes sense to reuse existing identifiers. After all, you are trying to stitch your spatial data into the Web so that we can "link your data to other data" and achieve a ★★★★★ rating! Organizations such as DBPedia, GeoNames and government mapping and cadastral authorities (that publish national registers of addresses, buildings, etc.) are good sources of stable, authoritative URIs. The steps described for discovering existing vocabularies [LD-BP] can be readily adapted to find more. For more details about how you might link to these authoritative identifiers, see section 12.1.3 Linking data.

However, HTTP URIs for subjects (e.g. the resource that we want to make statements about) can be trickier. If you are working purely with data then you can reuse existing URIs minted by other authorities for your subject URIs. But publishing spatial data on the Web means that the URIs for each Spatial Thing should dereference to Web pages or data resources that provide useful information (see Best Practice 2: Make your spatial data indexable by search engines). An HTTP request will be directed to a host Web server, identified by the internet domain name (or IP address) in the requested URI. If you use a URI with an internet domain name where you have no control over how the Web server behaves, then there is no way for your statements to be included in the Web server's response.

To take control of how information about Spatial Things is presented, data publishers need to assign their subject Spatial Things HTTP URIs from an internet domain name where they have authority over how the Web server responds. Typically, this means minting new HTTP URIs. It's all worth considering that the use of a particular internet domain may reinforce the authority of the information served. For example, a URI for Anne Frank's House is: https://monumentenregister.cultureelerfgoed.nl/monuments?MonumentId=4296. The use of the internet domain registered to the Cultural Heritage Agency of the Netherlands gives the definition authenticity.

Note

The need to control what information is provided about a given Spatial Thing means that it is not uncommon for a Spatial Thing to be identified by multiple HTTP URIs. The equality between two URIs that refer to the same resource can be stated using a property such as owl:sameAs. Care must always be taken when using owl:sameAs to determine that the two URIs actually refer to the same resource, rather than two resources that are similar. Warning: don't say it if you're not sure it's true!

For more information about the types of properties that can be used to link between Spatial Things, and between Spatial Things and other resources, see section 12.1.3 Linking data.

When minting your own URIs, [DWBP] Best Practice 10: Use persistent URIs as identifiers within datasets cites the advice from GS1's SmartSearch Implementation Guideline [GS1] which suggests that your URIs should include the type of resource that is being identified to help human readability. Also, given the need for the HTTP URIs for Spatial Things to be used throughout their lifetime (and perhaps beyond) you should give some thought to designing a URI that is persistent.

Example 4

This URI identifies the Amsterdam Central train station:

https://brt.basisregistraties.overheid.nl/top10nl/id/gebouw/102625209

This URI was minted using the recommendations in the Dutch URI strategy. Although minted by the Kadaster, they chose to use the domain ‘basisregistraties.overheid.nl’ (which translates to ‘base registries . government . nl’) because this is expected to be a more persistent name than ‘kadaster.nl’. Even though the Kadaster is over a 100-years old, organization names are not considered persistent in general as organizations may merge or their names may change. ‘top10nl’ is the name of the dataset, and ‘gebouw’ means ‘building’ – giving the human reader of this URI a clue of what is being identified. The last part of the URI is the building number from the dataset.

[DWBP] Best Practice 9: Use persistent URIs as identifiers of datasets cites the European Commission's Study on Persistent URIs [PURI] as a good source from which to gain insights about designing persistent URIs.

When an HTTP URI is dereferenced, the server will respond with a sequence of bytes: by its nature, HTTP can only serve information resources such as Web pages or JSON [RFC7159] documents. Yet a Spatial Thing is actually a real or conceptual phenomenon - a lake is made from water not information! Using a single URI to refer to both the Spatial Thing and the page/document that describes the Spatial Thing introduces a URI collision. This can impose a cost in communication due to the effort required to resolve ambiguities. [URLs-in-data] has more to say on this subject, including recommending URI design patterns that enable differentiation between the Spatial Thing and the page/document that describes it.

However, in most cases using a single URI for both Spatial Thing and the page/document is simpler to implement and meets the expectations of most end-users. As stated in [WEBARCH] section 2.2.3 Indirect Identification, identifiers are commonly used in this way. There is no obligation to distinguish between the Spatial Thing and the page/document unless your application requires this.

Note

While there is a cost to this conflation, problems can be mitigated by avoiding making statements that confuse Spatial Thing and the page/document, such as “Uluru is available in KML format”; e.g. <http://sws.geonames.org/7645281/> dcterms:hasFormat <http://www.geonames.org/kml/-25.34434_131.03282_15.kml> .

This statement is clearly not true; an ancient monolith covering more than 3 km² cannot be provided in XML [XML11]!

HTTP URIs for Spatial Things should not include any indication of the data format used to encode the page/document as this may change as your systems evolve. That said, you may wish to provide a set of complementary resources that specify a particular format as part of your content negotiation strategy. For example, the URI http://sws.geonames.org/7645281/about.rdf dereferences to provide an RDF/XML encoding of the information about Uluru in the Northern Territory of Australia (http://sws.geonames.org/7645281/).

[DWBP] Best Practice 10: Use persistent URIs as identifiers within datasets notes that URIs can be long. You may need to define identifiers that are locally unique within your spatial dataset and provide a mechanism to programmatically convert each local identifier to a URI. For example, the Metadata Vocabulary for Tabular Data [TABULAR-METADATA] achieves this using URI Templates as described in [RFC6570].

It is also good practice to use a redirection service to hide complex and potentially changing service end-point URLs, such as for a Web Feature Service [WFS] behind well-designed URIs. This means that users don’t need to be aware of the complexities of the API or changes in endpoint URIs or API versions to request information about a particular Spatial Thing. For example, the URI http://data.example.org/aan/id/perceel/aan.2528 could be used as proxy for the WFS GetFeature request http://geodata.nationaalgeoregister.nl/aan/wfs?VERSION=2.0.0&SERVICE=WFS&REQUEST=GetFeature&RESOURCEID=aan.2528.

Finally, while it is simple to use a query-pattern URL to serve information about a resource identified with a URI from a third-party internet domain, e.g. http://example.org/museums?q=http://sws.geonames.org/6618987/, these URLs are unsuitable as persistent identifiers. More often than not, your intended users will dereference the "official" URI, e.g. http://sws.geonames.org/6618987/. That said, this kind of search operation does provide a useful mechanism to find particular Spatial Things. See Best Practice 12: Expose spatial data through 'convenience APIs' for further details.

How to Test

Check that within the data Spatial Things, such as countries, regions and people, are referred to by HTTP URIs or by short identifiers that can be converted to HTTP URIs. Ideally dereferencing the URIs should return the Spatial Thing, however, they have value as globally scoped variables whether they dereference or not.

Evidence

Relevant requirements: R-Linkability, R-GeoReferencedData, R-IndependenceOnReferenceSystems.

Benefits

Discoverability
Reuse
Linkability

12.1.2 Indexable data

Search engines are the common starting point for people looking for content on the Web. However, as far as search engines are concerned, something is only 'on the Web' if it has an HTTP URI and when this URI is dereferenced, information is returned (usually in the form of a Web page).

Best Practice 2: Make your spatial data indexable by search engines

Search engines should be able to crawl spatial data on the Web and index Spatial Things for direct discovery by users.

Why

In SDIs information about spatial datasets is published as authoritative metadata records and collated in Web-based catalogues. This approach causes several problems:

the catalogues are often designed to primarily support expert users - people may not even be aware of their existence;
once you have discovered a dataset that meets your needs and identified where it is available from, a second step is required to access the data itself - often requiring the use of unfamiliar protocols or complex API requests; and
the data itself is not indexed - discovery relies on the metadata records that are often sparsely populated or out of date.

Search engines are the common starting point for people looking for content on the Web that is widely understood. By publishing spatial data in a way that enables their crawlers to index spatial datasets including each Spatial Thing, the fidelity of search results should improve. Users will be able to directly search for specific entities rather than having to look for a dataset and then parse through it; e.g. to search for "Anne Frank’s House" (https://g.co/kg/m/02s5hd) rather than looking for a dataset about "Cultural Heritage in Amsterdam" and hoping that it contains a reference to what you’re interested in.

Note

At present, spatial information is not widely exploited by search engines. However, by increasing the volume of spatial information presented to search engines, and the consistency with which it is provided, we expect search engines to begin offering spatial search functions. We already see evidence of this in the form of contextual search, such as prioritization of search results from nearby entities. In addition, search engines are beginning to offer more structured, custom searches that return only results that include certain [SCHEMA-ORG] types, like Dataset, Place or City.

Intended Outcome

Information about spatial datasets and things is indexed by search engines.

Users can find Spatial Things using common search engines.

Possible Approach to Implementation

In general, you need to:

publish a HTML Web-page for the spatial dataset and each Spatial Thing that it describes; and
make sure that those pages can be crawled.

The Web-page for the dataset is an entry-point for humans to browse and for the search engines to crawl your data. This landing page should provide descriptive metadata that helps users evaluate whether the dataset meets their needs (see Best Practice 13: Include spatial metadata in dataset metadata and [DWBP] Best Practice 2: Provide descriptive metadata), and may provide links to other service end-points, APIs or tools that will help a user work with the dataset. When metadata for datasets has already been created, e.g. to create a record in a metadata catalogue or to describe the data available from a service end-point, this information should be re-used - publishing it in a Web-friendly way that humans and Web-crawlers can consume. The landing page should be indexable by the search engines so that it can be discovered too!

To enable humans and Web-crawlers to find HTML pages for the Spatial Things, the "landing page" needs to include hyperlinks that can be followed. Where you have a larger collection of Spatial Things, you should support paging through the collection.

You may also consider using Sitemaps to direct the Web-crawler. For larger datasets, multiple sitemaps can be provided and grouped by a sitemap index file. If a dataset contains millions of Spatial Things (e.g. a building dataset with national coverage), generating and maintaining the sitemaps may require a custom implementation to keep the sitemaps with the set of Spatial Things synchronized.

For very large datasets paging through thousands of pages is not useful for a human either. Consider supporting filtering and/or organize the Spatial Things into subsets, as described in section 12.3 Spatial data access.

A pre-condition for this best practice is Best Practice 1: Use globally unique persistent HTTP URIs for Spatial Things as persistent identifiers are essential to support reliable indexing and linking. Traditionally spatial datasets have not been maintained with stable identifiers for Spatial Things, but to share spatial data on the Web stable identifiers are a must. Sharing spatial data is more than "just" making the dataset available on the Web.

Each Web-page, and the hyperlinks used to relate the Spatial Things to the dataset landing page, can likely be generated programmatically from the data you hold about the Spatial Thing, either directly from the data or by using an API that makes the data available on the Web.

It is important to keep in mind that the HTML representations should not mainly be designed for the search engines, but they should present the data in a clear and understandable way to human users. The page about the Spatial Thing should be useful to a user and encourage others to link to the page when they share other information about the Spatial Thing. This typically will also improve the ranking of these pages in search results.

Example 7

The Property Search in the City of Nanaimo, Canada provides a landing page and one page per property. The landing page offers a search capability and the option to browse by street. This data is indexed; a search for, for example, "2100 AARON WAY, NANAIMO, BC" in a popular search engine returns the Nanaimo data for this Spatial Thing as one of the first results.

The Bathing Water Quality Explorer for England provides a landing page and one page per site. Sites can be searched, selected from a list or in a map.

In both cases, the pages of the Spatial Things are generated from the underlying data at request time.

The property Web-pages in Nanaimo also use [MICRODATA] annotations using [SCHEMA-ORG], which is discussed below.

In addition to exposing the spatial data as linked HTML Web-pages, indexing by Web-engines can be further enhanced by incorporating a description of the Spatial Thing as structured markup (in particular [MICRODATA] or [JSON-LD] annotations using [SCHEMA-ORG]) as this enables the search engines to make more detailed assumptions about your resource. It is important to note that this is not only helpful to search engines, but also to other tools that want to understand more about the semantics of the resource, for example, its location.

In [SCHEMA-ORG], a spatial dataset is a Dataset and a Spatial Thing is in general a Place or an Event. For some types of Spatial Things, more specific sub-types exist, for example City or Mountain.

Location information about a Spatial Thing is typically provided using a geometry (GeoCoordinates or GeoShape) or a PostalAddress. [SCHEMA-ORG] coordinates are restricted to WGS 84 with longitude and latitude. Supported geometry types are points, line strings, polygons, boxes and circles.

By using [SCHEMA-ORG] annotations, search engines and others can connect location information with other information, e.g. about the nature of the Spatial Thing, opening hours, contact details, etc.

The use of [SCHEMA-ORG] for spatial data is in its early days and should be understood as an "emerging practice".

Example 8

This code-snippet illustrates a [JSON-LD] annotation using a [SCHEMA-ORG] Dataset for an address dataset in the Netherlands that may be embedded in the HTML of the Web-page. It includes a name, a description, the spatial coverage using a bounding box, the URL of the Web-page, and a link to another dataset containing this dataset. The same annotation could also be provided using [MICRODATA], but we use [JSON-LD] here as this presents the structured data in a more human-readable way.

<script type="application/ld+json">
{
  "@context" : {
    "@vocab" : "http://schema.org/"
  },
  "@type" : "Dataset",
  "@id" : "http://www.ldproxy.net/bag/inspireadressen/",
  "name" : "Adressen",
  "description" : "INSPIRE Adressen afkomstig uit de basisregistratie Adressen, beschikbaar voor heel Nederland",
  "url" : "http://www.ldproxy.net/bag/inspireadressen/",
  "isPartOf" : {
    "@type" : "Dataset",
    "url" : "http://www.ldproxy.net/bag/"
  },
  "keywords" : "Adressen",
  "spatialCoverage" : {
    "@type" : "Place",
    "geo" : {
      "@type" : "GeoShape",
      "box" : "47.975,3.053 53.504,7.24"
    }
  }
}
</script>

This code-snippet illustrates a [JSON-LD] annotation using a [SCHEMA-ORG] Place for the address of the "Anne Frank’s House" in that dataset. It includes the location, the URL of the Web-page, and the structured postal address information.

<script type="application/ld+json">
{
  "@context" : {
    "@vocab" : "http://schema.org/"
  },
  "@type" : "Place",
  "@id" : "http://www.ldproxy.net/bag/inspireadressen/inspireadressen.3329155",
  "url" : "http://www.ldproxy.net/bag/inspireadressen/inspireadressen.3329155",
  "geo" : {
    "@type" : "GeoCoordinates",
    "longitude" : "4.88399",
    "latitude" : "52.37520"
  },
  "name": "Anne Franks House",
  "description": "Museum house where Anne Frank & her family hid from the Nazis in a secret annex, during WWII.",
  "address" : {
    "@type" : "PostalAddress",
    "streetAddress" : "Prinsengracht 267",
    "addressLocality" : "Amsterdam",
    "postalCode" : "1016GV"
  }
}
</script>

The Web-pages should also provide a mechanism to download data in the formats you decide to support. [DWBP] Best Practice 14: Provide data in multiple formats provides guidance.

Typically, multiple formats for a resource are supported using two mechanisms: HTTP content negotiation and by adding format-specific file extensions to the resource URI like ".json", ".xml" or ".ttl". Content negotiation is the standard mechanism of HTTP and the format-specific URIs enable the use of clickable links to the resource in a specific format.

Search engines may also index resource representations in other formats than HTML.

Note

In 2016, these topics were analyzed in a testbed organized by Geonovum in the Netherlands. More details can be found in reports from the testbed: Spatial Data on the Web using the current SDI and Crawlable geospatial data using the ecosystem of the Web and Linked Data.

The use of [SCHEMA-ORG] for describing spatial information have been also investigated in two studies, concerning, the former, the definitions of mappings between [LOCN], [VCARD-RDF] and [SCHEMA-ORG], and, the latter, the definitions of mappings between [GeoDCAT-AP] and [SCHEMA-ORG].

The use of [SCHEMA-ORG] for describing spatial information is continually evolving; spatial data publishers should familiarize themselves with current practices. A useful Introduction to Structured Data is provided in Google's developer portal.

How to Test

Using a Web browser,

search for the landing page of your dataset, and
check that you can browse to human-readable HTML pages for each Spatial Thing that the dataset describes.

Monitor the search consoles of the search engines about the progress in indexing your Web-pages and their structured data. In case any errors are reported, try to fix them.

Evidence

Relevant requirements: R-BoundingBoxCentroid, R-Crawlability, R-Discoverability, R-Linkability, R-MachineToMachine.

Benefits

Discoverability
Reuse

12.1.3 Linking data

Links, in whatever machine-readable form, are important. In the wider Web, it is links that enable the discovery of Web pages: from user-agents following a hyperlink to find related information to search engines using links to prioritize and refine search results. This section is concerned with the creation and use of those links to support discovery of the SpatialThings described in spatial datasets.

For data to be on the Web, the resources it describes need to be connected, or linked, to other resources. The connectedness of data is one of the fundamentals of the Linked Data approach that these best practices build upon.

Just like any type of data, spatial data benefits massively from linking when publishing on the Web. The widespread use of links within data is regarded as one of the most significant departures from contemporary practices used within SDIs. That's why this topic is included in this Best Practice.

[DWBP] identifies Linkability as one of the benefits gained from implementing the Data on the Web best practices (see [DWBP] section 8.7 Data Identifiers Best Practice 9: Use persistent URIs as identifiers of datasets and Best Practice 10: Use persistent URIs as identifiers within datasets). However, no discussion is provided about how to create links that can make use of those persistent URIs. This section of the document extends [DWBP] by providing a best practice about creating links between the resources described inside spatial datasets.

Note

Discussion of links in these best practices are limited to simple links that relate exactly two resources: the source and target. Complex links that relate an arbitrary number of resources, such as described in [XLINK11] section 5.1 Extended Links, are out of scope.

Best Practice 3: Link resources together to create the Web of data

Bind Spatial Things into the Web of data using links to other resources, providing sufficient information for a user to determine whether the target resource specified in a link will be of use.

Why

The 5★ rating for Linked Open Data asserts that to achieve the fifth star you must "link your data to other data to provide context". The benefits for consumers and publishers of linking to other data are listed as:

You can discover more (related) data while consuming the data.
You can directly learn about the data schema.
You make your data discoverable.
You increase the value of your data.
Your own organization will gain the same benefits from the links as the [other] consumers.

There is always a cost to traversal of a link, even if it is just a few milliseconds delay and the need to parse a few hundred or thousand bytes returned in response to an HTTP request. In many cases, such as when dealing with large datasets and complex queries, the costs incurred from traversing a link may be significant in terms of time and data volumes. Before a user or software agent decides to traverse a link, they should be able to determine whether acquisition of the target resource, or data about the target resource, will support their application goals. For example, what format can one expect the response in, what type of resource is the target and how is that target related to the source resource?

Intended Outcome

Links can be identified and traversed by humans and software agents.

Sufficient information is provided to help humans and software agents determine whether traversal of a given link meets their goals.

Possible Approach to Implementation

The ground-rules for linking spatial data are the same as for any type of data.

Use formats that support Web linking (as defined in [WEBARCH] section 4.4 Hypertext)

Earlier in this document (section 10. Linked Data) we explained that linked data requires only that the formats used to publish data support Web linking. In other words, linking spatial data does not automatically mean the use of RDF [RDF11-PRIMER]; links can also be created, for example, using [GML], HTML or [JSON-LD]. The two key points from [WEBARCH] are:
- Good practice: Link identification — A [data format] specification SHOULD provide ways to identify links to other resources [...].
- Good practice: Web linking — A [data format] specification SHOULD allow Web-wide linking, not just internal document linking.
The examples used in this best practice illustrate some of the data formats and mechanisms that support Web linking.
Follow the principles for 4★ — Linked [WEB-DATA]
- Always use global identifiers when linking between documents, so that link identifiers can be taken out of context and shared globally.
  
  Note
  
  4★ [WEB-DATA] is predicated on the use of global identifiers for resources. As such, we consider Best Practice 1: Use globally unique persistent HTTP URIs for Spatial Things as a prerequisite for linking.
- Links should be typed (explicitly or implicitly), so that clients can decide which link to follow when they are traversing a Web of interlinked resources to reach application goals.
  Example 10: HTTP response Link header with IANA Link Relation types
```
HTTP/1.1 200 OK
Link: <http://www.gemeentegeschiedenis.nl/gemeentenaam/Amsterdam/2014>; rel="predecessor-version"
Content-type: application/geo+json
Connection: close

{...}
```
  This example, using HTTP Link headers (as defined in [RFC5988]), illustrates the use of IANA [LINK-RELATION-TYPES] to define the link type. According to the IANA registry, predecessor-version points to a resource containing the predecessor version in the version history (as defined in [RFC5829] "Link Relation Types for Simple Version Navigation between Web Resources").
  
  Note
  
  In simple links involving only two resources, the role, or type, of each resource are implicit and can be inferred from the link relation type. It can be useful to include other information to help users judge whether to follow a link such as human-readable labels and hints about the target resource type. Of course, often target resources and the links that refer to them are maintained by different parties, so such hints should be assumed as prescriptive; they may or may not turn out to be true. For example, [RFC5988] "Web Linking" defines several additional attributes including: hreflang — hints at the language or languages that the target resource is available in; type — indicates the media-type expected; and title — labels the link target such that it can be used as a human-readable identifier etc.
  
  Also note that [DWBP] Best Practice 19: Use content negotiation for serving data available in multiple formats recommends the use of content negotiation to help ensure that a user or software agent is provided with useful content when they traverse a link and dereference to the target resource. However, HTTP Request headers are limited to specifying media-type, character set, encoding (e.g. for compression) and language. There is no mechanism to request that data is provided according to a particular data model or 'profile', nor request data in a particular coordinate reference system. This gap in current practice is discussed in section 13.1 Requesting different representations of geometries.
- Make links as specific as possible. If the linked resource supports fragment identification, and the link logically should be to a fragment of the resource (and not just the resource as a whole), try to use fragment identifiers when possible.
  
  Note
  
  Being as specific as possible with links is important; e.g. refer to a particular Spatial Thing rather than the dataset in which that Spatial Thing is described. That said, we encourage publication of data about Spatial Things as independently resolvable resources (e.g. so that they can be accessed by search engine's Web crawlers, see Best Practice 2: Make your spatial data indexable by search engines) which means that fragment identifiers are usually not required.

How to Test

Check that hyperlinks are distinguishable within the data — a string-literal that happens to contain a URL is insufficient.

Check that hyperlinks use global identifiers, preferably HTTP URIs, to identify the link target.

Check that hyperlinks use typed relationships, and that the definition of the link relation type can be located in order to determine how to interpret the hyperlink.

Evidence

Relevant requirements: R-Linkability, R-MachineToMachine.

Benefits

Comprehension
Processability
Reuse
Interoperability

12.2 Spatial data

The best practices in this section take [DWBP] as a basis and further refine them to provide more specific guidance for spatial data.

12.2.1 Spatial data encoding

This section does not elaborate on formats for publishing spatial data on the Web. The formats are basically the same as for publishing any other data on the Web: XML [XML11], JSON [RFC7159], CSV [RFC4180], RDF [RDF11-PRIMER], etc. Refer to [DWBP] section 8.6 Data Formats for more information and best practices. Refer to Appendix A. Applicability of common formats to implementation of best practices for a list of spatial data formats for the Web.

That being said, it is important to publish your spatial data with clear semantics, i.e. to provide information about the contents of your data. The primary use case for this is you have information about a collection of Spatial Things and you want to publish precise information about their attributes and how they are inter-related. Another use case is the publication on the Web of a dataset that has a spatial component in a form that search engines will understand.

Depending on the format you use, the semantics may already be described in some form. For example, in GeoJSON [RFC7946] this description is present in the specification. When using JSON it is possible to add semantics using a [JSON-LD] @context object. For providing semantics to search engines, using [SCHEMA-ORG] is a good option, as explained in Best Practice 2: Make your spatial data indexable by search engines. In a linked data setting, the attributes of a Spatial Thing can be described using existing vocabularies, where each term has a published definition. If you can't find a suitable existing vocabulary term, you should create your own, and publish a clear definition for the new term, linking it to commonly used existing ones if possible, because this increases its usefulness. An overview and high-level comparison of RDF vocabularies / OWL ontologies for spatial data is provided in section A. Applicability of common formats to implementation of best practices. We do not recommend one vocabulary because this recommendation would not remain durable as vocabularies are released or amended.

[DWBP] section 8.9 Data Vocabularies provides guidance on the topic of data modelling; determining which concepts and relationships should be used to describe your area of interest, something usually done by domain experts. Data publishers should not attempt to guess all the purposes for which someone might use or reference their data - ending up with a super-complex data model that tries to cover every possible use case. Instead, data publishers should try to help data consumers make informed decisions about the best way to use the data by providing good metadata.

In most cases, the effective use of information resources requires understanding thematic concepts in addition to the spatial ones; "spatial" is just a facet of the broader information space. For example, when the Dutch Fire Service responded to an incident at a day care center, they needed to evacuate the children. In this case, the 2nd closest alternative day care center was preferred because it was operated by the same organization as the one that was subject of the incident, and they knew who all the children were.

This best practice document provides mechanisms for determining how places and locations are related - but determining the compatibility or validity of thematic data elements is beyond our scope; we're not attempting to solve the problem of different views on the same/similar resources.

That said, there is one aspect of thematic semantics that must be mentioned. The most important semantic statement you can make when publishing spatial data - or any data - is to specify the type of a resource. For Spatial Things, there are several types that define "spatialness" (for examples in a linked data context, see the vocabularies table in Appendix A of this document). But you should also consider non-spatial aspects when designating the type of a Spatial Thing. For example, should a fire incident occur at Amsterdam Central railway station, it might seem sensible for the Municipal Fire Department to designate a type such as Building or Station (the Dutch Government Base Registry defines Amsterdam Central railway station, identified as https://brt.basisregistraties.overheid.nl/top10nl/id/gebouw/102625209, designates both of these types). However, the Fire Departments are concerned with a fire incident - not the railway station itself. The fire incident is a Spatial Thing (it has spatial extent) but it is not the station. For example, the fire may spread to adjacent buildings. The Fire Department might designate their Spatial Thing as having type FireIncident or similar. Advice on how to assign a persistent identifier to the fire incident is provided in Best Practice 1: Use globally unique persistent HTTP URIs for Spatial Things, and section 12.1.3 Linking data provides guidance on how one might relate the fire incident to other coincident Spatial Things such as Amsterdam Central railway station.

Note

Thematic semantics are out of scope for this best practice document. For associated best practices, please refer to [DWBP] section 8.2 Metadata, Best Practice 3: Provide structural metadata; and [DWBP] section 8.9 Data Vocabularies, Best Practice 15: Reuse vocabularies, preferably standardized ones and Best Practice 16: Choose the right formalization level.

Why

Spatial data is used by a range of user communities, each with their own purposes, knowledge and preferred tools. Data publishers should consider which communities and purposes they want to serve and make appropriate choices for the approach to encoding data. In general terms, data usefulness is increased when it can be used for more purposes. This might involve providing data in several different formats. (See [DWBP] Best Practice 14: Provide data in multiple formats.)

Intended Outcome

Spatial data can be used easily and reliably by the target users.

Possible Approach to Implementation

A high-level objective of these best practices is to highlight approaches that data publishers can take to maximize the ease of use of their spatial data via the Web and hence present data in a way that meets the needs of as wide a range of users and applications as possible.

One way of classifying the applications of spatial data is as follows:

Web pages for people to read about Spatial Things
Web mapping or visualization applications
Data integration - combining spatial data with other data
Spatial analytics - discover meaningful patterns in spatial data

Each of these has different needs: often it will be possible or desirable to support several of these application groups.

The main objective is to encode data in a way that recipients can easily decode and understand. To decide this, you need to consider which purpose(s) and which audience(s) are you aiming to serve and the characteristics of the data that you want to share. For example:

the volume of data
how many spatial dimensions it covers (points, lines, areas, 3D)
what kind of area it covers (one building, a town, a whole country)
how frequently it changes
the level of spatial precision that exists in the data and the precision needed by users

1. Web pages for people to read about Spatial Things

In Best Practice 1 we recommend use of HTTP URIs as a way of assigning identifiers to Spatial Things. The data publisher should offer the ability to look up ('dereference') such a URI to find out useful information about that Spatial Thing in human readable form (as well as machine readable formats - see the discussion below on data integration). Each Spatial Thing therefore gets its own Web page - in addition it might be useful to have Web pages about groups of Spatial Things, but the 'page per thing' approach enables fine-grained linking of information.

To promote discovery of such Web pages in search engines, each page should contain a clear text description of what it is, ideally in a way that distinguishes it from pages about other similar Spatial Things. Including metadata using the [SCHEMA-ORG] vocabulary, embedded as [MICRODATA], [HTML-RDFa] or as [JSON-LD] in the <head> section of the page can provide additional information to search engines to support more precise indexing. See Best Practice 2: Making data indexable by search engines for a more detailed discussion.

It is also very useful in such Web pages to include links to descriptions of the Spatial Thing in other formats (typically machine-readable formats) as well as linking to related Spatial Things.

In most cases, a web page about a Spatial Thing should include information on its location. This can be done by providing spatial coordinates (see Best Practice 7 for guidance on how to do this).

A common way of specifying the location of a building is to use its postal address. Most spatial applications require an address to be turned into spatial coordinates, so that its location can be marked on a map, or compared with locations of other things, a process known as geocoding. Although a publisher could leave this process of geocoding to the data user, ideally the publisher should take responsibility for this as they are in a better position to check the accuracy of the results. Different ways of specifying addresses can sometimes lead to errors in the geocoding process.

Other approaches can be taken to specifying location. What3words is an example of a service that assigns an alternative kind of address to a location - in this case a sequence of three common words associated with a 3m by 3m square on the ground. It allows every location to be given such an address and what3words also provides a means to relate the address to latitude and longitude coordinates. Like conventional addresses, converting to coordinates is necessary for many spatial data applications (e.g. to calculate the distance between points or whether a point is inside a region), but the process of conversion is more reliable and precise.

2. Web mapping or visualization applications

A common application of spatial data on the Web is delivering map data in a tiled form, suitable for display in zoomable 'slippy maps'. The OGC's Web Map Tile Service [WMTS] is an established standard for doing this. Other approaches in common use include MBTiles or 'Tile layers' in Google Maps APIs

Another frequent requirement is to draw markers or polygons on top of a Web map. A typical approach is for the browser to display a base map, then separately retrieve data about Spatial Things of interest, typically as GeoJSON [RFC7946], [GeoRSS] feeds, [GML] using the Simple Features profile [GML-SF] or [KML] files, then combine the two using appropriate JavaScript libraries. For applications involving boundary polygons of geographical areas, a common consideration is how to make this process efficient at different zoom levels. A high level of detail is appropriate when zoomed in, but many areas may be visible when zoomed out, and delivering boundaries of all of those at full detail can lead to very large amounts of data and hence poor performance, so simplified lower resolution versions of polygons may be required.

See this comparison of different spatial data formats to help guide the choice of which approach is best suited to your purpose.

3. Data integration - combining spatial data with other data

Many important applications of spatial data involve combining it with other kinds of data: for example, opening times of nearby supermarkets, or statistical information on the economy of a town. Often one or more Spatial Things are at the center of the data analysis process.

Other applications involve distinguishing or selecting Spatial Things according to their non-spatial characteristics: hospitals with an emergency department, or restaurants that serve Japanese food.

To enable such questions to be answered using data from different sources, it is important to describe Spatial Things using shared identifiers and vocabularies. This is described in [DWBP] Best Practice 10: Use persistent URIs as identifiers within datasets and [DWBP] Best Practice 15: Reuse vocabularies, preferably standardized ones.

From a spatial data perspective, the question of identifiers is discussed in Best Practice 1. How to relate a Spatial Thing to its geometry is described in Best Practice 5: Provide geometries on the Web in a usable way.

A common approach to encoding data to enable data integration is Linked Data [LD-BP] and RDF [RDF11-PRIMER]. The spatial aspects of the data can either be included in the RDF data model, or the entity in question can link to an external Web resource containing the geometry in one of the standard spatial data formats. Although RDF is well-suited to important aspects of best practice, including use of URIs as identifiers and re-use of vocabularies, other data formats are also consistent with this approach. Most spatial data formats enable associating attributes of an entity alongside its geometry.

The publisher's choice of data model to represent the data will depend on what data is available and which audiences and purposes it seems most important to support. However, a reasonable general rule is that it is always useful to provide a label and a type for each entity in the data collection. (See [DWBP] Best Practice 16: Choose the right formalization level)

Common vocabularies for describing the address or location of a Spatial Thing include: [SCHEMA-ORG], [VCARD-RDF] and [LOCN]. See this comparison of different vocabularies for describing Spatial Things to help decide which is best for your application.

Publishing explicit relationships between the Spatial Thing of interest and other related Spatial Things helps support data integration applications: for example providing hierarchical relationships between different kinds of administrative area.

4. Spatial analytics - discover meaningful patterns in spatial data

Spatial analytics (or spatial analysis) is about deriving new insights by applying formal techniques to study Spatial Things using their topological or geometric properties. Combining spatial data with other data (see item 3 above) is a typical preparatory step before analyzing the one or more datasets using spatial operators, statistical algorithms, etc.

For spatial analytics on the Web, the data should be accessible via an API as described in section 12.3 Spatial data access and results should be shared using the best practices described in this document. Current spatial data infrastructures have some limitations with respect to sharing spatial data on the Web (as discussed in section 11. Why are traditional Spatial Data Infrastructures not enough?). Nonetheless this approach is a well-established and powerful way of distributing spatial data, based on open standards and suited to a community of expert users. It is thus one of the options a data publisher should consider when deciding how to encode their spatial data.

In addition to publishing the data that represents the results of the analysis, maps and other forms of visualization (see item 2 above) are typically used to communicate the results.

Example 14

Statistics Netherlands (CBS) publishes their Neighborhoods statistics data as a [WFS] service. The capabilities of that service can be requested in the following way:

https://geodata.nationaalgeoregister.nl/wijkenbuurten2016/wfs?request=GetCapabilities

For example, the following request returns the data for neighborhoods within the specified bounding box. The bounding box is specified using EPSG:28992 ("Amersfoort / RD New") and indicates an areay of 100 square meters.

https://geodata.nationaalgeoregister.nl/wijkenbuurten2016/wfs?request=GetFeature&typename=wijkenbuurten2016:cbs_wijken_2016&version=2.0.0&service=WFS&bbox=120000,480000,130000,490000

Balancing quality and cost

The four main classes of application above have a wide range of requirements. To support such a wide range may require a lot of effort and cost on behalf of a data publisher. There are many aspects to the 'quality' of a spatial data publishing approach, but in general terms it relates to how well the data and approach to data delivery meet the needs of the target audience. By choosing to concentrate on only some kinds of application the publisher can keep cost down. Other factors to consider include performance (speed with which data is delivered), timeliness of updates - which can be a significant consideration if the underlying data changes frequently, software complexity or maintenance.

In many cases a mixture of technologies can be used together to find a good compromise of quality or performance and cost. The strengths of various approaches can be applied to the part of the publishing 'spectrum' that suits them best. For example, if using a Linked Data approach, one option is to keep all data in a triple store; but hybrid approaches are also possible, for example where geometrical information is stored and served from flat files, or where non-geometrical data and metadata is stored in a triple store and used to generate Web pages and machine readable descriptions of Spatial Things, while geometrical data is indexed by software such as Lucene Spatial, PostGIS or Elasticsearch. Use of shared Web-accessible identifiers for Spatial Things can help support the interconnections between a range of diverse information systems.

[EO-QB] describes a 'spectrum of linkiness' for coverage data. At one end of the spectrum, you can assign each individual data point or pixel within a coverage (such as a satellite image) an individual identifier and web page. At the other, you can link just to an entire dataset and provide metadata for that. An intermediate approach involves dividing the data into tiles, each of which can have its own identifier and metadata. The balance of quality and cost in this example corresponds to the size of tiles that can be individually referenced, described and retrieved.

How to Test

Check if spatial data is encoded, so that it can be understood and re-used reliably.

Consider the main target audience or audiences of a web page or service, and check if spatial information is provided in a way appropriate for that audience.

Evidence

Relevant requirements: R-DeterminableCRS, R-Discoverability, R-GeoreferencedData, R-Linkability, R-MachineToMachine, R-SpatialRelationships

Benefits

Comprehension
Processability
Reuse
Interoperability
Access

12.2.2 Geometries and coordinate reference systems

Location information is a common constituent of spatial data and can be an important 'hook' for finding information and for integrating different datasets. There are different ways of describing the location of Spatial Things. You can use and/or refer to the name of a well-known named place, provide position coordinates in a geometry or describe one location relative to another location. Providing multiple representations i.e. several geometries for one Spatial Thing can also be helpful, allowing data users to choose the one that fits their use case. This generally requires each geometry to be represented as a structured object that includes not only coordinates of the positions defining the geometry but also an identifier and other properties that describe its specific characteristics. It is especially important to choose the coordinate reference system with care and indicate it clearly for each geometry.

Best Practice 5: Provide geometries on the Web in a usable way

Geometry data should be expressed in a way that allows its publication and use on the Web.

Why

The geospatial, Linked Data, and Web communities use different geometry formats and tools, which reflect different requirements with respect to data complexity and manipulation.

When deciding how a geometry should be described, it is therefore necessary to consider the intended uses and the related user communities. This may also imply providing alternative geometry descriptions.

This best practice helps with choosing the right format for describing geometries, based on aspects like intended use(s), performance, and tool support. It also helps with deciding when encoding geometries as literals rather than as structured objects is a useful simplification.

Note

This best practice is strictly correlated to Best Practice 6: Provide geometries at the right level of accuracy, precision, and size, Best Practice 7: Choose coordinate reference systems to suit your user's applications, and Best Practice 8: State how coordinate values are encoded, to which we refer the reader for more information.

Intended Outcome

The format chosen to express geometry data should:

Support the dimensionality of the geometry (from points - 0D - to volumes - 3D) - not all geometry formats support all dimensions.
Support the coordinate reference system you need.
Be supported by the software tools used within particular data user communities - the geospatial and Web communities use different tools often suited to different geometry formats.
Keep geometry definitions to a level of detail and size that is appropriate for the intended applications - Web applications do not typically require detailed geometries.

Ideally, to enable their widest re-use, geometries should be described having in mind the geospatial, Linked Data and Web communities. This may not be always feasible, but the objective should at least be to describe geometries (also) for Web consumption.

Possible Approach to Implementation

Steps to follow:

Identify the intended uses and applications. In particular, it is important to verify if geometries need to be used in one or more of the following scenarios:
- specific geospatial applications;
- linked data applications;
- Web consumption.
For each of the intended uses / applications, provide possibly alternative descriptions of geometries, considering:
- The appropriate geometry dimensionality (0D - points, 1D - curves, 2D - surfaces, 3D - solids). See section 6. Spatial Things, Features and Geometry for more information.
- The appropriate coordinate reference system(s). See section 9. Coordinate Reference Systems (CRS) for more information.
- The appropriate geometry encoding(s) / representation(s) - also considering the software tools that you anticipate your user community to employ. See section A. Applicability of common formats to implementation of best practices for more information.
- The appropriate level of detail. See Best Practice 6: Provide geometries at the right level of accuracy, precision, and size for more information.
Where multiple representations are required, consider offering as many as you can - balancing the benefit of ease of use against the cost of the additional storage or additional processing if converting on-the-fly. See [DWBP] Best Practice 19: Use content negotiation for serving data available in multiple formats for more information.

Note

HTTP content negotiation only works for media-type, character set, encoding and language. Consequently, it is not possible to select one representation that conforms to a given "profile" (e.g. data model, complexity level, CRS) from several that all share the same media-type; e.g. asking for the GeoJSON [RFC7946] features with "simple" geometries (compacted polygons or just points) not the "complex" geometries; or asking for the representation that uses CRS84 not Amersfoort-RD.

It is important to note that the steps outlined above are interrelated. For instance, the dimensionality of a geometry determines the set of coordinate reference systems that can be used, as well as the geometry encodings / representations.

Another issue to be considered when choosing the geometry format is whether the coordinate axis order is unambiguous - i.e., whether the order of the position coordinates defining each geometry is, e.g., longitude/latitude or latitude/longitude. This specific topic is covered by Best Practice 8: State how coordinate values are encoded.

Note

Multiple formats exist for representing geometries (and some of them are listed in section A. Applicability of common formats to implementation of best practices). It is important to distinguish between the structured geometry object itself and the list of two or more position coordinates that places that geometry in space and is typically the most voluminous part of geometry data. Another of the issues to be considered when choosing the format(s) to be supported is where and when to use literals or structured object formats.

For geometry literals, several encodings are available, such as Well-Known Text (WKT) representations, Geohash and other geocoding representations. Literals may lend themselves to compact storage and fast processing, but have the disadvantage that properties of the geometry are not readily Web-accessible. An alternative is to use structured geometry objects. [GeoSPARQL], for example, balances accessibility and compactness by using a literal such as WKT to encode just the position list of a geometry but represents other properties in RDF .
There are also several suitable binary data formats (e.g. Google's protocol buffers for vector tiling); however, some binary formats do not (effectively) work on the Web as there are no software tools for working with those formats from within a typical Web application; to work with data in such formats, you must first download the data and then work with it locally.
There are widespread practices for representing geometric data as linked data, such as using [W3C-BASIC-GEO] w3cgeo:lat and w3cgeo:long that are used extensively for describing w3cgeo:Point objects.
Concrete geometry types are available, such as those defined in the OpenGIS [SIMPLE-FEATURES] Specification, namely 0-dimensional Point and MultiPoint; 1-dimensional curve LineString and MultiLineString; 2-dimensional surface Polygon and MultiPolygon; and the heterogeneous GeometryCollection.

Currently, there are two reference geometry formats widely used in the geospatial and Web communities, respectively, [GML] and GeoJSON [RFC7946].

[GML] provides the ability to express any type of geometry, in any coordinate reference system, and up to 3 dimensions (from points to solids) but is typically serialized in XML [XML11].

GeoJSON [RFC7946] supports only one coordinate reference system (CRS84 - i.e., WGS 84 longitude/latitude), and geometries up to 2 dimensions (points, lines, surfaces) but is serialized in JSON [RFC7159], which is often easier for browser-based Web applications to process.

To facilitate the use of geometry data on the Web as well in GIS, it is desirable that complex [GML]-encoded geometries be made available also in simplified form as GeoJSON [RFC7946], by applying any required coordinate reference system transformation, as well as simplifying and generalizing the original geometry as needed (e.g., by transforming a 3D geometry into a 2D one). Simplified geometries may of course also be published in [GML], for example by conforming to the GML Simple Feature profile [GML-SF]. (On this topic, see Best Practice 6: Provide geometries at the right level of accuracy, precision, and size).

Note

Another approach to publishing geometries on the Web is to embed them directly in Web pages. This is, for instance, the approach used by [SCHEMA-ORG], which defines several terms to specify them (see Best Practice 2: Make your spatial data indexable by search engines for more information).

Typically, this is used just for 0D-2D geometries (points, lines, surfaces). Detailed and complex geometries cannot be published with this methodology, so also in this case only a very simplified representation of the original geometry can be published - e.g., the centroid and/or 2D bounding box. (On this topic, see Best Practice 6: Provide geometries at the right level of accuracy, precision, and size).

Finally, RDF-based representations of geometries are used in the Linked Data community. This is achieved by using specific vocabularies, as [W3C-BASIC-GEO] (only for points), [GeoRSS] (points, lines, boxes, circles, polygons) or [GeoSPARQL] (for any simple features geometries). For a high-level comparison of common spatial data vocabularies, see section A. Applicability of common formats to implementation of best practices.

These geometry representations are either stored with the related data, or are maintained separately, and possibly denoted with HTTP URIs (see Example 18: HTTP URIs for geometries).

RDF representations of geometries can support most geometry types and dimensions (up to at least 2 dimensions), with any level of complexity, in any coordinate reference system. On the other hand, many existing Semantic Web tools such as triple stores are currently not efficient enough to perform spatial queries which are complex and/or on complex geometries. It may therefore preferable to maintain geometries separately, in software platforms designed for these specific tasks.

It is nonetheless still desirable to make simplified geometries available for Web consumption in GeoJSON [RFC7946] or embedded in Web pages.

The following [TURTLE] snippet shows the [GeoDCAT-AP] representation of the dataset in Example 8. Here the bounding box is provided in multiple literal encodings (WKT, [GML], GeoJSON [RFC7946]), by using property locn:geometry [LOCN].

Example 15: [GeoDCAT-AP] representation of dataset spatial coverage (bounding box) in multiple encodings

@prefix dcat:    <http://www.w3.org/ns/dcat#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix locn:    <http://www.w3.org/ns/locn#> .

<http://www.ldproxy.net/bag/inspireadressen/> a dcat:Dataset ;
  dcterms:title "Adressen"@nl ;
  dcterms:title "Addresses"@en ;
  dcterms:description "INSPIRE Adressen afkomstig uit de basisregistratie Adressen,
                   beschikbaar voor heel Nederland"@nl ;
  dcterms:description "INSPIRE addresses derived from the Addresses base registry,
                   available for the Netherlands"@en ;
  dcterms:isPartOf <http://www.ldproxy.net/bag/> ;
  dcat:theme <http://inspire.ec.europa.eu/theme/ad> ;
  dcterms:spatial [
    a dcterms:Location ;
    locn:geometry
# Bounding box in WKT
      "POLYGON((3.053 47.975,7.24 47.975,7.24 53.504,3.053 53.504,3.053 47.975))"^^geosparql:wktLiteral ,
# Bounding box in GML
      "<gml:Envelope srsName=\"http://www.opengis.net/def/crs/OGC/1.3/CRS84\">
         <gml:lowerCorner>3.053 47.975</gml:lowerCorner>
         <gml:upperCorner>7.24  53.504</gml:upperCorner>
       </gml:Envelope>"^^geosparql:gmlLiteral ,
# Bounding box in GeoJSON
      "{ \"type\":\"Polygon\",\"coordinates\":[[
           [3.053,47.975],[7.24,47.975],[7.24,53.504],[3.053,53.504],[3.053,47.975]
         ]] }"^^https://www.iana.org/assignments/media-types/application/geo+json
  ] .

In the above example, the coordinate reference system used for the bounding box is CRS84 (equivalent to WGS 84, but with coordinate axis-order longitude/latitude), which is explicitly specified in the [GML] encoding via attribute @srsName, and by using the relevant HTTP URI from the OGC CRS registry. The coordinate reference system is not specified for the WKT encoding, since CRS84 is the default coordinate reference system for WKT in [GeoSPARQL], and therefore it can be omitted. The coordinate reference system is also not specified in the GeoJSON [RFC7946] encoding, since CRS84 is the only supported coordinate reference system in GeoJSON [RFC7946].

Always with reference to Example 8, the following snippet shows the [GML] and the RDF [RDF11-PRIMER] representations of the entry in the BAG Dutch register concerning the building where Anne Frank's house is located. For the corresponding GeoJSON [RFC7946] representation, see the relevant example in Best Practice 8: State how coordinate values are encoded.

Example 16: [GML] description of a building, with detailed geometry

The [GML] representation of Anne Frank's house building (taken from the BAG WFS endpoint):

<bagwfs:pand gml:id="pand.3323294">
  <bagwfs:identificatie>363100012169587</bagwfs:identificatie>
  <bagwfs:bouwjaar>1635</bagwfs:bouwjaar>
  <bagwfs:status>Pand in gebruik (niet ingemeten)</bagwfs:status>
  <bagwfs:gebruiksdoel>woonfunctie</bagwfs:gebruiksdoel>
  <bagwfs:oppervlakte_min>1</bagwfs:oppervlakte_min>
  <bagwfs:oppervlakte_max>21</bagwfs:oppervlakte_max>
  <bagwfs:aantal_verblijfsobjecten>20</bagwfs:aantal_verblijfsobjecten>
  <bagwfs:geometrie>
    <gml:MultiSurface srsDimension="2" axisLabels="east north"
                         srsName="urn:ogc:def:crs:EPSG::28992">
      <gml:surfaceMember>
        <gml:Polygon srsDimension="2">
          <gml:exterior>
            <gml:LinearRing>
              <gml:posList>
                120749.725 487589.422  120752.55  487594.375  120751.227 487595.129
                120732.539 487605.788  120723.505 487589.745  120721.387 487585.939
                120740.668 487575.07   120743.316 487573.589  120747.735 487581.337
                120751.564 487579.154  120755.411 487576.96   120750.935 487569.172
                120755.941 487566.288  120764.369 487581.066  120749.725 487589.422
                </gml:posList>
            </gml:LinearRing>
          </gml:exterior>
        </gml:Polygon>
      </gml:surfaceMember>
    </gml:MultiSurface>
  </bagwfs:geometrie>
</bagwfs:pand>

The corresponding RDF representation is provided in the following [TURTLE] snippet (taken from the BAG Linked Data service). NB: The RDF representation below has been complemented with additional properties (marked with # Added) for demonstration purposes.

Example 17: [RDF] description of a building, with detailed geometry

@prefix bag:       <http://bag.basisregistraties.overheid.nl/def/bag#> .
@prefix dcterms:   <http://purl.org/dc/terms/> .
@prefix geosparql: <http://www.opengis.net/ont/geosparql#> .
@prefix gml-ont:   <http://www.opengis.net/ont/gml#> .
@prefix locn:      <http://www.w3.org/ns/locn#> .
@prefix pdok:      <http://data.pdok.nl/def/pdok#> .
@prefix rdfs:      <http://www.w3.org/2000/01/rdf-schema#> .
@prefix schema:    <http://schema.org/> .
@prefix w3cgeo:    <http://www.w3.org/2003/01/geo/wgs84_pos#> .

<http://bag.basisregistraties.overheid.nl/bag/id/pand/0363100012169587> 
  a geosparql:Feature, bag:Pand ;
  rdfs:label "Pand 0363100012169587"@nl;
  rdfs:isDefinedBy <http://bag.basisregistraties.overheid.nl/bag/doc/2016083000000000/pand/0363100012169587> ;
  bag:identificatiecode "0363100012169587"^^xsd:string;
# Added
  dcterms:identifier "363100012169587"^^xsd:string ;
  bag:status <http://bag.basisregistraties.overheid.nl/id/begrip/PandInGebruik_nietIngemeten> ;
  bag:oorspronkelijkBouwjaar "1635"^^xsd:gYear;
# Added
  dcterms:created "1635"^^xsd:gYear ;
# Added
  locn:address <http://www.ldproxy.net/bag/inspireadressen/inspireadressen.3329155> ;
  geosparql:hasGeometry <http://bag.basisregistraties.overheid.nl/bag/id/geometry/5C1F8F11324717378B437B2CD12871FF> ;
  bag:geometriePand <http://bag.basisregistraties.overheid.nl/bag/id/geometry/5C1F8F11324717378B437B2CD12871FF>
.

<http://bag.basisregistraties.overheid.nl/bag/id/geometry/5C1F8F11324717378B437B2CD12871FF> 
  a geosparql:Geometry, gml-ont:Surface ;
  geosparql:asWKT 
    "POLYGON ((
      4.8842353  52.375108 , 4.884276 52.375153 , 
      4.8842567  52.375159 , 4.883981 52.375254 , 
      4.8838502  52.375109 , 4.883819 52.375075 , 
      4.8841037  52.374979 , 4.884143 52.374965 , 
      4.8842069  52.375035 , 4.884263 52.375016 , 
      4.8843200  52.374996 , 4.884255 52.374926 , 
      4.8843289  52.374901 , 4.884451 52.375034 , 
      4.8842353  52.375108
    ))"^^geosparql:wktLiteral ;
  pdok:asWKT-RD 
    "POLYGON ((
      120749.725 487589.422 , 120752.55 487594.375  ,   
      120751.227 487595.129 , 120732.539 487605.788 ,
      120723.505 487589.745 , 120721.387 487585.939 , 
      120740.668 487575.07  , 120743.316 487573.589 , 
      120747.735 487581.337 , 120751.564 487579.154 , 
      120755.411 487576.96  , 120750.935 487569.172 , 
      120755.941 487566.288 , 120764.369 487581.066 , 
      120749.725 487589.422
    ))"^^xsd:string ;
# Added
  geosparql:asWKT 
    "<http://www.opengis.net/def/crs/EPSG/0/28992> POLYGON ((
      120749.725 487589.422 , 120752.55 487594.375  ,   
      120751.227 487595.129 , 120732.539 487605.788 ,
      120723.505 487589.745 , 120721.387 487585.939 , 
      120740.668 487575.07  , 120743.316 487573.589 , 
      120747.735 487581.337 , 120751.564 487579.154 , 
      120755.411 487576.96  , 120750.935 487569.172 , 
      120755.941 487566.288 , 120764.369 487581.066 , 
      120749.725 487589.422
    ))"^^geosparql:wktLiteral
.

The different WKT encodings in the example show alternative ways of specifying the coordinate reference system used.

The two instances of property geosparql:asWKT follow the syntax recommended in [GeoSPARQL], where the specification of the coordinate reference system is required only if different from CRS84. By contrast, property pdok:asWKT-RD implies the use of a specific coordinate reference system, namely, EPSG:28992 ("Amersfoort / RD New"). The coordinate axis-order used is determined here by the coordinate reference system, and in both cases, it is longitude / latitude (more precisely, east/north for EPSG:28992).

Example 17: [RDF] description of a building, with detailed geometry shows also how geometries for Spatial Things can be published as separate Web resources. This approach can be particularly suitable for giving access to huge geometries, consisting of hundreds of vertex positions (as the detailed geometry of the boundaries of a geographical region), without attaching them to the relevant Spatial Things. Moreover, this allows the same geometry to be linked from (i.e., re-used by) different Spatial Things. Finally, it is possible to use mechanisms (including HTTP content negotiation) to provide access to different representations / encodings of the geometry ([GML], WKT, GeoJSON [RFC7946], etc.) as media types, thus addressing different use cases. (On this topic, see also Best Practice 10: Use appropriate relation types to link Spatial Things).

Example 18: HTTP URIs for geometries

As shown in Example 4, the following URI:

https://brt.basisregistraties.overheid.nl/top10nl/id/gebouw/102625209

denotes Amsterdam Central train station. However, its geometry is provided as a separate, standalone resource, denoted by the following URI:

https://brt.basisregistraties.overheid.nl/top10nl/id/geometry/2525562935f2c33152e98f65f9d8d6ff

A similar approach is used by Ordnance Survey. For instance, North Devon is denoted by the following URI:

http://data.ordnancesurvey.co.uk/id/7000000000022933

whereas its geometry is denoted by:

http://data.ordnancesurvey.co.uk/id/geometry/22933-4

An additional example is the API of the GADM-RDF project, providing access to spatial linked data concerning administrative areas. For instance, the following URI http://gadm.geovocab.org/id/0/60 returns a description of administrative area "Germany", which links to the geometry of Germany's boundaries, provided via a separate URI: http://gadm.geovocab.org/id/0/60/geometry.

Dereferencing the geometry URIs operated by the GADM-RDF API returns different geometry representations / encodings (SVG included), that can be accessed via HTTP content negotiation or by appending the format extension to the URI. For instance, URI http://gadm.geovocab.org/id/0/60/geometry.geojson returns the GeoJSON [RFC7946] representation of the geometry. Direct links to the supported geometry representations / encodings are specified in the RDF and HTML representations of the geometry.

How to Test

Check if:

Geometries are made available in possibly different formats and levels of complexity, considering their intended uses and their consumption on the Web.
The chosen geometry descriptions comply with Best Practice 6: Provide geometries at the right level of accuracy, precision, and size, Best Practice 7: Choose coordinate reference systems to suit your user's applications, and Best Practice 8: State how coordinate values are encoded.
The (possibly) alternative geometry descriptions can be accessible via standard mechanisms, as HTTP content negotiation.

Evidence

Relevant requirements: R-MultipleCRSs, R-BoundingBoxCentroid, R-Compressible, R-CRSDefinition, R-EncodingForVectorGeometry, R-IndependenceOnReferenceSystems, R-MachineToMachine, R-SpatialMetadata, R-3DSupport, R-TimeDependentCRS, R-TilingSupport.

Benefits

Processability
Reuse
Interoperability
Access

Best Practice 6: Provide geometries at the right level of accuracy, precision, and size

Geometry data should be provided at levels of accuracy, precision, and size fit for their use on the Web.

Why

Geometry data always provide an approximate description of the shape and extent of Spatial Things, which is fit for specific uses. For instance, portraying a geometry on a Web map would typically not require the level of detail that is needed for using the same geometry for spatial analysis. Moreover, although a 3D description of a geometry of a building might be available, a Web map would be typically capable of portraying just its 2-dimensional footprint.

Other issues to be taken into account are network bandwidth and the processing capabilities of the target tools. For instance, a geometry of a total size of 1GB or more, could be more efficiently transmitted after being compressed. On the other hand, a tool with limited processing capabilities (as a Web browser) may not be able to efficiently handle such geometry (e.g., for displaying it on a Web map).

This best practice complements Best Practice 5: Provide geometries on the Web in a usable way by outlining some of the approaches that can be used to publish alternative versions of geometry data, with respect to the level of accuracy, precision, and size, fit for the most general use cases and the reference target communities.

Note

This best practice is not meant to provide detailed guidelines on (a) which is the right level of accuracy and precision for different use cases, or (b) how to generate and publish alternative geometries for Spatial Things. On these specific topics, we refer the reader to, respectively, Best Practice 12: Expose spatial data through 'convenience APIs' and Best Practice 14: Describe the positional accuracy of spatial data.

Intended Outcome

Geometry data should be made available at (possibly different) levels of accuracy, precision, and size, taking into account:

The required level of precision and accuracy of the intended use case(s).
The processing capabilities of the target tools.
Optimization in terms of network bandwidth consumption

As said in Best Practice 5: Provide geometries on the Web in a usable way, the requirements of the geospatial, Linked Data and Web communities should be ideally taken into account also with respect to the accuracy, precision, and size of geometry data. Whenever this is not feasible, Web consumption requirements should at least be addressed.

Possible Approach to Implementation

A number of techniques can be used to deliver representations of geometries at an accuracy, precision, and size fitting the requirements of a given use case.

The following list, although not exhaustive, outlines the approaches most widely used, especially for the Web delivery and consumption of geometry data.

Choosing the right technique requires taking primarily into account whether the derived geometry is fit for the target use case. Technical limits - as network bandwidth and processing capabilities - are of course important, but secondary. Of course, the ideal situation is when you are able to find the technique offering the right trade-off between these two types of requirements.

Whatever option is used, the key requirement is that the derived geometry data are not replacing the original ones, but are made available as alternative representations.

Best Practice 5: Provide geometries on the Web in a usable way, Best Practice 14: Describe the positional accuracy of spatial data and Best Practice 12: Expose spatial data through 'convenience APIs' provide general guidelines that can be used for the publication of alternative representations of geometries, providing at the same time information on their characteristics. These include, but are not limited to, the use of different URIs for different representations, and HTTP content negotiation. Moreover, whenever geometry data are made available in RDF [RDF11-PRIMER], specific properties can be used to specify the geometry type and the level of accuracy and precision. More specific examples are included in the approaches described below.

Compress geometry data

Using standard compression algorithms, as zip and gzip, addresses the issue of efficient transmission of geometry data, without information loss. Notably, some formats come with alternative compressed encodings - e.g., KMZ is used to deliver compressed [KML] data.

Compression can be easily carried out on the fly, and it is also supported by the HTTP protocol via content negotiation - see [RFC2616], section 3.5: Content Codings.
Use formats optimizing access to and processing of geometry data

Some formats support a more compact description of geometry data, which potentially results in reducing network bandwidth consumption and/or more efficient client-side processing.

This is for instance the case of TopoJSON, an extension to GeoJSON [RFC7946] which reduces redundancy in the description of a geometry, by splitting it into segments (referred to as "arcs") that can be re-used.

To achieve the same results, other formats are designed to enable the stream-based delivery of geometry data. For instance, GeoJSON Text Sequences [RFC8142] is a format designed to optimize access and processing of GeoJSON [RFC7946] data, by enabling a client application to use the received data even before the transmission is completed.

Another approach, focused on efficient client-side processing, is GeoJSON-VT, a library which enables a client to create on the fly vector tiles from GeoJSON [RFC7946] data.

Finally, Geohash provides a compact way of encoding 0-dimensional geometries (points), which, at the same time, can be used for spatial indexing.

Example 19

The point coordinates of the address of Anne Frank's House (see Example 8) can be encoded with Geohash as u173zns7thy (corresponding to the following WGS 84 lat/long coordinates: 52.37520 4.88399).
Provide geometries at different levels of generalization

Generalization is a traditional technique used in spatial data - first of all, in cartography - to reduce the precision and/or accuracy of a geometry for specific purposes. A typical example is provided by how geometries are portrayed in maps of different scales: for instance, a large-scale map can depict the width of a road (2-dimensional geometry), whereas, at lower scales, the same road can be shown as a line with zero width (1-dimensional geometry).

Providing geometries at different scales or resolutions is actually one of the first criteria to be considered for addressing different use cases. This is common practice in the geospatial domain, especially, but not only, for reference data. For instance, the dataset of the Nomenclature of Territorial Units for Statistics (NUTS) of the European Union is made available at five different scales - ranging from 1:1,000,000 to 1:60,000,000.

Example 20

The GADM-RDF project provides access to geometries of administrative areas at a resolution of 100m, 1km, 10km, and 100km. Each of these variants is associated with a different HTTP URI, and geometry data are made available in different formats. For instance, the geometry of Germany at 100m resolution is denoted by the following URI http://gadm.geovocab.org/id/0/60/geometry_100m, whereas the variant at 100km resolution is available from the following URI: http://gadm.geovocab.org/id/0/60/geometry_100km (see also Example 18: HTTP URIs for geometries).

Scale reduction uses a number of generalization techniques that can be used also outside this specific use case in order to provide geometries at different levels of accuracy and precision.

These techniques include the following:

Reducing precision

It boils down to reducing the number of decimals in point coordinates of a geometry. This feature is widely supported in geospatial tools and Web libraries, and it provides a way to effectively reduce the size of geometry data without losing too much information about its shape.

Note

The precision with which coordinate positions are reported often do not reflect the accuracy of the measurement. For example, latitude and longitude reported to six decimal places corresponds to a precision of around 1cm on the ground. GPS-enabled consumer devices are accurate to within a few meters: centimeter-accuracy can only be achieved with professional equipment. Yet a lot of software defaults to use of six, seven or even more decimal places when expressing coordinate positions which may mislead users to thinking that the data is more accurate than it actually is!

Best Practice 14: Describe the positional accuracy of spatial data for a discussion on precision and accuracy.

Simplification

This basically consists in reducing the number of point coordinates of a geometry. Examples of algorithms used for this purpose are Ramer-Douglas-Peucker (RDP) and Visvalingam-Whyatt.

Conversion of geometry dimensions

One of the cases is the example mentioned earlier in this section, where the geometry of a road, originally, 2-dimensional, is converted into a 1-dimensional object (a line). This can also apply to conversion from 3-dimensional geometries into 2-dimensional ones (e.g., the 3D representation of a building is converted into its 2D footprint), and to conversion of an n-dimensional geometry into a point.
Provide the centroid and bounding box of a geometry

Centroids and bounding boxes are another example of how a geometry can be generalized, but serving different purposes. More precisely, a centroid is meant to specify the position of a Spatial Thing by converting its actual geometry to a point, corresponding to its center. On the other hand, a bounding box provides a simplified description of the maximum extent of a Spatial Thing.

Although both these generalization methodologies result in a high-level information loss with respect to the original geometry, they play an important role in spatial analysis because of the topological information they provide. Moreover, centroids and bounding boxes could provide an accurate enough description of a geometry for those use cases where, respectively, the extent or precise shape of a Spatial Thing is not relevant. Finally, they are widely used also outside the geospatial domain.

Computation of centroids and bounding boxes is supported by all GIS tools and Web mapping libraries, which makes it possible to be carried out on the fly. However, performing this operation client-side can be extremely inefficient if the target tool has limited processing capabilities.

This issue can be addressed by providing access to centroids and bounding boxes as alternative representations of a given geometry.
Example 21

In the following [TURTLE] snippet, [W3C-BASIC-GEO] and [GeoRSS] are used to specify, respectively, the centroid (w3cgeo:lat and w3cgeo:long) and bounding box (georss:box) of the 2-dimensional footprint of the building hosting Anne Frank's Museum (see Example 17: [RDF] description of a building, with detailed geometry).
```
@prefix bag:       <http://bag.basisregistraties.overheid.nl/def/bag#> .
@prefix georss:    <http://www.georss.org/georss/> .
@prefix geosparql: <http://www.opengis.net/ont/geosparql#> .
@prefix rdfs:      <http://www.w3.org/2000/01/rdf-schema#> .
@prefix w3cgeo:    <http://www.w3.org/2003/01/geo/wgs84_pos#> .

<http://bag.basisregistraties.overheid.nl/bag/id/pand/0363100012169587> 
  a geosparql:Feature, bag:Pand ;
  rdfs:label "Pand 0363100012169587"@nl;
  
# Detailed geometry  
  
  geosparql:hasGeometry <http://bag.basisregistraties.overheid.nl/bag/id/geometry/5C1F8F11324717378B437B2CD12871FF> ;
  bag:geometriePand     <http://bag.basisregistraties.overheid.nl/bag/id/geometry/5C1F8F11324717378B437B2CD12871FF> ;
  
# Centroid

  w3cgeo:lat  "52.37509"^^xsd:float ;
  w3cgeo:long "4.88412"^^xsd:float ;
  
# Bounding box

  georss:box "52.3749,4.8838 52.3753,4.8845"^^xsd:string .
.
```

How to Test

Check if:

The original and most detailed version of geometry data is available.
Compressed version of geometry data can be obtained via HTTP content negotiation or other mechanisms.
Centroids and bounding boxes are made available, without the need of downloading and processing the relevant geometry data.
It is possible to get a 2-dimensional representation of a 3-dimensional geometry.
Geometry data are available at different levels of precision, e.g., by allowing users to specify the maximum number of decimals in point coordinates.
Geometry data are available at different scales / spatial resolutions.

Evidence

Relevant requirements: R-BoundingBoxCentroid, R-Compatibility, R-Compressible, R-CoordinatePrecision,

Benefits

Processability
Reuse
Access

Best Practice 7: Choose coordinate reference systems to suit your user's applications

Consider your user's intended application when choosing the coordinate reference system(s) used to publish spatial data.

Why

A multitude of coordinate reference systems exist because there is no perfect solution to meet all requirements:

The Earth is a complicated shape (neither spherical nor flat!):

For each (Earth-based) coordinate reference system, the topographical surface of the Earth is approximated to a geodetic datum that is described using an ellipsoid. The trouble with approximation is that nothing is perfect everywhere, which means that compromise is inevitable. Some datums, like WGS 84, provide a reasonable (but not highly accurate) fit everywhere on the Earth, while other datums (such as the European Terrestrial Reference System 1989 - as used by ETRS89 / EPSG:4258) provide a better fit in a given region at the expense of accuracy elsewhere.

Spatial data is often projected from the curved surface of the Earth onto a flat plane (e.g. a computer screen or a topographical map) to make it easier to compute distances between positions and calculate areas. There are many choices of projection (e.g. equirectangular, mercator, stereographic, orthographic etc.), each of which is designed for particular tasks. As with datums, projections are often chosen to better support regional, national or local needs.

It is also worth noting that as a living planet, the Earth continues to change its shape; for example, continental drift moves Australia north-eastwards several centimeters each year and New Zealand shifts in multiple directions. To retain accuracy, datums need to be adjusted from time to time - as is the case of the New Zealand Geodetic Datum (NZGD2000) that is frequently revised to take account of earth deformations.
Sometimes we don't want to measure relative to the surface of the Earth at all:

Spatial data such as descriptions of the built environment, geological surveys, satellite imagery, etc. are often captured and stored in an engineering coordinate reference system as measurements from a local datum. For example, X Y survey coordinates relative to a building corner, pixel positions within the image swath of a satellite camera, or distance along a line from a fixed origin point.

Although it is possible to convert coordinates from one CRS to another, many users will be put off by the need to do so. Furthermore, the need for such transformations introduces a point where errors can be introduced to the spatial data - especially where users have limited expertise with spatial data.

When publishing spatial data, it is best to help users avoid the need for them to transform spatial data between coordinate reference systems themselves by providing data in a form, or forms, which they can use directly. To determine which coordinate reference system(s) are needed, data publishers must consider the intended applications of their user community.

Intended Outcome

Spatial data is provided in a coordinate reference system, or systems, that are sensitive to the needs of user's intended applications.

Most of a publisher's anticipated user community do not need to transform coordinate values prior to using the spatial data.

Possible Approach to Implementation

Note

Whichever coordinate reference system is chosen for the publication of spatial data, it is imperative that that choice is made clear to users. Please refer to Best Practice 8: State how coordinate values are encoded for further details.

The first thing that publishers of spatial data need to do is consider their audience.

When publishing spatial data on the Web, the largest community of potential users will be unknown: anyone might find and use data published on the Web! To support this unanticipated reuse, we recommend always publishing your spatial data using a global coordinate reference system which allows spatial data from multiple sources to be readily combined for display or computation. For geospatial data with point, line or polygon geometries (i.e. vector data), WGS 84 Lat/Long (EPSG:4326) or WGS 84 Lat/Long/Elevation (EPSG:4979) are good choices as many of the tools and applications used by Web developers are set up to use data from GPS-enabled mobile devices that all use WGS 84. Where you have geo-imagery (i.e. raster data, comprised of a rectangular pattern of pixels on a flat plane) it is best to use Web Mercator (EPSG:3857) which has global coverage.

Note

Data publishers should be aware that the geodetic datum used by Web Mercator is spherical and not true to the shape of the earth. At high latitudes, this results in positional differences of up to 20 kilometers when compared with WGS 84. However, many Web-mapping tools transparently perform the necessary transformations to ensure that geospatial vector data is correctly plotted on the underlying base map.

Where considerations of the known user community (or communities) call for different coordinate reference systems, we recommend publishing spatial data in multiple representations: one for each of the prioritized coordinate reference systems. Clearly, the number of representations provided needs to be determined with respect to the associated effort. However, remember that a decision not to publish data in a priority CRS will result in each member of your user community needing to do that task - or them not using your data.

Common reasons for needing to publish in additional coordinate reference systems include:

publication through government data portals that require use of a projected CRS defined by the national mapping agency - and similar legislative requirements;

The Basisregistraties Adressen en Gebouwen (BAG), or Basic Registers for Addresses and Buildings, provided by Kadaster, publishes data in both OGC CRS84 (using the WGS 84 geodetic datum) and the Amersfoort / RD (EPSG:28992) coordinate reference systems.

The INSPIRE Directive 2007/2/EC of the European Commission requires that the European Terrestrial Reference System 1989 ETRS89 (EPSG:4258) is used for the referencing of spatial datasets.
applications such as augmented reality, defense and precision agriculture that require coordinates to be accurate to tens of centimeters or less, thereby requiring the use of a CRS with an alternative geodetic datum that provides a superior fit for the local or regional geographic area - noting that every CRS and datum should define the geographic area within which it is intended to be used;
the need to support applications that work in a local frame of reference using an engineering CRS - such as in an urban environment, inside a building complex or using chainage along a survey line;
avoiding computationally intensive reprojection of raster data such as satellite imagery or base maps within end-user applications - which may mean publishing vector data in the same projected CRS so that can be easily aligned with the raster data; and
the need to retain the integrity of raster data by publishing in its original projection, thereby avoiding modification of pixel values due to the reprojection.

Note

There are many cases where WGS 84, or any Earth-based coordinate reference system, are not appropriate. For example, when describing location relative to other celestial bodies (e.g. Lunar geography, and areography - the geography of Mars), the arrangement of cells on a microscope slide, tapes in a mass storage unit, or the position of an artifact in a museum warehouse. In such cases, publication of spatial data in WGS 84 is either impossible or provides no value.

That said, many of these best practices are still relevant. In particular, see Best Practice 9: Describe relative positioning.

Note

Discussion of coordinate system transformations is beyond the scope of this best practice document: converting coordinates between CRSs that use different datums and or projections can be very involved. This is especially true where elevation values are missing from the source data. For reference, EPSG guidelines say that in such cases reasonable assumptions are:

Height = 0 meters (i.e. we are standing on the surface of the ellipsoid); or
The height is given by a digital elevation model (i.e. we are standing on the surface of the planet).

That said, we note that there are several open source software implementations are available to help users do such conversions. These include: the Geospatial Data Abstraction Library (GDAL), the Cartographic Projections Library (PROJ.4), its associated JavaScript implementation (PROJ4.JS) and the Apache Spatial Information System Library (SIS).

How to Test

Check that geospatial data (i.e. data about things located relative to the Earth) is available, as a minimum, in a global coordinate reference system: for vector data, this should be WGS 84 Lat/Long (EPSG:4326) or WGS 84 Lat/Long/Elevation (EPSG:4979); for raster data this should be Web Mercator (EPSG:3857).

Evidence

Relevant requirements: R-AvoidCoordinateTransformations, R-CoordinatePrecision.

Benefits

Comprehension
Processability
Reuse
Interoperability
Access

Best Practice 8: State how coordinate values are encoded

Provide enough information for users to determine how coordinate values are encoded.

Why

The geometry of Spatial Things is described using position coordinates; for example, latitude and longitude. Because coordinates describe a position relative to a datum (e.g. zero latitude is the equator and zero longitude is the prime meridian - often the Greenwich Meridian), it is important to understand both the datum and the units that are used for coordinates along with the order which the coordinate axes are defined: the coordinate reference system (CRS). Spatial data is published in a wide variety of CRS. This variety can create confusion and inconsistencies in using and interpreting spatial data. Unless the CRS is known, errors are likely to be introduced when determining the position and extent of a Spatial Thing on the Earth and this makes comparing or combining spatial data from different sources extremely problematic.

Intended Outcome

Sufficient information is provided to enable coordinates to be related to the correct position, thereby enabling spatial data to be correctly interpreted by humans and software agents.

Spatial data from different sources can be combined without introducing unwarranted positional errors.

Possible Approach to Implementation

A user of spatial data will need to know:

which coordinate value relates to which axis;
what units used for each coordinate; and
what datum is used

Note

There is a predominant view that "I just need to use Lat and Long - and I'm done".

Although the clear majority of spatial data published on the Web uses WGS 84 Long/Lat (as used by GPS), we strongly recommend that spatial data is published with all the necessary information to interpret coordinate values. Even where the use of latitude and longitude angular measurements is obvious; the choices of datum and units of measurement have an impact. In particular, angular measurements appearing as floating point numbers are most likely to be provided in decimal degrees, but could also be in radians or gons (also known as grads).

The problem is that the assumption of a "predominant view" leads to ambiguity. For example, many spatial data users work entirely with information provided in their national coordinate reference system (such as the Dutch Amersfoort / RD EPSG:28992 or British National Grid EPSG:27700) which make all coordinates in WGS 84 Long/Lat (especially the negative numbers) utterly perplexing.

In practice, a publisher not documenting their CRS and presuming that latitude and longitude can be treated as cartesian is often bailed out by fuzzy use cases and software that takes care of projections. However, CRS and coordinate axis order ambiguity leads sooner or later to serious and avoidable errors, while ignorance of datums and map projections leads to broken applications. Furthermore, these practices will also become less and less tenable as new applications such as Augmented Reality require higher data precision and accuracy.

There are four common ways that this information can be provided:

Describe the coordinate reference system in the dataset metadata.

Example 22: Coordinate reference system stated in [GeoDCAT-AP] (TTL encoding)

@prefix ex:      <http://data.example.org/datasets/> .
@prefix dcat:    <http://www.w3.org/ns/dcat#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix skos:    <http://www.w3.org/2004/02/skos/core#> .

ex:ExampleDataset 
  a dcat:Dataset ;
  dcterms:conformsTo <http://www.opengis.net/def/crs/EPSG/0/32630> .

<http://www.opengis.net/def/crs/EPSG/0/32630> 
  a dcterms:Standard, skos:Concept ;
  dcterms:type <http://inspire.ec.europa.eu/glossary/SpatialReferenceSystem> ;
  dcterms:identifier "http://www.opengis.net/def/crs/EPSG/0/32630"^^xsd:anyURI ;
  skos:prefLabel "WGS 84 / UTM zone 30N"@en ;
  skos:inScheme <http://www.opengis.net/def/crs/EPSG/0/> .

The example above illustrates how to describe the coordinate reference system used for a dataset within [GeoDCAT-AP] metadata. The conformsTo property from [DCTERMS] is used to assert the relationship between dataset and CRS in the same way that conformance with a standard is expressed in [VOCAB-DQV].

Dataset metadata for spatial data should always provide details of the CRS used. For more information about dataset metadata, please refer to Best Practice 13: Include spatial metadata in dataset metadata.

Provide each coordinate value with explicit labels and provide metadata to indicate what each label means.

Example 23: Coordinate position provided using [W3C-BASIC-GEO]

@prefix w3cgeo: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
@prefix dcterms: <http://purl.org/dc/terms/> .

:myPointOfInterest a w3cgeo:SpatialThing ;
    dcterms:description "Anne Frank's House, Amsterdam."
    w3cgeo:lat "52.37514"^^xsd:float ;
    w3cgeo:long "4.88412"^^xsd:float ;
    .

The labels (or terms) w3cgeo:lat and w3cgeo:long are provided by the [W3C-BASIC-GEO] vocabulary which states that it is:

A vocabulary for representing latitude, longitude and altitude information in the WGS 84 geodetic reference datum.

The terms themselves (plus w3cgeo:alt) are defined with all the necessary information as follows:

lat: The WGS 84 latitude of a Spatial Thing (decimal degrees).
long: The WGS 84 longitude of a Spatial Thing (decimal degrees).
alt: The WGS 84 altitude of a Spatial Thing (decimal meters above the local reference ellipsoid).

Example 24: Coordinate position provided using [JSON-LD] and [SCHEMA-ORG]

<script type="application/ld+json">
{
  "@context" : {
    "@vocab" : "http://schema.org/"
  },
  "myPointOfInterest" : {
    "@type" : "Place",
    "geo" : {
      "@type": "GeoCoordinates",
      "latitude": "52.37514",
      "longitude": "4.88412"
    }
  }
}
</script>

In the example above, the labels latitude and longitude are defined in [SCHEMA-ORG], as indicated by the [JSON-LD] key @vocab. The associated definitions in [SCHEMA-ORG] are:

latitude: The latitude of a location. For example 37.42242 (WGS 84).
longitude: The longitude of a location. For example -122.08585 (WGS 84).

Note

The definitions provided in [SCHEMA-ORG] do not indicate the unit of measure. However, we have included this example as [SCHEMA-ORG] is very commonly used. The unit of measure used for latitude and longitude are decimal degrees, and decimal meters is used for the remaining coordinate position property elevation.

The metadata for axis labels may also be provided in the documentation for an API from which the spatial data is accessed. For more information on documenting APIs, please refer to [DWBP] Best Practice 25: Provide complete documentation for your API.

Example 25: Coordinate position provided using column in tabular data

GID,On Street,Long,Lat,Species,Trim Cycle,Diameter at Breast Ht,Inventory Date,Comments,Protected
1,ADDISON AV,-122.15649,37.44096,Celtis australis,Large Tree Routine Prune,11,10/18/2010,,
2,EMERSON ST,-122.15675,37.44096,Liquidambar styraciflua,Large Tree Routine Prune,11,6/2/2010,,
6,ADDISON AV,-122.15630,37.44115,Robinia pseudoacacia,Large Tree Routine Prune,29,6/1/2010,cavity or decay; trunk decay; codominant leaders; included bark; large leader or limb decay; previous failure root damage; root decay;  beware of BEES,YES

In this example (adapted from the City of Palo Alto tree operations database and published as tabular data and as an interactive map) the coordinate position of each tree is specified using separate columns (Long and Lat).

We see the definitions of those Long and Lat columns provided in the dataset metadata, in this case a tabular metadata document, as per approach (1) above. Long and Lat are mapped onto the definitions provided by [W3C-BASIC-GEO] to ensure that the meaning of the data values in those columns is clear:

Example 26: Abridged tabular metadata providing column meanings

{
  "@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}],
  "@id": "http://example.org/tree-ops-db",
  "url": "tree-ops-db.csv",
  "dcterms:title": "Tree Operations",
  ...
  "tableSchema": {
    "columns": [{
      "name": "GID",
      "titles": [
        "GID",
        "Generic Identifier"
      ],
      "dcterms:description": "An identifier for the operation on a tree.",
      "datatype": "string",
      "required": true, 
      "suppressOutput": true
    }, {
      "name": "on_street",
      "titles": "On Street",
      "dcterms:description": "The street that the tree is on.",
      "datatype": "string"
    }, {
      "name": "Long",
      "titles": "Longitude",
      "dcterms:description": "The WGS 84 longitude of the tree (decimal degrees).",
      "propertyUrl": "http://www.w3.org/2003/01/geo/wgs84_pos#long"
      "datatype": {
        "base": "number",
        "minimum": "-180",
        "maximum": "180"
      }
    }, {
      "name": "Lat",
      "titles": "Latitude",
      "propertyUrl": "http://www.w3.org/2003/01/geo/wgs84_pos#lat"
      "dcterms:description": "The WGS 84 latitude of the tree (decimal degrees).",
      "datatype": {
        "base": "number",
        "minimum": "-90",
        "maximum": "90"
      }
    },
    ...
    "primaryKey": "GID",
    "aboutUrl": "http://example.org/tree-ops-ext#gid-{GID}"
  }
}

Note

Please refer to [TABULAR-DATA-PRIMER] section 6.2 How do you support geospatial data? for more details on working with geospatial content in tabular data.

Use a data format that specifies axes, their order, datum and unit of measurement for coordinates.
Example 27: Coordinates encoded using GeoJSON [RFC7946] in HTTP response
```
HTTP/1.1 200 OK
Date: Sun, 05 Mar 2017 17:12:35 GMT
Content-length: 543
Connection: close
Content-type: application/geo+json

{
  "type": "Feature",
  "geometry": {
    "type": "Polygon",
    "coordinates": [
      [ [4.884235, 52.375108], [4.884276, 52.375153], 
        [4.884257, 52.375159], [4.883981, 52.375254], 
        [4.883850, 52.375109], [4.883819, 52.375075], 
        [4.884104, 52.374979], [4.884143, 52.374965], 
        [4.884207, 52.375035], [4.884263, 52.375016], 
        [4.884320, 52.374996], [4.884255, 52.374926], 
        [4.884329, 52.374901], [4.884451, 52.375034], 
        [4.884235, 52.375108] ]
      ]
  },
  "properties": {
    "name": "Anne Frank's House"
  }
}
```
The media type application/geo+json is used to designate that content is provided in GeoJSON format, as specified in [RFC7946].

[RFC7946] Section 4. Coordinate Reference System provides all the necessary information to interpret the coordinates, stating that:

The coordinate reference system for all GeoJSON [RFC7946] coordinates is a geographic coordinate reference system, using the World Geodetic System 1984 (WGS 84) [WGS84] datum, with longitude and latitude units of decimal degrees. This is equivalent to the coordinate reference system identified by the Open Geospatial Consortium (OGC) URN urn:ogc:def:crs:OGC::CRS84. An OPTIONAL third-position element SHALL be the height in meters above or below the WGS 84 reference ellipsoid. In the absence of elevation values, applications sensitive to height or depth SHOULD interpret positions as being at local ground or sea level.
Example 28: Coordinate position provided using [JSON-LD] and [SCHEMA-ORG]
```
<script type="application/ld+json">
{
  "@context" : {
    "@vocab" : "http://schema.org/"
  },
  "myPlaceOfInterest" : {
    "@type" : "Place",
    "name" : "Anne Frank's House",
    "geo" : {
      "@type": "GeoShape",
      "polygon": "52.375108,4.884235 52.375153,4.884276 
                  52.375159,4.884257 52.375254,4.883981 
                  52.375109,4.883850 52.375075,4.883819 
                  52.374979,4.884104 52.374965,4.884143 
                  52.375035,4.884207 52.375016,4.884263 
                  52.374996,4.884320 52.374926,4.884255 
                  52.374901,4.884329 52.375034,4.884451 
                  52.375108,4.884235"
    }
  }
}
</script>
```
The [SCHEMA-ORG] definition of GeoShape states:

The geographic shape of a place. A GeoShape can be described using several properties whose values are based on latitude/longitude pairs. Either whitespace or commas can be used to separate latitude and longitude; whitespace should be used when writing a list of several such points.

Note

In these two previous examples, we see a prime example of why coordinate axis-order is important: GeoJSON [RFC7946] uses Long/Lat while [SCHEMA-ORG] uses Lat/Long. Getting the axis order in the wrong order puts Anne Frank's House somewhere off the coast of Somalia rather than the Netherlands!
State within the data itself which coordinate reference system is used.
Example 29: Coordinate reference system stated in [GML]
```
<gml:Polygon srsDimension="2" axisLabels="east north" 
             srsName="http://www.opengis.net/def/crs/EPSG/0/28992">
  <gml:exterior>
    <gml:LinearRing>
      <gml:posList>
        120749.725 487589.422  120752.55  487594.375  120751.227 487595.129
        120732.539 487605.788  120723.505 487589.745  120721.387 487585.939
        120740.668 487575.07   120743.316 487573.589  120747.735 487581.337
        120751.564 487579.154  120755.411 487576.96   120750.935 487569.172
        120755.941 487566.288  120764.369 487581.066  120749.725 487589.422
      </gml:posList>
    </gml:LinearRing>
  </gml:exterior>
</gml:Polygon>
```
The example above encodes the polygon for Anne Frank's House in [GML]. The XML [XML11] attribute srsName (srs meaning "spatial reference system") refers to the Amersfoort / RD CRS (EPSG:28992) used in the Netherlands. Also note that additional useful information (srsDimension and axisLabels) is provided within the document for easy reference.
Example 30: Coordinate reference system stated in [GeoSPARQL] WKT ([JSON-LD] encoding)
```
{
  "@context": {
    "geosparql" : "http://www.opengis.net/ont/geosparql#" ,
    "rdfs" : "http://www.w3.org/2000/01/rdf-schema#" ,
    "asWKT" : {
      "@id" : "http://www.opengis.net/ont/geosparql#asWKT" ,
      "@type" : "geosparql:wktLiteral"
    }
  } ,
  "@id" : "http://example.org/register/id/building/0363100012169587" ,
  "@type" : "http://www.opengis.net/ont/geosparql#Feature" ,
  "rdfs:label" : "Building 0363100012169587" ,
  "geosparql:hasGeometry": {
    "geosparql:asWKT" : "<http://www.opengis.net/def/crs/EPSG/0/4326> 
                        POLYGON ((52.375108 4.884235, 52.375153 4.884276, 
                                  52.375159 4.884257, 52.375254 4.883981, 
                                  52.375109 4.883850, 52.375075 4.883819, 
                                  52.374979 4.884104, 52.374965 4.884143, 
                                  52.375035 4.884207, 52.375016 4.884263, 
                                  52.374996 4.884320, 52.374926 4.884255, 
                                  52.374901 4.884329, 52.375034 4.884451, 
                                  52.375108 4.884235))"
  }
}
```
The "Well Known Text" (WKT) encoding, itself defined in [SIMPLE-FEATURES], is extended by [GeoSPARQL] to include designation of the coordinate reference system used, which in turns determines the coordinate axis-order. The example above encodes the polygon as a [GeoSPARQL] wktLiteral data type, designating the coordinate reference system as <http://www.opengis.net/def/crs/EPSG/0/4326> (EPSG:4326) - WGS 84 Lat/Long.

Note

When using the wktLiteral datatype specified in [GeoSPARQL], the coordinate reference system URI may be omitted. In such a case, WGS 84 Long/Lat (urn:ogc:def:crs:OGC::CRS84) is used. Please refer to [GeoSPARQL] Requirement 11 for more details.

The Basisregistraties Adressen en Gebouwen (BAG - the Dutch "Basic Registers for Addresses and Buildings"), provided by Kadaster, uses this default behavior. Anne Frank's House, is identified using the URI http://bag.basisregistraties.overheid.nl/bag/id/pand/0363100012169587. HTML, JSON, TTL and XML representations are available.

Note

It is worth noting that, in the [SIMPLE-FEATURES] definition of WKT, the coordinate axis-order is by default longitude / latitude, irrespective of the coordinate reference system used. The same applies to EWKT (Extended WKT) - a PostGIS extension to WKT supported also by other GIS tools -, which includes a parameter (SRID) for specifying the coordinate reference system.

For this reason, whenever using WKT to encode geometries, it is important that the reference WKT specification can be unambiguously determined.

How to Test

For a given spatial data publication, check that users can find information about the coordinate axes, their order and unit of measurement, plus the datum used.

Evidence

Relevant requirements: R-DeterminableCRS, R-CRSDefinition, R-GeoreferencedData, R-LinkingCRS.

Benefits

Comprehension
Processability
Reuse
Interoperability

12.2.3 Relative positioning

Sometimes instead of using geometry and coordinates to describe a location, we want or need to describe it in relation to another location. In that case relative positioning can be used.

Best Practice 9: Describe relative positioning

Provide a relative positioning capability in which one entity can be positioned relative to another entity.

Why

Geocentric coordinate reference systems describe position relative to the earth itself. It can also be valuable or even necessary to describe the position of an entity relative to a second entity. In some cases, this is a navigation convenience, for example a tour kiosk might be described as located between the Boston Common Frog Pond and the Park Street T entrance, or in one's lower left view when looking up at the Statehouse. In other cases of moving or generalized entities, it may be that the entity can only usefully be given a relative position. For example, a package is reported left on seat 32L1 on the #59 bus, or part number PRG5460 is always located at position (51, 73, 3) in Acme warehouses.

Intended Outcome

It should be possible to describe the location of an entity in relation to one or more other entities or places, instead of specifying its own geocentric position or geometry.

The relative positioning descriptions should be machine-interpretable and/or human-readable as required by the intended application. The positions and/or geometries of reference entities, if available, should be retrievable through their link relations.

Possible Approach to Implementation

Positioning of one entity (A) relative to another referenced entity (B) is a combination of two factors: the referencing target, and the means of relative positioning. "Geocentric" referencing targets the planet itself or at least a fixed point on it. "Allocentric" referencing targets another entity. "Egocentric" referencing targets a particular field of view of an observer or camera. Positioning can take the form of a complete coordinate reference system (e.g. engineering CRS), a qualitative relation such as "beside", or a quantitative relation such as "30m northwest"

Combinations of relative positioning means and references
	Engineering CRS	Qualitative Relation	Quantitative Relation
Geocentric	Coordinate position A relative to a fixed earth datum	Not Applicable	Not Applicable
Allocentric	Coordinate position A relative to a fixed, mobile, or generic entity B	A "next to" B	A "20m south" of B
Egocentric	Coordinate position A within field of view B	A in "lower left corner" of field of view B	A "30 deg right of center" in field of view B

Descriptions of the positions of entities as explicit links to target entities.
Semantic descriptions of the target entities and type of positioning.
Encodings of the specific entity relations or the relative coordinate positions in the case of engineering CRS'

How to Test

Check that, when positions of entities are described as relative to other entities, these descriptions can be interpreted by a machine as well as humans, and the positions of the reference entities can be retrieved through their link relations.

Evidence

Relevant requirements: R-MachineToMachine, R-SamplingTopology.

Benefits

Processability

12.2.4 Spatial links

The fundamentals of links and how they are encoded are described in section 12.1.3 Linking data. This section provides advice on the resources to use as the source and target of links in spatial data, and the common categories of link relation types that might be used.

Best Practice 10: Use appropriate relation types to link Spatial Things

Ensure that hyperlinks between Spatial Things and related resources use appropriate semantics.

Why

Geography is often described as the "glue that binds Linked Data"; the links between Spatial Things - and between other resources and Spatial Things - describe how the world around us is structured and interrelated and form an important facet of the Web of Data.

Spatial relationships can often be derived mathematically based on geometry - but this can be computationally expensive. Topological relationships such as these can be asserted, thereby removing the need to do geometry-based calculations. A useful secondary benefit is that these relationships are easier for humans to understand!

Different authorities and agencies seek to describe the world around them by publishing spatial data, and in doing so, each minting their own URIs (as recommended in Best Practice 1: Use globally unique persistent HTTP URIs for Spatial Things). Where Spatial Things are of common interest to multiple agents, it is almost inevitable that a given Spatial Thing will end up being identified with several URIs. Given necessary due diligence, multiple identifiers may be linked, thereby supporting combination of multiple sets of information and yielding new perspectives on Spatial Things.

Application domains often require Spatial Things to be related; to convey the correct meaning, specific link relation types need to be used.

Intended Outcome

Spatial things are related to other resources in the Web of data using links with appropriate semantics.

Possible Approach to Implementation

Before examining the link relation types that might be used in spatial data, let's to consider what we should link to.

Link to the Spatial Thing.

The geometry description or extent of a Spatial Thing may be expressed using an object with its own URI. For example:
Example 35: Independently identified geometry extent for the City of Edinburgh Council Area (TTL format)
```
@prefix rdfs:      <http://www.w3.org/2000/01/rdf-schema#> .
@prefix admingeo:  <http://data.ordnancesurvey.co.uk/ontology/admingeo/> .
@prefix geom:      <http://data.ordnancesurvey.co.uk/ontology/geometry/>

<http://data.ordnancesurvey.co.uk/id/7000000000030505>
  a admingeo:District ;
  rdfs:label "City of Edinburgh" ;
  geom:extent <http://data.ordnancesurvey.co.uk/id/geometry/30505-11> .

<http://data.ordnancesurvey.co.uk/id/geometry/30505-11>
  a geom:AbstractGeometry ;
  geom:asGML "<gml:MultiPolygon>...</gml:MultiPolygon>"^^rdf:XMLLiteral ;
  geom:hectares 27300.411 .
```
As can be seen in the example above, the geometry 30505-11 is an attribute of the City of Edinburgh. If your intent is to make a statement about, or refer to, the real-world entity then make sure you link to the Spatial Thing rather than the geometry. Furthermore, note that the geometry record may be updated and re-published with a new identifier, for example, if the city boundary was resurveyed and would then result in a broken link.

Data publishers should also be aware of a common pattern used in the publication of Linked Data, where the Spatial Thing and the information resource that describes it are identified separately — often, but not always, using /id as part of the URI for Spatial Thing, and /doc for the corresponding page/document/record. When the URI for the Spatial Thing is dereferenced, a HTTP 303 (see other) response is used to redirect the browser to the page/document/record URL. For example:
- http://statistics.gov.scot/id/statistical-geography/S12000036 redirects to http://statistics.gov.scot/doc/statistical-geography/S12000036
- http://dbpedia.org/resource/Anne_Frank_House redirects to http://dbpedia.org/page/Anne_Frank_House
While this disambiguation has its advantages, it often seems to confuse users (and even some experts). Be aware of this redirect pattern, and make sure you use the correct URI i.e. the identifying one — especially if you're copying the URI from a browser's address bar which usually ends up showing the page/document/record URL.
Link to Spatial Things from popular repositories.

Linking with URIs from popular repositories may improve discoverability of your data. Not only does this provide users with better context by enabling them to browse the information published by the popular repository, it also helps relate your data with datasets from other parties who have also used those URIs as points of reference.

There are many popular repositories containing sets of identifiers for Spatial Things; the following list suggests the primary sources worth checking:
- GeoNames
- Wikidata
- DBpedia
- National open spatial datasets such as are made available by for example the UK and Dutch governments.
Finding out which national open spatial datasets are available, and how they can be accessed, currently requires some insider knowledge — in most cases because these datasets are often not easily discoverable. Look for national data portals / geoportals such as Nationaal Georegister (Dutch national register of spatial datasets) or Dataportaal van de Nederlandse overheid (Dutch national governmental data portal).

Once you've found well-known URIs for Spatial Things that you want to link to, proceed to create links using properties such as those described above — owl:sameAs (if you're careful!) and geosparql:sfWithin, or perhaps qualitative relationships like geonames:nearby or the proposed schema:samePlaceAs (see related discussion in section 13.6 Defining that two places are the same).

However, don't try to make links to everything. It is not always feasible to link your Spatial Things to well-known resources. For example, if you were maintaining a registry of cultural heritage in Amsterdam, it would be reasonably simple to look up identifiers for the city's 50 or so museums and map these to your Spatial Things. But it would be a huge task for, say, a topographic mapping agency to cross-reference their entire catalogue of named places containing tens of thousands of Spatial Things with third-party resources (although in the spirit of crowd-sourcing, if someone else found those links useful, they may take on the task of relating the Spatial Things and publishing those relationships to the Web as a complementary resource!). In essence, you should only create the data that you have the resources to maintain.

Now, let's take a look at link relation types that may be applicable to spatial data. These fall in to three broad categories: spatial relations, equality relations and domain-specific relations.

In this best practice document, we cannot cover all the possible vocabularies and ontologies that provide link relation types for spatial data. Other than a few areas of specific guidance, we are not recommending specific vocabularies for spatial linking. Instead, we hope to have introduced patterns that show the types of spatial linking that might be used and leave it to spatial data publishers to determine which specific vocabulary best suits their purpose. In this regard, [DWBP] section 8.9 Data Vocabularies and, in particular, [DWBP] Best Practice 15: Reuse vocabularies, preferably standardized ones are highly relevant.

Also, readers should note that in many cases, there will often be value in linking Spatial Things with multiple relationships — each of which provides different semantics. Having identified your intended user communities and the vocabularies that they commonly use, choose those link relation types that meet their specific needs, and then add more generalized link relation types to support broader reuse of your data.

However, data publishers should only assert those relationships that they know about and that they think will be of interest to their user community. Don't try to cover all possible requirements! That said, publishers should try to avoid making assumptions about what the user may or may not know. For example, users may lack the expertise or resources to calculate a topological relationship, or lack the domain knowledge to determine how two Spatial Things are related, if at all. As the data publisher, you are likely to be in a better position to make these judgements than the user — so help them out by making these relationships clear.

Spatial relationships

Topological relationships between Spatial Things can be computed based on assessment of their geometry. [GeoSPARQL] defines families of topological relationships (based on the DE-9IM pattern) that, in mathematical terms, specify the spatial dimension of the intersections of the interiors, boundaries and exteriors of two geometric objects that may be 2-dimensional (e.g. area), 1-dimensional (e.g. linear) or 0-dimensional (e.g. point).

Most commonly used are the simple feature relationship family, described in [SIMPLE-FEATURES] section 6.1.15.3 Named spatial relationship predicates based on the DE-9IM. The set of seven named relationships, or spatial predicates, and their associated [GeoSPARQL] properties are listed below:
- Equals — geosparql:sfEquals
- Disjoint — geosparql:sfDisjoint
- Touches — geosparql:sfTouches
- Crosses — geosparql:sfCrosses
- Within — geosparql:sfWithin
- Contains — geosparql:sfContains
- Intersects — geosparql:sfIntersects
We recommend use of the Simple Features relation families for describing topological relations between points, lines and areas. Further details are provided in [GeoSPARQL] section 7 Topology Vocabulary Extension.
Example 36: Asserting topological relationship 'crosses' (JSON format)
```
<script type="application/hal+json">
{
  "ex:type-nl": "brug",
  "ex:type-en": "bridge",
  "ex:name": "Lelieslius",
  "_links": {
    "self": { "href" : "http://data.example.org/topo/ams/brug/Leliesluis" },
    "curies": [ 
      { 
        "name": "geosparql", 
        "href": "http://www.opengis.net/ont/geosparql#{rel}", 
        "templated": true 
      } , {
        "name": "ex",
        "href": "http://data.example.org/def/topo#{rel}",
        "templated": "true"
      } 
    ],
    "geosparql:sfCrosses": { "href" : "http://data.example.org/topo/ams/kanaal/Prinsengracht" }
  }, 
  "_embedded": {
    "ex:type-nl": "kanaal",
    "ex:type-en": "canal",
    "ex:name": "Prinsengracht",
    "_links": {
      "self": { "href" : "http://data.example.org/topo/ams/kanaal/Prinsengracht" }
    }
  }
}
</script>
```
The example above uses the Hypertext Application Language (HAL) conventions for expressing hyperlinks in JSON [RFC7159]. It illustrates how one would indicate using geosparql:crosses that two linear Spatial Things, a bridge and a canal, cross over each other.

Note

The spatial predicates specified in [GeoSPARQL] describe 2-dimensional topological relations. There is no evidence of common practice for describing 3-dimensional topological relationships.

In addition to the mathematically precise spatial predicates described above, several vocabularies define similar relationships but without the formal mathematical underpinning. For example, [SCHEMA-ORG] defines a pair of basic containment relationships for use with schema:Place:
- schema:containsPlace: The basic containment relation between a place and another that it contains.
- schema:containedInPlace: The basic containment relation between a place and one that contains it.
It is also commonplace to use spatial relationships to convey distance (e.g. at, nearby or far-away) and direction (e.g. left, inFrontOf, astern and below). However, we find no evidence that points to use of common vocabularies to express these relationships - perhaps because these relationships are often subjective and dependent on application context (e.g. the meaning of “near” will be quite different between an endurance cycling App and the App I use to find the Bluetooth tag attached to my house keys!).

Two notable examples of distance relations are:
- foaf:based_near which states "We do not say much about what 'near' means in this context; it is a 'rough and ready' concept."; and
- geonames:nearby which simply states, "A feature close to the reference feature".
Example 37: Asserting distance spatial relationship using GeoNames ontology (GeoJSON [RFC7946] format)
```
<script type="application/geo+json">
{
  "id" : "http://sws.geonames.org/6618987/",
  "type": "Feature",
  "geometry": {
    "type": "Polygon",
    "coordinates": [ [ ... ] ]
  },
  "properties": {
    "http://www.geonames.org/ontology#name": "Anne Frank's House",
    "http://www.geonames.org/ontology#nearby" : 
      [ "http://sws.geonames.org/6950949/",
        "http://sws.geonames.org/6951798/",
        "http://sws.geonames.org/6944503/",
        ... ]
  }
}
</script>
```
This example snippet, adapted to use the GeoJSON [RFC7946] format, shows a list of Spatial Things (e.g. Westerkerk, Homomonument and Westertoren) that are deemed 'nearby' Anne Frank's House according to GeoNames.

Note

The JSON [RFC7159] format provides only simple primitive types; string, number, boolean etc. The lack of a datatype for URIs means that they must be encoded as strings. As such, conventions (such as those defined in HAL) are required to tell applications that a given string value is a URI. However, GeoJSON [RFC7946] does not define any conventions for describing URIs and forbids any extension of the data format specification.

To mitigate this, details about object types etc. included in data payload should be provided in the documentation for the API or service end-point from which the data is accessed. See [DWBP] Best Practice 25: Provide complete documentation for your API for further details.
Synonyms and equality

As described above, it is not uncommon for a Spatial Thing to be identified using more than one URI (also known as the "non-unique naming problem"). If you think that this is the case, the property owl:sameAs may be used to express this. However, caution is advised as owl:sameAs is an extremely strong statement; literally "these two URIs identify the same resource". As there is only one Spatial Thing, all the properties and attributes returned when resolving any of the equated URIs are considered to apply to that Spatial Thing. Given that spatial data is often published by different parties, each concerned with their own perspective, the Spatial Thing equality is often difficult to determine and depends heavily on the semantics involved.

So, the advice is: if in doubt, don't use owl:sameAs.

By way of example, let's explore some data for Edinburgh.

The City of Edinburgh Council Area (e.g. the geographical area that Edinburgh City Council is responsible for) is identified by the Office for National Statistics (the recognized national statistical institute of the UK) using their GSS code (a 9 character alpha numeric identifier) S12000036 and the URI http://statistics.data.gov.uk/id/statistical-geography/S12000036. At the same time, the devolved government in Scotland, operating under its own jurisdiction, retains the GSS code but uses the URI http://statistics.gov.scot/id/statistical-geography/S12000036. Furthermore, the Ordnance Survey maintain yet another URI for the City of Edinburgh Council Area as part of its 'Boundary Line' service that contains administrative and statistical geography areas in the UK: http://data.ordnancesurvey.co.uk/id/7000000000030505. Similarly, Geonames identifies Edinburgh, a second-order administrative division, as http://sws.geonames.org/2650225/. All of these URIs refer to the same Spatial Thing and are equated using owl:sameAs.
Example 38: Asserting equality between URIs for the City of Edinburgh Council Area (TTL format)
```
@prefix owl:          <http://www.w3.org/2002/07/owl#> .
@prefix scotgov-stat: <http://statistics.gov.scot/id/statistical-geography/> .
@prefix ukgov-stat:   <http://statistics.data.gov.uk/id/statistical-geography/> .
@prefix osuk:         <http://data.ordnancesurvey.co.uk/id/> .
@prefix geonames:     <http://sws.geonames.org/> .

scotgov-stat:S12000036 owl:sameAs ukgov-stat:S12000036 .
osuk:7000000000030505 owl:sameAs ukgov-stat:S12000036 .
geonames:2650225 owl:sameAs ukgov-stat:S12000036 .
```
Also note that in this [TURTLE] snippet one could easily include additional properties to help users determine whether the link is worth traversing, such as providing human-readable labels and specifying the type designated by each data publisher.

In contrast, the resource identified by http://data.os.uk/id/4000000074558316 defines the named place Edinburgh - a colloquial definition for the city itself. This is not the same as the City of Edinburgh Area and therefore use of the owl:sameAs relationship is inappropriate.

Note

The mechanics of determining whether the information provided when resolving two or more URIs does indeed describe the same Spatial Thing is a complex topic all in its own right and way beyond the scope of best practice document. Tools such as Open Refine and the Silk Linked Data Integration Framework are designed to work with, transform and integrate heterogeneous data sources. Their documentation may provide further insight regarding these challenges.

Given the very strong semantics of the owl:sameAs property, alternative properties with weaker semantics are commonly used. Examples include:
- schema:sameAs defined by [SCHEMA-ORG] whose description states:
  
  URL of a reference Web page that unambiguously indicates the item's identity. E.g. the URL of the item's Wikipedia page, Freebase page, or official website.
- ov:similarTo defined by Open.vocab.org, with the description:
  
  Having two things that are not the owl:sameAs but are similar to a certain extent. It is thought of being used where owl:sameAs is too strong but rdfs:seeAlso is too loose.
- http://www.bbc.co.uk/ontologies/coreconcepts/sameAs, defined by the BBC, whose description states that the property:
  
  Indicates that something is the same as something else, but in a way that is slightly weaker than owl:sameAs. Its purpose is to connect separate identities of the same thing, whilst keeping separation between the original statements of each.
All of the properties list above, are concerned with equality or similarity about resources themselves. However, we often want to talk about the similarity of Spatial Things in terms of location or place. Spatial relations (see above) can be used to describe how locations are related — either using rigorous topological relationships derived from geometry, such as geosparql:sfEquals, or ones without formal mathematical underpinning, such as geonames:nearby. But place is a social concept that reflect how we humans perceive the space around us, often with a vague or imprecise notion of location; you can’t always define a boundary for a place like The Sahara because not everyone agrees where its edge lies!

Talking of places, the City of Edinburgh [Administrative] Area and Edinburgh the named place are strongly related; you might say that they are the same place if that makes sense for your application. This also provides an example where it is worthwhile to provide multiple relationships between Spatial Things: Ordnance Survey uses the within link relation type to relate the named place Edinburgh and the City of Edinburgh administrative area. within complements a qualitative same-place-as relation between two places.

However, while we see people wanting to assert such qualitative same-place-as relationships based on human perception of place, there is no evidence of a best practice in how to achieve this; see section 13.6 Defining that two places are the same for more details about possible approaches that could be adopted.
Domain-specific relationships involving Spatial Things

In addition to the spatial relationships that are applicable to a wide variety of domains, there are a huge number of cases where asserting a relationship between Spatial Thing is useful. Clearly, enumerating all these cases is more than we can do here - but we can look at some of those that commonly occur.

First, there are the properties used to describe relationships between Spatial Things in a gazetteer. These properties are often used in combination with spatial predicates to describe the relationship between administrative units. For example, Ordnance Survey define specific properties to describe the relationships between the administrative units used within the UK: county, district, ward, etc.
Example 39: Asserting gazetteer relationships for the City of Edinburgh Council Area (TTL format)
```
@prefix rdfs:      <http://www.w3.org/2000/01/rdf-schema#> .
@prefix geosparql: <http://www.opengis.net/ont/geosparql#> .
@prefix admingeo:  <http://data.ordnancesurvey.co.uk/ontology/admingeo/> .

<http://data.ordnancesurvey.co.uk/id/7000000000030505>
  a admingeo:District ;
  rdfs:label "City of Edinburgh" ;
  admingeo:gssCode "S12000036" ;
  admingeo:ward
    <http://data.ordnancesurvey.co.uk/id/7000000000043412> , 
    <http://data.ordnancesurvey.co.uk/id/7000000000043415> , 
    <http://data.ordnancesurvey.co.uk/id/7000000000043411> ,
    ... ;
  geosparql:sfTouches
    <http://data.ordnancesurvey.co.uk/id/7000000000036552> , 
    <http://data.ordnancesurvey.co.uk/id/7000000000030509> , 
    <http://data.ordnancesurvey.co.uk/id/7000000000030634> , 
    <http://data.ordnancesurvey.co.uk/id/7000000000030632> ;
  ...
  .
```
The example snippet above, provided in [TURTLE] format, shows the relationships between the City of Edinburgh district and the electoral wards it contains. Also note that complementary use of geosparql:sfTouches to relate the City of Edinburgh to its adjacent districts; Midlothian, West Lothian etc.

A second domain where relationships between Spatial Things and non-spatial resources occur is earth observing. The example below, provided in [GML], relates a monitoring point at Deddington on the Nile River, Tasmania, to the sensor that is deployed there (using the sams:hostedProcedure property) and relates that monitoring point to the waterbody whose properties are being measured (using the sam:sampledFeature property). Here, the links are defined using [XLINK11].
Example 40: Asserting spatial relationship for sensing/earth observation using [XLINK11] and [GML]
```
<wml2:MonitoringPoint gml:id="xsd-monitoring-point.example"
  xmlns:wml2="http://www.opengis.net/waterml/2.0"
  xmlns:gml="http://www.opengis.net/gml/3.2" 
  xmlns:sam="http://www.opengis.net/sampling/2.0"
  xmlns:sams="http://www.opengis.net/samplingSpatial/2.0"
  xmlns:xlink="http://www.w3.org/1999/xlink">
  <gml:description>Hydrological monitoring point for Nile river at 
    Deddington, South Esk catchment, Tasmania</gml:description>
  <gml:identifier codeSpace="http://www.example.com/">
    http://www.example.com/catchment/south-esk/mpoint/deddington
  </gml:identifier>
  <sam:sampledFeature xlink:href="http://sws.geonames.org/2155327/" 
    xlink:title="Nile river"/> 
  <sams:shape>
    <gml:Point gml:id="location_deddington">
      <gml:pos srsName="urn:ogc:def:crs:EPSG::4326">
        -41.814935 147.568517
      </gml:pos> 
    </gml:Point>
  </sams:shape>
  <sams:hostedProcedure>
    <wml2:ObservationProcess gml:id="sensor:4c40fd3acdbf">
      <wml2:processType xlink:href="http://www.opengis.net/def/waterml/2.0/processType/Sensor" 
        xlink:title="Sensor"/>
      <wml2:processReference xlink:href="http://www.example.com/sensor/00d97bbc-77ca-4b3d-91ca-4c40fd3acdbf/conf/1489405706" 
        xlink:title="Sensor configuration (updated:2017-03-13)"/>
    </wml2:ObservationProcess>
  </sams:hostedProcedure>
  ...
</wml2:MonitoringPoint>
```
For further information about sensors, sampling, observations and measurements, please refer to [OandM] and [VOCAB-SSN].

Note

[GML] adopted the [XLINK11] standard to represent links between resources. At the time of adoption, XLink was the only W3C-endorsed standard mechanism for describing links between resources within XML [XML11] documents. The Open Geospatial Consortium anticipated broad adoption of XLink over time - and, with that adoption, provision of support within software tooling. While XML Schema, XPath, XSLT and XQuery etc. have seen good software support over the years, this never happened with XLink. The authors of [GML] note that given the lack of widespread support, use of XLink within [GML] provided no significant advantage over and above use a bespoke mechanism tailored to the needs of [GML].

Our final example of a domain-specific relationship concerns creative works. For example, one may want to indicate the location a social media message was sent from. In the example below, we assume that Maurits, a tourist in Amsterdam, wants to comment on his visit to Anne Frank's House. His social media App uses the [GEOLOCATION-API] to determine his location (Lat=52.37590 and Long=4.88452) and suggests several places that Maurits might choose from in order to geo-tag his message. Maurits wants people to know roughly where he is, so he chooses "Amsterdam-Centrum" and presses 'send'. The App encodes the message in [SCHEMA-ORG] and pushes the message to the server for distribution. The geo-information is provided using the schema:locationCreated property.
Example 41: Asserting location that a creative work was created using [SCHEMA-ORG] ([JSON-LD] format)
```
<script type="application/ld+json">
{
  "@context" : {
    "@vocab" : "http://schema.org/"
  },
  "@id" : "http://app.example.com/message/867a52e3-6687-4471-b1f2-c7561673552e",
  "@type" : "Message",
  "sender" : { "@type" : "Person", "name" : "Maurits" },
  "datePublished" : "2017-03-12", 
  "locationCreated" : {
    "@id" : "https://g.co/kg/m/0gh6_3j"
    "@type" : "Place",
    "name" : "Amsterdam-Centrum"
  }
}
</script>
```
If Maurits had wanted to indicate that the subject of the photograph he took moments later was Leliesluis bridge, then the following [SCHEMA-ORG] markup and schema:mainEntity property could be used:
Example 42: Asserting the subject of a creative work using [SCHEMA-ORG] ([JSON-LD] format)
```
<script type="application/ld+json">
{
  "@context" : {
    "@vocab" : "http://schema.org/"
  },
  "@id" : "http://app.example.com/user/Maurits/photo/e35f1132-461e-4acb-8a76-a5d622a85958",
  "@type" : "Photograph",
  "sender" : { "@type" : "Person", "name" : "Maurits" },
  "datePublished" : "2017-03-12", 
  "mainEntity" : {
    "@id" : "http://data.example.org/topo/ams/brug/Leliesluis"
    "@type" : "Bridge",
    "name" : "Leliesluis bridge",
    "geo" : {
      "@type" : "GeoCoordinates",
      "longitude" : "4.88435",
      "latitude" : "52.37608"
    }
  }
}
</script>
```

How to Test

Check that hyperlinks use typed relationships, and that link relation type can be located in order to determine how to interpret the hyperlink.

Check that the source and target of the hyperlink are Spatial Things, unless the link relation type definition indicates that this should be otherwise (e.g. when relating a Spatial Thing to its geometry).

Evidence

Relevant requirements: R-Linkability, R-MachineToMachine, R-SpatialRelationships, R-SpatialOperators.

Benefits

Comprehension
Processability
Linkability
Interoperability

12.2.5 Spatial data versioning

Spatial things and their attributes can change over time. For example, a lake may grow or shrink due to changes in climate, water extraction or any number of reasons. For many applications, it is important that information about Spatial Things is kept up to date. When new information is available, the data publisher may make this available on the Web according to their update schedule and policies. [DWBP] section 8.6 Data Versioning and Best Practice 21: Provide data up to date provide directly applicable guidance.

When dealing with change to a Spatial Thing, you should consider its lifecycle; in particular, how much change is acceptable before a Spatial Thing can no longer be considered as the same resource. Consider Eddystone Lighthouse for example: the “Eddystone Light”, a maritime navigation aid, has existed in (more or less) the same place on Eddystone Rocks since 1698. A single HTTP URI (such as http://dbpedia.org/resource/Eddystone_Lighthouse) is used to identify “the lighthouse on Eddystone rocks” for all that period. The lighthouse's attributes (such as its focal height, visible range and light characteristic) have changed over that period, but we still consider it to be the same lighthouse. However, if our interest is historic buildings, we would identify the four different structures that have stood on that site as different Spatial Things, from Winstanley's Eddystone Lighthouse (the first incarnation) to Douglass' Eddystone Lighthouse (the 4th and current incarnation). In that context, incremental change for these structures during the entire period from 1698 is not appropriate; one structure replaces another and so each structure should be assigned a unique identifier. In summary, different things are important to different people!

All that said, if you consider that the change affects the fundamental nature of the Spatial Thing, then you should assign a new identifier. See section 12.1.1 Spatial data identifiers for more details. Otherwise, read on for guidance on how to describe properties that change over time.

Best Practice 11: Provide information on the changing nature of spatial things

Spatial data should include metadata that allows a user to determine when it is valid for.

Why

Spatial things and their attributes change over time. When it comes to Spatial Things, or any resource, that changes over time, it is important to provide metadata about the life cycle of those entities and the resources used to describe them. Given that information, data consumers can make considered choices about which resource they want to link to. Mostly, they are interested in current information. They need to be able to determine whether the published description of a Spatial Thing meets their needs. For example, is the published geographic extent of the City of Amsterdam relevant for a land-usage study of the nineteenth century? (Gemeentegeschiedenis.nl, "Municipality History", illustrates how the extent of Amsterdam has changed during the past 200-years, in HTML and GeoJSON). Where the information is available, a user may want to browse older versions of the published information to understand the nature of any changes or to find historical information.

Intended Outcome

Users are provided with the most recent version of information about a Spatial Thing and its attributes by default.

Users can determine the time-period for which data is applicable.

If a version history of changes is available, users can browse through a set of changes to see how a Spatial Thing and its attributes have changed over time.

Possible Approach to Implementation

When publishing information about a Spatial Thing that is subject to change there are four approaches to consider in response to a change:

simply updating the description of the Spatial Thing;
republish the entire dataset with a new URI;
providing a series of immutable snapshots that describe the Spatial Thing at various points in its lifecycle; and
capturing a time-series of data values within an attribute of the Spatial Thing.

Whichever approach is chosen, publishers of spatial data should consider how dataset metadata plays an important part in helping users determine whether a dataset is fit for their use. Particularly where the contents of a dataset change with time, statements about the (most recent) publication date, the frequency of update and the time-period for which the dataset is relevant (i.e. temporal extent) should be provided. Please refer to [DWBP] section 8.2 Metadata for more details about dataset metadata.

A description of the lifecycle of the Spatial Things (e.g. what triggers a change and whether those changes are versioned etc.) should also be provided in either the dataset's metadata, schema or specification. For example, the UK's Digital National Framework policy states that data publishers must provide these lifecycle rules.

Approach (1) is lightweight and should only be used where there are no user requirements that require access to older descriptions of the Spatial Things. Data publishers simply replace the old description of the Spatial Thing with the amended description and keep users informed about updates by providing the appropriate metadata (e.g. when the data was changed). This may be achieved using dataset metadata (as outlined above) or by including the metadata attributes in the description of each Spatial Thing.

Where users are anticipated to need to understand how a Spatial Thing has changed over time, approaches (2), (3) and (4) should be considered.

Approach (2) is a simple variant on approach (1); the difference being that the entire dataset is assigned a new URI when changes are made, thereby enabling older versions of the dataset to be addressed separately. See [DWBP] Best Practice 11: Assign URIs to dataset versions and series for further details. Using this approach, a user should be able to compare two versions of the dataset to determine what has changed. Although simple for data publishers, the downside of this approach is that the effort is passed on to the users.

Approach (3) requires the data publisher to publish immutable resources that describe the Spatial Thing at specific points in time (i.e. "snapshots") and provide a mechanism for users to browse between those snapshots. Effectively, the dataset becomes an accumulation of these snapshots that users can browse through. However, given that each snapshot of the Spatial Thing is published as a separate resource, this approach is suited to infrequent changes so that the number of snapshots does not become unwieldy.

The URI for the Spatial Thing, the base URI, should dereference to provide the current information and a link to its version history of snapshots. [DWBP] Best Practice 8: Provide version history describes how a version history may be implemented. Each snapshot resource within the version history must be uniquely identified; a common approach is to append a date/time stamp to the base URI as a version indicator. [DWBP] Best Practice 7: Provide a version indicator provides relevant guidance.

Example 43: Changing boundary of Amsterdam

The extent of the City of Amsterdam has changed during the last 200 years. This example, based on Gemeentegeschiedenis.nl ("Municipality history") (condensed and changed to reflect the recommendations in this best practice), shows how the version history of Amsterdam's boundary can be provided as a series of immutable snapshots in GeoJSON [RFC7946].

The current information on Amsterdam including the current boundary:

{
 "uri": "http://www.gemeentegeschiedenis.nl/gemeentenaam/Amsterdam",
 "name": "Amsterdam",
 "inProvince": "Noord-Holland",
 "cbscode": "0363",
 "absorbed": "http://www.gemeentegeschiedenis.nl/gemeentenaam/Weesperkarspel",
 "2016": {
   "type": "FeatureCollection",
   "features": [{
   "type": "Feature",
   "versionedUri": "http://www.gemeentegeschiedenis.nl/gemeentenaam/Amsterdam/2016",
   "replaces": "http://www.gemeentegeschiedenis.nl/gemeentenaam/Amsterdam/2014",
   "year": "2016", 
   "geometry": {
      "type": "MultiPolygon",
      "coordinates": [...],
}}]}}

The previous boundary of Amsterdam:

{
 "2014": {
    "type": "FeatureCollection",
    "features": [{
       "type": "Feature",
       "versionedUri": "http://www.gemeentegeschiedenis.nl/gemeentenaam/Amsterdam/2014",
       "replacedBy": "http://www.gemeentegeschiedenis.nl/gemeentenaam/Amsterdam/2016",
       "replaces": "http://www.gemeentegeschiedenis.nl/gemeentenaam/Amsterdam/2013",
       "year": "2014",
       "geometry": {
          "type": "MultiPolygon",
          "coordinates": [...],
}]}}

Approach (4) is suitable where a Spatial Thing has a small number of attributes that are frequently updated. For example, the GPS-position of a runner or when streaming data from a sensor, such as the water level from a stream gauge.

With this approach, the description of the Spatial Thing must include a property that contains a sequentially-ordered set of data-points, each of which defines a time-stamp and the values for the time-varying attribute(s). By definition, this property can be considered as a time-series coverage. Standard data encodings are available for time-series data, including: [TIMESERIESML] for [GML], plus [COVERAGE-JSON] and the SensorThings API [SENSORTHINGS] for JSON [RFC7159]. [VOCAB-DATA-CUBE] provides a generic mechanism to express well-structured data, such as time series, in RDF [RDF11-PRIMER]. Although not yet widely used enough to be considered best practices, [EO-QB] and [QB4ST] (developed alongside this best practice Note within the Spatial Data on the Web Working Group) illustrate how [VOCAB-DATA-CUBE] may be used in this way.

Note

The OGC [MOVING-FEATURES-XML] and [MOVING-FEATURES-CSV] specifications follow the pattern described above. A trajectory element is used to describe the position of a Spatial Thing, and varying attributes (such as orientation or rotation) can be added alongside the tuples in the trajectory. However, there is limited evidence of adoption outside of Japan.

Example 44: Changing position of a runner traversing the Alps

This examples shows a snipped of a file storing the changing GPS position of a runner traversing the Alps. The format is GPX, a common format for exchanging a series of GPS positions. For each track point, the coordinates as well as a timestamp are stored.

<gpx version="1.1">
  <trk>
    <name>Move</name>
    <trkseg>
      <trkpt lat="47.24239" lon="10.749514">
        <ele>784</ele>
        <time>2016-09-06T06:01:25.009Z</time>
      </trkpt>
      <trkpt lat="47.242403" lon="10.749489">
        <ele>784</ele>
        <time>2016-09-06T06:01:26.009Z</time>
      </trkpt>
      [...]
      <trkpt lat="46.968127" lon="10.870573">
        <ele>1677</ele>
        <time>2016-09-06T17:41:50.009Z</time>
      </trkpt>
    </trkseg>
  </trk>
</gpx>

How to Test

Information about a given Spatial Thing, or set of Spatial Things, will be relevant for a particular time or time-period. Check that this information is stated.

Check that dataset metadata provides details about how often the dataset is updated; e.g. date of most recent publication, frequency of update.

If a version history of changes is available, check that links to previous versions are available.

If the Spatial Thing contains an attribute that varies with time, check that those attribute values are provided as a time-series.

Evidence

Relevant requirements: R-MachineToMachine, R-MovingFeatures, R-Streamable, R-CoverageTemporalExtent

Benefits

Comprehension
Trust
Access

12.3 Spatial data access

In recent years, we have seen widespread emergence of Web applications that use spatial data. Often these applications do not access all the spatial data they use via the Web. While there are good reasons for this, e.g. licensing restrictions, it is often the case, too, that the spatial data is not available via the Web at all, or in ways that application developers find too complex to use, or with insufficient or unclear quality-of-service commitments.

[DWBP] provides best practices discussing access to data using Web infrastructure (see [DWBP] section 8.10 Data Access). This section provides additional insight for publishers of spatial data.

Making data available on the Web requires data publishers to provide some form of access to the data. There are numerous mechanisms available, each providing varying levels of utility and incurring differing levels of effort and cost to implement and maintain. Publishers of spatial data should make their data available on the Web using affordable mechanisms to ensure long-term, sustainable access to their data.

When determining the mechanism to be used provide Web access to data, publishers need to assess utility against cost. In order of increasing usefulness and cost:

Bulk-download or streaming of the entire or pre-defined subsets of a dataset
Generalized spatial data access API
Bespoke API designed to support a particular type of use

Let's take a closer look at these options.

The download of a dataset - or a pre-defined subset of it - via a single HTTP request is mainly covered by these [DWBP] best practices:

Providing bulk-download or streaming access to data is useful in any case and is relatively inexpensive to support as it relies on standard capabilities of Web servers for datasets that may be published as downloadable files stored on a server. However, this option is more complex for frequently changing datasets or real-time data.

[DWBP] Best Practice 18: Provide Subsets for Large Datasets explains why providing subsets is important and how this could be implemented. Spatial datasets, particularly coverages such as satellite imagery, sensor measurement time-series and climate prediction data, are often very large. In these cases, it is useful to provide subsets by having identifiers for conveniently sized subsets of large datasets that Web applications can work with.

Effectively, breaking up a large coverage into pre-defined lumps that you can access via HTTP GET requests is a very simple API.

When a subset is provided, this should include information about the relationship to the complete dataset. In HTML, this could be descriptive text or it is implicitly clear for humans in the way the subset is presented. In [SCHEMA-ORG] it could be schema:isPartOf property. In RDF [RDF11-PRIMER], PROV-O could be used to describe the relationship between the subset and the complete dataset as well as the mechanism used to derive the subset. In ISO 19115 metadata, the LI_Lineage element may be used for a similar purpose. Etc.

The use of APIs to access data is covered in [DWBP] by the following best practices:

For spatial data, SDIs have long been used to provide generalized access to spatial data via Web services, typically using open standard specifications from the Open Geospatial Consortium (OGC). The main examples are Web Feature Service [WFS], Web Coverage Service [WCS], Sensor Observation Service, SensorThings [SENSORTHINGS] or [GeoSPARQL] for access to data, or Web Map Service [WMS] and Web Map Tile Service [WMTS] for access to data rendered as map. Apart from the Web Map Service, the OGC standards have not seen widespread adoption beyond the geospatial expert community.

In addition, commercial offerings for publishing spatial data on the Web often provide access via product-specific APIs, too. These APIs are typically not restricted to HTTP-based Web service APIs in the sense of [DWBP] Best Practice 24: Use Web Standards as the foundation of APIs, but include APIs targeted at a specific programming language, for example, JavaScript.

In the list of options above, a third option is included as sharing spatial data on the Web using the first two options (bulk download or generalized APIs) may not be sufficient for reaching application developers. Reasons for this include:

Generalized spatial data access APIs, like the OGC standards, typically and intentionally cover a wide range of usages, including requirements of users that are geospatial experts, and they support a broad range of Spatial Things. While they are documented comprehensively, they are often not easy to understand and the "Time to First Successful Call" (see [DWBP] Best Practice 25: Provide complete documentation for your API) may be too high for application developers. A typical convenience API is a simple API that implements complex requirements and hides the complexity, e.g. coordinate reference system handling, from the application developer.
Spatial data as it is used by expert users may be complex. As users need to understand how the data is structured to work effectively with that data, this burdens the data user with significant effort before they can even perform simple queries on data they have downloaded or access through a generalized data access API.
Spatial datasets tend to be large, often too large for direct use in Web applications.

A useful 'bespoke API' mentioned in the third option provides convenience to developers of the targeted applications, because the API designer has thought about the needs of those developers when consuming the spatial data shared via the API.

Best Practice 12: Expose spatial data through 'convenience APIs'

If you have a specific type of application in mind for your data, tailor a spatial data access API to meet that goal.

Why

Providing access to spatial data via bulk download or generalized spatial data access APIs may be too complex for application developers with relatively simple requirements, if the spatial data or the API is complex to understand or too large to handle in a Web application. Convenience APIs are tailored to meet a specific goal; enabling a user to engage with complex data structures using (a set of) simple queries, including spatial search.

Intended Outcome

The API provides a coherent set of queries and operations, including spatial ones, that help users get working with the data quickly to achieve common tasks. The API provides both machine readable data and human readable HTML markup. The human-readable markup will also support search engine's Web crawlers to enable indexing of spatial data.

Possible Approach to Implementation

The API should:

Offer both machine readable data and human readable HTML that includes the structured metadata required by search engines seeking to index content (see Best Practice 4: Make your entity-level data indexable by search engines for details);
Follow the architectural guidance of the [DWBP] Best Practice 24: Use Web Standards as the foundation of APIs and [DWBP] Best Practice 19: Use content negotiation for serving data available in multiple formats;
Be well documented and easy to understand, both in terms of the options to access / filter the data and of the data structures that are returned (see [DWBP] Best Practice 25: Provide complete documentation for your API);
Return data in chunks fit for use in Web applications and are useful sets of information.
For large datasets, this is related to [DWBP] Best Practice 18: Provide Subsets for Large Datasets; this may be achieved, for example, by filtering options that return appropriately sized subsets of the specific dataset or by supporting paging (returning larger subsets in pages with forward/backward links). For paging, some patterns have been established, see for example W3C Linked Data Platform Paging or Hydra pagination.
For geometries with many coordinates, simplifying the geometries for display at large map scales - think about all administrative boundaries of Europe on a map display with scale 1:30,000,000 - is another option; the simplification may be controlled by the client using a query parameter indicating the target scale; geometry simplification including its caveats are discussed in Best Practice 5: Provide geometries on the Web in a usable way.
At the other end of the spectrum, overly small pieces of data are inconvenient to use, too. Data should be packaged in lumps that are convenient to work with. An approach where very small, fine-grained units of information are published that require further HTTP requests to get the related information sufficient to determine context is not useful;
Support queries for Spatial Things based on user needs. For spatial data, typical needs that should be considered are neighborhood searches (e.g., "what is near me?" or "what is near this Spatial Thing?") and searching for things located in a specific area (e.g., an area shown as a map in an application). Users will often look for a particular Spatial Thing without knowing its identifier, too, in which case a fault-tolerant, free-text search on the name, label or other property may be useful.

In a White Paper about open geospatial APIs [OGC-API-WP], the Open Geospatial Consortium (OGC) has defined the concept of the "OGC API Essentials" - a set of items defined in OGC standards and other open standards that are reusable modules for use in geospatial APIs. The White Paper provides an initial list and many of the identified standards are mentioned in this document. Reuse of standardized building blocks improves consistency and interoperability across APIs. It is recommended to consider the OGC API Essentials when defining an API to access spatial data.

One such essential is a set of well-known spatial predicates for use in queries to select Spatial Things based on their geometry. Most commonly supported is the following set: equal, disjoint, touches, within, overlaps, crosses, intersects, contains. These predicates were originally defined in [SIMPLE-FEATURES], but are also supported by [GeoSPARQL], WFS [WFS] and others. For more information about the definition of the predicates, see [SIMPLE-FEATURES].

If the data is already published in a Spatial Data Infrastructure, there are basically two options to publish the data via an additional convenience API.

Reuse your existing spatial data infrastructure

Use a RESTful API as a wrapper, proxy or a shim layer can be created around SDI services. This aims at exposing 'generalized APIs' using 'convenience APIs' to make the data easier to use. For example, in the geospatial domain there are a lot of WFS services providing spatial data. Content from the WFS service can be provided in this way as linked data, JSON [RFC7159] or another Web friendly format using simple, navigable resources. This approach is like the use of Z39.50 in the library community; that protocol is still used but 'modern' Web sites and Web services are wrapped around it.

Example 48

An example of this approach of creating a convenience API that works dynamically on top of WFS is the experimental ldproxy. This requires relatively little effort and is an attractive option for quickly exposing spatial data from existing WFS services on the Web. The approach is to create an intermediate layer by introducing a proxy on top of the WFS so the contained resources are made available. The proxy maps the data and metadata to [SCHEMA-ORG] according to a provided mapping scheme; assigns URIs to all resources based on a pattern; supports filtering based on a property; makes each resource available in HTML, XML [XML11], [JSON-LD], [ GML], GeoJSON [RFC7946]; and generates links to data in other datasets managed in triple stores using SPARQL [SPARQL11-OVERVIEW] queries. The API is documented and published using Swagger.

Example 49

Mapping a URI template (as specified in [RFC6570]) to a WFS, WCS or OPeNDAP end-point is a very simple way of specifying a set of resources in the existing infrastructure. This will not address all API aspects described in this best practice, but it is a simple start.

One advantage of this approach is to be able to hide implementation details of the current backend in the URI. A Web service URL in general does not provide a good URI for a resource as it is unlikely to be persistent. A Web service URL is often technology and implementation dependent and in practice both are likely to change with time.

The Environment Agency Bathing Water Quality API example above uses URI templates, too. In this case, the Linked Data API configuration uses URI templates to provide RESTful access to SPARQL queries thereby taking away from the user the challenge of writing generalized SPARQL queries and understanding the underpinning data model.
Provide parallel Web-friendly access to the data as an alternative

A more effective route may be to provide an alternative 'Web friendly' access path to the spatial data is to create a new, complementary service endpoint on top of the native storage of the dataset. This limits the load on your SDI compared to the first option, which may matter as the data access APIs of the SDI will continue to be used by expert users and their complex data management tasks.

Example 50

Expose the underpinning relational database via a SPARQL endpoint (using something like ontop-spatial) and Linked Data API. The data may be mapped dynamically to resources on the Web using the [R2RML] standard and Linked Data Publication tools. This process also allows to enrich the data represented in RDF with additional information and links. To maintain a direct link between the Spatial Things provided through the SDI and as Linked Data, use properties that link between the [GML] and RDF representations, for example, by including an additional property rdf_seealso in the [GML] encoding pointing to the RDF representation of the Spatial Thing.

Fig. 2 Providing an alternative 'Linked Data friendly' access path to a WFS data source.

How to Test

See the "How to test" sections in [DWBP] Best Practice 23: Make data available through an API, [DWBP] Best Practice 24: Use Web Standards as the foundation of APIs and [DWBP] Best Practice 25: Provide complete documentation for your API.

Evidence

Relevant requirements: R-Compatibility, R-LightweightAPI, R-SpatialOperators, R-ReferenceDataChunks.

Benefits

Access
Reuse
Interoperability

12.4 Spatial metadata

[DWBP] provides best practices discussing the provision of metadata to support discovery and reuse of data (see [DWBP] section 8.2 Metadata for more details). Providing metadata at the dataset level supports a mode of discovery well aligned with the practices used in Spatial Data Infrastructure (SDI) where a user begins their search for spatial data by submitting a query to a catalog. Once the appropriate dataset has been located, the information provided by the catalog enables the user to find a service end-point from which to access the data itself - which may be as simple as providing a mechanism to download the entire dataset for local usage or may provide a rich API enabling the users to request only the required parts for their needs. The dataset-level metadata is used by the catalog to match the appropriate dataset(s) with the user's query.

This section includes best practices for including the spatial extent, CRS, and other spatial details of the dataset in the metadata. These are the extra metadata items needed to make spatial datasets both discoverable and usable. A third best practice in this section goes a step further in granularity: exposing spatial data on the Web in such a way that individual entities or "granules" within a dataset can be discovered, evaluated, and utilized.

Quality information is also an important part of spatial metadata, especially for asserting if data is fit for a certain purpose. [DWBP] provides a best practice discussing how the quality of data on the Web should be described (see [DWBP] section 8.5 Data Quality for more details). This section is based on the Data Quality section from [DWBP] and adds a best practice specific for spatial data, which concentrates on the accuracy of the positions in the data - how close are they to the actual positions of the real world things?

In the Spatial Metadata section, we provided a Best Practice on how to deal with CRS in spatial data on the Web. There is also a clear link between CRS and data quality, because the accuracy of spatial data depends for a large part on the CRS used. This can be seen as conformance of data with a "standard" - in this case, a (spatial or temporal) reference system. This is how you can describe spatial data quality using different vocabularies. We will provide an example in this section.

For some uses, it may be sufficient to simply state conformance to a published specification:

Example 51: GeoDCAT-AP specification of a dataset conformance with the INSPIRE Regulation on spatial data and services interoperability

a:Dataset a dcat:Dataset ;
  dcterms:conformsTo <http://data.europa.eu/eli/reg/2010/1089/oj> .

<http://data.europa.eu/eli/reg/2010/1089/oj> a dcterms:Standard , foaf:Document ;
  dcterms:title "COMMISSION REGULATION (EU) No 1089/2010 of 23 November 2010
             implementing Directive 2007/2/EC of the European Parliament
             and of the Council as regards interoperability of spatial
             data sets and services"@en ;
  dcterms:issued "2010-12-08"^^xsd:date .

However, that specification makes no statement about the positional accuracy of the data, so on its own, it is only a useful quality statement for users to whom positional accuracy is not that important.

Best Practice 13: Include spatial metadata in dataset metadata

The description of datasets that have Spatial Things should include explicit metadata about their spatial extent, coverage, and representation

Note

This best practice extends [DWBP] Best Practice 2: Provide descriptive metadata.

Why

Since location is such a powerful organizing principle, it is usually necessary to specifically describe the spatial details and nature of a dataset to discover it as well as to determine its fitness for use. This information is used, for example, by SDI catalog services that offer spatial querying to find data - but also by users to understand the nature of the dataset. In some cases, for example when dealing with crowd-sourced data, provenance information or how the dataset came to be in its published form and with what quality, is important as well.

The first level of spatial description is the spatial extent of the dataset, the area of the world that the dataset describes. This often suffices for initial discovery, but further levels of description are needed to evaluate a dataset for use. These include the dataset spatial coverage (continuity, resolution, properties) as well as the spatial representation or geometric model (for example, grid coverage, discrete coverage, point cloud, linear network).

Dataset quality measures such as positional accuracy are also important for determining applicability. In the case of datasets whose spatial characteristics vary over their temporal coverage, spatial descriptions must include an explicit temporal aspect.

Intended Outcome

Dataset metadata should include the information necessary to enable spatial queries within catalog services such as those provided by SDIs.
Dataset metadata should also include the information required for a user to evaluate whether a spatial dataset is suitable for their intended application.

Possible Approach to Implementation

When publishing a dataset, provide as much spatial metadata as necessary, but at least the spatial extent, coverage, and representation. Other examples of spatial metadata include:

number of dimensions (1D, 2D, 3D, 4D)
spatial representation type (e.g. grid, vector, text table)
geometric property (e.g. boundary, bounding box, region, centerline, centroid, field) - expressed in the WGS 84 coordinate reference system to make the metadata consumable by as broad an audience as possible (see Best Practice 7: Choose coordinate reference systems to suit your user's applications for more information).
Coordinate Reference System(s) - refer to section 9. Coordinate Reference Systems (CRS) for an introduction to that topic
spatial resolution - Best Practice 14: Describe the positional accuracy of spatial data
spatial significance of non-spatial properties (e.g. point value, interpolation, unit average, sum)

In Spatial Data Infrastructures, the accepted standard for describing metadata is [ ISO-19115] or profiles thereof.

To provide information about the spatial attributes of the dataset on the Web one can:

As shown in [DWBP] Best Practice 2: Provide descriptive metadata: Include the spatial coverage of the things described by the dataset using [ VOCAB-DCAT] and a reference to a named place in a common vocabulary for geospatial semantics (e.g. GeoNames).
Again, use [VOCAB-DCAT], but instead of a reference to a named place, use a set of coordinates to specify the boundaries of the area either as a bounding box or a polygon.

This can be done, e.g., by using the approach defined in the spatial extension of [ VOCAB-DCAT], [GeoDCAT-AP] - see Example 15: [GeoDCAT-AP] representation of dataset spatial coverage (bounding box) in multiple encodings.

Note

[GeoDCAT-AP] provides an RDF [RDF11-PRIMER] syntax binding for the metadata elements defined in the core profile of [ISO-19115] and in the INSPIRE metadata schema [INSPIRE-MD].
Use [GeoDCAT-AP] to specify spatial attributes that are not available in [VOCAB-DCAT].
Use geospatial ontologies to describe the spatial data for the datasets - see the matrix of spatial data vocabularies in section A. Applicability of common formats to implementation of best practices.
Publish metadata in both machine- and human-readable format, following [DWBP] Best Practice 1: Provide metadata.

Example 53: Specification of spatial representation type in [GeoDCAT-AP]

[GeoDCAT-AP] models this information by using adms:representationTechnique [VOCAB-ADMS], with URIs corresponding to the items in the appropriate ISO 19115 code list.

The following [TURTLE] snippet provides an example of the [GeoDCAT-AP] specification of two datasets using, respectively, a vector and a grid spatial representation type.

a:Dataset a dcat:Dataset ;
  adms:representationTechnique 
    <http://inspire.ec.europa.eu/metadata-codelist/SpatialRepresentationTypeCode/vector> .

another:Dataset a dcat:Dataset ;
  adms:representationTechnique 
    <http://inspire.ec.europa.eu/metadata-codelist/SpatialRepresentationTypeCode/grid> .

Note

The URIs in the examples, denoting the spatial representation type, are part of a register yet to be added to the INSPIRE Registry. Therefore, they currently do not resolve.

How to Test

Check if the spatial metadata for the dataset itself includes the overall features of the dataset in a human-readable format.

Check if the descriptive spatial metadata is available in a valid machine-readable format.

Evidence

Relevant requirements: R-Discoverability, R-Compatibility, R-BoundingBoxCentroid, R-Crawlability, R-SpatialMetadata and R-Provenance.

Benefits

Comprehension
Reuse
Trust
Discoverability

Best Practice 14: Describe the positional accuracy of spatial data

Accuracy of spatial data should be specified in machine-interpretable and human-readable form.

Why

The amount of detail that is provided in spatial data and the resolution of the data can vary. No measurement system is infinitely precise and in some cases the spatial data can be intentionally generalized (e.g. merging entities, reducing the details, and aggregation of the data) [Veregin]. Some spatial data applications, such as aircraft navigation, require highly accurate data. For others, such as human navigation, a horizontal accuracy of a few meters is good enough. For yet others, such as overlaying weather forecasts on a map, the map is only giving a general indication of place. If the positional accuracy is published together with the data, the user can determine whether it is appropriate to use for their application. Potentially, this makes existing data more re-usable.

Note

It is important to understand the difference between precision and accuracy. Seven decimal places of a latitude degree corresponds to about one centimeter. Whatever the precision of the specified coordinates, the accuracy of positioning on the actual earth's surface using WGS 84 will only approach about a meter horizontally and may have apparent errors of up to 100 meters vertically, because of assumptions about reference systems, tectonic plate movements and which definition of the earth's 'surface' is used.

Intended Outcome

For many uses, the positional accuracy of the data is an important aspect of assessing its fitness for purpose (quality). As with other data quality statements, this can be a quantitative measure, a statement of conformance to a standard or policy, or an assertion or report of fitness for a particular purpose.

Possible Approach to Implementation

Describe the accuracy of spatial data in a way that is understandable for humans.

In addition, describe the accuracy of spatial data in a machine-readable format. [ VOCAB-DQV] is such a format. It is a vocabulary for describing data quality, including the details of quality metrics and measurements.

For observed (measured) datasets, it is possible to make specific quantitative statements about positional accuracy, based on knowledge of the equipment used to make the observations, and any processing carried out.

For coverages, the sampling distance is an effective way of indicating the amount of detail in the dataset - this is one of the meanings of the term "resolution". Alternatively, samples of the data could be independently checked against the real world, and the results of that check reported. Either way, this is usually a statement of absolute positional accuracy, but for some uses, relative positional accuracy is more important.

Positional accuracy measurements, whether observed or asserted based on process, can be given using QualityMeasurement.

For modelled datasets, for example in planning and construction, there is no 'real world' against which to assess the positional accuracy - but relative positional accuracy can still be stated.

For many uses, a statement of the amount of detail provided is sufficient to assess fitness for purpose; examples include "level of detail" (building models), "navigational purpose" (marine navigation), "equivalent scale" or "zoom level" (cartography). Sometimes, this is expressed as if it were a statement of positional accuracy.

These can be expressed in the same way as for non-spatial data; for example using the QualityAnnotation, Standard, and QualityPolicy statements of [VOCAB-DQV].

The following example shows how [VOCAB-DQV] can express conformance to a specified positional accuracy

Example 56: GeoDCAT-AP specification of a dataset conformance with IHO's S44

a:Dataset a dcat:Dataset ;
  dcterms:conformsTo <https://www.iho.int/iho_pubs/standard/S-44_5E.pdf#Special> .

<https://www.iho.int/iho_pubs/standard/S-44_5E.pdf> a dcterms:Standard , foaf:Document ;
  dcterms:title "IHO Standards for Hydrographic Surveys"@en ;
  dcterms:issued "2008-02-01"^^xsd:date.

The following example shows how [VOCAB-DQV] can express the amount of detail in a coverage dataset:

Example 57: [VOCAB-DQV] specification of data quality

:myDataset a dcat:Dataset ;
   dqv:hasQualityMeasurement :myDatasetPrecision, :myDatasetAccuracy .

:myDatasetPrecision a dqv:QualityMeasurement ;
   dqv:isMeasurementOf :spatialResolutionAsDistance ;
   dqv:value "1000"^^xsd:decimal ;
   sdmx-attribute:unitMeasure  <http://www.wurvoc.org/vocabularies/om-1.8/metre>
   .

:spatialResolutionAsDistance  a  dqv:Metric;
    skos:definition "Spatial resolution of a dataset expressed as distance"@en ;
    dqv:expectedDataType xsd:decimal ;
    dqv:inDimension dqv:precision
    .

This example was taken from [VOCAB-DQV]. For more examples of expressing spatial data precision and accuracy see [VOCAB-DQV], Express dataset precision and accuracy.

How to Test

Check if the metadata contains at least one human and machine readable statement regarding positional accuracy

Check that the kind of statement is relevant to the kind of data, e.g. not an absolute positional accuracy measure for Atlantis

Checking whether the accuracy statement is actually correct is beyond the scope of this best practice.

Evidence

Relevant requirements: R-MachineToMachine, R-QualityPerSample.

Benefits

Comprehension
Reuse
Trust
Interoperability

	WKT	GML	KML	GeoJSON	HTML
Based on	WKT	XML	XML	JSON	HTML
Media type	`text/plain`	`application/gml+xml`	`application/vnd.google-earth.kml+xml`, `application/vnd.google-earth.kmz`	`application/geo+json`	`text/html`
Usage	Representation of 0D-2D geometries, CRS and CRS transformation	Representation of Spatial Things and 0D-3D geometries. Comprehensive and supporting many use cases.	Representation of Spatial Things and 0D-3D geometries. Main focus on spatial data visualization and interaction	Representation of Spatial Things and 0D-2D geometries	Description of Spatial Things and geometries can be embedded by using mechanisms as [HTML-RDFa], [MICRODATA], [JSON-LD], using vocabularies as [SCHEMA-ORG]
Tool support	Widely supported in GIS tools Supported by some Web libraries, usually converted in GeoJSON [RFC7946] Supported by most triple stores	Widely supported in GIS tools Supported by some Web libraries, usually converted in GeoJSON [RFC7946], but not when the geometry is 3-dimensional (volumes) Supported only by triple stores supporting [GeoSPARQL]	Mainly supported by Earth browsers, as Google Earth	Supported in some GIS tools Widely supported in Web libraries and mapping APIs	Optimal for Web publication and discovery
Web discoverability	Low	Low	Low	Low	Good
Link support	No	Via [XLINK11]	Via [XLINK11]	No	Yes
Geometry specification
CRS support	Depends on the flavor - e.g., EWKT and [GeoSPARQL]'s WKT support arbitrary CRSs, and the latter defaults to WGS 84 long/lat (CRS84)	Any, and it can be explicitly specified (via attribute `@srsName`)	WGS 84 long/lat (CRS84) only	WGS 84 long/lat (CRS84) only	Depends on the vocabulary used - e.g., [SCHEMA-ORG] supports WGS 84 only
Axis order support	Any, but it cannot be explicitly specified - e.g., in [SIMPLE-FEATURES]'s WKT and EWKT it defaults to longitude/latitude, whereas in [GeoSPARQL]'s WKT it is determined by the CRS used	Determined by the CRS used	Longitude / latitude only, with optional altitude	Longitude / latitude only, with optional altitude	Depends on the vocabulary used - e.g., [SCHEMA-ORG] supports lat/long only
3D support	No	Yes	Yes	No	Depends on the vocabulary used - e.g., [SCHEMA-ORG] does not support 3D geometries

	[DCTERMS]	[W3C-BASIC-GEO]	[VCARD-RDF]	[GeoRSS]	[SCHEMA-ORG]	[GeoSPARQL]	[LOCN]
Description	includes terms for describing location and temporal information, as classes `dcterms:Location`, `dcterms:PeriodOfTime`, and properties `dcterms:spatial`, `dcterms:temporal`, and `dcterms:coverage`.	A widely used vocabulary, although not an official standard, for specifying point coordinates in the WGS 84 datum.	Includes terms for describing postal addresses and 0D geometries (points).	Vocabulary defined by the W3C Geospatial Incubator Group (GeoXG) for the representation of geospatial properties of Web resources. On 28 March 2017, [GeoRSS] has been proposed as a candidate OGC Community Standard.	Designed for annotating Web pages with machine-readable metadata, it supports a number of classes and properties for specifying location information, including geometries. See Best Practice 2: Make your spatial data indexable by search engines for more information.	Official OGC standard, defining a set of terms and functions for modeling and querying spatial information. Coordinates are encoded by using WKT or [GML].	Defines a set of general terms for describing location information that can be extended based on domain-specific requirements. Covers geographical names, geometries, and postal addresses.
Spatial things	`dcterms:Location`	`w3cgeo:SpatialThing`	`vcard:Kind`, and its subclasses; `vcard:Address`	`georss:_Feature` is placeholder for Spatial Thing	`schema:Place`, and its subclasses; `schema:PostalAddress`	`geosparql:Feature`	`dcterms:Location`, `locn:Address`
Properties to associate Spatial Things with geometries	-	`w3cgeo:location`, `w3cgeo:lat_long`, `w3cgeo:lat`, `w3cgeo:long`, `w3cgeo:alt`	`vcard:hasGeo`	`georss:where`, `georss:point`, `georss:line`, `georss:polygon`, `georss:box`	`schema:geo`	`geosparql:hasGeometry`, `geosparql:defaultGeometry`	`locn:geometry`
Geometries	-	`w3cgeo:Point` (subclass of `w3cgeo:SpatialThing`)	Geometries are represented with the `geo` URI scheme [RFC5870]	Geometries are represented with a literal encoding of point coordinates	`schema:GeoCoordinates` `schema:GeoShape` `schema:GeoCircle`	`geosparql:Geometry`, and its subclasses (`sf:Point`, `sf:Polygon`, etc.)	`locn:Geometry` (it denotes either a structured object or a literal)
Geometry specification
CRS support	-	WGS 84 only	WGS 84 only	WGS 84 only	WGS 84 only	Any	Any (depends on how the geometry is represented)
Axis order support	-	lat/long only	lat/long only	lat/long only	lat/long only	Determined by the CRS used	Any (depends on how the geometry is represented)
0D support	-	lat/long coordinate pair (`w3cgeo:lat_long`), decimal degrees (`w3cgeo:lat`, `w3cgeo:long`), decimal meters (`w3cgeo:alt`)	`geo` URI scheme [RFC5870]	lat/long coordinate pair	lat/long coordinate pair	[GML], WKT	[GML], WKT, GeoJSON [RFC7946], `geo` URI scheme [RFC5870], Geohash
1D and 2D support	-	-	-	lat/long coordinate pairs, separated by a comma	lat/long coordinate pairs, separated by a comma or a space	[GML], WKT	[GML], WKT, GeoJSON [RFC7946]
3D support	-	-	-	-	-	[GML]	[GML]

Best Practice	Benefits
Use globally unique persistent HTTP URIs for Spatial Things	Discoverability Reuse Linkability
Make your spatial data indexable by search engines	Discoverability Reuse
Link resources together to create the Web of data	Comprehension Processability Reuse Interoperability
Use spatial data encodings that match your target audience	Comprehension Processability Reuse Interoperability Access
Provide geometries on the Web in a usable way	Processability Reuse Interoperability Access
Provide geometries at the right level of accuracy, precision, and size	Processability Reuse Access
Choose coordinate reference systems to suit your user's applications	Comprehension Processability Reuse Interoperability Access
State how coordinate values are encoded	Comprehension Processability Reuse Interoperability
Describe relative positioning	Processability
Use appropriate relation types to link Spatial Things	Comprehension Processability Linkability Interoperability
Provide information on the changing nature of spatial things	Comprehension Trust Access
Expose spatial data through 'convenience APIs'	Access Reuse Interoperability
Include spatial metadata in dataset metadata	Comprehension Reuse Trust Discoverability
Describe the positional accuracy of spatial data	Comprehension Reuse Trust Interoperability

Requirements	Spatial Data Best Practice	General Data Best Practice
Linkability	Use globally unique persistent HTTP URIs for Spatial Things Make your spatial data indexable by search engines Link resources together to create the Web of data Use spatial data encodings that match your target audience Use appropriate relation types to link Spatial Things	Use persistent URIs as identifiers Use persistent URIs as identifiers within datasets Assign URIs to dataset versions and series
GeoreferencedData	Use globally unique persistent HTTP URIs for Spatial Things Use spatial data encodings that match your target audience State how coordinate values are encoded
IndependenceOnReferenceSystems	Use globally unique persistent HTTP URIs for Spatial Things Provide geometries on the Web in a usable way	Use locale-neutral data representations
BoundingBoxCentroid	Make your spatial data indexable by search engines Provide geometries on the Web in a usable way Provide geometries at the right level of accuracy, precision, and size Include spatial metadata in dataset metadata
Crawlability	Make your spatial data indexable by search engines Include spatial metadata in dataset metadata	Provide metadata Provide descriptive metadata Provide structural metadata Make data available through an API
Discoverability	Make your spatial data indexable by search engines Use spatial data encodings that match your target audience Include spatial metadata in dataset metadata	Provide metadata Provide descriptive metadata Provide structural metadata Make data available through an API
MachineToMachine	Make your spatial data indexable by search engines Link resources together to create the Web of data Use spatial data encodings that match your target audience Provide geometries on the Web in a usable way Describe relative positioning Use appropriate relation types to link Spatial Things Provide information on the changing nature of spatial things Describe the positional accuracy of spatial data	Provide metadata Provide descriptive metadata Provide structural metadata Make data available through an API
DeterminableCRS	Use spatial data encodings that match your target audience State how coordinate values are encoded	Reuse vocabularies, preferably standardized ones
SpatialRelationships	Use spatial data encodings that match your target audience Use appropriate relation types to link Spatial Things
MultipleCRS	Provide geometries on the Web in a usable way
Compressible	Provide geometries on the Web in a usable way Provide geometries at the right level of accuracy, precision, and size	Provide bulk download Provide Subsets for Large Datasets
CRSDefinition	Provide geometries on the Web in a usable way State how coordinate values are encoded
EncodingForVectorGeometry	Provide geometries on the Web in a usable way	Use machine-readable standardized data formats
SpatialMetadata	Provide geometries on the Web in a usable way Include spatial metadata in dataset metadata	Provide metadata Provide descriptive metadata
3DSupport	Provide geometries on the Web in a usable way	Choose the right formalization level
TimeDependentCRS	Provide geometries on the Web in a usable way	Cite the Original Publication
TilingSupport	Provide geometries on the Web in a usable way
Compatibility	Provide geometries at the right level of accuracy, precision, and size Expose spatial data through 'convenience APIs' Include spatial metadata in dataset metadata	Reuse vocabularies, preferably standardized ones
CoordinatePrecision	Provide geometries at the right level of accuracy, precision, and size Choose coordinate reference systems to suit your user's applications	Provide data quality information
AvoidCoordinateTransformations	Choose coordinate reference systems to suit your user's applications	Reuse vocabularies, preferably standardized ones
LinkingCRS	State how coordinate values are encoded
SamplingTopology	Describe relative positioning
SpatialOperators	Use appropriate relation types to link Spatial Things Expose spatial data through 'convenience APIs'
MovingFeatures	Provide information on the changing nature of spatial things
Streamable	Provide information on the changing nature of spatial things	Assign URIs to dataset versions and series Provide real-time access Provide data up to date
CoverageTemporalExtent	Provide information on the changing nature of spatial things
LightweightAPI	Expose spatial data through 'convenience APIs'
ReferenceDataChunks	Expose spatial data through 'convenience APIs'
Provenance	Include spatial metadata in dataset metadata	Provide data provenance information
QualityPerSample	Describe the positional accuracy of spatial data
MultilingualSupport		Provide metadata Provide descriptive metadata
SubjectEquality		Provide data in multiple formats Use content negotiation for serving data available in multiple formats

Abstract

Status of This Document

1. Introduction

2. Audience

3. Scope

3.1 Spatial data

3.2 Data publication

3.3 Best practice criteria

3.4 Privacy considerations

4. Best Practices Summary

Best Practices Summary

5. Namespaces

5.1 General remarks

5.2 RDF Namespaces

5.3 XML Namespaces

6. Spatial Things, Features and Geometry

7. Coverages: describing properties that vary with location (and time)

8. Spatial relations

9. Coordinate Reference Systems (CRS)

10. Linked Data

11. Why are traditional Spatial Data Infrastructures not enough?

12. The Best Practices

12.1 Web principles for spatial data

12.1.1 Spatial data identifiers

Why

Intended Outcome

Possible Approach to Implementation

How to Test

Evidence

Benefits

12.1.2 Indexable data

Why

Intended Outcome

Possible Approach to Implementation

How to Test

Evidence

Benefits

12.1.3 Linking data

Why

Intended Outcome

Possible Approach to Implementation

How to Test

Evidence

Benefits

12.2 Spatial data

12.2.1 Spatial data encoding

Why

Intended Outcome

Possible Approach to Implementation

1. Web pages for people to read about Spatial Things

2. Web mapping or visualization applications

3. Data integration - combining spatial data with other data

4. Spatial analytics - discover meaningful patterns in spatial data

Balancing quality and cost

How to Test

Evidence

Benefits

12.2.2 Geometries and coordinate reference systems

Why

Intended Outcome

Possible Approach to Implementation

How to Test

Evidence

Benefits

Why

Intended Outcome

Possible Approach to Implementation

How to Test

Evidence

Benefits

Why

Intended Outcome

Possible Approach to Implementation

How to Test

Evidence

Benefits

Why

Intended Outcome

Possible Approach to Implementation

How to Test