Abstract

This Note describes CoverageJSON, a data format for describing "coverage" data in JavaScript Object Notation (JSON), and provides an overview of its design and capabilities. The primary intended purpose of the format is to enable data transfer between servers and web browsers, to support the development of interactive, data-driven web applications. "Coverage" data is a term that encompasses many kinds of data whose properties vary with space, time and other dimensions, including (but not limited to) satellite imagery, weather forecasts and river gauge measurements. We describe the motivation and objectives of the format, and provide a high-level overview of its structure and semantics. We compare CoverageJSON with other "coverage" formats and data models and provide links to tools and libraries that can help users to produce and consume data in this format. This Note does not attempt to describe the full CoverageJSON specification in detail: this is available at the project website.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

This is a largely complete document and is published as a First Published Working Draft. Work remaining includes: adding references, adding a section on how this document relates to the Spatial Data on the Web Best Practices, and any specific improvements in response to feedback.

The editors anticipate one further release of the document in approximately June 2017.

For OGC This is a Public Draft of a document prepared by the Spatial Data on the Web Working Group (SDWWG) — a joint W3C-OGC project (see charter). The document is prepared following W3C conventions. The document is released at this time to solicit public comment.

This document was published by the Spatial Data on the Web Working Group as a First Public Working Draft. If you wish to make comments regarding this document, please send them to public-sdw-comments@w3.org (subscribe, archives). All comments are welcome.

Publication as a First Public Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. The group does not expect this document to become a W3C Recommendation. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 1 March 2017 W3C Process Document.

1. Known issues with this document

2. Introduction

2.1 What is a Coverage?

The term "coverage" is defined in ISO19123 (REF) as a "feature that acts as a function to return values from its range for any direct position within its spatial, temporal or spatiotemporal domain". In other words, a coverage maps points in space and time to data values. For example, an aerial photograph can be modelled as a coverage that maps positions on the ground to colours. A river gauge maps points in time to flow values. A weather forecast maps points in space and time to values of temperature, wind speed, humidity and so forth.

Note

Sometimes we see the word “coverage” used synonymously with “gridded data” or “raster data” but this isn’t really accurate. One can see from the above paragraph that non-gridded data (like a river gauge measurement) can also be modelled as coverages. Nevertheless, we often observe a bias toward gridded data in discussions (and software) that concern coverages.)

2.2 Existing Coverage representations

The ISO19123 specification defines an abstract data model for representing the data and metadata needed to encode coverage data. This abstract model can be instantiated in many different concrete formats. The OGC Coverage Implementation Schema (CIS) specification (REF) describes a concrete, directly instantiable data model and XML serialisation based on the Geography Markup Language (GML). Other serialisations have been defined, sometimes by retrospectively mapping existing binary data formats to the ISO19123 abstract data model (e.g. NetCDF, REF Nativi et al). Version 1.1 (TODO check this) of the CIS specification supports a JSON serialization, which is created through a direct translation from GML types to JSON objects. (We provide a brief comparison of CIS-JSON with CoverageJSON below.)

2.3 Motivation and objectives

This document describes the CoverageJSON format and explains how it is well-suited to representation of spatial data on the Web: with a number of advantages for use in a web context, in comparison to alternative established approaches for serialisation of coverage data. It is a good match to the Use Case Requirements [SDW-UCR] and Best Practices [SDW-BP] of the Spatial Data on the Web working group.

Serialisations (i.e. data formats) based upon community-specific binary formats and complex XML schemas provide usability challenges for the development of web applications. Javascript libraries do not always exist for these formats, and the formats are complex, requiring specialist knowledge on the part of the web developer. Our overall aim is to make it easier for web developers to consume coverage data in their applications, minimising the need for prior knowledge about community-specific data formats.

Therefore our objective in this work was to develop a well-structured, consistent and easy-to-use JSON format for coverage data that fulfils the following criteria:

3. Overview of CoverageJSON

The full specification for CoverageJSON is published on GitHub, which also records all discussions that led to the design decisions in the format. The specification is split up into two documents: the core part, and a set of optional domain types that ease interoperability.

3.1 High-level structure

In CoverageJSON, a Coverage consists of the following objects:

A sample skeleton document encoding a three-dimensional gridded Coverage with two Parameters (sea surface temperature and sea ice area fraction) is shown here:

{
  "type" : "Coverage",
  "domain" : {
    "type": "Domain",
    "domainType" : "Grid",
    "axes" : {
      "x" : { /* Coordinate values */ },
      "y" : {                         },
      "t" : {                         }
    },
    "referencing" : [
      /* Coordinate referencing information */
    ]
  },
  "parameters" : {
    "SST"     : { /* Description of temperature values */ },
    "sea_ice" : { ... }
  },
  "ranges" : {
    "SST"     : { /* Encoding of temperature values, or link(s) */ },
    "sea_ice" : { ... }
  }
}
      

3.2 Encoding the Domain

A Domain is a collection of named orthogonal axes containing coordinate values, coupled with information about how to reference these values to one or more real-world coordinate reference systems. An axis can contain simple numeric values like latitudes or longitudes but can also contain composite values like tuples or polygons.

The following is a complete example of a simple grid domain with longitude, latitude, and time axes, using the WGS84 longitude-latitude coordinate reference system and the Gregorian calendar:

{
  "type" : "Domain",
  "domainType" : "Grid",
  "axes": {
    "x" : { "start": -179.5, "stop": 179.5, "num": 360 },
    "y" : { "start": 89.5, "stop": -89.5, "num": 180 },
    "t" : { "values": ["2001", "2002", "2003"] }
  },
  "referencing": [{
    "coordinates": ["x","y"],
    "system": {
      "type": "GeographicCRS",
      "id": "http://www.opengis.net/def/crs/OGC/1.3/CRS84"
    }
  }, {
    "coordinates": ["t"],
    "system": {
      "type": "TemporalRS",
      "calendar": "Gregorian"
    }
  }]
}
      

Note that difference CRSs can be associated with different combinations of axes, providing a very flexible model that allows complex data to be encoded without the need to create composite CRSs. Axis values can also be categorical in nature (instead of numeric), enabling data values to be associated with entities that are not spatiotemporal coordinates.

Note

This closely mirrors the structure of the RDF Data Cube, in which orthogonal dimensions are combined to form the domain of the data cube. (A "dimension" in the data cube corresponds with an "axis" in CoverageJSON.) Therefore, although a formal mapping process has yet to be performed, we expect that interoperability between the RDF Data Cube and CoverageJSON is achievable. The RDF Data Cube specification does not explicitly support spatiotemporal dimensions, but this is addressed in the QB4ST extensions. TODO: is there a better way of citing QB and QB4ST?

This mechanism allows for a huge variety of domain structures to be encoded, from multidimensional grids to one-dimensional trajectories through four-dimensional space. To ease the burden on clients, CoverageJSON allows an optional domain type property to be defined (see the example above). If the data provider specifies that the domain is of a known type, the client then knows in more detail what to expect when the domain is inspected. For example, if the domain type is "Grid", the client knows that the domain MUST have axes that are called "x" and "y" (corresponding to the two horizontal spatial dimensions) and MAY have axes called "z" and "t" (corresponding respectively to the vertical and temporal dimensions). A number of common domain types are specified and there is a mechanism for data providers to define and register their own types. Note that in a typical document, short names ("Grid", "PointSeries", "Trajectory", etc.) are used to indicate the domain type, but in fact these are full URIs in disguise - this becomes apparent when viewing CoverageJSON document as RDF using the JSON-LD context (see below).

3.3 Encoding of data values

Data in CoverageJSON is held in Range objects, which represent multi-dimensional arrays. There are two subtypes of Range objects:

Note

Readers may wonder why multi-dimensional arrays are not encoded as nested arrays in CoverageJSON. Nested arrays (i.e. "arrays of arrays") are harder to manipulate and reason over, since there is no guarantee that the inner arrays are of a consistent length. With a one-dimensional array it is easy to verify that array.length matches the required number of elements, defined by the shape of the domain and the numbers in the shape property. In addition, Javascript engines can treat one-dimensional arrays more efficiently than nested arrays. APIs can be provided to extract slices in any dimension as if the array were truly multi-dimensional. This mirrors the approach taken by libraries such as numpy.

Here is an example of an NdArray:

{
  "type" : "NdArray",
  "dataType": "float",
  "axisNames": ["t", "y", "x"],
  "shape": [1, 90, 90],
  "values": [
    12.2, 12.0, 13.3, ...
    /* 8100 numbers (1*90*90) in row-major order */
  ]
}

Support for large datasets

For reasons of efficiency and convenience, data providers may prefer not to specify the range objects for all parameters inline in the same CoverageJSON document. In this case, there are two options:

  1. For some or all parameters, create a separate JSON document containing an NdArray, which holds the values of that parameter. Then insert a link to this document in the Coverage document in place of the inline NdArray.
  2. Additionally, the individual arrays may be split up into tiles, in which the values of each parameter are encoded in multiple JSON documents, each containing an NdArray.

The following illustrates how a coverage may be split up into a particular tileset:

Illustration of a tiled NdArray. The data array is split into a number of tiles, each of which will be represented by a separate CoverageJSON file
{
  "type" : "TiledNdArray",
  "dataType": "float",
  "axisNames": ["t", "y", "x"],
  "shape": [3, 180, 360],
  "tileSets": [{
    "tileShape": [1, 90, 90],
    "urlTemplate": "http://example.com/{t}/{y}/{x}.covjson"
  }]
}
      

Each tile is an NdArray, encoded as above.

3.4 Encoding of Parameter metadata

Data values are described in CoverageJSON using Parameter objects. These contain a minimal set of metadata needed to do something useful with the data values: a definition of the quantity being recorded (e.g. relative humidity, potential temperature, reflectance) and the units of measure in which the data values are expressed.

The sample JSON document below shows a Parameter object describing the sea surface temperature variable from the above skeleton JSON.

"SST" : {
  "type" : "Parameter",
  "observedProperty" : {
    "id" : "http://vocab.nerc.ac.uk/standard_name/sea_surface_temperature/",
    "label" : {
      "en" : "Sea Surface Temperature",
      "de" : "Meeresoberflächentemperatur"
    },
    "description" : {
      "en" : "The temperature of sea water near the surface",
      "de" : "Die Temperatur des Meerwassers nahe der Oberfläche"
    }
  },
  "unit" : {
    "label" : {
      "en" : "Degree Celsius",
      "de" : "Grad Celsius"
    },
    "symbol": {
      "value" : "Cel",
      "type" : "http://www.opengis.net/def/uom/UCUM/"
    }
  }
}

Note that the main features of the Parameter metadata in this example are:

Other metadata, such as provenance information, is not part of the core CoverageJSON specification, but can be recorded via the extension mechanism.

3.5 CoverageJSON documents

A single CoverageJSON document can contain one of the following types of object:

The top-level object within a document contains a “type” property that identifies the type of the object that it contains. Documents may be linked to other documents; in this way data providers can ensure that each individual document is of a manageable size, with large datasets being partitioned among a number of linked documents. (See "Support for large datasets" above.)

3.6 CoverageJSON, JSON-LD and RDF

To a limited extent, a CoverageJSON document can be converted into RDF through the use of a JSON-LD context header. The extent to which this is possible is discussed in REF (TODO: insert link to Montreal paper).

We did not consider that coversion to RDF should be a primary goal: we focused mainly on simplicity and readability of the format, under the assumption that few of the target users (web developers) would require a pure RDF representation of the data. Enabling a full conversion to RDF would require complicating the format (mainly for technical reasons including limitations of JSON-LD). Also, RDF is an unsuitable format for large arrays of data and so the Domain and Range would not convert efficiently.

Nevertheless, CoverageJSON makes frequent use of URIs to denote key concepts, such as units, observed properties, coordinate reference systems, domain types and links to other CoverageJSON documents. Clients can make use of these to detect these concepts unambiguously, whether or not they perform a translation to RDF.

By using the canonical CoverageJSON JSON-LD context, it is possible to convert the above Parameter directly into RDF triples:

_:SST <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://covjson.org/def/core#Parameter> .
_:SST <http://qudt.org/schema/qudt#unit> _:SST_UNIT .
_:SST_UNIT <http://qudt.org/schema/qudt#symbol> "Cel"^^<http://www.opengis.net/def/uom/UCUM/> .
_:SST_UNIT <http://www.w3.org/2004/02/skos/core#prefLabel> "Degree Celsius"@en .
_:SST_UNIT <http://www.w3.org/2004/02/skos/core#prefLabel> "Grad Celsius"@de .
_:SST <http://www.w3.org/2005/Incubator/ssn/ssnx/ssn#observedProperty> <http://vocab.nerc.ac.uk/standard_name/sea_surface_temperature/> .
<http://vocab.nerc.ac.uk/standard_name/sea_surface_temperature/> <http://purl.org/dc/terms/description> "Die Temperatur des Meerwassers nahe der Oberfläche"@de .
<http://vocab.nerc.ac.uk/standard_name/sea_surface_temperature/> <http://purl.org/dc/terms/description> "The temperature of sea water near the surface"@en .
<http://vocab.nerc.ac.uk/standard_name/sea_surface_temperature/> <http://www.w3.org/2004/02/skos/core#prefLabel> "Meeresoberflächentemperatur"@de .
<http://vocab.nerc.ac.uk/standard_name/sea_surface_temperature/> <http://www.w3.org/2004/02/skos/core#prefLabel> "Sea Surface Temperature"@en .
Note

An interesting area of future work would be to define two-way mappings between CoverageJSON and an RDF Data Cube representation (using the QB4ST extensions to the latter). As noted above, there are a number of similarities between the two representations, and defining such mappings should be possible.

3.7 Extension points

CoverageJSON allows data providers to extend the format in a controlled manner. The possible extensions that can be defined by users include:

In each case we recommend that URIs be used to denote these extensions (and to point to definitions), to avoid the possibility of clashes between extensions.

4. Examples

Complete examples of CoverageJSON documents can be found via the Playground. The same documents can be accessed directly on GitHub. These examples include Coverages, Coverage Collections and tiled Coverages.

5. Tools and libraries

We have developed a number of tools and libraries to help users produce, use and debug CoverageJSON documents. These are all published on the project website and include:

6. Relationship with other data models

In this section we compare the data model of CoverageJSON with that of other formats that are used to carry coverage data.

6.1 NetCDF and CF-NetCDF

Note

NetCDF (REF) is a binary, platform-independent data format for multidimensional data, which is independent of any community of practice. Essentially, a NetCDF file is a collection of multidimensional arrays, plus metadata provided as key-value pairs. Metadata conventions are required to specialise NetCDF for particular communities. The Climate and Forecast conventions are the pre-eminent conventions for geospatial data. NetCDF files that conform to these conventions are known as "CF-NetCDF files".)

The overall structure of CoverageJSON is quite close to that of NetCDF (REF), consisting essentially of a set of orthogonal domain axes that can be combined in different ways. One major difference is that in CoverageJSON, there is an explicit Domain object, whereas in NetCDF the domain is specified implicitly by linking data variables with coordinate variables. One consequence of this is that NetCDF files can contain several domains and hence several Coverages. A NetCDF file could therefore be converted to a single Coverage or a Coverage Collection in CoverageJSON.

6.2 OGC Coverage Implementation Schema (CIS)

The overall concepts of CoverageJSON are close to those of the ISO19123 standard (REF) and the OGC standard Coverage Implementation Schema (CIS, REF), which specialises ISO19123. The main points of difference are:

6.3 TimeseriesML

Describe relationship to other data models, such as TimeseriesML.

TODO expand this, but the main difference is probably that TimeseriesML allows for a number of ways to associate values of time to data values: a data value may represent an accumulation or average of a quantity over time, and the time values in the domain may mark the start, end or middle of the time period in question.

7. Relationship to Use Cases and Requirements

The Spatial Data on the Web Working Group has created a set of Use Cases and Requirements for spatial data on the Web. A subset of these requirements are relevant to Coverage data. This section describes how CoverageJSON addresses relevant requirements.

How CoverageJSON relates to the Spatial Data on the Web Use Cases and Requirements Document
Requirement CoverageJSON approach
4D model of space-time Domains in CoverageJSON can have any number of dimensions. Many of the defined domain types support 4D domains, including grids and trajectories.
Compatibility with existing practices CoverageJSON incorporates the same overall concepts (domain, range, and metadata) as other coverage data models OGC Coverage Implementation Schema standard (OGC document OGC 09-146r2. Is there a web link to this?). It differs in some respects from this standard.
Compressibility CoverageJSON consists of JSON objects, which can be compressed using standard approaches, for example by enabling gzip and the corresponding Content-Encoding in a web server.
Coverage temporal extent CoverageJSON has a means to add temporal references. This is defined to make the common case easy, (using the Gregorian calendar) but also allows alternative temporal reference systems.
Crawlability Like any other 'file' on the web, CoverageJSON objects can have a URL and so can be found by crawlers. To what extent an agent is able to interpret the contents of a CoverageJSON file is another question. Also important for crawling and hence discovery might be metadata associated with CoverageJSON data.
CRS definition CoverageJSON defines an approach for specifying the CRS as a URI.
Determinable CRS CoverageJSON enables domains to be referenced to one or more CRSs, either through URI links or inline definitions.
Different time models CoverageJSON supports a range of temporal reference systems, defaulting to the Gregorian calendar.
Discoverability CoverageJSON does not define how to include discovery metadata at the level of a CoverageJSON document. It is usually assumed that this metadata is contained elsewhere, e.g. in a catalogue. However, CoverageJSON does provide an extension mechanism for adding custom metadata to a document. This could be used to add discovery metadata, for example from the DCAT vocabulary (REF). See this discussion.
Georectification Coverage JSON has a flexible and extensible approach to specifying reference systems, in which data can be referenced to any grid.
Georeferenced spatial data See the Georectification requirement above.
Linkability CoverageJSON documents can contain links to external entities, including parameter definitions, CRS definitions or other CoverageJSON documents (see "Support for large datasets" above). CoverageJSON documents are intended to be published on the Web and can therefore be linked to.
Machine to machine CoverageJSON is designed for machine processing, although is also somewhat human-readable.
Multilingual support CoverageJSON supports multi-language labels
Observed property in coverage The observed property is defined in CoverageJSON as Parameter objects.
Provenance CoverageJSON is intended primarily to describe the result of a procedure and does not provide a specific mechanism to describe the provenance or the procedure itself. The extension mechanism could be used to include relevant properties.
Quality per sample Quality information can be incorporated as additional parameters of a coverage. For example, quality flags can be encoded as categorical parameters, and numerical erros can be described as continuous parameters. Parameter Group objects provide a mechanism to associate parameters with each other to provide internal links between data values and their associated quality information.
Reference data chunks There are two main approaches in CoverageJSON to dividing a large coverage into chunks. In each case, each chunk is given its own identifier. (Also, large collections of coverages, such as large collections of in situ observations, can be divided into Coverage collections.)
Sensing procedure As with "Provenance" above, CoverageJSON focuses on the results of the procedure, not the procedure itself. The text description of a parameter could potentially contain a description of the sensing method, or the extension mechanism could be used.
Spatial vagueness It is possible to record spatially-vague data (i.e. data that is not associated with precise spatial coordinate values) in CoverageJSON. For example, a domain axis could be defined that records locations as identifiers, rather than numeric coordinates (e.g. "['London', 'New York', 'Paris']").
SSN-like representation CoverageJSON does not attempt to provide a method to describe sensors (or any other provenance information, see above). However, CoverageJSON reuses the "observedProperty" property from SSN (https://www.w3.org/2005/Incubator/ssn/ssnx/ssn#observedProperty). Usage of this can be seen in "CoverageJSON, JSON-LD and RDF" above. The extension mechanism could be used to provide more information if required.
Support for 3D CoverageJSON fully supports 3D data: see "4D model of space-time" above.
Support for tiling CoverageJSON supports tiling. (This mechanism is probably more suited to raster than vector data.)
Use in Computational Models CoverageJSON is a machine-readable format that can be both read by, and written by, computational models. However, it is primarily intended as a format for data exchange over wide-area networks, not an archive format. Therefore it is probably unlikely that the designer of a computational model would choose to read or write data in CoverageJSON directly. Tools to convert to and from NetCDF (a more common archive format) are under development.

8. Relationship to Best Practices

TODO: add section explain which of the Best Practices are followed by CoverageJSON, and how.

A. References

A.1 Informative references

[SDW-BP]
Spatial Data on the Web Best Practices. Jeremy Tandy; Linda van den Brink; Payam Barnaghi. W3C. 30 March 2017. W3C Note. URL: https://www.w3.org/TR/sdw-bp/
[SDW-UCR]
Spatial Data on the Web Use Cases & Requirements. Frans Knibbe; Alejandro Llaves. W3C. 25 October 2016. W3C Note. URL: https://www.w3.org/TR/sdw-ucr/