Coverage Solution Criteria

From Spatial Data on the Web Working Group
Jump to: navigation, search

Our two main high level requirements are:

  • have an identifier for various extracts of coverage datasets
  • have a means to deliver these over the web in one or more 'web-friendly' formats

What criteria should we use to assess potential solutions to these?

What should the relationship of any new format be to existing OGC standards on coverage? Do we require it to be compatible at some level? If so, in what way? Should we just say "use WCS"?

What makes for 'web-friendly' ? Some initial ideas for discussion:

  • accessible over HTTP
  • follows the relevant parts of the SDW Best Practices
  • API follows patterns familiar to web data users and web developers?
  • in the balance between fully flexible and easy to use, we should probably err on the side of easy to use (eg be willing to accept that some edge cases are not handled, if that means the more common cases are easier)
  • can link to it, and can link from it to other relevant things
  • practical to receive data using HTTP over the network - extracts, option to get data in chunks etc
  • play nicely with web mapping tools (does this contradict the need to be long-lived and not tied closely to a particular current technology?)
  • practical for a data owner to implement and operate a server following the standard
  • only a finite list of pre-prepared extracts available? (simple, quick, not very flexible) or allow arbitrary extracts to be requested and rely on the server to generate an appropriate response (flexible but complex to implement and may be performance challenges)
  • supports a range of extract 'types' - eg constrained spatially, or by time, or by variable type

How do we try to make any recommendation as future-proof as possible? Which aspects of web-friendliness relate to current trends (eg format A is more popular than format B at the moment) and which relate to fundamentals that can be applied to future specific toolsets as they arise?

Rough notes (from Jon Blower) on existing protocols - for background only. Examples of things we might want to think about when preparing criteria.

  • WCS defines extracts in geospatial coordinates, so the query syntax is somewhat independent of the underlying data structure. So it’s quite useful for geospatial folk, and there are lots of use cases that can be implemented reasonably simply (e.g. Extracting subsets of satellite imagery).
  • OPeNDAP (strictly the DAP-2 protocol) defines extracts in index coordinates. This seems on the face of it to be less helpful than WCS, but there are lots of valid use cases. For one thing, the client knows exactly how much data to expect (almost to the nearest byte), so can implement certain optimisations, trading off the number of requests against the size of each payload. Also, the client can be 100% confident that the server hasn’t performed any interpolation or other manipulation – they will get a strict subset of the parent dataset. But OPeNDAP doesn’t understand geospatial coordinates, so it’s up to the client and server to agree on metadata conventions. [This can actually be a good thing too, because it means the protocol itself can remain very stable with time – any metadata improvements are independent of this.]

It’s worth noting that OPeNDAP has been around for over 20 years and is very well supported by tools (in the geoscience community anyway). And yet the developers of OPeNDAP are still disappointed that it is not more mainstream. I think there are two main reasons for this:

1. In “most” use cases, people just want to download a data file, they don’t want web service access. However, there are of course very many important use cases that do want web service access.

2. It’s quite hard to run an operational data-extraction server reliably, securely and scalably. Neither OPeNDAP nor WCS provides mechanisms for disallowing large requests, for instance. Hence most servers are run by and for researchers, not the mainstream.

Neither one of these concerns will be completely addressed just by inventing new protocols.