ACTION-194: Draft something for consideration in bp re metadata

Draft something for consideration in bp re metadata

State:
pending review
Person:
Rob Atkinson
Due on:
August 10, 2016
Created on:
August 3, 2016
Related emails:
  1. Re: No BP telecon this week(?) (from rob@metalinkage.com.au on 2016-08-10)

Related notes:

The Data on the Web Best Practices clearly identifies the need for metadata in different contexts,
https://www.w3.org/TR/dwbp/#metadata
it identifies at two specific cases: embedded metadata and provision of a separate metadata resource related to a given resource. It recommends that ontologies are developed to hold such metadata, according to the needs of a community of practice.

In the case of spatio-temporal data the provision of such metadata has often been limited to a metadata record with a focus on human-readability and discovery. Increasingly however there are opportunities to integrate data into applications and this means that machine-readable forms are required. Many such examples exist, however BP also recommends reuse of vocabularies and ontologies, and hence BP for metadata about spatio-temporal aspects should provide examples of how such vocabularies can be reused in the different ways such metadata may be attached to spatial data.

We can identify multiple facets, characteristic of, if not unique to, spatio-temporal data:
1) Granularity -
a) resources may be accessed as a dataset containing many individual entities - preserving details of spatial information models within the dataset encoding, but only understandable through provision of metadata records associated with such datasets.
b) individual entities (real world things - aka 'features') with a set of properties, identified with a URI as per BP XXX
c) subsets, slices, dices, tiles etc - particularly where spatial organisation of the subsets define the relationships between such elements, and with super- and sub-sets.
d) parameterised queries to generate subsets dynamically
2) Measurement and position
a) What Coordinate Reference System is used, in which a measurement or position has meaning
b) What phenomenom is being defined (e.g. is a geometry a centroid of an area, a gazetted boundary, a interpreted boundary etc)
c) What model is being used for the value (e.g. Geometry type)
3) Fitness for purpose
a) precision of measurement
b) accuracy of measurement
4) Role of metadata
a) identification of a concept (URI allows discovery, matching, triggering processing behaviour and dereferencing to access more details)
b) interpretation (human) - provision of labels and descriptions
c) symbolisation - provision of a standard symbol or form during data rendering
d) computation - provision of parameters for computation transformations - such as scaling, units conversion, map projections
5) Simplified data structures
This specifically refers to provision of a single default geometry for a real world object that may have multiple possible geometrical realtionships. In this case the definition of the simplified view of an object may be the appropriate place to attach other metadata facets which are held constant in collections of such features.

Note that it is possible to specify a constant assumed value for many of these facets when a specific ontology is used to define a spatial property. The assumed use of WGS84 for geo:Point (and geo:latitude, geo:longitude) is a case in point, although exactly what such a property means is potentially open to interpretation (def:"The geo coordinates of the place.").

The emergence of the IoT means that an increasing reliance on high-precision spatio-temporal data which will need explicit metadata, and its real-time nature will also drive finer grained data access - hence an increasing requirement for embedding metadata in spatial objects.
Such applications may synthesise high-precision data with "near enough is good enough" data - such as business locations, such as an automated car finding and manouvering into a parking space near a particular store.

Given these considerations, and existing practices, the following approach is recommended:
1) Choose a set of ontologies for describing spatial, temporal and other characteristic aspects of your data, where these ontologies define data properties that support the following options:
a) embedding a URI identifier for the value of a given aspect (where a numeric value is not appropriate)
b) providing multi-lingual labels for such values
c) providing a URI for a Class (type) definition for models of the value
d) binding of data properties to the spatio-temporal objects being documented (i.e. allows the properties of multiple different geometry representations to be individually qualified with CRS, precision etc)
e) embedding an instance of a data property in a feature instance, dataset description
f) defining explicitly the nature of a spatio-temporal property in relation to the subject of the property

2) Using the same ontologies and vocabularies in metadata at all levels of resource granularity
3) Providing human readable labels in addition to URI references where supported by data encoding
4) Embedding labels, URI, display symbols and machine-readable

When choosing vocabularies, if no further precision is required, it is recommended to simply use the schema.org vocabulary "geo" element to define an "indicative position". If more geographic or semantic precision is required it is recommended to follow any available guidance of the OpenGeospatial Consortium, which will in turn identify relevant W3C or other standards appropriate to a domain.

It is recommended that for maximum flexibility URIs resolve to one or more representations of value definitions through mechanisms such as "conneg" (Content negotiation):
1) SKOS
2) an RDF instance for the declared datatype for the property value (OWL and RDFS where the value is a type reference)
3) JSON-LD equivalent encoding of the RDF form
4) in forms relevant to the available data encodings of the data resources being described. (For example provide a .prj file for the CRS for a shapefile [https://en.wikipedia.org/wiki/Shapefile])




Rob Atkinson, 8 Aug 2016, 00:38:46

Display change log.


Chair, Staff Contact
Tracker: documentation, (configuration for this group), originally developed by Dean Jackson, is developed and maintained by the Systems Team <w3t-sys@w3.org>.
$Id: 194.html,v 1.1 2018/10/09 10:06:57 carine Exp $