From Spatial Data on the Web Working Group
- 1 BP Consolidation
- 1.1 Goal
- 1.2 Analysis pointers
- 1.3 Other thoughts
- 1.4 Next steps
- 1.5 Use case review
- 1.5.1 4.1 Meteorological Data Rescue
- 1.5.2 4.2 Habitat zone verification for designation of Marine Conservation Zones
- 1.5.3 4.3 Real-time Wildfire Monitoring
- 1.5.4 4.4 Diachronic Burnt Scar Mapping
- 1.5.5 4.5 Harvesting of Local Search Content
- 1.5.6 4.6 Locating a thing
- 1.5.7 4.7 Publishing geographical data
- 1.5.8 4.8 Consuming geographical data in a web application
- 1.5.9 4.9 Enabling publication, discovery and analysis of spatiotemporal data in the humanities
- 1.5.10 4.10 Publishing geospatial reference data
- 1.5.11 4.11 Integration of governmental and utility data to enable smart grids
- 1.5.12 4.12 Using spatial data from the web in GIS systems during emergency response operations
- 1.5.13 4.13 Publication of air quality data aggregations
- 1.5.14 4.14 Publication of transport card validation and recharging data
- 1.5.15 4.15 Combining spatial RDF data for integrated querying in a triplestore
- 1.5.16 4.16 Dutch Base Registry
- 1.5.17 4.17 Publishing Cultural Heritage Data
- 1.5.18 4.18 Dissemination of 3D geological data
- 1.5.19 4.19 Publication of Raw Subsurface Monitoring Data
- 1.5.20 4.20 Use of a place name ontology for geo-parsing text and geo-enabling searches
- 1.5.21 4.21 Driving to work in the snow
- 1.5.22 4.22 Intelligent Transportation System
- 1.5.23 4.23 Optimizing energy consumption, production, sales and purchases in Smart Grids
- 1.5.24 4.24 Linked Data for Tax Assessment
- 1.5.25 4.25 Images, e.g. a Time series of a Water Course
- 1.5.26 4.26 Droughts in geological complex environments where groundwater is important
- 1.5.27 4.27 Soil data applications
- 1.5.28 4.28 Bushfire response coordination centre
- 1.5.29 4.29 Observations on geological samples
- 1.5.30 4.30 Spatial Sampling
- 1.5.31 4.31 Select hierarchical geographical regions for use in data analysis or visualisation
- 1.5.32 4.32 Satellite data processing
- 1.5.33 4.33 Marine observations - eMII
- 1.5.34 4.34 Marine observations - data providers
- 1.5.35 4.35 Marine observations - data consumers
- 1.5.36 4.36 Building information management and data sharing
- 1.5.37 4.37 Landsat data services
- 1.5.38 4.38 Metadata and Search Granularity
- 1.5.39 4.39 Crowdsourced earthquake observation information
- 1.5.40 4.40 TCGA / Microscopy Imaging
- 1.5.41 4.41 Crop yield estimation using multiple satellites
- 1.5.42 4.42 Geospatial extensions to domain-independent metadata schemas
- 1.5.43 4.43 Improving discovery of spatial data on the Web
- 1.5.44 4.44 INSPIRE compliance using web standards
- 1.5.45 4.45 Event-like geographic features
- 1.5.46 4.46 Creation of “virtual observations” from “analysis” phase of weather prediction model
- 1.5.47 4.47 Incorporating geospatial data (e.g. geo-referenced geometry) into interactive 3D graphics on the web
- 1.5.48 4.48 Smart Cities
This is a work in progress by the BP editors: Jeremy, Payam and Lewis.
A first pass of consolidated narratives are available here: BP_Consolidated_Narratives
Analysis of use cases to determine repeated patterns, common themes. Focus is on BP document rather than the other deliverables.
- Publisher vs. consumer centric?
- Try to avoid repeating the requirements! Requirements are technical features that we need to describe; we’re trying to pull out common narratives in which those features can be applied
- Can’t get into the details of how processing algorithms work (e.g. wildfire classification from remote sensing imagery, geo-referencing etc.) - describe the information that these processing algorithms need to know in order to operate
- do we want to be recommending particular encodings? or just providing examples of how encodings are used? … (see UC#4.7)
- identify overlaps with Data on the web Best Practice and Web Architecture
- this has been completed -- see BP Consolidated Narratives
Use case review
4.1 Meteorological Data Rescue
- consumer: search of ‘catalogue’ (domain-specific or search engine?); multi-lingual (Cambodian, French) << based on what information? date (1950’s), location (Cambodia), subject (deforestation? other essential climate variables?)
- data subject location (Cambodia)
- physical resource location (pamphlet, in Paris)
- provenance ([observation -] paper publication - digital image - digitised dataset)
- publication: identifier assignment
- publication: links between resources (digital image of pamphlet, scientific paper, digitised dataset) allowing discovery of related resources
- publication: summary record (metadata) to support discovery and evaluation
4.2 Habitat zone verification for designation of Marine Conservation Zones
- consumer: “line of sight between policy and data” … citizen science, transparency … publishing your workings
- consumer: referring to subsets (slices? ranges?) of a (very) large (coverage) dataset using dimensions- both space/time and thematic (e.g. quantity kind; specified in external vocabularies) …
- dataset linkages; this dataset (e.g. a controlled vocabulary) provides a ‘dimension’ on that dataset
- publisher: provision of “observation context” enabling the dataset to be appropriately interpreted … when / where (inc. mobile sensors), what, sensor, in-situ/ex-situ/remote sensing
- types of data? coverage, feature
- do we want to talk about annotating a specific feature?
- [use of gazetteer to] relate thematic identifier (place name; a ’social identifier') to real-world location - with associated accuracy/quality … fuzzy location
4.3 Real-time Wildfire Monitoring
- observation context for remote sensing data to allow downstream processing to discriminate which data to be used for the processing algorithm
- reassembly of ‘image tiles’ into a large remote sensing dataset; how to assemble in the correct sequence? (transport layer cannot guarantee order)
- sub-selection (crop for area of interest)
- geo-referencing raw imagery
- imagery / raster data processing (pixel classification for wildfire occurence)
- feature derivation (wildfires)
- publication of derived vector product with supplementary metadata (shapefiles)
- cross reference wildfire features with features from other datasets (land cover, administrative geography, Geonames) … as input to decision-support systems
- positional accuracy not explicitly mentioned - but potentially applicable here
4.4 Diachronic Burnt Scar Mapping
- [data discovery and acquisition]
- geo-referencing satellite imagery
- imagery / raster data processing (pixel classification for cloud masking and burn-scar classification; noise removal)
- feature derivation (burn scars)
- cross reference burn-scar features with other geo-information
- generate maps (from feature data) that can be overlaid with other maps layers … digital acetates!
4.5 Harvesting of Local Search Content
- harvesting unstructured event and location information (e.g. from HTML and embedded content such as RDFa and JSON objects)
- "web pages are the canonical source of data"
- avoid the need to manual enter structured data into dedicated portals
- cross referencing events and activities to locations
- reconciliation of qualitative representations of place (noting multiple addressing standards) and [precise!] spatial co-ordinates … fuzzy matching
- publication of event information in a structured form for downstream re-use
- the issue that Google is concerned with [is] how can we identify the geospatial content on the Web rather than just crawling HTML (from eparson's comment in SDW WG call, 22-Jul-2015)
- [eparsons] "We can parse the page and work out if it's talking about a location but it's not great. The other side of that is how can we tag our own content so that our pages can be understood. It's about a methodology for identifying spatial content on the Web."
4.6 Locating a thing
- expressing location and places in human terms; relative statements … to me (based on my current context- including my location) and to other things such as administrative geographies (e.g. Teddington) or building layouts (e.g. Durbin Ward at Royal Devon and Exeter hospital) … noting the hierarchical relationships between locations- the essential ‘containment’ of one place within another
- … also- would be interesting to reconcile information about places drawn from multiple sources … because someone has published information about a place, that probably means that it exists (at least in their context)
- … assessing ‘containment’ provides a basis for discussing spatial operations (e.g. within, touches etc.)
4.7 Publishing geographical data
- not really a use case- more a list of technical requirements
4.8 Consuming geographical data in a web application
- publishing an interactive map using a web application
- placing features on the map, providing a 3D render, time ‘slider' … “to do something meaningful with spatial data that is [provided]” … runtime binding of new data resources to the map
- searching for [geographic] data
- using metadata to evaluate and use discovered data … “what type of representation?” 2D / 3D, raster/vector, “which encoding”, “un/compressed”, “filter data to get the most appropriate [spatial] representation” etc.
- coordinate transformation to convert discovered data to match the map projection
- … how do I package APIs for simple re-use? e.g. so that my coordinate transformation service or script can be easily re-used by third parties rather than them writing their own (“small APIs”)
- representing positional (in)accuracy
- selecting between [conflicting] sources of information - representing data provenance / origin / timeliness (is it out of date?)
- >> reconciling duplicate information based on assessment of the data model?
- what data services / “technical features" are available for me to interact with this dataset resource? (e.g. get spatial data, perform spatial operations, asynchronous operation via ‘promises') …
- e.g. SPARQL endpoint … supporting GeoSPARQL? mapping service … with tiles?
- >> is service binding (to the application) done at run time or at design time.
4.9 Enabling publication, discovery and analysis of spatiotemporal data in the humanities
- spatiotemporal correlation … "discover patterns in time and/or space in interlinked distributed datasets”
- simple to publish
- simple to discover
- representation of uncertainty; precision, probability, confidence
- expression of spatial containment relationships (and other topologies … enabling spatial reasoning using Region Connection Calculus RCC8) … these may vary with time
- correlating place names (toponyms) with geometry (which may change over time) and use of imprecise spatial referencing (e.g. near, north of)
- [historical] gazetteers
- correlating named events and calendars with temporal reference systems and use of imprecise spatial referencing (e.g. around, during, before, after) … enabling temporal reasoning using Allen Calculus
4.10 Publishing geospatial reference data
- linking _my_ data with (one or more) [geospatial] reference data resources (e.g. administrative unit)
- identification of real-world things [Cool URIs](http://www.w3.org/TR/cooluris/)
- persistent identification of [information] resources is a pre-requisite … but the question is how best to do this; do we need to provide guidance on URI patterns?
- relating document URIs (that resolve to information resources) to the real-world thing that the document describes … document URIs will often be bound to the web-service from which the information resource is available
- which identifier should I use? real-world thing or information resource … URLs only resolve to one place; how do I find information provided by other parties about this real-world thing?
- how do I discover what information resources are available for a real-world thing? … semantic indexing service? … data publisher, or third party, provide list of related datasets / technical features / service endpoints?
- potential for non-unique naming; different identifiers may be used for the same real-world thing across different information resources … reconciliation of information about the same real-world thing to build a “complete” view of the real-world thing from each information 'facet'
- how should the semantics of a Link be made explicit? (e.g. RDF predicate is a URI that usually resolves … if not RDF, how do we locate the data model that defines the semantics of the link)
- [a service may provide multiple representations (or ‘views’) of the same real-world resource … how do request the correct representation] << note: this is distinct to conneg; the information content differs, not the content encoding
- [assessment of trust - how do I know that this reference data is ‘good’ … the more the reference dataset is used, the more likely it is to be trustworthy (wisdom of crowds)- count the in-bound links]
- [find related data - given _my_ data, what else may be of interest … traverse the relationship to the reference dataset and see what else is linking with that- catalogue the in-bound links]
- this use case is as much about traversing the web of data (the ‘connectedness’ of datasets) as publication of reference data …
4.11 Integration of governmental and utility data to enable smart grids
- information reconciliation for the same, similar or related resource(s) across multiple domains
- “switch from domain-specific semantics to common semantics” << is this really achievable?
- Is a better approach to publish data models in such a way that axiomatic mappings between related concepts can be published (and maintained) allowing data conforming to one model/ontology to be reframed according to another?
- This suggests that a “feature type catalogue” or other data model registry is a necessary part of the semantic web infrastructure
- Furthermore, to support multiple communities, a data provider may publish different ‘views’ of a resource (e.g. different information constructs) and the mappings between those view … from SKOS concepts for simple classification concerns, to rich, full-featured ontologies like CIM and SSN …
- … “because people involved in integration of data from multiple domains should not be burdened with having to grasp the full complexity of each domain”
4.12 Using spatial data from the web in GIS systems during emergency response operations
- Spatial correlation (& reconciliation?) of 'managed' (authoritative) data from a GIS warehouse (served via OGC WFS) with ad-hoc data sourced from the web (in numerous web formats, like geoJSON etc.) in support of emergency response
- creating a 'common operating picture' from disparate data and providing visualisations to emergency responders.
- how do we discover what information is "out there" on the web that might be of use? how do we decide whether to trust that data?
- Cross-reference location of features against administrative geographies
- Correlating information from multiple sources about the same real-world thing - especially relevant where an emergency response spans jurisdictions (regional, national etc.)
- potentially, the a single real-world thing could be assigned identifiers by each jurisdiction leading to a non-unique naming problem
- Defining a machine-readable vocabulary to describe the domain which is based on agreed terminology (semantics); using this vocabulary to describe real-world features ... how should this linking be done?
4.13 Publication of air quality data aggregations
- Describe context associated with air quality observation
- location described with address (e.g. using Location Core Vocabulary) rather than spatial coordinates
- observed properties (e.g. SO2, NO2, O3, CO, etc.)
- time of observation (hourly observations)
- Need to be able to distinguish between the instantaneous hourly observation and the daily aggregation derived from them
- time instant vs time period
- describing the provenance of the aggregate value; source observation, the aggregation process etc. (description of processing chain for earth observing data; PROV-O?)
4.14 Publication of transport card validation and recharging data
- Description of domain features (metro stations, station entrance/exit points, buses, bus stops, card validation points, card recharging points) all of which have some spatial attributes (address and geographic coordinates).
- "Observations" of card validation (in/out) and recharging for a given card happen at a domain feature and a specific time; these events are captured for analysis.
- A transport card is linked with a particular user for whom a profile is maintained.
4.15 Combining spatial RDF data for integrated querying in a triplestore
- How to reconcile / combine information from disparate datasets that are inconsistent- e.g. using different geometry representations, different spatial reference systems etc.
- Can we simplify the problem through standardising on one geometry representation and using common URIs for standard reference systems? Can we have a canonical reference system in order to mitigate the need for coordinate transformations?
- ¿uniformity across everyone seems unrealistic; perhaps the underlying concern here is how a particular community should agree to use common representations and reference systems?
4.16 Dutch Base Registry
- Linked Data approach is used to mitigate these concerns:
- data is difficult to find
- data from the Dutch base registries cannot easily be linked to other data
- authenticity of data is questionable due to multiple copies of data being available [each potentially being in a different state?]
- geospatial references are informal, incorrect or outdated; need to a common way to express geospatial attributes in Linked Data
- ... desire is to perform spatial queries over the entire set of base registries; current queries are limited by fragmented approaches
- CRS is cited as a particular issue;
- we have to know what CRS is used in order to interpret the data,
- use of different CRSs (e.g. because they provide greater accuracy in a local context) makes combining different datasets difficult,
- can we agree a 'default CRS'?
- geometry data (e.g. coordinates) can be large- how do we optimise performance?
- ... for data transfer?
- ... for spatial queries? (e.g. creating simpler versions or precomputing the relationships)
4.17 Publishing Cultural Heritage Data
- describe events (e.g. World War 1, birth of Albert Einstein etc.) in linked data
- ... location and time.
- ... inexact references (e.g. second quarter of the 9th century; approx. 825-850)
- ... terms such as Renaissance Italy, a geographic entity, apply to a specific time range - albeit an inexact one
- domain experts (staff of cultural institutions) are not technical experts- how can we make it easy for non-techies to publish info?
- ... ¿is this the role of tooling?
- ... validator for WKT and GeoJSON cited as potentially useful things
- reasoning over inexact references (esp. temporal) is hampered by limitations of OWL (xsd:dateTime only etc.)
4.18 Dissemination of 3D geological data
- description of data in 3-dimensions;
- 3d spatial models; TINs ... these can be variable resolution in order to express complex geologies
- 3d voxel datasets ... these can have heterogeneous resolution
- 3d point clouds ...
- need to include uncertainty in position (e.g. of geological boundaries)
- want to expose simple APIs to work with complex (3d) data- e.g. cross-sections & subsetting etc. through 3d geological datasets
4.19 Publication of Raw Subsurface Monitoring Data
- publication of raw sensor data for easy consumption by 3rd party web applications (visualisation, spatio-temporal filtering, statistical analysis, alerts)
- ¿seems like the "publication for easy use" is about exposing the data through convenient APIs?
- alerts is interesting; first time we've come across a need for working with data in real-time; potentially working with data-stream
- observation context is required to enable interpretation of the data values;
- location (x,y,z [depth]) and time
- observed property
- sensor descriptions
- platform descriptions ... satellite, airborne, in-situ (borehole, surface)
4.20 Use of a place name ontology for geo-parsing text and geo-enabling searches
- use of a geological gazetteer [described as an ontology in the use case] to provide names (primary and aliases) of geological features and time periods (etc.) that enable historical records to be cross-referenced in space and time
- the gazetteer should provide geometry, topological relationships to other named features, time-reference (e.g. to geological period) - including expression of uncertainty in geometry and time
- versioning of the information resource is important- e.g. the geometry of a named feature may be updated- either because of a policy change (for administrative geography) or new measurement (for a natural phenomenon)
4.21 Driving to work in the snow
- discover observation streams within a given spatial context (and matching some criteria);
- query for observations within a given spatiotemporal context (and matching some criteria);
- publish and subscribe to an observation stream within a given spatial context (and matching some criteria);
- filter an observation stream based on given set of criteria (e.g., quality, feature-of-interest, observed-property);
- integration of SSN with existing, well-known, IoT protocols (CoAP, MQTT, etc.);
- represent "high-level" qualitative observations (such as observations of user behavior, e.g., Sue stopped at the coffee shop).
4.22 Intelligent Transportation System
- exploitation of spatio-temporal data in (near-) real-time (observed/forecast weather/road conditions, traffic, incidents etc.) to compute accurate travel duration
- description of transport infrastructure; geometry and topology (e.g. road network, parking spaces), time (bus schedules, traffic light durations) etc.
- exposing (near) real-time data streams through convenient APIs
- discovery of relevant datasets based on context (individual user, time, locations etc)
4.23 Optimizing energy consumption, production, sales and purchases in Smart Grids
- energy management in smart grids leads to dynamic pricing based on availability and demand;
- demand is affected by forecast weather conditions and statistical data on energy consumption
- availability must take into account small-scale supply; e.g. from domestic solar panels etc.
- these data have both geographic and temporal attributes; in case of time, both long(er) term (for statistical data) and short(er) term for predictions and observations
4.24 Linked Data for Tax Assessment
- calculation of tax based on real-estate ownership and use of that property
- information required for that calculation is distributed across many systems
- ... establish durable links between records in these systems to simplify the acquisition of this information; allow a consumer [with necessary permissions] to traverse these links to acquire related information
- ¿ assume that a tax assessor would only use authoritative data for this (e.g. following links to bona fide data sources) rather than discovering information from the web (which may have ambiguous provenance)
- information has both spatial and temporal attributes
4.25 Images, e.g. a Time series of a Water Course
- stitching a set of images (of a water-course) together to create a time-series; metadata about each image (location of subject area, time of picture) required in order to assemble disparate images into an ordered set
- providing metadata to enable users to discover the image-set; e.g. based on time and location
this is actually quite similar to use case 4.4 which discussed burn-scar mapping; in this case the satellite imagery is assembled from tiles that cover a portion of the area of interest
4.26 Droughts in geological complex environments where groundwater is important
- hydrogeological simulation of river basin composed on distinct (sub-) components that are operated separately ... potentially by multiple geographically distributed organisations
- each sub-component needs input information to drive the simulation;
- static data (e.g. the DEM, the aquifer shape, connectivity and composition etc.) ... including spatial and topological attributes
- dynamic data (e.g. from observations or virtual observations output from other simulations) ... including both spatial and temporal attributes
- sub-components with feed-back loops may require data to be shared in (near) real-time, necessitating the use of data streaming (?)
- sensor data (and data from virtual' sensors; e.g. derived from simulation) must be accompanied by sufficient context in order to understand / interpret the data values
- I see this as a number of black boxes within a (complex) processing chain, each of which import/export data with spatio-temporal attributes; describing the processing chain (e.g. linking output data to the process and it's inputs) provides a line of sight between the observations and conclusions
4.27 Soil data applications
- spatial cross-reference cadastral boundaries with soil-type classification zones and soil analysis data
- soil classification uses standardised scheme (controlled vocabulary)
- expose 'digital soil map' (vector data), functional soil property data (90m resolution raster) and soil analysis data through convenient APIs for consumption in web applications that, say, farmers might use
- WMS, WFS and WCS are specifically cited
- "access to observed data [using routine queries; soil type, location] in a standard format that allows [use by downstream algorithms]
- (re-)use existing SoilML data model to provide semantics
- extended semantics provided by ANZSoilML; need to be able to trace back to the common SoilML data model in order to, say, reconcile data from disparate providers using different extensions?
- ¿is there a need to map the SoilML data to simple(r) format for input to standard algorithms?
- combine data from different (national) sources to create global dataset; provenance of each sub-set should be clear
- datasets published to enable easy discovery by users from other domains; e.g. food security
- soil observation data needs to include sufficient context for interpretation of those values.
4.28 Bushfire response coordination centre
- discovery of features based on thematic identifier (e.g. place name "Springwood")
- thematic identifier does not provide unique identifier; features are uniquely identified using URI
- simple display on map as single point in addition to bounding box and more complex geographic representations; named features (may) have multiple geometries
- features are linked to related resources; additional sources of information (population profiles, nearby environmental sensors)
- topological relationships to other features are expressed as links
- discovery of other datasets that "use" the feature's unique identifier [back-links]; e.g. to find alternative polygon and population information from an alternative dataset, e.g. to find emergency contact and evacuation centre information from local government area data holdings (LGA reuses the identifier, but the federal dataset doesn't know about the LGA dataset)
- exposure of complex data (e.g. census data) through convenient API
4.29 Observations on geological samples
- observations taken ex-situ on specimens (sampling feature)
- the need to describe the observation context is implied
- specimens have identifiers too; often a specimen will be subdivided and shared amongst several labs, each of which assigns a local identifier [non-unique naming] ... reports from disparate labs about the same specimen need to be correlated- even though each lab uses a different identifier
- specimens have spatial attributes- both where they were extracted from (e.g. the drill hole, traverse or cruise) and where they current are (e.g. in drawer m of lab n etc.)
- need to be able group specimens in batches
- some information about the specimens (e.g. extraction location) is restricted (e.g. for confidentiality reasons); this means that the information provided to (some) downstream users will be a subset of the total known about the specimens
- ... this might be described as a 'view' that provides a well-defined subset
- ... to provide this restricted view, need to be able to expose the dataset through a (convenience) API that provides only the permitted subset of information
- derived samples generated by physical procedures at the lab; need to be able to describe this (physical) processing chain and how the derived sample relates to the original specimen [provenance]
4.30 Spatial Sampling
- observations are made on samples that are representative of the subject of interest
- the sampling regime may be statistical, use a proxy phenomenon (form which the ultimate property of interest can be derived) or be based on some spatial distribution
- samples may be related; forming a "topology" of sampling
- sampling features may be associated with organisational structures (cruises, campaigns, missions, etc.)
- positional accuracy of samples within a spatial distribution may be different to the positional accuracy of the spatial distribution itself e.g. the ends of a sampling traverse may be known in a national or global coordinate reference system to ±5m accuracy; soil samples may be taken along the traverse at every 1m ±0.01m interval
4.31 Select hierarchical geographical regions for use in data analysis or visualisation
- relate statistical data to administrative geographies (or other geographic regions)
- relate geographic regions / places to each other
- specific need to support non-overlapping regions with complete coverage of a 'parent' region (a 'Mutually Exclusive Collectively Exhaustive' - MECE - set)
- used to assist aggregation of data e.g. population statistics for all councils in Scotland
- describing the variation of geometries with time
- ... versioning the "information resource" that describes, say, a given municipal region - or creating a complex information resource that describes the history of change?
4.32 Satellite data processing
- Detailed satellite data processing user story representing general needs
- Consumer: Needs both multi-satellite and ground-sensed data with extensive metadata and localised algorithms
- Consumer: wants to republish results as a spatail coverage
- Similar to use case 4.41
4.33 Marine observations - eMII
- Consumer/Publishers want to receive information in standard formats (where semantics are explicitly defined) for re-publishing in a topic-specific portal
- Consumer/Publishers want validation tools before uploading data to a portal
- Consumer: I want to discover and filter download data using multiple arbitrary parameters, and get a result in a consistent self-describing format (like use case 4.3.5)
- need to provide observation context sufficient to interpret the data values
- data exposed through convenience APIs- e.g. to allow data that is not of interest to be filtered out
- republishing data from disparate sources (partners) via a portal as a homogenous dataset; need to map from data formats used by partners to the 'common' format
- or get the partners to agree to use a common standard in the first place!
4.34 Marine observations - data providers
- Some of this seems well out of scope, but the in-scope part is about users of published data navigating around spatial and thematic concepts to find data, sometimes with a temporal dimension too. The story may be about publishing for many access points and rich links.
- some of the data is derived from spatially-located images with extended attributes at points within an image.
- data published through convenience APIs (web services; including WMS) ... specific query patterns are provided which should be mapped to APIs e.g. "Show me what vertebrate and invertebrate species were found at this site [...]"
- metadata (ISO 19115) published for harvest by discovery portals
- ¿this use case indicates that ALA has "thousands of WMS layers" - how are the semantics associated with each layer expressed; only through the layer name?
- need to capture observation context - clearly including location and time
- use of photographic camera as "remote sensing" instrument
- likely need to capture the provenance of "scores" derived from each photograph; who did the scoring, when, using what method etc.
4.35 Marine observations - data consumers
- Many brief desirables from consumers of an existing portal
- Consumers: want to download bits of stuff, commonly sensor observations, extracted out of netCDF on demand with fewer clicks and an easier format such as CSV.
4.36 Building information management and data sharing
- Publisher: Wants to publish 2-D and 3-D geometries of public infrastructure and building projects, linked to aspatial information like road maintenance and materials.
- Publisher: Needs to publish around 11000 objects including the traffic network per project.
- This involves sharing/exchanging information amongst multiple parties, but it looks like only one party acts as publisher onto an access-controlled server.
4.37 Landsat data services
- Some overlap with 4.41
- Strong consumer perspective
- Consumer: wants easier access to remote sensing data for rapid integration and mashup building
- Publisher: wants to enable access via standards.
- Publisher: JSON and Restful Web APIs are usful to our consumers.
- ARG25 'dataset' (dataset collection?) composed of 184000 scenes
- searchable via CSW API; each scene has ISO 19115 metadata record
- scenes exposed convenience API (e.g. WMS)
- common desire to "stitch" scenes together for an area/period of interest
- ... had to work through metadata catalogue to determine which layers to select
- ... OGC web service APIs more complex than typical "web APIs" - barrier to adoption and hence limited uptake
- ... need to provide targeted (simple to use) convenience APIs that deliver against specific goals
4.38 Metadata and Search Granularity
- publisher: needs to publish geological samples
- consumer: needs to be able to search within the (attribute/feature) data, not just over collection level metadata that is available through a catalogue
4.39 Crowdsourced earthquake observation information
- Crowd-participant (publisher): I want to tell someone what is going on around me, how it feels and the damage I see right now.
- Geotechnical-expert (consumer): I want that crowd-sourced real-time sensor stream precisely geolocated and correlated with my own seismometer network.
- the small guys are the publishers here -- the big guys consume and integrate.
- Question -- is this webby enough for us?
4.40 TCGA / Microscopy Imaging
- Publisher: microscopy images of 100k+ objects spatially related within each image (automatically segmented from the image)
- consumer: images (and objects in images?) need to be linked to other data like lab results, patient groups, and other morphological features as a result of analysis and for subsequent analysis. This raises the need for the consumer (not the publisher) to do persistent linking.
- location is not geocentric here, only locally relative 2D and 3D cartesian-based. There is also a temporal dimension.
- yet another citation for use of WKT descriptions of geometry
4.41 Crop yield estimation using multiple satellites
- this is a stretch use case, but even doing some of it would help the status quo
- consumer: access to several different kinds of complex spatial data
- Consumer: data includes multi-spectral satellite coverages from multiple satellites at different spatial and temporal resolutions and different (but close) spatial locations and temporal(time)stamps
- Consumer: the data is big
- Consumer: Can I send my code to the data instead of bringing the data back to me?
- consumer: data includes Synthetic Aperture Radar data that could be pre-processed to points in 3D space or 3D triangulated surfaces
- consumer: Can I pull a polygonal or named geolocation out of the data? Even informally-named regions?
- publisher: multispectral data could be pre-processed to a datacube model in dimensions of 2D space and time
- publisher: What SAR preprocessing is helpful? How do I represent that 3D-point or triangulated surface coverage?
- publisher: The amount of data is big. I have the infrastructure for this but what should my consumers do?
- publisher: how do I describe all that data anyway, including the source sensors and all that pre-processing?
4.42 Geospatial extensions to domain-independent metadata schemas
- set of attributes are defined in (Geo)DCAT-AP as "minimum required to support [...] interoperability [of] European data portals" so that a user is provided a uniform search/discovery experience
- use case provides a list of the deficiencies of existing practices
4.43 Improving discovery of spatial data on the Web
- free text search is "far from satisfactory" but it's the most frequently used method
- users look to search engines before thematic data portals
- how can the data exposed by web portals be published/improved/optimised to enable discovery via the search engines?
4.44 INSPIRE compliance using web standards
- national implementations of INSPIRE the Directive often focus on compliance to the Implementing Rules rather than the usability of the published content
- Linked Data proposed as the best way to achieve this ... for which a gamut of new best practice needs to be provided and accepted
4.45 Event-like geographic features
- events are a very common type of feature; they have both spatial and temporal attributes
- events may be static or dynamic; position and/or geographic extent may change over time (along with other attributes)
4.46 Creation of “virtual observations” from “analysis” phase of weather prediction model
- description of observation context required to interpret the data values
- the normal attributes such as location, time and observed property
- in this case, the sensor is virtual (e.g. the "observation" is derived from a numerical simulation)
- need to be able to distinguish between real observations (e.g. taken using in-situ sensors) and virtual observations (e.g. derived from numerical simulations)
4.47 Incorporating geospatial data (e.g. geo-referenced geometry) into interactive 3D graphics on the web
- 3d geospatial data can be complex
- rendering (using) 3d geospatial data resources within web applications needs to be simple enough for web developers; they shouldn't have to become geospatial experts
- simplify interaction with 3d geospatial data by
- exposing data though convenience APIs
- mapping the complex internal data model to provide a simple encoding- perhaps a format that can be readily consumed by 3d rendering packages
4.48 Smart Cities
- Context-aware multi-modal real time travel planner and Public parking space availability prediction are covered by4.22 Intelligent Transportation System
- Stimulating green behaviour requires multiple datasets (many of which have spatial and temporal attributes) to be linked together in order to determine impacts of (non-)green behaviour; noise levels, water quality, air quality, waste index, recycling activities. This complex analysis needs to be simple enough for citizens to engage with ...